|
2026-01-19
ยง
|
| 11:46 |
<moritzm> |
intalling openjpeg2 security updates |
[production] |
| 11:44 |
<btullis@cumin1003> |
START - Cookbook sre.hosts.reboot-single for host an-worker1193.eqiad.wmnet |
[production] |
| 11:43 |
<btullis@cumin1003> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1206.eqiad.wmnet |
[production] |
| 11:43 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-search: apply |
[production] |
| 11:43 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-search: apply |
[production] |
| 11:38 |
<marostegui@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1160.eqiad.wmnet with reason: Maintenance |
[production] |
| 11:37 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Depool db1160 T414542', diff saved to https://phabricator.wikimedia.org/P87748 and previous config saved to /var/cache/conftool/dbconfig/20260119-113722-marostegui.json |
[production] |
| 11:35 |
<btullis@cumin1003> |
START - Cookbook sre.hosts.reboot-single for host an-worker1206.eqiad.wmnet |
[production] |
| 11:35 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Promote db1244 to s4 primary T414542', diff saved to https://phabricator.wikimedia.org/P87747 and previous config saved to /var/cache/conftool/dbconfig/20260119-113518-marostegui.json |
[production] |
| 11:34 |
<marostegui> |
Starting s4 eqiad failover from db1160 to db1244 - T414542 |
[production] |
| 11:34 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply |
[production] |
| 11:34 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply |
[production] |
| 11:33 |
<btullis@cumin1003> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1187.eqiad.wmnet |
[production] |
| 11:31 |
<vgutierrez@cumin1003> |
END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp7004.magru.wmnet |
[production] |
| 11:31 |
<vgutierrez@cumin1003> |
START - Cookbook sre.hosts.remove-downtime for cp7004.magru.wmnet |
[production] |
| 11:30 |
<vgutierrez@cumin1003> |
END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 111 hosts |
[production] |
| 11:29 |
<vgutierrez@cumin1003> |
START - Cookbook sre.hosts.remove-downtime for 111 hosts |
[production] |
| 11:28 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Set db1244 with weight 0 T414542', diff saved to https://phabricator.wikimedia.org/P87746 and previous config saved to /var/cache/conftool/dbconfig/20260119-112825-marostegui.json |
[production] |
| 11:28 |
<marostegui@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 42 hosts with reason: Primary switchover s4 T414542 |
[production] |
| 11:26 |
<btullis@cumin1003> |
START - Cookbook sre.hosts.reboot-single for host an-worker1187.eqiad.wmnet |
[production] |
| 10:58 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply |
[production] |
| 10:56 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply |
[production] |
| 10:55 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
| 10:54 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. |
[production] |
| 10:52 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
| 10:51 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. |
[production] |
| 10:39 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1262 (T413525)', diff saved to https://phabricator.wikimedia.org/P87745 and previous config saved to /var/cache/conftool/dbconfig/20260119-103917-marostegui.json |
[production] |
| 10:29 |
<Emperor> |
restart apus rgws in eqiad |
[production] |
| 10:29 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P87744 and previous config saved to /var/cache/conftool/dbconfig/20260119-102909-marostegui.json |
[production] |
| 10:24 |
<ayounsi@cumin1003> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2003.codfw.wmnet |
[production] |
| 10:19 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P87743 and previous config saved to /var/cache/conftool/dbconfig/20260119-101901-marostegui.json |
[production] |
| 10:12 |
<ayounsi@cumin1003> |
START - Cookbook sre.hosts.reboot-single for host sretest2003.codfw.wmnet |
[production] |
| 10:11 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Depooling db2240 (T413525)', diff saved to https://phabricator.wikimedia.org/P87742 and previous config saved to /var/cache/conftool/dbconfig/20260119-101136-marostegui.json |
[production] |
| 10:11 |
<marostegui@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance |
[production] |
| 10:11 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2248 (T413525)', diff saved to https://phabricator.wikimedia.org/P87741 and previous config saved to /var/cache/conftool/dbconfig/20260119-101111-marostegui.json |
[production] |
| 10:08 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1262 (T413525)', diff saved to https://phabricator.wikimedia.org/P87740 and previous config saved to /var/cache/conftool/dbconfig/20260119-100852-marostegui.json |
[production] |
| 10:01 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P87739 and previous config saved to /var/cache/conftool/dbconfig/20260119-100103-marostegui.json |
[production] |
| 09:50 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P87738 and previous config saved to /var/cache/conftool/dbconfig/20260119-095055-marostegui.json |
[production] |
| 09:40 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2248 (T413525)', diff saved to https://phabricator.wikimedia.org/P87737 and previous config saved to /var/cache/conftool/dbconfig/20260119-094048-marostegui.json |
[production] |
| 09:35 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply |
[production] |
| 09:35 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply |
[production] |
| 09:16 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
| 09:15 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. |
[production] |
| 08:58 |
<physikerwelt> |
running `mkdir -p /srv/qlever2` cd |
[wikiqlever] |
| 08:50 |
<dpogorzelski@deploy2002> |
helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. |
[production] |
| 08:49 |
<dpogorzelski@deploy2002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. |
[production] |
| 08:47 |
<XioNoX> |
continue asw1-b12-drmrs troubleshooting - T413181 |
[production] |
| 08:46 |
<dpogorzelski@deploy2002> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
| 08:45 |
<dpogorzelski@deploy2002> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. |
[production] |
| 08:20 |
<brouberol@deploy2002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |