|
2024-03-01
§
|
| 14:08 |
<claime> |
Pooled and uncordoned mw1387.eqiad.wmnet mw1389.eqiad.wmnet mw1391.eqiad.wmnet mw1393.eqiad.wmnet mw1395.eqiad.wmnet mw1397.eqiad.wmnet - T351074 |
[production] |
| 14:08 |
<cgoubert@cumin2002> |
conftool action : set/weight=10:pooled=yes; selector: name=(mw1387.eqiad.wmnet|mw1389.eqiad.wmnet|mw1391.eqiad.wmnet|mw1393.eqiad.wmnet|mw1395.eqiad.wmnet|mw1397.eqiad.wmnet),cluster=kubernetes,service=kubesvc |
[production] |
| 14:00 |
<claime> |
Running homer 'cr*eqiad*' commit 'T351074' |
[production] |
| 13:59 |
<cgoubert@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1393.eqiad.wmnet with OS bullseye |
[production] |
| 13:57 |
<elukey@deploy2002> |
helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
| 13:57 |
<elukey@deploy2002> |
helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. |
[production] |
| 13:56 |
<cgoubert@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1397.eqiad.wmnet with OS bullseye |
[production] |
| 13:53 |
<cgoubert@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1389.eqiad.wmnet with OS bullseye |
[production] |
| 13:51 |
<cgoubert@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1391.eqiad.wmnet with OS bullseye |
[production] |
| 13:48 |
<cgoubert@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1395.eqiad.wmnet with OS bullseye |
[production] |
| 13:46 |
<cgoubert@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1387.eqiad.wmnet with OS bullseye |
[production] |
| 13:43 |
<wmbot~taavi@tools-sgebastion-11> |
toolforge jobs restart redis2irc |
[tools.wikibugs] |
| 13:41 |
<cgoubert@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1393.eqiad.wmnet with reason: host reimage |
[production] |
| 13:40 |
<elukey@deploy2002> |
helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. |
[production] |
| 13:40 |
<elukey@deploy2002> |
helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. |
[production] |
| 13:38 |
<cgoubert@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1397.eqiad.wmnet with reason: host reimage |
[production] |
| 13:35 |
<cgoubert@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1389.eqiad.wmnet with reason: host reimage |
[production] |
| 13:33 |
<cgoubert@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1391.eqiad.wmnet with reason: host reimage |
[production] |
| 13:30 |
<cgoubert@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1395.eqiad.wmnet with reason: host reimage |
[production] |
| 13:28 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance |
[production] |
| 13:28 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 12:00:00 on db1169.eqiad.wmnet with reason: Maintenance |
[production] |
| 13:28 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1163 (T354015)', diff saved to https://phabricator.wikimedia.org/P58285 and previous config saved to /var/cache/conftool/dbconfig/20240301-132824-marostegui.json |
[production] |
| 13:28 |
<cgoubert@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1387.eqiad.wmnet with reason: host reimage |
[production] |
| 13:26 |
<cgoubert@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw1397.eqiad.wmnet with reason: host reimage |
[production] |
| 13:26 |
<cgoubert@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw1395.eqiad.wmnet with reason: host reimage |
[production] |
| 13:26 |
<cgoubert@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw1393.eqiad.wmnet with reason: host reimage |
[production] |
| 13:26 |
<cgoubert@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw1391.eqiad.wmnet with reason: host reimage |
[production] |
| 13:25 |
<cgoubert@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw1389.eqiad.wmnet with reason: host reimage |
[production] |
| 13:25 |
<cgoubert@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw1387.eqiad.wmnet with reason: host reimage |
[production] |
| 13:13 |
<cgoubert@cumin2002> |
START - Cookbook sre.hosts.reimage for host mw1397.eqiad.wmnet with OS bullseye |
[production] |
| 13:13 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P58284 and previous config saved to /var/cache/conftool/dbconfig/20240301-131318-marostegui.json |
[production] |
| 13:13 |
<cgoubert@cumin2002> |
START - Cookbook sre.hosts.reimage for host mw1395.eqiad.wmnet with OS bullseye |
[production] |
| 13:12 |
<cgoubert@cumin2002> |
START - Cookbook sre.hosts.reimage for host mw1393.eqiad.wmnet with OS bullseye |
[production] |
| 13:12 |
<cgoubert@cumin2002> |
START - Cookbook sre.hosts.reimage for host mw1391.eqiad.wmnet with OS bullseye |
[production] |
| 13:12 |
<cgoubert@cumin2002> |
START - Cookbook sre.hosts.reimage for host mw1389.eqiad.wmnet with OS bullseye |
[production] |
| 13:11 |
<cgoubert@cumin2002> |
START - Cookbook sre.hosts.reimage for host mw1387.eqiad.wmnet with OS bullseye |
[production] |
| 13:03 |
<jynus> |
refreshing image metadata of commons Алтарна_частина.jpg |
[production] |
| 13:02 |
<claime> |
Depooling mw1387.eqiad.wmnet,mw1389.eqiad.wmnet,mw1391.eqiad.wmnet,mw1393.eqiad.wmnet,mw1395.eqiad.wmnet,mw1397.eqiad.wmnet for reimage to k8s nodes - T351074 |
[production] |
| 12:58 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P58283 and previous config saved to /var/cache/conftool/dbconfig/20240301-125812-marostegui.json |
[production] |
| 12:43 |
<wmbot~superpes@tools-sgebastion-10> |
Restarted Stewardbot which had quit from IRC |
[tools.stewardbots] |
| 12:43 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1163 (T354015)', diff saved to https://phabricator.wikimedia.org/P58282 and previous config saved to /var/cache/conftool/dbconfig/20240301-124306-marostegui.json |
[production] |
| 12:18 |
<wmbot~superpes@tools-sgebastion-10> |
Restarted Stewardbot which had quit from IRC |
[tools.stewardbots] |
| 12:07 |
<dcaro> |
restarted nova-api on cloudcontrol100* as it was very slow |
[admin] |
| 12:04 |
<dcaro> |
restarted nova-api on cloudcontrol1005 as it was very slow |
[admin] |
| 11:58 |
<cgoubert@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply |
[production] |
| 11:58 |
<cgoubert@deploy2002> |
helmfile [eqiad] START helmfile.d/services/mw-api-int: apply |
[production] |
| 11:56 |
<cgoubert@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply |
[production] |
| 11:55 |
<cgoubert@deploy2002> |
helmfile [codfw] START helmfile.d/services/mw-api-int: apply |
[production] |
| 11:54 |
<btullis@cumin1002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1173.eqiad.wmnet |
[production] |
| 11:51 |
<wmbot~dcaro@urcuchillay> |
END (PASS) - Cookbook wmcs.openstack.cloudvirt.vm_console (exit_code=0) |
[codesearch] |