2024-01-15
§
|
11:20 |
<btullis> |
running puppet on an-master1003 to set it to active for T332573 |
[analytics] |
11:16 |
<btullis> |
running puppet on journal nodes first for T332573 |
[analytics] |
11:10 |
<btullis@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-coord[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service |
[production] |
11:10 |
<btullis@cumin1002> |
START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-coord[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service |
[production] |
11:10 |
<btullis@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-master[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service |
[production] |
11:10 |
<btullis@cumin1002> |
START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-master[1001-1004].eqiad.wmnet with reason: Bringing new nameservers into service |
[production] |
11:09 |
<jiji@cumin1002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1037.eqiad.wmnet |
[production] |
11:08 |
<btullis@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 8 hosts with reason: Bringing new nameservers into service |
[production] |
11:08 |
<btullis@cumin1002> |
START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 8 hosts with reason: Bringing new nameservers into service |
[production] |
11:08 |
<btullis@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 97 hosts with reason: Bringing new nameservers into service |
[production] |
11:07 |
<btullis@cumin1002> |
START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 97 hosts with reason: Bringing new nameservers into service |
[production] |
11:03 |
<btullis> |
stopping all hadoop services |
[analytics] |
11:03 |
<jiji@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host mc1037.eqiad.wmnet |
[production] |
10:59 |
<btullis> |
disabling puppet on all hadoop nodes |
[analytics] |
10:58 |
<jiji@cumin1002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-gp1002.eqiad.wmnet |
[production] |
10:54 |
<btullis> |
putting HDFS into safe mode for T332573 |
[analytics] |
10:51 |
<jiji@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host mc-gp1002.eqiad.wmnet |
[production] |
10:48 |
<moritzm> |
installing systemd bugfix updates from Bullseye point release |
[production] |
10:30 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc1037.eqiad.wmnet |
[production] |
10:13 |
<jmm@cumin2002> |
START - Cookbook sre.puppet.migrate-host for host mc1037.eqiad.wmnet |
[production] |
10:08 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host mc-gp1002.eqiad.wmnet |
[production] |
10:02 |
<ladsgroup@deploy2002> |
Finished scap: Backport for [[gerrit:990424|SecurePoll: Adding updated voterlist files (T349263)]] (duration: 16m 04s) |
[production] |
09:58 |
<jmm@cumin2002> |
START - Cookbook sre.puppet.migrate-host for host mc-gp1002.eqiad.wmnet |
[production] |
09:56 |
<ladsgroup@deploy2002> |
ladsgroup: Continuing with sync |
[production] |
09:48 |
<ladsgroup@deploy2002> |
ladsgroup: Backport for [[gerrit:990424|SecurePoll: Adding updated voterlist files (T349263)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
09:46 |
<ladsgroup@deploy2002> |
Started scap: Backport for [[gerrit:990424|SecurePoll: Adding updated voterlist files (T349263)]] |
[production] |
09:18 |
<taavi> |
reboot stuck tools-k8s-worker-84 |
[tools] |
09:16 |
<pfischer@deploy2002> |
helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
09:16 |
<pfischer@deploy2002> |
helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
09:15 |
<pfischer@deploy2002> |
helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
09:15 |
<pfischer@deploy2002> |
helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
09:15 |
<pfischer@deploy2002> |
helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
09:14 |
<pfischer@deploy2002> |
helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
08:45 |
<filippo@deploy2002> |
Finished deploy [performance/arc-lamp@67389a0]: (no justification provided) (duration: 00m 05s) |
[production] |
08:45 |
<filippo@deploy2002> |
Started deploy [performance/arc-lamp@67389a0]: (no justification provided) |
[production] |
08:23 |
<dcausse@deploy2002> |
Finished scap: Backport for [[gerrit:990029|enable page_rerender for 5th batch of wikis (T351503)]] (duration: 11m 40s) |
[production] |
08:17 |
<dcausse@deploy2002> |
pfischer and dcausse: Continuing with sync |
[production] |
08:13 |
<dcausse@deploy2002> |
pfischer and dcausse: Backport for [[gerrit:990029|enable page_rerender for 5th batch of wikis (T351503)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
08:12 |
<dcausse@deploy2002> |
Started scap: Backport for [[gerrit:990029|enable page_rerender for 5th batch of wikis (T351503)]] |
[production] |
04:57 |
<andrewbogott> |
restarting wikitech-static, oom |
[production] |