5001-5050 of 10000 results (103ms)
2023-06-13 ยง
13:02 <fabfur@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4050.ulsfo.wmnet [production]
13:01 <fabfur@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4042.ulsfo.wmnet [production]
13:01 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1167 (T336886)', diff saved to https://phabricator.wikimedia.org/P49421 and previous config saved to /var/cache/conftool/dbconfig/20230613-130129-ladsgroup.json [production]
13:01 <moritzm> installing nbconvert security updates [production]
12:55 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2109.codfw.wmnet with reason: Maintenance [production]
12:55 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db2109.codfw.wmnet with reason: Maintenance [production]
12:51 <fabfur> reboot cp4042 and cp4050 for kernel upgrade (T335835) [production]
12:51 <fabfur@cumin1001> START - Cookbook sre.hosts.reboot-single for host cp4042.ulsfo.wmnet [production]
12:51 <fabfur@cumin1001> START - Cookbook sre.hosts.reboot-single for host cp4050.ulsfo.wmnet [production]
12:46 <akosiaris@deploy1002> helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply [production]
12:46 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P49420 and previous config saved to /var/cache/conftool/dbconfig/20230613-124623-ladsgroup.json [production]
12:45 <akosiaris@deploy1002> helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply [production]
12:45 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance [production]
12:45 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance [production]
12:45 <akosiaris@deploy1002> helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply [production]
12:44 <akosiaris@deploy1002> helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply [production]
12:44 <akosiaris@deploy1002> helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply [production]
12:44 <akosiaris@deploy1002> helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply [production]
12:35 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1225.eqiad.wmnet with reason: Maintenance [production]
12:35 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db1225.eqiad.wmnet with reason: Maintenance [production]
12:31 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P49419 and previous config saved to /var/cache/conftool/dbconfig/20230613-123117-ladsgroup.json [production]
12:29 <fabfur@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4049.ulsfo.wmnet [production]
12:28 <fabfur@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4041.ulsfo.wmnet [production]
12:26 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1223.eqiad.wmnet with reason: Maintenance [production]
12:25 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db1223.eqiad.wmnet with reason: Maintenance [production]
12:18 <fabfur> reboot cp4041 and cp4049 for kernel upgrade (T335835) [production]
12:18 <fabfur@cumin1001> START - Cookbook sre.hosts.reboot-single for host cp4041.ulsfo.wmnet [production]
12:18 <fabfur@cumin1001> START - Cookbook sre.hosts.reboot-single for host cp4049.ulsfo.wmnet [production]
12:16 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1167 (T336886)', diff saved to https://phabricator.wikimedia.org/P49418 and previous config saved to /var/cache/conftool/dbconfig/20230613-121611-ladsgroup.json [production]
12:15 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance [production]
12:15 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance [production]
12:15 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1212.eqiad.wmnet with reason: Maintenance [production]
12:15 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db1212.eqiad.wmnet with reason: Maintenance [production]
12:09 <hashar> Restarted Zuul CI due to T309376 [production]
12:06 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1198.eqiad.wmnet with reason: Maintenance [production]
12:05 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db1198.eqiad.wmnet with reason: Maintenance [production]
11:56 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1189.eqiad.wmnet with reason: Maintenance [production]
11:56 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db1189.eqiad.wmnet with reason: Maintenance [production]
11:46 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1175.eqiad.wmnet with reason: Maintenance [production]
11:46 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db1175.eqiad.wmnet with reason: Maintenance [production]
11:45 <Amir1> cat wikis_having_stubs | xargs -I {} bash -c 'echo {}; touch /home/ladsgroup/{}.undo.sql; chmod 777 /home/ladsgroup/{}.undo.sql; mwscript maintenance/storage/moveToExternal.php --wiki={} --end 200000000 --undo /home/ladsgroup/{}.undo.sql DB cluster26' (T299387) [production]
11:43 <fabfur@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4048.ulsfo.wmnet [production]
11:42 <fabfur@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4040.ulsfo.wmnet [production]
11:41 <hnowlan@cumin1001> END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs1019*,lvs2013*} and A:lvs (T329049) [production]
11:40 <hnowlan@cumin1001> START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs1019*,lvs2013*} and A:lvs (T329049) [production]
11:37 <hnowlan@cumin1001> END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs1020*,lvs2014*} and A:lvs (T329049) [production]
11:37 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1166.eqiad.wmnet with reason: Maintenance [production]
11:37 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db1166.eqiad.wmnet with reason: Maintenance [production]
11:36 <ladsgroup@deploy1002> Finished scap: Backport for [[gerrit:929648|moveToExternal: Also check for utf8 encoding before trying to convert]] (duration: 09m 59s) [production]
11:35 <hnowlan@cumin1001> START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs1020*,lvs2014*} and A:lvs (T329049) [production]