5251-5300 of 10000 results (91ms)
2023-06-13 ยง
12:45 <akosiaris@deploy1002> helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply [production]
12:45 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance [production]
12:45 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance [production]
12:45 <akosiaris@deploy1002> helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply [production]
12:44 <akosiaris@deploy1002> helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply [production]
12:44 <akosiaris@deploy1002> helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply [production]
12:44 <akosiaris@deploy1002> helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply [production]
12:35 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1225.eqiad.wmnet with reason: Maintenance [production]
12:35 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db1225.eqiad.wmnet with reason: Maintenance [production]
12:31 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P49419 and previous config saved to /var/cache/conftool/dbconfig/20230613-123117-ladsgroup.json [production]
12:29 <fabfur@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4049.ulsfo.wmnet [production]
12:28 <fabfur@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4041.ulsfo.wmnet [production]
12:26 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1223.eqiad.wmnet with reason: Maintenance [production]
12:25 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db1223.eqiad.wmnet with reason: Maintenance [production]
12:18 <fabfur> reboot cp4041 and cp4049 for kernel upgrade (T335835) [production]
12:18 <fabfur@cumin1001> START - Cookbook sre.hosts.reboot-single for host cp4041.ulsfo.wmnet [production]
12:18 <fabfur@cumin1001> START - Cookbook sre.hosts.reboot-single for host cp4049.ulsfo.wmnet [production]
12:16 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1167 (T336886)', diff saved to https://phabricator.wikimedia.org/P49418 and previous config saved to /var/cache/conftool/dbconfig/20230613-121611-ladsgroup.json [production]
12:15 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance [production]
12:15 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance [production]
12:15 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1212.eqiad.wmnet with reason: Maintenance [production]
12:15 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db1212.eqiad.wmnet with reason: Maintenance [production]
12:09 <hashar> Restarted Zuul CI due to T309376 [production]
12:06 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1198.eqiad.wmnet with reason: Maintenance [production]
12:05 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db1198.eqiad.wmnet with reason: Maintenance [production]
11:56 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1189.eqiad.wmnet with reason: Maintenance [production]
11:56 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db1189.eqiad.wmnet with reason: Maintenance [production]
11:46 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1175.eqiad.wmnet with reason: Maintenance [production]
11:46 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db1175.eqiad.wmnet with reason: Maintenance [production]
11:45 <Amir1> cat wikis_having_stubs | xargs -I {} bash -c 'echo {}; touch /home/ladsgroup/{}.undo.sql; chmod 777 /home/ladsgroup/{}.undo.sql; mwscript maintenance/storage/moveToExternal.php --wiki={} --end 200000000 --undo /home/ladsgroup/{}.undo.sql DB cluster26' (T299387) [production]
11:43 <fabfur@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4048.ulsfo.wmnet [production]
11:42 <fabfur@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4040.ulsfo.wmnet [production]
11:41 <hnowlan@cumin1001> END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs1019*,lvs2013*} and A:lvs (T329049) [production]
11:40 <hnowlan@cumin1001> START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs1019*,lvs2013*} and A:lvs (T329049) [production]
11:37 <hnowlan@cumin1001> END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on P{lvs1020*,lvs2014*} and A:lvs (T329049) [production]
11:37 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1166.eqiad.wmnet with reason: Maintenance [production]
11:37 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db1166.eqiad.wmnet with reason: Maintenance [production]
11:36 <ladsgroup@deploy1002> Finished scap: Backport for [[gerrit:929648|moveToExternal: Also check for utf8 encoding before trying to convert]] (duration: 09m 59s) [production]
11:35 <hnowlan@cumin1001> START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on P{lvs1020*,lvs2014*} and A:lvs (T329049) [production]
11:32 <fabfur> reboot cp4040 and cp4048 for kernel upgrade (T335835) [production]
11:32 <fabfur@cumin1001> START - Cookbook sre.hosts.reboot-single for host cp4040.ulsfo.wmnet [production]
11:32 <fabfur@cumin1001> START - Cookbook sre.hosts.reboot-single for host cp4048.ulsfo.wmnet [production]
11:31 <marostegui@cumin1001> dbctl commit (dc=all): 'db2180 (re)pooling @ 100%: Repooling after maintenance', diff saved to https://phabricator.wikimedia.org/P49417 and previous config saved to /var/cache/conftool/dbconfig/20230613-113111-root.json [production]
11:28 <ladsgroup@deploy1002> ladsgroup: Backport for [[gerrit:929648|moveToExternal: Also check for utf8 encoding before trying to convert]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet [production]
11:27 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1150.eqiad.wmnet with reason: Maintenance [production]
11:27 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db1150.eqiad.wmnet with reason: Maintenance [production]
11:26 <ladsgroup@deploy1002> Started scap: Backport for [[gerrit:929648|moveToExternal: Also check for utf8 encoding before trying to convert]] [production]
11:26 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2118.codfw.wmnet with reason: Maintenance [production]
11:26 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db2118.codfw.wmnet with reason: Maintenance [production]
11:26 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1181.eqiad.wmnet with reason: Maintenance [production]