901-950 of 10000 results (66ms)
2023-04-04 §
08:01 <vgutierrez@cumin1001> START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_esams [production]
08:00 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_drmrs [production]
08:00 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_drmrs [production]
07:50 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance [production]
07:50 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 12:00:00 on db1162.eqiad.wmnet with reason: Maintenance [production]
07:48 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depool db1162 T333918', diff saved to https://phabricator.wikimedia.org/P46015 and previous config saved to /var/cache/conftool/dbconfig/20230404-074848-ladsgroup.json [production]
07:46 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Promote db1122 to s2 primary T333918', diff saved to https://phabricator.wikimedia.org/P46014 and previous config saved to /var/cache/conftool/dbconfig/20230404-074656-ladsgroup.json [production]
07:46 <Amir1> Starting s2 eqiad failover from db1162 to db1122 - T333918 [production]
07:41 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2001.codfw.wmnet [production]
07:36 <vgutierrez@cumin1001> START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_drmrs [production]
07:36 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host pybal-test2001.codfw.wmnet [production]
07:35 <vgutierrez@cumin1001> START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_drmrs [production]
07:35 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2003.codfw.wmnet [production]
07:31 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host pybal-test2003.codfw.wmnet [production]
07:31 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2002.codfw.wmnet [production]
07:28 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host pybal-test2002.codfw.wmnet [production]
07:28 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Set db1122 with weight 0 T333918', diff saved to https://phabricator.wikimedia.org/P46013 and previous config saved to /var/cache/conftool/dbconfig/20230404-072817-ladsgroup.json [production]
07:27 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s2 T333918 [production]
07:27 <hashar@deploy2002> Finished deploy [gerrit/gerrit@453b038]: Gerrit plugin update and switching from git-fat to git-lfs (duration: 00m 08s) [production]
07:27 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 26 hosts with reason: Primary switchover s2 T333918 [production]
07:27 <hashar@deploy2002> Started deploy [gerrit/gerrit@453b038]: Gerrit plugin update and switching from git-fat to git-lfs [production]
07:23 <hashar@deploy2002> Finished deploy [gerrit/gerrit@453b038]: Gerrit plugin update and switching from git-fat to git-lfs (duration: 00m 05s) [production]
07:23 <hashar@deploy2002> Started deploy [gerrit/gerrit@453b038]: Gerrit plugin update and switching from git-fat to git-lfs [production]
06:09 <XioNoX> stage new Junos on asw2-c-eqiad - T331882 [production]
2023-04-03 §
21:52 <ryankemper> T331896 `sudo -E cumin -b 4 'wdqs*' 'sudo run-puppet-agent'` [production]
21:42 <maryum> undeployed mitigation for T333140 [production]
21:25 <inflatador> bking@cumin ban cloudelastic1003 from all cloudelastic clusters T331882 [production]
21:22 <maryum> deployed mitigation for T333140 [production]
21:17 <ryankemper@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 10 hosts with reason: T331882 eqiad row C maint [production]
21:16 <ryankemper@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 10 hosts with reason: T331882 eqiad row C maint [production]
21:12 <ryankemper@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wcqs1003.eqiad.wmnet,wdqs[1010,1013-1014].eqiad.wmnet with reason: T331882 eqiad row C maint [production]
21:12 <ryankemper@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wcqs1003.eqiad.wmnet,wdqs[1010,1013-1014].eqiad.wmnet with reason: T331882 eqiad row C maint [production]
20:37 <kindrobot> close UTC late backport window [production]
20:36 <kindrobot@deploy2002> Finished scap: Backport for [[gerrit:905287|make "advanced mode" default on ptwikinews mobile (T290812)]] (duration: 10m 47s) [production]
20:31 <brett@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host lvs5006.eqsin.wmnet with OS bullseye [production]
20:26 <kindrobot@deploy2002> jdlrobson and kindrobot: Backport for [[gerrit:905287|make "advanced mode" default on ptwikinews mobile (T290812)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet [production]
20:25 <kindrobot@deploy2002> Started scap: Backport for [[gerrit:905287|make "advanced mode" default on ptwikinews mobile (T290812)]] [production]
20:19 <kindrobot@deploy2002> Finished scap: Backport for [[gerrit:905264|[refactor] split out Minerva configuration from main config]], [[gerrit:904284|Disable Vector js/css sharing on pl.wikipedia (T332809)]] (duration: 12m 05s) [production]
20:10 <brett@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs5006.eqsin.wmnet with reason: host reimage [production]
20:08 <kindrobot@deploy2002> kindrobot and jdlrobson: Backport for [[gerrit:905264|[refactor] split out Minerva configuration from main config]], [[gerrit:904284|Disable Vector js/css sharing on pl.wikipedia (T332809)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet [production]
20:07 <brett@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on lvs5006.eqsin.wmnet with reason: host reimage [production]
20:07 <kindrobot@deploy2002> Started scap: Backport for [[gerrit:905264|[refactor] split out Minerva configuration from main config]], [[gerrit:904284|Disable Vector js/css sharing on pl.wikipedia (T332809)]] [production]
20:03 <kindrobot> start UTC late backport window [production]
19:41 <brett@cumin2002> START - Cookbook sre.hosts.reimage for host lvs5006.eqsin.wmnet with OS bullseye [production]
19:38 <brett@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host lvs5006.eqsin.wmnet with OS bullseye [production]
19:36 <otto@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
19:35 <otto@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
19:09 <cwhite> manually upgrade vopsbot on alert2001 to version 0.3.3 [production]
18:59 <brett@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs5006.eqsin.wmnet with reason: host reimage [production]
18:55 <brett@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on lvs5006.eqsin.wmnet with reason: host reimage [production]