301-350 of 10000 results (90ms)
2024-07-18 ยง
18:30 <aokoth@cumin1002> END (FAIL) - Cookbook sre.vrts.upgrade (exit_code=99) on VRTS host vrts1001.eqiad.wmnet [production]
18:27 <aokoth@cumin1002> START - Cookbook sre.vrts.upgrade on VRTS host vrts1001.eqiad.wmnet [production]
18:17 <swfrench-wmf> api-ro.discovery.wmnet now resolves to failoid - T367949 [production]
18:03 <swfrench-wmf> appservers-ro.discovery.wmnet now resolves to failoid - T367949 [production]
18:01 <aokoth@cumin1002> END (FAIL) - Cookbook sre.vrts.upgrade (exit_code=99) on VRTS host vrts1001.eqiad.wmnet [production]
18:01 <aokoth@cumin1002> START - Cookbook sre.vrts.upgrade on VRTS host vrts1001.eqiad.wmnet [production]
17:45 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool db2136', diff saved to https://phabricator.wikimedia.org/P66829 and previous config saved to /var/cache/conftool/dbconfig/20240718-174547-root.json [production]
17:43 <topranks> disabling cr2-codfw port et-1/1/0 connecting to asw-c-codfw T366941 [production]
17:38 <cgoubert@cumin1002> START - Cookbook sre.hosts.provision for host mw2438.mgmt.codfw.wmnet with reboot policy GRACEFUL [production]
17:29 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw2438.codfw.wmnet [production]
17:29 <cgoubert@cumin1002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw2438.codfw.wmnet [production]
17:29 <cgoubert@cumin1002> START - Cookbook sre.hosts.convert-disks for host mw2438 [production]
17:28 <cgoubert@cumin1002> END (ERROR) - Cookbook sre.hosts.convert-disks (exit_code=97) for host mw2438 [production]
17:24 <topranks> making cr1-codfw interfaces connecting ssw1-d1-codfw VRRP master for row c & d vlans T366941 [production]
17:20 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw2438.codfw.wmnet [production]
17:20 <cgoubert@cumin1002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw2438.codfw.wmnet [production]
17:20 <cgoubert@cumin1002> START - Cookbook sre.hosts.convert-disks for host mw2438 [production]
17:15 <cgoubert@cumin1002> END (FAIL) - Cookbook sre.hosts.convert-disks (exit_code=99) for host mw2438 [production]
17:15 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw2438.codfw.wmnet [production]
17:15 <cgoubert@cumin1002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw2438.codfw.wmnet [production]
17:15 <cgoubert@cumin1002> START - Cookbook sre.hosts.convert-disks for host mw2438 [production]
17:10 <cgoubert@cumin1002> END (FAIL) - Cookbook sre.hosts.convert-disks (exit_code=99) for host mw2438 [production]
17:10 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw2438.codfw.wmnet [production]
17:10 <cgoubert@cumin1002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw2438.codfw.wmnet [production]
17:09 <cgoubert@cumin1002> START - Cookbook sre.hosts.convert-disks for host mw2438 [production]
16:52 <cgoubert@cumin1002> END (FAIL) - Cookbook sre.hosts.convert-disks (exit_code=99) for host mw2438 [production]
16:39 <topranks> resetting line card 1/1 on cr1-codfw (T366941) [production]
16:37 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mw2438.codfw.wmnet [production]
16:35 <cgoubert@cumin1002> START - Cookbook sre.hosts.reboot-single for host mw2438.codfw.wmnet [production]
16:35 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw2438.codfw.wmnet [production]
16:34 <cmooney@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on ssw1-a1-codfw.mgmt with reason: bouncing line card on cr1-codfw [production]
16:34 <cmooney@cumin1002> START - Cookbook sre.hosts.downtime for 0:20:00 on ssw1-a1-codfw.mgmt with reason: bouncing line card on cr1-codfw [production]
16:32 <papaul> re-enable option 82 on lsw1-b7-codfw [production]
16:26 <cgoubert@cumin1002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw2438.codfw.wmnet [production]
16:25 <cgoubert@cumin1002> START - Cookbook sre.hosts.convert-disks for host mw2438 [production]
16:24 <papaul> disable option 82 on lsw1-b7-codfw to test pxe boot issue [production]
16:23 <cgoubert@cumin1002> END (FAIL) - Cookbook sre.hosts.convert-disks (exit_code=99) for host mw2433 [production]
16:21 <cmooney@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on cloudsw1-b1-codfw.mgmt,pfw3-codfw with reason: bouncing line card on cr1-codfw [production]
16:21 <cmooney@cumin1002> START - Cookbook sre.hosts.downtime for 0:20:00 on cloudsw1-b1-codfw.mgmt,pfw3-codfw with reason: bouncing line card on cr1-codfw [production]
16:13 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mw2433.codfw.wmnet [production]
16:10 <cgoubert@cumin1002> START - Cookbook sre.hosts.reboot-single for host mw2433.codfw.wmnet [production]
16:10 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw2433.codfw.wmnet [production]
16:10 <cgoubert@cumin1002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw2433.codfw.wmnet [production]
16:10 <cgoubert@cumin1002> START - Cookbook sre.hosts.convert-disks for host mw2433 [production]
16:07 <cmooney@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on cloudsw1-b1-codfw.mgmt,pfw3-codfw with reason: bouncing line card on cr1-codfw [production]
16:07 <cmooney@cumin1002> START - Cookbook sre.hosts.downtime for 0:20:00 on cloudsw1-b1-codfw.mgmt,pfw3-codfw with reason: bouncing line card on cr1-codfw [production]
15:52 <brouberol@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s_services/services/datahub: sync on production [production]
15:48 <brouberol@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s_services/services/datahub: apply on production [production]
15:37 <arnaudb@cumin1002> dbctl commit (dc=all): 'db1203 (re)pooling @ 100%: maintenance rescheduled', diff saved to https://phabricator.wikimedia.org/P66827 and previous config saved to /var/cache/conftool/dbconfig/20240718-153748-arnaudb.json [production]
15:37 <arnaudb@cumin1002> dbctl commit (dc=all): 'db1202 (re)pooling @ 100%: maintenance rescheduled', diff saved to https://phabricator.wikimedia.org/P66826 and previous config saved to /var/cache/conftool/dbconfig/20240718-153731-arnaudb.json [production]