301-350 of 10000 results (62ms)
2024-07-18 ยง
19:42 <cmooney@cumin1002> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add entries for new IRB interfaces codfw - cmooney@cumin1002" [production]
19:39 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2155 (T367856)', diff saved to https://phabricator.wikimedia.org/P66831 and previous config saved to /var/cache/conftool/dbconfig/20240718-193927-marostegui.json [production]
19:39 <wmbot~dcaro@urcuchillay> START - Cookbook wmcs.ceph.osd.bootstrap_and_add [admin]
19:38 <wmbot~dcaro@urcuchillay> END (PASS) - Cookbook wmcs.ceph.osd.depool_and_destroy (exit_code=0) [admin]
19:38 <cmooney@cumin1002> START - Cookbook sre.dns.netbox [production]
19:37 <topranks> add IRB int on public1-c-codfw vlan to ssw1-d1-codfw and ssw1-d8-codfw T369274 [production]
19:37 <denisse> Send SIGQUIT signal to the benthos service after a goroutine was waiting forever in webrequest_live.yaml - T369256 [production]
19:34 <topranks> disable BGP between spine switches in rows A and row D prior to enabling IP GW (T369274) [production]
19:32 <cmooney@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ssw1-a[1,8]-codfw.mgmt,ssw1-d[1,8]-codfw.mgmt with reason: Migrate codfw row c and d IP GWs from CRs to Spines [production]
19:31 <cmooney@cumin1002> START - Cookbook sre.hosts.downtime for 3:00:00 on ssw1-a[1,8]-codfw.mgmt,ssw1-d[1,8]-codfw.mgmt with reason: Migrate codfw row c and d IP GWs from CRs to Spines [production]
19:12 <topranks> enabling BGP session from cr1-codfw to ssw1-d1-codfw [production]
19:08 <wmbot~dcaro@urcuchillay> START - Cookbook wmcs.ceph.osd.depool_and_destroy [admin]
19:08 <wmbot~dcaro@urcuchillay> END (PASS) - Cookbook wmcs.ceph.osd.bootstrap_and_add (exit_code=0) [admin]
19:07 <dancy@deploy1002> Installing scap version "4.93.0" for 232 hosts [production]
18:57 <andrewbogott> shutting down main.commons-corruption-checker.eqiad1.wikimedia.cloud due to no response on T367525 or direct emails [commons-corruption-checker]
18:30 <aokoth@cumin1002> END (FAIL) - Cookbook sre.vrts.upgrade (exit_code=99) on VRTS host vrts1001.eqiad.wmnet [production]
18:27 <aokoth@cumin1002> START - Cookbook sre.vrts.upgrade on VRTS host vrts1001.eqiad.wmnet [production]
18:17 <swfrench-wmf> api-ro.discovery.wmnet now resolves to failoid - T367949 [production]
18:03 <swfrench-wmf> appservers-ro.discovery.wmnet now resolves to failoid - T367949 [production]
18:01 <aokoth@cumin1002> END (FAIL) - Cookbook sre.vrts.upgrade (exit_code=99) on VRTS host vrts1001.eqiad.wmnet [production]
18:01 <aokoth@cumin1002> START - Cookbook sre.vrts.upgrade on VRTS host vrts1001.eqiad.wmnet [production]
17:45 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool db2136', diff saved to https://phabricator.wikimedia.org/P66829 and previous config saved to /var/cache/conftool/dbconfig/20240718-174547-root.json [production]
17:43 <topranks> disabling cr2-codfw port et-1/1/0 connecting to asw-c-codfw T366941 [production]
17:38 <cgoubert@cumin1002> START - Cookbook sre.hosts.provision for host mw2438.mgmt.codfw.wmnet with reboot policy GRACEFUL [production]
17:29 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw2438.codfw.wmnet [production]
17:29 <cgoubert@cumin1002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw2438.codfw.wmnet [production]
17:29 <cgoubert@cumin1002> START - Cookbook sre.hosts.convert-disks for host mw2438 [production]
17:28 <cgoubert@cumin1002> END (ERROR) - Cookbook sre.hosts.convert-disks (exit_code=97) for host mw2438 [production]
17:24 <topranks> making cr1-codfw interfaces connecting ssw1-d1-codfw VRRP master for row c & d vlans T366941 [production]
17:21 <MacFan4000> deleted unused buster instances [wm-bot]
17:20 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw2438.codfw.wmnet [production]
17:20 <cgoubert@cumin1002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw2438.codfw.wmnet [production]
17:20 <cgoubert@cumin1002> START - Cookbook sre.hosts.convert-disks for host mw2438 [production]
17:15 <cgoubert@cumin1002> END (FAIL) - Cookbook sre.hosts.convert-disks (exit_code=99) for host mw2438 [production]
17:15 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw2438.codfw.wmnet [production]
17:15 <cgoubert@cumin1002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw2438.codfw.wmnet [production]
17:15 <cgoubert@cumin1002> START - Cookbook sre.hosts.convert-disks for host mw2438 [production]
17:10 <cgoubert@cumin1002> END (FAIL) - Cookbook sre.hosts.convert-disks (exit_code=99) for host mw2438 [production]
17:10 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw2438.codfw.wmnet [production]
17:10 <cgoubert@cumin1002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts mw2438.codfw.wmnet [production]
17:09 <cgoubert@cumin1002> START - Cookbook sre.hosts.convert-disks for host mw2438 [production]
16:52 <cgoubert@cumin1002> END (FAIL) - Cookbook sre.hosts.convert-disks (exit_code=99) for host mw2438 [production]
16:40 <wmbot~dcaro@urcuchillay> START - Cookbook wmcs.ceph.osd.bootstrap_and_add [admin]
16:40 <wmbot~dcaro@urcuchillay> END (PASS) - Cookbook wmcs.ceph.osd.depool_and_destroy (exit_code=0) [admin]
16:39 <topranks> resetting line card 1/1 on cr1-codfw (T366941) [production]
16:37 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mw2438.codfw.wmnet [production]
16:35 <cgoubert@cumin1002> START - Cookbook sre.hosts.reboot-single for host mw2438.codfw.wmnet [production]
16:35 <cgoubert@cumin1002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts mw2438.codfw.wmnet [production]
16:34 <cmooney@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on ssw1-a1-codfw.mgmt with reason: bouncing line card on cr1-codfw [production]
16:34 <cmooney@cumin1002> START - Cookbook sre.hosts.downtime for 0:20:00 on ssw1-a1-codfw.mgmt with reason: bouncing line card on cr1-codfw [production]