5451-5500 of 10000 results (24ms)
2022-03-30 §
06:35 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ml-serve1001.eqiad.wmnet [production]
06:34 <elukey@cumin1001> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host ml-serve1001.eqiad.wmnet [production]
06:34 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ml-serve1001.eqiad.wmnet [production]
06:34 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ores2006.codfw.wmnet [production]
06:28 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2005.codfw.wmnet [production]
06:20 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ores2005.codfw.wmnet [production]
06:15 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2004.codfw.wmnet [production]
06:11 <elukey> restart rsyslogd on ml-serve1001 [production]
06:07 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ores2004.codfw.wmnet [production]
2022-03-29 §
17:11 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2003.codfw.wmnet [production]
17:03 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ores2003.codfw.wmnet [production]
17:03 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2002.codfw.wmnet [production]
16:55 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ores2002.codfw.wmnet [production]
16:51 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2001.codfw.wmnet [production]
16:45 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ores2001.codfw.wmnet [production]
10:02 <elukey@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ml-cache1002.eqiad.wmnet with OS bullseye [production]
10:02 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host ml-cache1002.eqiad.wmnet with OS bullseye [production]
10:02 <elukey@cumin1001> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ml-cache1002.eqiad.wmnet with OS bullseye [production]
09:35 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host ml-cache1002.eqiad.wmnet with OS bullseye [production]
2022-03-28 §
13:04 <elukey@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
13:03 <elukey@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
06:52 <elukey> reboot ml-serve-ctrl1002 - ganeti console available but slow (attempted to root login but never get to input the password) [production]
2022-03-27 §
14:20 <elukey> roll restart of wqds-blazegraph-public codfw [production]
14:18 <elukey> restart blazegraph on wdqs2003 [production]
2022-03-24 §
11:15 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1017.eqiad.wmnet with OS bullseye [production]
11:04 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1017.eqiad.wmnet with reason: host reimage [production]
11:00 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1017.eqiad.wmnet with reason: host reimage [production]
10:45 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kubernetes1017.eqiad.wmnet with OS bullseye [production]
10:40 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1014.eqiad.wmnet with OS bullseye [production]
10:28 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1014.eqiad.wmnet with reason: host reimage [production]
10:25 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1014.eqiad.wmnet with reason: host reimage [production]
10:09 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kubernetes1014.eqiad.wmnet with OS bullseye [production]
08:43 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1013.eqiad.wmnet with OS bullseye [production]
08:31 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1013.eqiad.wmnet with reason: host reimage [production]
08:27 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1013.eqiad.wmnet with reason: host reimage [production]
08:12 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kubernetes1013.eqiad.wmnet with OS bullseye [production]
07:39 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1012.eqiad.wmnet with OS bullseye [production]
07:27 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1012.eqiad.wmnet with reason: host reimage [production]
07:24 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1012.eqiad.wmnet with reason: host reimage [production]
07:08 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kubernetes1012.eqiad.wmnet with OS bullseye [production]
2022-03-23 §
16:40 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1011.eqiad.wmnet with OS bullseye [production]
16:29 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1011.eqiad.wmnet with reason: host reimage [production]
16:25 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1011.eqiad.wmnet with reason: host reimage [production]
16:10 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kubernetes1011.eqiad.wmnet with OS bullseye [production]
13:51 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1010.eqiad.wmnet with OS bullseye [production]
13:39 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1010.eqiad.wmnet with reason: host reimage [production]
13:34 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1010.eqiad.wmnet with reason: host reimage [production]
13:19 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kubernetes1010.eqiad.wmnet with OS bullseye [production]
08:24 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1009.eqiad.wmnet with OS bullseye [production]
08:12 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1009.eqiad.wmnet with reason: host reimage [production]