1-50 of 10000 results (16ms)
2021-01-22 ยง
22:41 <reedy@deploy1001> Synchronized invalid.json: (no justification provided) (duration: 00m 58s) [production]
20:07 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1268.eqiad.wmnet with reason: REIMAGE [production]
20:05 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1268.eqiad.wmnet with reason: REIMAGE [production]
20:05 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2330.codfw.wmnet with reason: REIMAGE [production]
20:05 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2332.codfw.wmnet with reason: REIMAGE [production]
20:03 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2328.codfw.wmnet with reason: REIMAGE [production]
20:01 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1413.eqiad.wmnet with reason: REIMAGE [production]
20:01 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2334.codfw.wmnet with reason: REIMAGE [production]
20:00 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1413.eqiad.wmnet with reason: REIMAGE [production]
20:00 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2328.codfw.wmnet with reason: REIMAGE [production]
20:00 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2330.codfw.wmnet with reason: REIMAGE [production]
20:00 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2332.codfw.wmnet with reason: REIMAGE [production]
19:59 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2334.codfw.wmnet with reason: REIMAGE [production]
19:39 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw2356.codfw.wmnet [production]
19:38 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw2354.codfw.wmnet [production]
19:38 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw2352.codfw.wmnet [production]
19:36 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw2350.codfw.wmnet [production]
19:35 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw2352.codfw.wmnet [production]
19:35 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw2350.codfw.wmnet [production]
19:35 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw2354.codfw.wmnet [production]
19:34 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw2356.codfw.wmnet [production]
19:15 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2350.codfw.wmnet with reason: REIMAGE [production]
19:13 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2352.codfw.wmnet with reason: REIMAGE [production]
19:11 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2354.codfw.wmnet with reason: REIMAGE [production]
19:10 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2350.codfw.wmnet with reason: REIMAGE [production]
19:09 <mutante> releases1002 systemctl reset-failed [production]
19:09 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2356.codfw.wmnet with reason: REIMAGE [production]
19:09 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2352.codfw.wmnet with reason: REIMAGE [production]
19:08 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2354.codfw.wmnet with reason: REIMAGE [production]
19:07 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2356.codfw.wmnet with reason: REIMAGE [production]
18:47 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw2364.codfw.wmnet [production]
18:47 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw2362.codfw.wmnet [production]
18:47 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw2360.codfw.wmnet [production]
18:46 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw2358.codfw.wmnet [production]
18:46 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw2362.codfw.wmnet [production]
18:46 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw2364.codfw.wmnet [production]
18:45 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw2360.codfw.wmnet [production]
18:45 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw2358.codfw.wmnet [production]
18:17 <mutante> releases2002 - rebooting to confirm works now and also new disk gets auto-mounted [production]
18:03 <mutante> releases1002 - replaced ens5 with ens6 in /etc/network/interfaaces and rebooted again [production]
18:01 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on releases1002.eqiad.wmnet with reason: fixing networking - added disk [production]
18:01 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on releases1002.eqiad.wmnet with reason: fixing networking - added disk [production]
17:59 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw2360.codfw.wmnet with reason: new install on buster [production]
17:59 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 4:00:00 on mw2360.codfw.wmnet with reason: new install on buster [production]
17:57 <mutante> releases1002 (releases.wm.org active backend) - rebooting - hopefully it does not run into T272555 but if it does now it's known how to fix [production]
17:55 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2364.codfw.wmnet with reason: REIMAGE [production]
17:54 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2360.codfw.wmnet with reason: REIMAGE [production]
17:53 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2358.codfw.wmnet with reason: REIMAGE [production]
17:52 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2362.codfw.wmnet with reason: REIMAGE [production]
17:52 <mutante> releases2001 - create new partition table with fdisk, make ext4 filesystem on /dev/vdb1 [production]