6901-6950 of 10000 results (35ms)
2021-01-22 ยง
19:15 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2350.codfw.wmnet with reason: REIMAGE [production]
19:13 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2352.codfw.wmnet with reason: REIMAGE [production]
19:11 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2354.codfw.wmnet with reason: REIMAGE [production]
19:10 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2350.codfw.wmnet with reason: REIMAGE [production]
19:09 <mutante> releases1002 systemctl reset-failed [production]
19:09 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2356.codfw.wmnet with reason: REIMAGE [production]
19:09 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2352.codfw.wmnet with reason: REIMAGE [production]
19:08 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2354.codfw.wmnet with reason: REIMAGE [production]
19:07 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2356.codfw.wmnet with reason: REIMAGE [production]
18:47 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw2364.codfw.wmnet [production]
18:47 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw2362.codfw.wmnet [production]
18:47 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw2360.codfw.wmnet [production]
18:46 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw2358.codfw.wmnet [production]
18:46 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw2362.codfw.wmnet [production]
18:46 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw2364.codfw.wmnet [production]
18:45 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw2360.codfw.wmnet [production]
18:45 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw2358.codfw.wmnet [production]
18:17 <mutante> releases2002 - rebooting to confirm works now and also new disk gets auto-mounted [production]
18:03 <mutante> releases1002 - replaced ens5 with ens6 in /etc/network/interfaaces and rebooted again [production]
18:01 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on releases1002.eqiad.wmnet with reason: fixing networking - added disk [production]
18:01 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on releases1002.eqiad.wmnet with reason: fixing networking - added disk [production]
17:59 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw2360.codfw.wmnet with reason: new install on buster [production]
17:59 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 4:00:00 on mw2360.codfw.wmnet with reason: new install on buster [production]
17:57 <mutante> releases1002 (releases.wm.org active backend) - rebooting - hopefully it does not run into T272555 but if it does now it's known how to fix [production]
17:55 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2364.codfw.wmnet with reason: REIMAGE [production]
17:54 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2360.codfw.wmnet with reason: REIMAGE [production]
17:53 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2358.codfw.wmnet with reason: REIMAGE [production]
17:52 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2362.codfw.wmnet with reason: REIMAGE [production]
17:52 <mutante> releases2001 - create new partition table with fdisk, make ext4 filesystem on /dev/vdb1 [production]
17:50 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2364.codfw.wmnet with reason: REIMAGE [production]
17:50 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2362.codfw.wmnet with reason: REIMAGE [production]
17:49 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2360.codfw.wmnet with reason: REIMAGE [production]
17:49 <ppchelko@deploy1001> Finished deploy [restbase/deploy@e54225d]: T270411 T270415 T270281 T270277 (duration: 65m 37s) [production]
17:49 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2358.codfw.wmnet with reason: REIMAGE [production]
17:29 <mforns@deploy1001> Finished deploy [analytics/refinery@eea071d] (thin): Extra bug-fix train THIN [analytics/refinery@eea071def90a8a856b1e04dda23b77a850134253] (duration: 00m 07s) [production]
17:29 <mforns@deploy1001> Started deploy [analytics/refinery@eea071d] (thin): Extra bug-fix train THIN [analytics/refinery@eea071def90a8a856b1e04dda23b77a850134253] [production]
17:23 <mforns@deploy1001> Finished deploy [analytics/refinery@eea071d]: Extra bug-fix train [analytics/refinery@eea071def90a8a856b1e04dda23b77a850134253] (duration: 10m 03s) [production]
17:13 <mforns@deploy1001> Started deploy [analytics/refinery@eea071d]: Extra bug-fix train [analytics/refinery@eea071def90a8a856b1e04dda23b77a850134253] [production]
16:44 <ppchelko@deploy1001> Started deploy [restbase/deploy@e54225d]: T270411 T270415 T270281 T270277 [production]
16:40 <cmjohnson1> replacing optics/fiber pfw3a-eqiad:xe-0/0/17 and fasw-c1a-eqiad:xe-0/2/0 T271295 [production]
16:19 <jynus> restart of backup source hosts on codfw T271913 [production]
15:54 <otto@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'eventstreams-internal' for release 'main' . [production]
15:40 <moritzm> installing puppetboard1002 [production]
15:24 <moritzm> installing puppetboard2002 [production]
13:44 <kormat@cumin1001> dbctl commit (dc=all): 'db1149 (re)pooling @ 100%: Reboot T272255', diff saved to https://phabricator.wikimedia.org/P13932 and previous config saved to /var/cache/conftool/dbconfig/20210122-134444-kormat.json [production]
13:33 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db1121', diff saved to https://phabricator.wikimedia.org/P13931 and previous config saved to /var/cache/conftool/dbconfig/20210122-133341-marostegui.json [production]
13:31 <marostegui> Stop replication on db1121 [production]
13:30 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1121', diff saved to https://phabricator.wikimedia.org/P13930 and previous config saved to /var/cache/conftool/dbconfig/20210122-133044-marostegui.json [production]
13:29 <kormat@cumin1001> dbctl commit (dc=all): 'db1149 (re)pooling @ 75%: Reboot T272255', diff saved to https://phabricator.wikimedia.org/P13929 and previous config saved to /var/cache/conftool/dbconfig/20210122-132939-kormat.json [production]
13:21 <jmm@cumin2001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host puppetboard2002.codfw.wmnet [production]