9501-9550 of 10000 results (61ms)
2021-04-01 ยง
16:59 <Urbanecm> Start server-side upload of two files (T279082, T279081) [production]
16:50 <razzi> rebalance kafka partitions for webrequest_text partitions 7 and 8 [analytics]
16:44 <hnowlan@puppetmaster1001> conftool action : set/pooled=yes; selector: name=aqs1007.eqiad.wmnet [production]
16:39 <urbanecm@deploy1002> Synchronized wmf-config/InitialiseSettings.php: a7acf3357d5d148bad11a2d2718b4da56e1a0cb8: hrwiki: Fix help panel links (T275684) (duration: 01m 10s) [production]
16:25 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2396.codfw.wmnet with reason: REIMAGE [production]
16:23 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2396.codfw.wmnet with reason: REIMAGE [production]
16:16 <Majavah> hard reboot unresponsive deployment-cache-text06 [releng]
16:02 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2395.codfw.wmnet with reason: REIMAGE [production]
16:00 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2395.codfw.wmnet with reason: REIMAGE [production]
15:58 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2394.codfw.wmnet with reason: REIMAGE [production]
15:56 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2394.codfw.wmnet with reason: REIMAGE [production]
15:53 <dcaro> Removed etcd member tools-k8s-etcd-5.tools.eqiad.wmflabs, adding a new member (T267082) [tools]
15:43 <dcaro> Removing etcd member tools-k8s-etcd-5.tools.eqiad.wmflabs (T267082) [tools]
15:39 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2393.codfw.wmnet with reason: REIMAGE [production]
15:37 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2393.codfw.wmnet with reason: REIMAGE [production]
15:36 <dcaro> Added new etcd member tools-k8s-etcd-9.tools.eqiad1.wikimedia.cloud (T267082) [tools]
15:32 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2391.codfw.wmnet with reason: REIMAGE [production]
15:30 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2391.codfw.wmnet with reason: REIMAGE [production]
15:18 <dcaro> adding new etcd member using the cookbook wmcs.toolforge.add_etcd_node (T267082) [tools]
15:17 <dcaro> etcd cluster shrunk 3 members (using wmcs.toolforge.remove_etcd_node cookbook) [toolsbeta]
15:05 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2392.codfw.wmnet with reason: REIMAGE [production]
15:03 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2392.codfw.wmnet with reason: REIMAGE [production]
14:54 <dcaro> shrinking etcd cluster to 3 members, cleaning up automation runs [toolsbeta]
14:52 <volans> uploaded python3-wmflib_0.0.7 to bullseye-wikimedia [production]
14:41 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2390.codfw.wmnet with reason: REIMAGE [production]
14:39 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2390.codfw.wmnet with reason: REIMAGE [production]
14:22 <effie> disable puppet on mw* canaries, rolling depool and pooling of canaries [production]
14:06 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-test-worker1001.eqiad.wmnet with reason: REIMAGE [production]
14:04 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-test-worker1001.eqiad.wmnet with reason: REIMAGE [production]
14:01 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2389.codfw.wmnet with reason: REIMAGE [production]
13:59 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2389.codfw.wmnet with reason: REIMAGE [production]
13:53 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw2388.codfw.wmnet with reason: REIMAGE [production]
13:51 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw2388.codfw.wmnet with reason: REIMAGE [production]
13:24 <ema> cp3054: reboot with Linux 4.19.181+1 -- the kernel was not upgraded earlier during T273278 reboots due to broken dpkg status [production]
13:16 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1022.eqiad.wmnet [production]
13:07 <jmm@cumin2001> START - Cookbook sre.hosts.reboot-single for host ganeti1022.eqiad.wmnet [production]
12:59 <dcaro@cumin1001> END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) [production]
12:53 <dcaro@cumin1001> START - Cookbook sre.hosts.upgrade-and-reboot [production]
12:52 <Majavah> update floating ip 185.15.56.9 from deployment-parsoid11 to deployment-parsoid12 [releng]
12:51 <dcaro@cumin1001> END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) [production]
12:47 <moritzm> drain ganeti1022 [production]
12:46 <dcaro@cumin1001> START - Cookbook sre.hosts.upgrade-and-reboot [production]
12:45 <dcaro@cumin1001> END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) [production]
12:45 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1021.eqiad.wmnet [production]
12:40 <dcaro@cumin1001> START - Cookbook sre.hosts.upgrade-and-reboot [production]
12:38 <dcaro@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephosd2003-dev.codfw.wmnet [production]
12:38 <jmm@cumin2001> START - Cookbook sre.hosts.reboot-single for host ganeti1021.eqiad.wmnet [production]
12:34 <dcaro@cumin1001> START - Cookbook sre.hosts.reboot-single for host cloudcephosd2003-dev.codfw.wmnet [production]
12:23 <moritzm> drain ganeti1021 [production]
12:21 <dcaro@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephosd2003-dev.codfw.wmnet [production]