2701-2750 of 10000 results (30ms)
2021-04-24 §
16:19 <arturo> deleting 2 leaked VMs by hand: 6aefef6f-0723-499d-895f-314f4804c377 | fullstackd-20210424153344 and af8bc9bd-ea0a-4789-b8dd-cf5cf96c31cc | fullstackd-20210424074938 (puppet check step timed out) [admin-monitoring]
08:03 <joal> Rerun failed webrequest-druid-hourly-wf-2021-4-23-13 [analytics]
2021-04-23 §
22:14 <Krinkle> Reloading Zuul to deploy https://gerrit.wikimedia.org/r/682029 [releng]
21:36 <foks> removing 1 file for legal compliance [production]
21:02 <wm-bot> <root> Hard restart in an attempt to reset state information at the Toolforge front proxy [tools.simple]
20:59 <wm-bot> <root> Restarting webservice which seems to have died due to grid engine instability [tools.simple]
20:15 <mutante> [apt1001:~] $ sudo -i reprepro -C main includedeb bullseye-wikimedia /home/dzahn/rawdog_2.23-2_all.deb (T280989) [production]
19:41 <mutante> [apt1001:~] $ sudo -i reprepro copy bullseye-wikimedia buster-wikimedia envoyproxy - copy envoy package from buster to bullseye T280989 [production]
19:09 <ebernhardson> closing duplicate/wrong cluster indices in cloudelastic [production]
18:51 <Framawiki> ran apt updates without issues on all 4 servers. T266386 looks fixed. [quarry]
17:24 <bstorm> rebooting toolsbeta-test-k8s-control-6 because it was "notready" for some reason [toolsbeta]
17:02 <elukey@puppetmaster1001> conftool action : set/pooled=yes; selector: name=cp1087.eqiad.wmnet [production]
16:35 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
16:32 <cmjohnson@cumin1001> START - Cookbook sre.dns.netbox [production]
16:30 <Majavah> remove deployment-prep hiera settings for phabricator, given there is no phabricator instance on that project [releng]
16:24 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
16:19 <cmjohnson@cumin1001> START - Cookbook sre.dns.netbox [production]
14:59 <jbond@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on theemin.codfw.wmnet with reason: REIMAGE [production]
14:59 <jbond@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on theemin.codfw.wmnet with reason: REIMAGE [production]
14:25 <moritzm> revert back bullseye image to daily build from last week (to rule out potential reimage issue) [production]
14:23 <elukey> roll restart an-master100[1,2] daemons to pick up new lo4j settings - T276906 [analytics]
13:49 <dcaro> testing the drain_cloudvirt cookbook on codfw1 openstack cluster, draining cloudvirt2001 (T280641) [admin]
13:33 <elukey> roll restart of all thanos-swift proxies to pick up new ML account - T280773 [production]
12:50 <jbond42> upload new debmonitor-client packages [production]
11:50 <moritzm> installing perf updates from Buster 10.9 point release [production]
11:12 <dcaro> testing the drain_cloudvirt cookbook on codfw1 openstack cluster (T280641) [admin]
10:30 <elukey> restart hadoop daemons (NM, DN, JN) on an-worker1080 to further test the new log4j config - T276906 [analytics]
10:06 <moritzm> installing Linux 4.19.181 updates from Buster 10.9 point release (no reboots, just updating the packages) [production]
09:54 <moritzm> installing xen security updates [production]
09:49 <moritzm> installing xorg-server security updates [production]
09:37 <marostegui@cumin1001> dbctl commit (dc=all): 'db1079 (re)pooling @ 100%: Repool db1079', diff saved to https://phabricator.wikimedia.org/P15512 and previous config saved to /var/cache/conftool/dbconfig/20210423-093723-root.json [production]
09:32 <dcaro> finished upgrade of ceph cluster on codfw1 using exclusively cookbooks (T280641) [admin]
09:22 <marostegui@cumin1001> dbctl commit (dc=all): 'db1079 (re)pooling @ 75%: Repool db1079', diff saved to https://phabricator.wikimedia.org/P15511 and previous config saved to /var/cache/conftool/dbconfig/20210423-092220-root.json [production]
09:17 <dcaro> testing the upgrade_osds cookbook on codfw1 ceph cluster (T280641) [admin]
09:12 <elukey> change default log4j hadoop config to include rolling gzip appender [analytics]
09:12 <Majavah> signing puppet certs for deployment-eventlog08 and running puppet for the first time to stop annoying email alerts [releng]
09:07 <marostegui@cumin1001> dbctl commit (dc=all): 'db1079 (re)pooling @ 50%: Repool db1079', diff saved to https://phabricator.wikimedia.org/P15510 and previous config saved to /var/cache/conftool/dbconfig/20210423-090716-root.json [production]
08:52 <marostegui@cumin1001> dbctl commit (dc=all): 'db1079 (re)pooling @ 25%: Repool db1079', diff saved to https://phabricator.wikimedia.org/P15509 and previous config saved to /var/cache/conftool/dbconfig/20210423-085212-root.json [production]
08:27 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1020.eqiad.wmnet [production]
08:21 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host ms-be1020.eqiad.wmnet [production]
08:19 <filippo@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1021.eqiad.wmnet [production]
08:17 <dcaro> testing the upgrade_mons cookbook on codfw1 ceph cluster (T280641) [admin]
08:12 <moritzm> upgrading d-i image for bullseye to RC1 release T275873 [production]
08:12 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host ms-be1021.eqiad.wmnet [production]
08:12 <moritzm> upgrading d-i image for bullseye to RC1 release [production]
08:12 <filippo@cumin1001> END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ms-be1019.eqiad.wmnet [production]
07:59 <filippo@cumin1001> START - Cookbook sre.hosts.reboot-single for host ms-be1019.eqiad.wmnet [production]
07:56 <jynus> deleting db1156 s2 database and reloading it from logical backups T280492 [production]
07:22 <Amir1> removing junk bounced email addresses from yahoo from all mailing lists [production]
05:40 <marostegui> Stop db1079 to clone db1158 (lag will appear on s7 on wiki replicas) [production]