1-50 of 10000 results (30ms)
2021-08-14 §
19:21 <wm-bot> <lucaswerkmeister> installed TemplateStyles extension (turns out it doesn’t do what I wanted to but let’s keep it anyways) [tools.notwikilambda]
17:35 <majavah> add k8s job to rebuild stretch report, with same parameters as now-deactivated grid cron job for jessie [tools.os-deprecation]
17:21 <bd808> Added majavah as co-maintainer and granted git repo access [tools.os-deprecation]
15:11 <bd808> Transferred ownership from [[User:Owner of abandoned tools]] to [[User:Ash Crow]] (T288890) [tools.macommune]
15:03 <wm-bot> <bd808> Updated config for channel renaming that has happened after libera.chat migration [tools.stashbot]
14:57 <majavah> restart after irc disconnect [tools.bridgebot]
12:46 <wm-bot> <lucaswerkmeister> deployed 7a1980f4e2 (l10n updates) [tools.lexeme-forms]
03:54 <legoktm[m]> restarting mailman3 on lists1001, bounce runner crashed (T288880) [production]
2021-08-13 §
20:09 <urbanecm> Manually start `beta-update-databases-eqiad` CI job [releng]
20:06 <urbanecm> deployment-prep: sudo -u jenkins-deploy /usr/local/bin/wmf-beta-update-databases.py [releng]
20:03 <urbanecm> Kill beta-scap-sync-world job for the usual reason [releng]
18:43 <bblack> reprepro: uploaded gdnsd-3.8.0-1~wmf1 to buster-wikimedia - T252132 [production]
17:32 <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on mw[1451-1452,1454-1455].eqiad.wmnet with reason: setup new mediawiki servers in eqiad https://phabricator.wikimedia.org/T279309 [production]
17:32 <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on mw[1451-1452,1454-1455].eqiad.wmnet with reason: setup new mediawiki servers in eqiad https://phabricator.wikimedia.org/T279309 [production]
17:06 <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw[1451-1452,1454-1455].eqiad.wmnet with reason: setup new mediawiki servers in eqiad https://phabricator.wikimedia.org/T279309 [production]
17:05 <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw[1451-1452,1454-1455].eqiad.wmnet with reason: setup new mediawiki servers in eqiad https://phabricator.wikimedia.org/T279309 [production]
16:46 <elukey> cleanup /srv/discovery on stat1007 after https://gerrit.wikimedia.org/r/c/operations/puppet/+/712422 [analytics]
15:39 <mutante> mw1451, mw1452, mw1454 - rebooting after reimage, memcached needs one [production]
15:30 <mutante> mw1453 - racadm serveraction powercycle (down and was working until right before the switch issue) [production]
15:18 <godog> restart pybal on lvs2009, to clear CRITICAL - thanos-swift_443: Servers thanos-fe2002.codfw.wmnet are marked down but pooled [production]
15:16 <milimetric> reran the other three failed jobs successfully [analytics]
15:14 <godog> restart pybal on lvs2010, to clear CRITICAL - thanos-swift_443: Servers thanos-fe2002.codfw.wmnet are marked down but pooled [production]
15:02 <mutante> etherpad1002 - started failed ferm [production]
15:00 <mutante> an-worker1117, an-worker1118 - started failed ferm (why are these slowly trickling in ) [production]
14:57 <jelto@cumin1001> conftool action : set/pooled=no; selector: name=mw1450.eqiad.wmnet [production]
14:57 <jelto@cumin1001> conftool action : set/pooled=no; selector: name=mw144[7-9].eqiad.wmnet [production]
14:54 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw[1451-1452,1454-1455].eqiad.wmnet with reason: new setup [production]
14:54 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw[1451-1452,1454-1455].eqiad.wmnet with reason: new setup [production]
14:52 <milimetric> rerunning webrequest-druid-hourly-wf-2021-8-13-13 because of failure to connect to Hive metastore [analytics]
14:50 <mutante> an-worker1079 - started failed ferm [production]
14:47 <jelto@cumin1001> conftool action : set/weight=25; selector: name=mw1450.eqiad.wmnet [production]
14:46 <jelto@cumin1001> conftool action : set/weight=25; selector: name=mw144[7-9].eqiad.wmnet [production]
14:45 <mutante> an-worker1095 - started ferm, service failed [production]
14:44 <mutante> an-worker1082 - started ferm (was failed due to DNS hickup) [production]
14:44 <jelto@cumin1001> conftool action : set/pooled=inactive; selector: name=mw1450.eqiad.wmnet [production]
14:43 <jelto@cumin1001> conftool action : set/pooled=inactive; selector: name=mw144[7-9].eqiad.wmnet [production]
14:41 <mutante> mw1419 - started ferm [production]
13:35 <sukhe> ran homer for Gerrit 712400: Set up BGP peering to doh4002 in ulsfo [production]
13:23 <mutante> mw1453 - manual powercycle after it never rebooted when the reimage cookbook tries to trigger one [production]
13:22 <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1450.eqiad.wmnet with reason: setup new mediawiki servers in eqiad https://phabricator.wikimedia.org/T279309 [production]
13:21 <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1450.eqiad.wmnet with reason: setup new mediawiki servers in eqiad https://phabricator.wikimedia.org/T279309 [production]
13:21 <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw[1447-1449].eqiad.wmnet with reason: setup new mediawiki servers in eqiad https://phabricator.wikimedia.org/T279309 [production]
13:21 <jelto@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw[1447-1449].eqiad.wmnet with reason: setup new mediawiki servers in eqiad https://phabricator.wikimedia.org/T279309 [production]
13:13 <majavah> `mwscript extensions/CentralAuth/maintenance/importMissingLocalNames.php --wiki metawiki` on the beta cluster [releng]
12:54 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1454.eqiad.wmnet with reason: REIMAGE [production]
12:53 <godog> set runtime envoy.reloadable_features.strict_1xx_and_204_response_headers=false on thanos-fe* - T288815 [production]
12:53 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-fe2001.codfw.wmnet with reason: new setup [production]
12:53 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-fe2001.codfw.wmnet with reason: new setup [production]
12:52 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1454.eqiad.wmnet with reason: REIMAGE [production]
12:33 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1452.eqiad.wmnet with reason: REIMAGE [production]