2021-08-15
§
|
18:23 |
<wm-bot> |
<lucaswerkmeister> deployed 9235b38189 (Python 3.9, CC T284590) |
[tools.ranker] |
18:06 |
<wm-bot> |
<lucaswerkmeister> deployed de504073a8 (style fix) |
[tools.pagepile-visual-filter] |
17:54 |
<wm-bot> |
<lucaswerkmeister> deployed 9e864a3b9b (Python 3.9, no issues so far; CC T284590) |
[tools.pagepile-visual-filter] |
17:44 |
<James_F> |
Zuul: [mediawiki/extensions/CIForms] Add basic quibble CI |
[releng] |
17:30 |
<majavah> |
deploying update jobs-framework-api container list to include bullseye images |
[tools] |
17:21 |
<majavah> |
finished initial build of images: php74, jdk17, python39, ruby27 - T284590 |
[tools] |
16:51 |
<majavah> |
starting build of initial bullseye based images - T284590 |
[tools] |
16:44 |
<majavah> |
tagged and building toollabs-webservice 0.76 with bullseye images defined T284590 |
[tools] |
16:13 |
<andrew@deploy1002> |
Finished deploy [horizon/deploy@c23a155]: adding cinder volume resize warning (duration: 03m 52s) |
[production] |
16:10 |
<andrew@deploy1002> |
Started deploy [horizon/deploy@c23a155]: adding cinder volume resize warning |
[production] |
15:14 |
<majavah> |
building tools-webservice 0.74 (currently live version) to bullseye-tools and bullseye-toolsbeta |
[tools] |
2021-08-14
§
|
19:21 |
<wm-bot> |
<lucaswerkmeister> installed TemplateStyles extension (turns out it doesn’t do what I wanted to but let’s keep it anyways) |
[tools.notwikilambda] |
17:35 |
<majavah> |
add k8s job to rebuild stretch report, with same parameters as now-deactivated grid cron job for jessie |
[tools.os-deprecation] |
17:21 |
<bd808> |
Added majavah as co-maintainer and granted git repo access |
[tools.os-deprecation] |
15:11 |
<bd808> |
Transferred ownership from [[User:Owner of abandoned tools]] to [[User:Ash Crow]] (T288890) |
[tools.macommune] |
15:03 |
<wm-bot> |
<bd808> Updated config for channel renaming that has happened after libera.chat migration |
[tools.stashbot] |
14:57 |
<majavah> |
restart after irc disconnect |
[tools.bridgebot] |
12:46 |
<wm-bot> |
<lucaswerkmeister> deployed 7a1980f4e2 (l10n updates) |
[tools.lexeme-forms] |
03:54 |
<legoktm[m]> |
restarting mailman3 on lists1001, bounce runner crashed (T288880) |
[production] |
2021-08-13
§
|
20:09 |
<urbanecm> |
Manually start `beta-update-databases-eqiad` CI job |
[releng] |
20:06 |
<urbanecm> |
deployment-prep: sudo -u jenkins-deploy /usr/local/bin/wmf-beta-update-databases.py |
[releng] |
20:03 |
<urbanecm> |
Kill beta-scap-sync-world job for the usual reason |
[releng] |
18:43 |
<bblack> |
reprepro: uploaded gdnsd-3.8.0-1~wmf1 to buster-wikimedia - T252132 |
[production] |
17:32 |
<jelto@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on mw[1451-1452,1454-1455].eqiad.wmnet with reason: setup new mediawiki servers in eqiad https://phabricator.wikimedia.org/T279309 |
[production] |
17:32 |
<jelto@cumin1001> |
START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on mw[1451-1452,1454-1455].eqiad.wmnet with reason: setup new mediawiki servers in eqiad https://phabricator.wikimedia.org/T279309 |
[production] |
17:06 |
<jelto@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw[1451-1452,1454-1455].eqiad.wmnet with reason: setup new mediawiki servers in eqiad https://phabricator.wikimedia.org/T279309 |
[production] |
17:05 |
<jelto@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw[1451-1452,1454-1455].eqiad.wmnet with reason: setup new mediawiki servers in eqiad https://phabricator.wikimedia.org/T279309 |
[production] |
16:46 |
<elukey> |
cleanup /srv/discovery on stat1007 after https://gerrit.wikimedia.org/r/c/operations/puppet/+/712422 |
[analytics] |
15:39 |
<mutante> |
mw1451, mw1452, mw1454 - rebooting after reimage, memcached needs one |
[production] |
15:30 |
<mutante> |
mw1453 - racadm serveraction powercycle (down and was working until right before the switch issue) |
[production] |
15:18 |
<godog> |
restart pybal on lvs2009, to clear CRITICAL - thanos-swift_443: Servers thanos-fe2002.codfw.wmnet are marked down but pooled |
[production] |
15:16 |
<milimetric> |
reran the other three failed jobs successfully |
[analytics] |
15:14 |
<godog> |
restart pybal on lvs2010, to clear CRITICAL - thanos-swift_443: Servers thanos-fe2002.codfw.wmnet are marked down but pooled |
[production] |
15:02 |
<mutante> |
etherpad1002 - started failed ferm |
[production] |
15:00 |
<mutante> |
an-worker1117, an-worker1118 - started failed ferm (why are these slowly trickling in ) |
[production] |
14:57 |
<jelto@cumin1001> |
conftool action : set/pooled=no; selector: name=mw1450.eqiad.wmnet |
[production] |
14:57 |
<jelto@cumin1001> |
conftool action : set/pooled=no; selector: name=mw144[7-9].eqiad.wmnet |
[production] |
14:54 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw[1451-1452,1454-1455].eqiad.wmnet with reason: new setup |
[production] |
14:54 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw[1451-1452,1454-1455].eqiad.wmnet with reason: new setup |
[production] |
14:52 |
<milimetric> |
rerunning webrequest-druid-hourly-wf-2021-8-13-13 because of failure to connect to Hive metastore |
[analytics] |
14:50 |
<mutante> |
an-worker1079 - started failed ferm |
[production] |
14:47 |
<jelto@cumin1001> |
conftool action : set/weight=25; selector: name=mw1450.eqiad.wmnet |
[production] |
14:46 |
<jelto@cumin1001> |
conftool action : set/weight=25; selector: name=mw144[7-9].eqiad.wmnet |
[production] |
14:45 |
<mutante> |
an-worker1095 - started ferm, service failed |
[production] |
14:44 |
<mutante> |
an-worker1082 - started ferm (was failed due to DNS hickup) |
[production] |
14:44 |
<jelto@cumin1001> |
conftool action : set/pooled=inactive; selector: name=mw1450.eqiad.wmnet |
[production] |
14:43 |
<jelto@cumin1001> |
conftool action : set/pooled=inactive; selector: name=mw144[7-9].eqiad.wmnet |
[production] |
14:41 |
<mutante> |
mw1419 - started ferm |
[production] |
13:35 |
<sukhe> |
ran homer for Gerrit 712400: Set up BGP peering to doh4002 in ulsfo |
[production] |
13:23 |
<mutante> |
mw1453 - manual powercycle after it never rebooted when the reimage cookbook tries to trigger one |
[production] |