2025-03-27
§
|
09:41 |
<elukey@cumin1002> |
START - Cookbook sre.hosts.reimage for host ml-serve-ctrl2002.codfw.wmnet with OS bookworm |
[production] |
09:33 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host ganeti4005.ulsfo.wmnet |
[production] |
09:32 |
<dcausse@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
09:32 |
<dcausse@deploy1003> |
helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
09:30 |
<brouberol@deploy1003> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply |
[production] |
09:29 |
<brouberol@deploy1003> |
helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply |
[production] |
09:29 |
<godog> |
silence LogstashKafkaConsumerLag and LogstashIndexingFailures for today for 1d - T390140 |
[production] |
09:29 |
<dcausse@deploy1003> |
helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
09:29 |
<dcausse@deploy1003> |
helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
09:28 |
<dcausse@deploy1003> |
helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
09:28 |
<dcausse@deploy1003> |
helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
09:27 |
<dcausse@deploy1003> |
helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
09:27 |
<dcausse@deploy1003> |
helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
09:14 |
<aklapper@deploy1003> |
rebuilt and synchronized wikiversions files: group1 to 1.44.0-wmf.22 refs T386217 |
[production] |
09:01 |
<aklapper@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1131348|Instead of calling deprecated parserOptions(), parse content ourselves (T390032)]] (duration: 12m 24s) |
[production] |
08:54 |
<aklapper@deploy1003> |
aklapper, jforrester: Continuing with sync |
[production] |
08:53 |
<aklapper@deploy1003> |
aklapper, jforrester: Backport for [[gerrit:1131348|Instead of calling deprecated parserOptions(), parse content ourselves (T390032)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
08:50 |
<jmm@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ganeti4005.ulsfo.wmnet |
[production] |
08:48 |
<aklapper@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1131348|Instead of calling deprecated parserOptions(), parse content ourselves (T390032)]] |
[production] |
08:44 |
<aklapper@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1131018|Allow arwikisource bureaucrat to manage "import" (T389952)]] (duration: 13m 28s) |
[production] |
08:42 |
<jmm@cumin2002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ganeti4005.ulsfo.wmnet with reason: remove from cluster for reimage |
[production] |
08:41 |
<fabfur@cumin1002> |
conftool action : set/pooled=yes; selector: name=cp7001.magru.wmnet |
[production] |
08:41 |
<fabfur@cumin1002> |
conftool action : set/pooled=yes; selector: name=cp7009.magru.wmnet |
[production] |
08:41 |
<fabfur> |
repooling cp7001 and cp7009 with new TLS certificate path (T384227) |
[production] |
08:40 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet |
[production] |
08:37 |
<aklapper@deploy1003> |
hubaishan, aklapper: Continuing with sync |
[production] |
08:37 |
<aklapper@deploy1003> |
hubaishan, aklapper: Backport for [[gerrit:1131018|Allow arwikisource bureaucrat to manage "import" (T389952)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
08:30 |
<aklapper@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1131018|Allow arwikisource bureaucrat to manage "import" (T389952)]] |
[production] |
08:29 |
<aklapper@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1131410|Make officewiki readonly after moving flow pages (T380909)]] (duration: 14m 14s) |
[production] |
08:28 |
<fabfur@cumin1002> |
conftool action : set/pooled=no; selector: name=cp7009.magru.wmnet |
[production] |
08:28 |
<fabfur@cumin1002> |
conftool action : set/pooled=no; selector: name=cp7001.magru.wmnet |
[production] |
08:28 |
<fabfur> |
depooling cp7001 and cp7009 to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1131052 (T384227) |
[production] |
08:21 |
<aklapper@deploy1003> |
zoe, aklapper: Continuing with sync |
[production] |
08:21 |
<aklapper@deploy1003> |
zoe, aklapper: Backport for [[gerrit:1131410|Make officewiki readonly after moving flow pages (T380909)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
08:14 |
<aklapper@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1131410|Make officewiki readonly after moving flow pages (T380909)]] |
[production] |
08:06 |
<marostegui@cumin1002> |
END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db1211.eqiad.wmnet onto db1255.eqiad.wmnet |
[production] |
07:54 |
<marostegui@cumin1002> |
START - Cookbook sre.mysql.clone of db1211.eqiad.wmnet onto db1255.eqiad.wmnet |
[production] |
07:37 |
<jmm@cumin2002> |
START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet |
[production] |
07:30 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet |
[production] |
07:30 |
<jmm@cumin2002> |
START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet |
[production] |
06:01 |
<marostegui@cumin1002> |
END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) of db2181.codfw.wmnet onto db2242.codfw.wmnet |
[production] |
04:06 |
<cwhite> |
restart grafana-server on grafana1002 - appears hung |
[production] |
2025-03-26
§
|
23:06 |
<toyofuku@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1131469|Restore simplified watchlist for logged in users (T388445)]] (duration: 12m 29s) |
[production] |
22:59 |
<toyofuku@deploy1003> |
jdlrobson, toyofuku: Continuing with sync |
[production] |
22:59 |
<jclark@cumin1002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host elastic1123.eqiad.wmnet with OS bullseye |
[production] |
22:58 |
<toyofuku@deploy1003> |
jdlrobson, toyofuku: Backport for [[gerrit:1131469|Restore simplified watchlist for logged in users (T388445)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
22:57 |
<brett@cumin2002> |
END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P{cp6009.drmrs.wmnet} and A:cp |
[production] |
22:56 |
<brett@cumin2002> |
END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on P{cp6008.drmrs.wmnet} and A:cp |
[production] |
22:54 |
<toyofuku@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1131469|Restore simplified watchlist for logged in users (T388445)]] |
[production] |
22:51 |
<toyofuku@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1131365|Set wgMinervaDonateBanner to default base true (T388438)]] (duration: 15m 15s) |
[production] |