2025-06-05
§
|
05:20 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2180.codfw.wmnet with reason: Maintenance |
[production] |
05:20 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool db2180 T395989', diff saved to https://phabricator.wikimedia.org/P77075 and previous config saved to /var/cache/conftool/dbconfig/20250605-052003-marostegui.json |
[production] |
04:45 |
<Krinkle> |
gitpuppet@deployment-puppetserver-1:/srv/git/operations/puppet$ Cherry-pick https://gerrit.wikimedia.org/r/c/operations/puppet/+/1153764, ref T289318 |
[releng] |
03:58 |
<Krinkle> |
Update profile::cache::haproxy::available_unified_certificates under deployment-cache in Horizon, to include remaining the wildcard and m-dot subdomains under beta.wmcloud.org for wikibooks, wikimedia, wikinews, wikiquote, wikisource, wikiversity, wiktionary. Remove `*.zero.wikipedia.beta.wmflabs.org` which wasn't responding/didn't work anymore. ref T289318 |
[releng] |
03:34 |
<Krinkle> |
Update profile::acme_chief::certificates under deployment-acme-chief prefix in Horizon, to include remaining the wildcard and m-dot subdomains under beta.wmcloud.org for wikibooks, wikimedia, wikinews, wikiquote, wikisource, wikiversity, wiktionary (wikipedia and wikivoyage were already there), ref T289318 |
[releng] |
03:34 |
<Krinkle> |
Update profile::acme_chief::certificates under deployment-acme-chief prefix in Horizon, to include remaining the wildcard and m-dot subdomains under beta.wmcloud.org for wikibooks, wikimedia, wikinews, wikiquote, wikisource, wikiversity, wiktionary (wikipedia and wikivoyage were already there) |
[releng] |
00:32 |
<Krinkle> |
Add `TXT *.wikimedia.beta.wmcloud.org. "v=spf1 -all"` to match beta.wmflabs.org DNS (ref T289318, changing email is out of scope for now, but might as well add the DNS records). |
[releng] |
00:22 |
<Krinkle> |
Adding missing DNS entries under beta.wmcloud.org. There was already: *.wikipedia, *.m.wikimedia, *.wikivoyage, *.m.wikivoyage (for T355281). Adding: wikibooks, wikimedia, wikinews, wikiquote, wikisource, wikiversity, wiktionary, wikidata, upload (T289318). |
[releng] |
2025-06-04
§
|
23:55 |
<brett@cumin2002> |
END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir |
[production] |
23:21 |
<raymond-ndibe@cloudcumin1001> |
END (PASS) - Cookbook wmcs.openstack.quota_increase (exit_code=0) (T396073) |
[codesearch] |
23:21 |
<raymond-ndibe@cloudcumin1001> |
START - Cookbook wmcs.openstack.quota_increase (T396073) |
[codesearch] |
23:20 |
<raymond-ndibe@cloudcumin1001> |
END (PASS) - Cookbook wmcs.openstack.quota_increase (exit_code=0) (T396073) |
[codesearch] |
23:20 |
<raymond-ndibe@cloudcumin1001> |
START - Cookbook wmcs.openstack.quota_increase (T396073) |
[codesearch] |
22:45 |
<brett@cumin2002> |
START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir |
[production] |
22:30 |
<robh@cumin2002> |
END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp7016.magru.wmnet |
[production] |
22:27 |
<vriley@cumin1002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1185.eqiad.wmnet with OS bullseye |
[production] |
22:20 |
<robh@cumin2002> |
END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp7014.magru.wmnet |
[production] |
22:18 |
<damilare> |
SmashPig upgraded from d08693e5 to 3222a1f3 |
[production] |
22:16 |
<ladsgroup@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1153725|Bump cache key version in EventStore (T396075)]] (duration: 13m 54s) |
[production] |
22:12 |
<robh@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7016.magru.wmnet |
[production] |
22:12 |
<robh@cumin2002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp7015.magru.wmnet |
[production] |
22:12 |
<robh@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7015.magru.wmnet |
[production] |
22:11 |
<robh@cumin2002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp7015.magru.wmnet |
[production] |
22:11 |
<brett> |
sudo -i cumin 'A:ncredir' 'depool && apt-get update && apt-get upgrade -y && pool' -b1 -s10 |
[production] |
22:09 |
<ladsgroup@deploy1003> |
ladsgroup: Continuing with sync |
[production] |
22:04 |
<ladsgroup@deploy1003> |
ladsgroup: Backport for [[gerrit:1153725|Bump cache key version in EventStore (T396075)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. |
[production] |
22:02 |
<ladsgroup@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1153725|Bump cache key version in EventStore (T396075)]] |
[production] |
22:02 |
<robh@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7015.magru.wmnet |
[production] |
22:02 |
<robh@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7014.magru.wmnet |
[production] |
22:02 |
<robh@cumin2002> |
END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp7013.magru.wmnet |
[production] |
21:58 |
<robh@cumin2002> |
END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp7012.magru.wmnet |
[production] |
21:43 |
<robh@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7013.magru.wmnet |
[production] |
21:42 |
<robh@cumin2002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp7011.magru.wmnet |
[production] |
21:40 |
<robh@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7012.magru.wmnet |
[production] |
21:40 |
<robh@cumin2002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp7010.magru.wmnet |
[production] |
21:39 |
<robh@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7010.magru.wmnet |
[production] |
21:35 |
<robh@cumin2002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp7010.magru.wmnet |
[production] |
21:29 |
<ryankemper@cumin2002> |
START - Cookbook sre.wdqs.data-reload reloading scholarly_articles on wdqs1023.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/scholarly/20250526/ using stat1011.eqiad.wmnet) |
[production] |
21:27 |
<James_F> |
Zuul: [mediawiki/extensions/Springboard] Add basic CI, for T395981 |
[releng] |
21:25 |
<robh@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7011.magru.wmnet |
[production] |
21:25 |
<robh@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7010.magru.wmnet |
[production] |
21:24 |
<ryankemper@cumin2002> |
START - Cookbook sre.wdqs.data-reload reloading wikidata_main on wdqs1022.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/main/20250526/ using stat1009.eqiad.wmnet) |
[production] |
21:22 |
<robh@cumin2002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp7009.magru.wmnet |
[production] |
21:14 |
<robh@cumin2002> |
END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts cp7008.magru.wmnet |
[production] |
21:07 |
<vriley@cumin1002> |
START - Cookbook sre.hosts.reimage for host an-worker1185.eqiad.wmnet with OS bullseye |
[production] |
21:06 |
<vriley@cumin1002> |
END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host an-worker1186.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED |
[production] |
21:05 |
<robh@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7009.magru.wmnet |
[production] |
21:05 |
<robh@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7008.magru.wmnet |
[production] |
21:04 |
<cjming> |
end of UTC late backport window |
[production] |
21:04 |
<robh@cumin2002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp7007.magru.wmnet |
[production] |