2024-01-26
§
|
12:30 |
<ayounsi@cumin1002> |
START - Cookbook sre.dns.netbox |
[production] |
11:43 |
<taavi> |
reprepro: copy helm-diff_3.1.3-2 from bullseye-wikimedia to bookworm-wikimedia |
[production] |
11:28 |
<eoghan@cumin1002> |
START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Gitlab security upgrade |
[production] |
10:52 |
<arnaudb@cumin1002> |
END (FAIL) - Cookbook sre.mysql.clone (exit_code=99) Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet |
[production] |
10:51 |
<arnaudb@cumin1002> |
START - Cookbook sre.mysql.clone Will create a clone of db2169.codfw.wmnet onto db2194.codfw.wmnet |
[production] |
10:50 |
<eoghan@cumin1002> |
END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Gitlab security upgrade |
[production] |
10:44 |
<eoghan@cumin1002> |
START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Gitlab security upgrade |
[production] |
10:36 |
<moritzm> |
prune obsolete nginx packages from eventschema hosts after migration to new library scheme T329529 |
[production] |
10:25 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Cloning db2169 in db2194 for T343674', diff saved to https://phabricator.wikimedia.org/P55737 and previous config saved to /var/cache/conftool/dbconfig/20240126-102550-arnaudb.json |
[production] |
08:01 |
<moritzm> |
rebalance codfw/B following switch maintenance T355549 |
[production] |
07:54 |
<moritzm> |
failover ganeti master for codfw back to ganeti2022, switch maintenance is completed T355549 |
[production] |
01:01 |
<dzahn@cumin1002> |
END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1004.wikimedia.org with reason: security release |
[production] |
00:07 |
<dzahn@cumin1002> |
START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: security release |
[production] |
00:00 |
<dzahn@cumin1002> |
END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: security release |
[production] |
2024-01-25
§
|
23:54 |
<zabe> |
zabe@mwmaint2002:~$ mwscript namespaceDupes.php --wiki=wikimaniawiki --fix # T347622 |
[production] |
23:54 |
<zabe@deploy2002> |
Finished scap: Backport for [[gerrit:961963|Setup namespace for 2025, 2026, enable subpages for 2023-2026 (T347622)]] (duration: 08m 30s) |
[production] |
23:47 |
<zabe@deploy2002> |
robertsky and zabe: Continuing with sync |
[production] |
23:46 |
<zabe@deploy2002> |
robertsky and zabe: Backport for [[gerrit:961963|Setup namespace for 2025, 2026, enable subpages for 2023-2026 (T347622)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
23:45 |
<zabe@deploy2002> |
Started scap: Backport for [[gerrit:961963|Setup namespace for 2025, 2026, enable subpages for 2023-2026 (T347622)]] |
[production] |
23:29 |
<zabe> |
zabe@mwmaint2002:/tmp/uploads$ mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user=Sturm . # T355485 |
[production] |
23:17 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cloudelastic1010.wikimedia.org with reason: migration canary T355617 |
[production] |
23:17 |
<bking@cumin2002> |
START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on cloudelastic1010.wikimedia.org with reason: migration canary T355617 |
[production] |
22:54 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.elasticsearch.ban (exit_code=0) Banning hosts: cloudelastic1010.wikimedia.org for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617 |
[production] |
22:53 |
<bking@cumin2002> |
START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010.wikimedia.org for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617 |
[production] |
22:53 |
<bking@cumin2002> |
END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617 |
[production] |
22:53 |
<bking@cumin2002> |
START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617 |
[production] |
22:52 |
<bking@cumin2002> |
END (FAIL) - Cookbook sre.elasticsearch.ban (exit_code=99) Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617 |
[production] |
22:52 |
<bking@cumin2002> |
START - Cookbook sre.elasticsearch.ban Banning hosts: cloudelastic1010 for use cloudelastic1010 as migration canary - bking@cumin2002 - T355617 |
[production] |
22:40 |
<ryankemper> |
T351354 Restarting `cloudelastic1006` (final restart for today) |
[production] |
22:34 |
<ryankemper> |
T351354 Now restarting new masters to keep configs in sync; restarting `cloudelastic1009` |
[production] |
22:33 |
<ryankemper> |
T351354 Now restarting new masters to keep configs in sync; restarting `cloudelastic1007` |
[production] |
22:25 |
<ryankemper> |
T351354 Restarting `cloudelastic1002` |
[production] |
22:19 |
<ebernhardson@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
22:19 |
<ebernhardson@deploy2002> |
helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
22:15 |
<ryankemper> |
T351354 Restarting `cloudelastic1004` following puppet run |
[production] |
22:12 |
<dzahn@cumin1002> |
START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: security release |
[production] |
22:11 |
<ryankemper> |
T351354 Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/993038; restarting `cloudelastic1001` following puppet run |
[production] |
22:08 |
<ryankemper> |
T351354 Downtimed `cloudelastic*`; shortly will restart `cloudelastic100[1,2,4]` one host at a time to make them no longer masters |
[production] |
22:08 |
<ryankemper@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: cloudelastic maintenance |
[production] |
22:07 |
<ryankemper@cumin2002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: cloudelastic maintenance |
[production] |
21:55 |
<ebernhardson@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
21:55 |
<ebernhardson@deploy2002> |
helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
21:44 |
<ebernhardson@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
21:44 |
<ebernhardson@deploy2002> |
helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
21:44 |
<ebernhardson@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
21:44 |
<ebernhardson@deploy2002> |
helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
21:19 |
<ebernhardson@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
21:19 |
<ebernhardson@deploy2002> |
helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
21:14 |
<ebernhardson@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
21:14 |
<ebernhardson@deploy2002> |
helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |