2024-05-07
§
|
19:57 |
<denisse@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 12 hosts with reason: Downtiming the Logstash hosts serving OpenSearch Dashboards as part of the cergen to CFSSL migration - T360414 |
[production] |
19:57 |
<denisse@cumin2002> |
START - Cookbook sre.hosts.downtime for 0:30:00 on 12 hosts with reason: Downtiming the Logstash hosts serving OpenSearch Dashboards as part of the cergen to CFSSL migration - T360414 |
[production] |
19:46 |
<denisse> |
disabling Puppet on the Logstash hosts that serve OpenSearch dashboards to test the CFSSL certificates - T360414 |
[production] |
19:34 |
<jhuneidi@deploy1002> |
Finished scap: Backport for [[gerrit:1028865|Partial cherry-pick of I9d8409fdbd757e (T361398 T362566)]] (duration: 15m 39s) |
[production] |
19:21 |
<jhuneidi@deploy1002> |
ladsgroup and jhuneidi: Continuing with sync |
[production] |
19:21 |
<jhuneidi@deploy1002> |
ladsgroup and jhuneidi: Backport for [[gerrit:1028865|Partial cherry-pick of I9d8409fdbd757e (T361398 T362566)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
19:18 |
<jhuneidi@deploy1002> |
Started scap: Backport for [[gerrit:1028865|Partial cherry-pick of I9d8409fdbd757e (T361398 T362566)]] |
[production] |
18:40 |
<eevans@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on aqs1013.eqiad.wmnet with reason: Decommissioning — T364422 |
[production] |
18:40 |
<eevans@cumin1002> |
START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on aqs1013.eqiad.wmnet with reason: Decommissioning — T364422 |
[production] |
17:33 |
<swfrench@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/apertium: apply |
[production] |
17:32 |
<swfrench@deploy1002> |
helmfile [eqiad] START helmfile.d/services/apertium: apply |
[production] |
17:21 |
<swfrench@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/apertium: apply |
[production] |
17:20 |
<swfrench@deploy1002> |
helmfile [codfw] START helmfile.d/services/apertium: apply |
[production] |
17:14 |
<swfrench@deploy1002> |
helmfile [staging] DONE helmfile.d/services/apertium: apply |
[production] |
17:13 |
<swfrench@deploy1002> |
helmfile [staging] START helmfile.d/services/apertium: apply |
[production] |
16:48 |
<elukey@cumin1002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2001.codfw.wmnet |
[production] |
16:39 |
<elukey@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet |
[production] |
16:34 |
<zabe@deploy1002> |
Finished scap: T363825 (duration: 07m 42s) |
[production] |
16:26 |
<zabe@deploy1002> |
Started scap: T363825 |
[production] |
16:08 |
<zabe@deploy1002> |
sync-world aborted: (no justification provided) (duration: 00m 00s) |
[production] |
16:08 |
<zabe@deploy1002> |
Started scap: (no justification provided) |
[production] |
16:05 |
<ladsgroup@deploy1002> |
Finished scap: Backport for [[gerrit:1028778|Stop writing to old columns of pagelinks in most wikis (T352010 T299947)]] (duration: 32m 29s) |
[production] |
15:58 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Depooling db2179 (T352010)', diff saved to https://phabricator.wikimedia.org/P61983 and previous config saved to /var/cache/conftool/dbconfig/20240507-155822-ladsgroup.json |
[production] |
15:58 |
<ladsgroup@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance |
[production] |
15:58 |
<ladsgroup@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance |
[production] |
15:52 |
<ladsgroup@deploy1002> |
ladsgroup: Continuing with sync |
[production] |
15:38 |
<ladsgroup@deploy1002> |
ladsgroup: Backport for [[gerrit:1028778|Stop writing to old columns of pagelinks in most wikis (T352010 T299947)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
15:34 |
<ejegg> |
switched Adyen IPN format to JSON in merchant console |
[production] |
15:32 |
<ladsgroup@deploy1002> |
Started scap: Backport for [[gerrit:1028778|Stop writing to old columns of pagelinks in most wikis (T352010 T299947)]] |
[production] |
15:31 |
<ejegg> |
SmashPig (standalone IPN listener) upgraded from 71b9be53 to 67db9d96 |
[production] |
15:29 |
<hnowlan> |
depooling 5 eqiad api appservers in advance of reimaging to k8s workers |
[production] |
15:19 |
<moritzm> |
imported nodejs 20.5.1-deb-1nodesource1 to thirdparty/node20 T362681 |
[production] |
15:14 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2122.codfw.wmnet |
[production] |
15:13 |
<godog> |
remove accidentally set site!=magru silence, add site=magru silence instead - T364016 |
[production] |
15:12 |
<elukey> |
repool ms-fe1009's envoy with PKI TLS cert |
[production] |
15:12 |
<elukey@puppetmaster1001> |
conftool action : set/pooled=yes; selector: name=ms-fe1009.eqiad.wmnet |
[production] |
14:55 |
<elukey> |
depool ms-fe1009's nginx (swift proxy) to safely apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/1026927 |
[production] |
14:54 |
<elukey@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=ms-fe1009.eqiad.wmnet |
[production] |
14:53 |
<sukhe> |
A:cp and A:magru: running haproxy-restart |
[production] |
14:53 |
<jmm@cumin2002> |
START - Cookbook sre.puppet.migrate-host for host db2122.codfw.wmnet |
[production] |
14:53 |
<hnowlan@cumin1002> |
conftool action : set/weight=10:pooled=yes; selector: name=(mw2305.codfw.wmnet|mw2325.codfw.wmnet|mw2338.codfw.wmnet|mw2359.codfw.wmnet|mw2390.codfw.wmnet|mw2407.codfw.wmnet),cluster=kubernetes,service=kubesvc |
[production] |
14:52 |
<moritzm> |
installing mariadb-10.5 security updates (as packaged in Debian, not the wmf-mariadb packages) |
[production] |
14:51 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2121.codfw.wmnet |
[production] |
14:50 |
<godog> |
silence site=magru alerts during prometheus7001 - T364016 |
[production] |
14:44 |
<jmm@cumin2002> |
START - Cookbook sre.puppet.migrate-host for host db2121.codfw.wmnet |
[production] |
14:41 |
<hnowlan> |
running homer 'cr*codfw*' commit to configure BGP for new k8s codfw workers |
[production] |
14:39 |
<hnowlan@cumin1002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2338.codfw.wmnet with OS bullseye |
[production] |
14:33 |
<hnowlan@cumin1002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw2325.codfw.wmnet with OS bullseye |
[production] |
14:31 |
<filippo@cumin1002> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host prometheus7001.magru.wmnet |
[production] |
14:31 |
<filippo@cumin1002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host prometheus7001.magru.wmnet with OS bullseye |
[production] |