2025-09-08
ยง
|
23:28 |
<rzl@deploy1003> |
helmfile [eqiad] START helmfile.d/services/mw-debug: apply |
[production] |
23:21 |
<jdlrobson@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1182944|Cleanup: Simplify configuration for wgSpecialContributeSkinsEnabled]], [[gerrit:1186044|Temporarily use production for summary endpoint (T400694)]] (duration: 16m 06s) |
[production] |
23:16 |
<jdlrobson@deploy1003> |
jdlrobson: Continuing with sync |
[production] |
23:11 |
<jdlrobson@deploy1003> |
jdlrobson: Backport for [[gerrit:1182944|Cleanup: Simplify configuration for wgSpecialContributeSkinsEnabled]], [[gerrit:1186044|Temporarily use production for summary endpoint (T400694)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. |
[production] |
23:10 |
<ladsgroup@cumin1003> |
START - Cookbook sre.mysql.pool db1223 gradually with 4 steps - Maint over |
[production] |
23:08 |
<ladsgroup@cumin1002> |
END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1223.eqiad.wmnet |
[production] |
23:08 |
<eileen> |
civicrm upgraded from c7ebd726 to 1ec5de94 |
[production] |
23:05 |
<jdlrobson@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1182944|Cleanup: Simplify configuration for wgSpecialContributeSkinsEnabled]], [[gerrit:1186044|Temporarily use production for summary endpoint (T400694)]] |
[production] |
23:02 |
<ladsgroup@cumin1002> |
END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1223 - Upgrading db1223.eqiad.wmnet |
[production] |
23:02 |
<ladsgroup@cumin1002> |
START - Cookbook sre.mysql.depool db1223 - Upgrading db1223.eqiad.wmnet |
[production] |
23:02 |
<ladsgroup@cumin1002> |
START - Cookbook sre.mysql.upgrade for db1223.eqiad.wmnet |
[production] |
22:56 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Depool db1223 T404025', diff saved to https://phabricator.wikimedia.org/P82789 and previous config saved to /var/cache/conftool/dbconfig/20250908-225603-ladsgroup.json |
[production] |
22:54 |
<ladsgroup@dns1004> |
END - running authdns-update |
[production] |
22:53 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Depooling db2176 (T402925)', diff saved to https://phabricator.wikimedia.org/P82788 and previous config saved to /var/cache/conftool/dbconfig/20250908-225313-ladsgroup.json |
[production] |
22:53 |
<ladsgroup@dns1004> |
START - running authdns-update |
[production] |
22:53 |
<ladsgroup@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance |
[production] |
22:52 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2174 (T402925)', diff saved to https://phabricator.wikimedia.org/P82787 and previous config saved to /var/cache/conftool/dbconfig/20250908-225250-ladsgroup.json |
[production] |
22:50 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Promote db1189 to s3 primary and set section read-write T404025', diff saved to https://phabricator.wikimedia.org/P82786 and previous config saved to /var/cache/conftool/dbconfig/20250908-225054-ladsgroup.json |
[production] |
22:49 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Set s3 eqiad as read-only for maintenance - T404025', diff saved to https://phabricator.wikimedia.org/P82785 and previous config saved to /var/cache/conftool/dbconfig/20250908-224914-ladsgroup.json |
[production] |
22:48 |
<Amir1> |
Starting s3 eqiad failover from db1223 to db1189 - T404025 |
[production] |
22:43 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Set db1189 with weight 0 T404025', diff saved to https://phabricator.wikimedia.org/P82784 and previous config saved to /var/cache/conftool/dbconfig/20250908-224330-ladsgroup.json |
[production] |
22:42 |
<ladsgroup@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s3 T404025 |
[production] |
22:37 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P82783 and previous config saved to /var/cache/conftool/dbconfig/20250908-223742-ladsgroup.json |
[production] |
22:35 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Depooling db1235 (T402925)', diff saved to https://phabricator.wikimedia.org/P82782 and previous config saved to /var/cache/conftool/dbconfig/20250908-223528-ladsgroup.json |
[production] |
22:35 |
<ladsgroup@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1235.eqiad.wmnet with reason: Maintenance |
[production] |
22:35 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1234 (T402925)', diff saved to https://phabricator.wikimedia.org/P82781 and previous config saved to /var/cache/conftool/dbconfig/20250908-223504-ladsgroup.json |
[production] |
22:23 |
<andrew@cumin2002> |
START - Cookbook sre.hosts.reimage for host cloudcephmon1004.eqiad.wmnet with OS bullseye |
[production] |
22:22 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P82780 and previous config saved to /var/cache/conftool/dbconfig/20250908-222235-ladsgroup.json |
[production] |
22:19 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P82779 and previous config saved to /var/cache/conftool/dbconfig/20250908-221956-ladsgroup.json |
[production] |
22:07 |
<andrew@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephmon1004.eqiad.wmnet with OS bullseye |
[production] |
22:07 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2174 (T402925)', diff saved to https://phabricator.wikimedia.org/P82778 and previous config saved to /var/cache/conftool/dbconfig/20250908-220728-ladsgroup.json |
[production] |
22:04 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P82777 and previous config saved to /var/cache/conftool/dbconfig/20250908-220449-ladsgroup.json |
[production] |
21:58 |
<jhancock@cumin1002> |
END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART |
[production] |
21:57 |
<jhancock@cumin1002> |
START - Cookbook sre.hosts.provision for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART |
[production] |
21:52 |
<jhancock@cumin1002> |
END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART |
[production] |
21:51 |
<jhancock@cumin1002> |
START - Cookbook sre.hosts.provision for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART |
[production] |
21:49 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1234 (T402925)', diff saved to https://phabricator.wikimedia.org/P82776 and previous config saved to /var/cache/conftool/dbconfig/20250908-214941-ladsgroup.json |
[production] |
21:46 |
<jhancock@cumin1002> |
END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART |
[production] |
21:45 |
<jhancock@cumin1002> |
START - Cookbook sre.hosts.provision for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART |
[production] |
21:43 |
<jhancock@cumin1002> |
END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART |
[production] |
21:42 |
<jhancock@cumin1002> |
START - Cookbook sre.hosts.provision for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART |
[production] |
21:38 |
<jhancock@cumin1002> |
END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART |
[production] |
21:38 |
<maryum> |
Deployed security fix for T403408 |
[production] |
21:37 |
<jhancock@cumin1002> |
START - Cookbook sre.hosts.provision for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART |
[production] |
21:31 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Depooling db2174 (T402925)', diff saved to https://phabricator.wikimedia.org/P82775 and previous config saved to /var/cache/conftool/dbconfig/20250908-213103-ladsgroup.json |
[production] |
21:30 |
<ladsgroup@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2174.codfw.wmnet with reason: Maintenance |
[production] |
21:30 |
<ladsgroup@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2173 (T402925)', diff saved to https://phabricator.wikimedia.org/P82774 and previous config saved to /var/cache/conftool/dbconfig/20250908-213040-ladsgroup.json |
[production] |
21:26 |
<jclark@cumin1002> |
END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART |
[production] |
21:25 |
<jclark@cumin1002> |
START - Cookbook sre.hosts.provision for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART |
[production] |
21:24 |
<jclark@cumin1002> |
END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART |
[production] |