2025-06-18
ยง
|
19:43 |
<jhancock@cumin1003> |
START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2008.codfw.wmnet with reason: host reimage |
[production] |
19:43 |
<jhancock@cumin1003> |
START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2007.codfw.wmnet with reason: host reimage |
[production] |
19:43 |
<jhancock@cumin1003> |
START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2006.codfw.wmnet with reason: host reimage |
[production] |
19:32 |
<ryankemper> |
T393966 Ran puppet on `titan1001` following merge of https://gerrit.wikimedia.org/r/c/operations/puppet/+/1155335. Puppet looks happy and I see the new recording rules getting created |
[production] |
19:31 |
<jhancock@cumin1003> |
START - Cookbook sre.hosts.reimage for host aux-k8s-worker2009.codfw.wmnet with OS bookworm |
[production] |
19:31 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance |
[production] |
19:31 |
<jhancock@cumin1003> |
START - Cookbook sre.hosts.reimage for host aux-k8s-worker2008.codfw.wmnet with OS bookworm |
[production] |
19:31 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1251 (T396130)', diff saved to https://phabricator.wikimedia.org/P78387 and previous config saved to /var/cache/conftool/dbconfig/20250618-193101-marostegui.json |
[production] |
19:31 |
<jhancock@cumin1003> |
START - Cookbook sre.hosts.reimage for host aux-k8s-worker2007.codfw.wmnet with OS bookworm |
[production] |
19:30 |
<jhancock@cumin1003> |
START - Cookbook sre.hosts.reimage for host aux-k8s-worker2006.codfw.wmnet with OS bookworm |
[production] |
19:15 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P78386 and previous config saved to /var/cache/conftool/dbconfig/20250618-191553-marostegui.json |
[production] |
19:14 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Testing T395696', diff saved to https://phabricator.wikimedia.org/P78385 and previous config saved to /var/cache/conftool/dbconfig/20250618-191440-ladsgroup.json |
[production] |
19:09 |
<ladsgroup@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1160990|etcd: Check for array key (T395696)]] (duration: 12m 39s) |
[production] |
19:07 |
<ejegg> |
civicrm upgraded from 63302c18 to 670b3f6b |
[production] |
19:05 |
<cdobbins@cumin2002> |
START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade of ATS on A:cp-codfw and A:cp - 9.2.10 upgrade (T390912) |
[production] |
19:05 |
<ChrisDobbins901_> |
cdobbins@cumin2002:~$ sudo -i cookbook sre.cdn.roll-upgrade-ats --query 'A:cp-codfw' --task-id T390912 --reason '9.2.10 upgrade' |
[production] |
19:03 |
<ladsgroup@deploy1003> |
ladsgroup: Continuing with sync |
[production] |
19:00 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P78384 and previous config saved to /var/cache/conftool/dbconfig/20250618-190045-marostegui.json |
[production] |
19:00 |
<wfan> |
payments-wiki upgraded from aa102260 to f56db8e6 |
[production] |
18:59 |
<ladsgroup@deploy1003> |
ladsgroup: Backport for [[gerrit:1160990|etcd: Check for array key (T395696)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. |
[production] |
18:57 |
<ladsgroup@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1160990|etcd: Check for array key (T395696)]] |
[production] |
18:56 |
<ryankemper@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on 6 hosts with reason: T395772 hosts not serving production traffic |
[production] |
18:55 |
<ladsgroup@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1152853|etcd: Remove ES clusters from "write clusters" if section is RO (T395696)]] (duration: 26m 55s) |
[production] |
18:49 |
<ladsgroup@deploy1003> |
ladsgroup: Continuing with sync |
[production] |
18:47 |
<brett@cumin2002> |
END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-eqiad and A:cp - 9.2.10 upgrade (T390912) |
[production] |
18:45 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1251 (T396130)', diff saved to https://phabricator.wikimedia.org/P78383 and previous config saved to /var/cache/conftool/dbconfig/20250618-184538-marostegui.json |
[production] |
18:43 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Testing T395696', diff saved to https://phabricator.wikimedia.org/P78382 and previous config saved to /var/cache/conftool/dbconfig/20250618-184325-ladsgroup.json |
[production] |
18:38 |
<cdobbins@cumin2002> |
END (PASS) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=0) Rolling upgrade of ATS on A:cp-eqsin and A:cp - 9.2.10 upgrade (T390912) |
[production] |
18:31 |
<ladsgroup@deploy1003> |
ladsgroup: Backport for [[gerrit:1152853|etcd: Remove ES clusters from "write clusters" if section is RO (T395696)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. |
[production] |
18:28 |
<ladsgroup@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1152853|etcd: Remove ES clusters from "write clusters" if section is RO (T395696)]] |
[production] |
18:28 |
<jhancock@cumin1003> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
18:27 |
<jhancock@cumin1003> |
START - Cookbook sre.hosts.provision for host aux-k8s-worker2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
18:27 |
<jhancock@cumin1003> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
18:26 |
<jhancock@cumin1003> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
18:26 |
<jhancock@cumin1003> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
18:26 |
<jhancock@cumin1003> |
START - Cookbook sre.hosts.provision for host aux-k8s-worker2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
18:26 |
<jhancock@cumin1003> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
18:26 |
<jhancock@cumin1003> |
START - Cookbook sre.hosts.provision for host aux-k8s-worker2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
18:25 |
<jhancock@cumin1003> |
START - Cookbook sre.hosts.provision for host aux-k8s-worker2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
18:24 |
<jhancock@cumin1003> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
18:23 |
<jhancock@cumin1003> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
18:23 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db1251 (T396130)', diff saved to https://phabricator.wikimedia.org/P78381 and previous config saved to /var/cache/conftool/dbconfig/20250618-182313-marostegui.json |
[production] |
18:23 |
<marostegui@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1251.eqiad.wmnet with reason: Maintenance |
[production] |
18:22 |
<jhancock@cumin1003> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2006.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
18:20 |
<jgleeson> |
civicrm rolled back from 10eac2f8 to 63302c18 |
[production] |
18:18 |
<jhancock@cumin1003> |
START - Cookbook sre.hosts.provision for host aux-k8s-worker2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
18:17 |
<jhancock@cumin1003> |
START - Cookbook sre.hosts.provision for host aux-k8s-worker2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
18:17 |
<jhancock@cumin1003> |
END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker2009.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
18:17 |
<jhancock@cumin1003> |
START - Cookbook sre.hosts.provision for host aux-k8s-worker2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
18:17 |
<jhancock@cumin1003> |
END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker2008.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |