|
2025-11-24
ยง
|
| 19:27 |
<bking@cumin2002> |
START - Cookbook sre.hosts.reimage for host wdqs1030.eqiad.wmnet with OS trixie |
[production] |
| 19:25 |
<cdobbins@cumin2002> |
END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P{lvs7003*} and A:liberica |
[production] |
| 19:12 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2194 (T410531)', diff saved to https://phabricator.wikimedia.org/P85545 and previous config saved to /var/cache/conftool/dbconfig/20251124-191200-marostegui.json |
[production] |
| 18:58 |
<cdobbins@cumin2002> |
START - Cookbook sre.loadbalancer.admin rebooting P{lvs7003*} and A:liberica |
[production] |
| 18:50 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Depooling db2194 (T410531)', diff saved to https://phabricator.wikimedia.org/P85544 and previous config saved to /var/cache/conftool/dbconfig/20251124-185050-marostegui.json |
[production] |
| 18:50 |
<marostegui@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2194.codfw.wmnet with reason: Maintenance |
[production] |
| 18:50 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2190 (T410531)', diff saved to https://phabricator.wikimedia.org/P85543 and previous config saved to /var/cache/conftool/dbconfig/20251124-185026-marostegui.json |
[production] |
| 18:41 |
<jclark@cumin1003> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003" |
[production] |
| 18:36 |
<swfrench-wmf> |
deleted EtcdReplicationDown silence. f75c71c9-62d3-449f-860a-9b5e4570717a - T405950 |
[production] |
| 18:36 |
<jclark@cumin1003> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003" |
[production] |
| 18:35 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P85542 and previous config saved to /var/cache/conftool/dbconfig/20251124-183518-marostegui.json |
[production] |
| 18:34 |
<swfrench-wmf> |
begin restarts of eqiad-associated confds, navtiming, requestctl - T405950 |
[production] |
| 18:32 |
<swfrench@deploy2002> |
Unlocked for deployment [ALL REPOSITORIES]: Hold deployments during etcd ToR switch migration - T405950 (duration: 08m 43s) |
[production] |
| 18:31 |
<swfrench-wmf> |
manually transferred etcd-mirror replication source back to conf1009 - T405950 |
[production] |
| 18:25 |
<robh@cumin2002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on conf1009.eqiad.wmnet with reason: C/D Migration |
[production] |
| 18:24 |
<jclark@cumin1003> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage |
[production] |
| 18:23 |
<swfrench@deploy2002> |
Locking from deployment [ALL REPOSITORIES]: Hold deployments during etcd ToR switch migration - T405950 |
[production] |
| 18:21 |
<bking@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1032.eqiad.wmnet with OS trixie |
[production] |
| 18:21 |
<swfrench-wmf> |
manually transferred etcd-mirror replication source to conf1008 - T405950 |
[production] |
| 18:20 |
<jclark@cumin1003> |
START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1031.eqiad.wmnet with reason: host reimage |
[production] |
| 18:20 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P85541 and previous config saved to /var/cache/conftool/dbconfig/20251124-182011-marostegui.json |
[production] |
| 18:19 |
<jclark@cumin1003> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1030.eqiad.wmnet with reason: host reimage |
[production] |
| 18:16 |
<swfrench-wmf> |
silenced EtcdReplicationDown. f75c71c9-62d3-449f-860a-9b5e4570717a - T405950 |
[production] |
| 18:11 |
<jclark@cumin1003> |
START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1030.eqiad.wmnet with reason: host reimage |
[production] |
| 18:09 |
<jclark@cumin1003> |
START - Cookbook sre.hosts.reimage for host wdqs1028.eqiad.wmnet with OS bookworm |
[production] |
| 18:05 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2190 (T410531)', diff saved to https://phabricator.wikimedia.org/P85540 and previous config saved to /var/cache/conftool/dbconfig/20251124-180503-marostegui.json |
[production] |
| 18:05 |
<jclark@cumin1003> |
START - Cookbook sre.hosts.reimage for host wdqs1031.eqiad.wmnet with OS trixie |
[production] |
| 17:55 |
<jclark@cumin1003> |
START - Cookbook sre.hosts.reimage for host wdqs1030.eqiad.wmnet with OS trixie |
[production] |
| 17:51 |
<jclark@cumin1003> |
START - Cookbook sre.hosts.reimage for host wdqs1029.eqiad.wmnet with OS bookworm |
[production] |
| 17:45 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Depooling db2190 (T410531)', diff saved to https://phabricator.wikimedia.org/P85539 and previous config saved to /var/cache/conftool/dbconfig/20251124-174501-marostegui.json |
[production] |
| 17:44 |
<marostegui@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2190.codfw.wmnet with reason: Maintenance |
[production] |
| 17:44 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2177 (T410531)', diff saved to https://phabricator.wikimedia.org/P85538 and previous config saved to /var/cache/conftool/dbconfig/20251124-174437-marostegui.json |
[production] |
| 17:29 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P85537 and previous config saved to /var/cache/conftool/dbconfig/20251124-172929-marostegui.json |
[production] |
| 17:27 |
<cgoubert@deploy2002> |
helmfile [staging] DONE helmfile.d/services/rest-gateway: apply |
[production] |
| 17:26 |
<cgoubert@deploy2002> |
helmfile [staging] START helmfile.d/services/rest-gateway: apply |
[production] |
| 17:24 |
<urbanecm@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply |
[production] |
| 17:23 |
<urbanecm@deploy2002> |
helmfile [codfw] START helmfile.d/services/mw-experimental: apply |
[production] |
| 17:14 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P85536 and previous config saved to /var/cache/conftool/dbconfig/20251124-171418-marostegui.json |
[production] |
| 17:11 |
<bking@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1032.eqiad.wmnet with OS trixie |
[production] |
| 17:10 |
<bking@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1030.eqiad.wmnet with OS trixie |
[production] |
| 17:10 |
<bking@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1029.eqiad.wmnet with OS trixie |
[production] |
| 17:09 |
<bking@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1028.eqiad.wmnet with OS trixie |
[production] |
| 17:09 |
<btullis@cumin1003> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1019.eqiad.wmnet |
[production] |
| 17:03 |
<btullis@cumin1003> |
START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1019.eqiad.wmnet |
[production] |
| 17:02 |
<btullis@cumin1003> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1013.eqiad.wmnet |
[production] |
| 17:01 |
<bking@cumin2002> |
START - Cookbook sre.hosts.reimage for host wdqs1032.eqiad.wmnet with OS trixie |
[production] |
| 17:00 |
<bking@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1031.eqiad.wmnet with OS trixie |
[production] |
| 16:59 |
<marostegui@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db2177 (T410531)', diff saved to https://phabricator.wikimedia.org/P85535 and previous config saved to /var/cache/conftool/dbconfig/20251124-165910-marostegui.json |
[production] |
| 16:56 |
<btullis@cumin1003> |
START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1013.eqiad.wmnet |
[production] |
| 16:43 |
<jdrewniak@deploy2002> |
Synchronized portals: Wikimedia Portals Update: [[gerrit:1210618| Bumping portals to master (T128546)]] (duration: 01m 59s) |
[production] |