2023-04-17
ยง
|
15:42 |
<jhancock@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
15:41 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1119 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P47018 and previous config saved to /var/cache/conftool/dbconfig/20230417-154149-root.json |
[production] |
15:34 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2104 (T333332)', diff saved to https://phabricator.wikimedia.org/P47017 and previous config saved to /var/cache/conftool/dbconfig/20230417-153412-ladsgroup.json |
[production] |
15:31 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db2104 (T333332)', diff saved to https://phabricator.wikimedia.org/P47016 and previous config saved to /var/cache/conftool/dbconfig/20230417-153134-ladsgroup.json |
[production] |
15:31 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2104.codfw.wmnet with reason: Maintenance |
[production] |
15:31 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 8:00:00 on db2104.codfw.wmnet with reason: Maintenance |
[production] |
15:30 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance |
[production] |
15:30 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 8:00:00 on db2097.codfw.wmnet with reason: Maintenance |
[production] |
15:30 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance |
[production] |
15:29 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 8:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance |
[production] |
15:29 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance |
[production] |
15:29 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 8:00:00 on db1225.eqiad.wmnet with reason: Maintenance |
[production] |
15:29 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1222 (T333332)', diff saved to https://phabricator.wikimedia.org/P47015 and previous config saved to /var/cache/conftool/dbconfig/20230417-152916-ladsgroup.json |
[production] |
15:27 |
<urbanecm@deploy2002> |
Finished scap: Expose the sfsblock-bypass right so it can be assigned to global groups (T334856; second try) (duration: 06m 22s) |
[production] |
15:26 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1119 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P47014 and previous config saved to /var/cache/conftool/dbconfig/20230417-152644-root.json |
[production] |
15:21 |
<urbanecm@deploy2002> |
Started scap: Expose the sfsblock-bypass right so it can be assigned to global groups (T334856; second try) |
[production] |
15:20 |
<urbanecm@deploy2002> |
Unlocked for deployment [ALL REPOSITORIES]: LVS Maint - Outage (duration: 23m 03s) |
[production] |
15:18 |
<sukhe> |
run authdns-update and repool eqiad |
[production] |
15:14 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P47013 and previous config saved to /var/cache/conftool/dbconfig/20230417-151409-ladsgroup.json |
[production] |
15:11 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1119 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P47012 and previous config saved to /var/cache/conftool/dbconfig/20230417-151138-root.json |
[production] |
15:09 |
<sukhe@cumin2002> |
END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs1020 |
[production] |
15:09 |
<sukhe@cumin2002> |
START - Cookbook sre.network.configure-switch-interfaces for host lvs1020 |
[production] |
15:07 |
<sukhe@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host lvs1020.eqiad.wmnet with OS bullseye |
[production] |
15:07 |
<vgutierrez> |
rolling restart of HAProxy in the text cluster - T334448 |
[production] |
14:59 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P47011 and previous config saved to /var/cache/conftool/dbconfig/20230417-145902-ladsgroup.json |
[production] |
14:57 |
<urbanecm@deploy2002> |
Locking from deployment [ALL REPOSITORIES]: LVS Maint - Outage |
[production] |
14:57 |
<urbanecm@deploy2002> |
Unlocked for deployment [ALL REPOSITORIES]: LVS Maint - Outage (duration: 00m 01s) |
[production] |
14:57 |
<urbanecm@deploy2002> |
Locking from deployment [ALL REPOSITORIES]: LVS Maint - Outage |
[production] |
14:56 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1119 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P47010 and previous config saved to /var/cache/conftool/dbconfig/20230417-145633-root.json |
[production] |
14:55 |
<claime> |
repooled mw1375.eqiad.wmnet |
[production] |
14:54 |
<claime> |
depooling mw1375.eqiad.wmnet |
[production] |
14:53 |
<ladsgroup@deploy2002> |
Unlocked for deployment [ALL REPOSITORIES]: LVS Maint - Outage (T334703) (duration: 13m 39s) |
[production] |
14:43 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1222 (T333332)', diff saved to https://phabricator.wikimedia.org/P47009 and previous config saved to /var/cache/conftool/dbconfig/20230417-144356-ladsgroup.json |
[production] |
14:41 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1222 (T333332)', diff saved to https://phabricator.wikimedia.org/P47008 and previous config saved to /var/cache/conftool/dbconfig/20230417-144133-ladsgroup.json |
[production] |
14:41 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1119 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P47007 and previous config saved to /var/cache/conftool/dbconfig/20230417-144128-root.json |
[production] |
14:41 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance |
[production] |
14:41 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance |
[production] |
14:41 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1197 (T333332)', diff saved to https://phabricator.wikimedia.org/P47006 and previous config saved to /var/cache/conftool/dbconfig/20230417-144109-ladsgroup.json |
[production] |
14:40 |
<ladsgroup@deploy2002> |
Locking from deployment [ALL REPOSITORIES]: LVS Maint - Outage (T334703) |
[production] |
14:31 |
<cgoubert@cumin1001> |
conftool action : set/pooled=yes; selector: dc=eqiad,cluster=parsoid |
[production] |
14:31 |
<claime> |
repooling parsoid in eqiad |
[production] |
14:31 |
<cgoubert@cumin1001> |
conftool action : set/pooled=yes; selector: dc=eqiad,cluster=appserver |
[production] |
14:31 |
<claime> |
repooling appserver in eqiad |
[production] |
14:30 |
<cgoubert@cumin1001> |
conftool action : set/pooled=yes; selector: dc=eqiad,cluster=api_appserver |
[production] |
14:30 |
<claime> |
repooling api_appserver in eqiad |
[production] |
14:30 |
<sukhe> |
running auth-dns update to depool eqiad |
[production] |
14:26 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1119 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P47005 and previous config saved to /var/cache/conftool/dbconfig/20230417-142623-root.json |
[production] |
14:26 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P47004 and previous config saved to /var/cache/conftool/dbconfig/20230417-142603-ladsgroup.json |
[production] |
14:25 |
<urbanecm@deploy2002> |
Finished scap: Backport for [[gerrit:909267|Expose the 'sfsblock-bypass' right so it can be assigned to global groups (T334856)]] (duration: 07m 36s) |
[production] |
14:24 |
<sukhe@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1020.eqiad.wmnet with reason: host reimage |
[production] |