2023-04-17
ยง
|
15:26 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1119 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P47014 and previous config saved to /var/cache/conftool/dbconfig/20230417-152644-root.json |
[production] |
15:21 |
<urbanecm@deploy2002> |
Started scap: Expose the sfsblock-bypass right so it can be assigned to global groups (T334856; second try) |
[production] |
15:20 |
<urbanecm@deploy2002> |
Unlocked for deployment [ALL REPOSITORIES]: LVS Maint - Outage (duration: 23m 03s) |
[production] |
15:18 |
<sukhe> |
run authdns-update and repool eqiad |
[production] |
15:14 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P47013 and previous config saved to /var/cache/conftool/dbconfig/20230417-151409-ladsgroup.json |
[production] |
15:11 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1119 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P47012 and previous config saved to /var/cache/conftool/dbconfig/20230417-151138-root.json |
[production] |
15:09 |
<sukhe@cumin2002> |
END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host lvs1020 |
[production] |
15:09 |
<sukhe@cumin2002> |
START - Cookbook sre.network.configure-switch-interfaces for host lvs1020 |
[production] |
15:07 |
<sukhe@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host lvs1020.eqiad.wmnet with OS bullseye |
[production] |
15:07 |
<vgutierrez> |
rolling restart of HAProxy in the text cluster - T334448 |
[production] |
14:59 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1222', diff saved to https://phabricator.wikimedia.org/P47011 and previous config saved to /var/cache/conftool/dbconfig/20230417-145902-ladsgroup.json |
[production] |
14:57 |
<urbanecm@deploy2002> |
Locking from deployment [ALL REPOSITORIES]: LVS Maint - Outage |
[production] |
14:57 |
<urbanecm@deploy2002> |
Unlocked for deployment [ALL REPOSITORIES]: LVS Maint - Outage (duration: 00m 01s) |
[production] |
14:57 |
<urbanecm@deploy2002> |
Locking from deployment [ALL REPOSITORIES]: LVS Maint - Outage |
[production] |
14:56 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1119 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P47010 and previous config saved to /var/cache/conftool/dbconfig/20230417-145633-root.json |
[production] |
14:55 |
<claime> |
repooled mw1375.eqiad.wmnet |
[production] |
14:54 |
<claime> |
depooling mw1375.eqiad.wmnet |
[production] |
14:53 |
<ladsgroup@deploy2002> |
Unlocked for deployment [ALL REPOSITORIES]: LVS Maint - Outage (T334703) (duration: 13m 39s) |
[production] |
14:43 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1222 (T333332)', diff saved to https://phabricator.wikimedia.org/P47009 and previous config saved to /var/cache/conftool/dbconfig/20230417-144356-ladsgroup.json |
[production] |
14:41 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1222 (T333332)', diff saved to https://phabricator.wikimedia.org/P47008 and previous config saved to /var/cache/conftool/dbconfig/20230417-144133-ladsgroup.json |
[production] |
14:41 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1119 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P47007 and previous config saved to /var/cache/conftool/dbconfig/20230417-144128-root.json |
[production] |
14:41 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance |
[production] |
14:41 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance |
[production] |
14:41 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1197 (T333332)', diff saved to https://phabricator.wikimedia.org/P47006 and previous config saved to /var/cache/conftool/dbconfig/20230417-144109-ladsgroup.json |
[production] |
14:40 |
<ladsgroup@deploy2002> |
Locking from deployment [ALL REPOSITORIES]: LVS Maint - Outage (T334703) |
[production] |
14:31 |
<cgoubert@cumin1001> |
conftool action : set/pooled=yes; selector: dc=eqiad,cluster=parsoid |
[production] |
14:31 |
<claime> |
repooling parsoid in eqiad |
[production] |
14:31 |
<cgoubert@cumin1001> |
conftool action : set/pooled=yes; selector: dc=eqiad,cluster=appserver |
[production] |
14:31 |
<claime> |
repooling appserver in eqiad |
[production] |
14:30 |
<cgoubert@cumin1001> |
conftool action : set/pooled=yes; selector: dc=eqiad,cluster=api_appserver |
[production] |
14:30 |
<claime> |
repooling api_appserver in eqiad |
[production] |
14:30 |
<sukhe> |
running auth-dns update to depool eqiad |
[production] |
14:26 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1119 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P47005 and previous config saved to /var/cache/conftool/dbconfig/20230417-142623-root.json |
[production] |
14:26 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P47004 and previous config saved to /var/cache/conftool/dbconfig/20230417-142603-ladsgroup.json |
[production] |
14:25 |
<urbanecm@deploy2002> |
Finished scap: Backport for [[gerrit:909267|Expose the 'sfsblock-bypass' right so it can be assigned to global groups (T334856)]] (duration: 07m 36s) |
[production] |
14:24 |
<sukhe@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1020.eqiad.wmnet with reason: host reimage |
[production] |
14:21 |
<sukhe@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on lvs1020.eqiad.wmnet with reason: host reimage |
[production] |
14:19 |
<urbanecm@deploy2002> |
urbanecm and maurelio: Backport for [[gerrit:909267|Expose the 'sfsblock-bypass' right so it can be assigned to global groups (T334856)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet |
[production] |
14:17 |
<urbanecm@deploy2002> |
Started scap: Backport for [[gerrit:909267|Expose the 'sfsblock-bypass' right so it can be assigned to global groups (T334856)]] |
[production] |
14:14 |
<elukey> |
upload amd-k8s-device-plugin deb (1.25.2.3-1) to bullseye-wikimedia - T333009 |
[production] |
14:12 |
<claime> |
Migrated linkrecommandation to mw-api-int - T334060 |
[production] |
14:10 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P47003 and previous config saved to /var/cache/conftool/dbconfig/20230417-141056-ladsgroup.json |
[production] |
14:10 |
<cgoubert@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply |
[production] |
14:09 |
<cgoubert@deploy2002> |
helmfile [codfw] START helmfile.d/services/linkrecommendation: apply |
[production] |
14:08 |
<cgoubert@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply |
[production] |
14:07 |
<sukhe@cumin2002> |
START - Cookbook sre.hosts.reimage for host lvs1020.eqiad.wmnet with OS bullseye |
[production] |
14:07 |
<claime> |
Migrating linkrecommandation to mw-api-int - T334060 |
[production] |
14:06 |
<cgoubert@deploy2002> |
helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply |
[production] |
13:55 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1197 (T333332)', diff saved to https://phabricator.wikimedia.org/P47002 and previous config saved to /var/cache/conftool/dbconfig/20230417-135550-ladsgroup.json |
[production] |
13:53 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1197 (T333332)', diff saved to https://phabricator.wikimedia.org/P47001 and previous config saved to /var/cache/conftool/dbconfig/20230417-135334-ladsgroup.json |
[production] |