2024-02-22
ยง
|
15:57 |
<hnowlan> |
depooling mw[1458,1467-1468,1483-1485,1494].eqiad.wmnet in advance of reimaging |
[production] |
15:56 |
<cmooney@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 25 hosts with reason: Migrating servers in codfw rack B2 to lsw1-b2-codfw |
[production] |
15:55 |
<cmooney@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:30:00 on 25 hosts with reason: Migrating servers in codfw rack B2 to lsw1-b2-codfw |
[production] |
15:54 |
<mvernon@cumin2002> |
conftool action : set/pooled=false; selector: dnsdisc=swift,name=codfw |
[production] |
15:54 |
<Emperor> |
depool codfs-mw T355868 |
[production] |
15:53 |
<Emperor> |
depool thanos-fe2002 T355868 |
[production] |
15:51 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2149 (re)pooling @ 50%: After recloning', diff saved to https://phabricator.wikimedia.org/P57736 and previous config saved to /var/cache/conftool/dbconfig/20240222-155141-root.json |
[production] |
15:50 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P57735 and previous config saved to /var/cache/conftool/dbconfig/20240222-155005-arnaudb.json |
[production] |
15:48 |
<cmooney@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on asw-b-codfw,cr[1-2]-codfw,lsw1-b2-codfw.mgmt with reason: prepping for server uplink migration codfw rack b2 |
[production] |
15:48 |
<cmooney@cumin1002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on asw-b-codfw,cr[1-2]-codfw,lsw1-b2-codfw.mgmt with reason: prepping for server uplink migration codfw rack b2 |
[production] |
15:46 |
<sukhe@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on cp[2031-2032].codfw.wmnet with reason: T355868 |
[production] |
15:46 |
<sukhe@cumin2002> |
START - Cookbook sre.hosts.downtime for 3:00:00 on cp[2031-2032].codfw.wmnet with reason: T355868 |
[production] |
15:39 |
<aqu@deploy2002> |
Finished deploy [airflow-dags/analytics_test@b115452]: Deploy Refine job POC on test cluster (duration: 00m 16s) |
[production] |
15:39 |
<aqu@deploy2002> |
Started deploy [airflow-dags/analytics_test@b115452]: Deploy Refine job POC on test cluster |
[production] |
15:36 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2149 (re)pooling @ 25%: After recloning', diff saved to https://phabricator.wikimedia.org/P57734 and previous config saved to /var/cache/conftool/dbconfig/20240222-153636-root.json |
[production] |
15:35 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1211', diff saved to https://phabricator.wikimedia.org/P57733 and previous config saved to /var/cache/conftool/dbconfig/20240222-153459-arnaudb.json |
[production] |
15:32 |
<cmooney@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage |
[production] |
15:27 |
<moritzm> |
installing glib2.0 security updates on bullseye |
[production] |
15:27 |
<cmooney@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2002.codfw.wmnet with reason: host reimage |
[production] |
15:21 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2149 (re)pooling @ 10%: After recloning', diff saved to https://phabricator.wikimedia.org/P57732 and previous config saved to /var/cache/conftool/dbconfig/20240222-152131-root.json |
[production] |
15:19 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1211 (T357189)', diff saved to https://phabricator.wikimedia.org/P57731 and previous config saved to /var/cache/conftool/dbconfig/20240222-151952-arnaudb.json |
[production] |
15:17 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Depooling db1211 (T357189)', diff saved to https://phabricator.wikimedia.org/P57730 and previous config saved to /var/cache/conftool/dbconfig/20240222-151733-arnaudb.json |
[production] |
15:17 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance |
[production] |
15:17 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1211.eqiad.wmnet with reason: Maintenance |
[production] |
15:17 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1203 (T357189)', diff saved to https://phabricator.wikimedia.org/P57729 and previous config saved to /var/cache/conftool/dbconfig/20240222-151701-arnaudb.json |
[production] |
15:15 |
<cmooney@cumin1002> |
START - Cookbook sre.hosts.reimage for host testvm2002.codfw.wmnet with OS bullseye |
[production] |
15:15 |
<akosiaris@cumin1002> |
conftool action : set/pooled=yes; selector: service=parsoid-php,name=kubernetes.* |
[production] |
15:15 |
<akosiaris> |
T357392 pool 46 kubernetes hosts of parsoid-php with a weight of 1. Since the 42 parse hosts are at weight 110, that means 1% goes to mw-parsoid deployment, aka mw-on-k8s |
[production] |
15:13 |
<akosiaris@cumin1002> |
conftool action : set/weight=1; selector: service=parsoid-php,name=kubernetes.* |
[production] |
15:12 |
<akosiaris@cumin1002> |
conftool action : set/weight=110; selector: service=parsoid-php,name=(pars.*|mw.*) |
[production] |
15:12 |
<akosiaris> |
Bump weight of old parsoid hosts from 10 to 110. This is a noop right now but will makes calculations later spelled out in T357392 possible. |
[production] |
14:55 |
<akosiaris@deploy2002> |
helmfile [codfw] DONE helmfile.d/services/mw-parsoid: apply |
[production] |
14:55 |
<akosiaris@deploy2002> |
helmfile [codfw] START helmfile.d/services/mw-parsoid: apply |
[production] |
14:55 |
<akosiaris@deploy2002> |
helmfile [eqiad] DONE helmfile.d/services/mw-parsoid: apply |
[production] |
14:55 |
<akosiaris@deploy2002> |
helmfile [eqiad] START helmfile.d/services/mw-parsoid: apply |
[production] |
14:51 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2149 (re)pooling @ 1%: After recloning', diff saved to https://phabricator.wikimedia.org/P57726 and previous config saved to /var/cache/conftool/dbconfig/20240222-145120-root.json |
[production] |
14:46 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P57725 and previous config saved to /var/cache/conftool/dbconfig/20240222-144648-arnaudb.json |
[production] |
14:45 |
<cgoubert@deploy2002> |
Finished scap: Backport for [[gerrit:1004135|Enable $wgLocalHTTPProxy on group1 wikis (T298265)]] (duration: 17m 46s) |
[production] |
14:44 |
<jclark@cumin1002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-redacteddb1001.eqiad.wmnet with OS bullseye |
[production] |
14:44 |
<jclark@cumin1002> |
START - Cookbook sre.hosts.reimage for host an-redacteddb1001.eqiad.wmnet with OS bullseye |
[production] |
14:37 |
<cgoubert@deploy2002> |
cgoubert: Continuing with sync |
[production] |
14:31 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1203 (T357189)', diff saved to https://phabricator.wikimedia.org/P57724 and previous config saved to /var/cache/conftool/dbconfig/20240222-143141-arnaudb.json |
[production] |
14:29 |
<cgoubert@deploy2002> |
cgoubert: Backport for [[gerrit:1004135|Enable $wgLocalHTTPProxy on group1 wikis (T298265)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
14:29 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Depooling db1203 (T357189)', diff saved to https://phabricator.wikimedia.org/P57723 and previous config saved to /var/cache/conftool/dbconfig/20240222-142921-arnaudb.json |
[production] |
14:29 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance |
[production] |
14:29 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1203.eqiad.wmnet with reason: Maintenance |
[production] |
14:29 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1193 (T357189)', diff saved to https://phabricator.wikimedia.org/P57722 and previous config saved to /var/cache/conftool/dbconfig/20240222-142859-arnaudb.json |
[production] |
14:28 |
<cgoubert@deploy2002> |
Started scap: Backport for [[gerrit:1004135|Enable $wgLocalHTTPProxy on group1 wikis (T298265)]] |
[production] |
14:15 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'es1028 (re)pooling @ 100%: After migration to 10.6', diff saved to https://phabricator.wikimedia.org/P57721 and previous config saved to /var/cache/conftool/dbconfig/20240222-141508-root.json |
[production] |
14:13 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P57720 and previous config saved to /var/cache/conftool/dbconfig/20240222-141353-arnaudb.json |
[production] |