2022-09-12
ยง
|
12:11 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1034 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34505 and previous config saved to /var/cache/conftool/dbconfig/20220912-121150-root.json |
[production] |
12:08 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1021 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34504 and previous config saved to /var/cache/conftool/dbconfig/20220912-120818-root.json |
[production] |
12:08 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host an-worker1096.eqiad.wmnet |
[production] |
12:07 |
<btullis@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1146.eqiad.wmnet |
[production] |
12:07 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P34503 and previous config saved to /var/cache/conftool/dbconfig/20220912-120715-ladsgroup.json |
[production] |
11:56 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1034 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34502 and previous config saved to /var/cache/conftool/dbconfig/20220912-115645-root.json |
[production] |
11:53 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1021 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34501 and previous config saved to /var/cache/conftool/dbconfig/20220912-115313-root.json |
[production] |
11:52 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P34500 and previous config saved to /var/cache/conftool/dbconfig/20220912-115208-ladsgroup.json |
[production] |
11:41 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1034 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34499 and previous config saved to /var/cache/conftool/dbconfig/20220912-114140-root.json |
[production] |
11:38 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1021 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34498 and previous config saved to /var/cache/conftool/dbconfig/20220912-113808-root.json |
[production] |
11:37 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T314041)', diff saved to https://phabricator.wikimedia.org/P34497 and previous config saved to /var/cache/conftool/dbconfig/20220912-113702-ladsgroup.json |
[production] |
11:35 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
11:33 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
11:33 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
11:30 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
11:28 |
<marostegui@deploy1002> |
Synchronized wmf-config/db-production.php: Enable writes on es4 T317522 (duration: 03m 36s) |
[production] |
11:27 |
<bmansurov@deploy1002> |
Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s) |
[production] |
11:27 |
<bmansurov@deploy1002> |
Started deploy [airflow-dags/research@b9be20d]: (no justification provided) |
[production] |
11:26 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1034 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34496 and previous config saved to /var/cache/conftool/dbconfig/20220912-112635-root.json |
[production] |
11:25 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
11:23 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool es1021 T317522', diff saved to https://phabricator.wikimedia.org/P34495 and previous config saved to /var/cache/conftool/dbconfig/20220912-112343-root.json |
[production] |
11:23 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host an-worker1146.eqiad.wmnet |
[production] |
11:21 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
11:21 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
11:20 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Promote es1020 to es4 primary T317522', diff saved to https://phabricator.wikimedia.org/P34494 and previous config saved to /var/cache/conftool/dbconfig/20220912-112039-root.json |
[production] |
11:20 |
<marostegui> |
Starting es4 eqiad failover from es1021 to es1020 - T317522 |
[production] |
11:18 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
11:18 |
<marostegui@deploy1002> |
Synchronized wmf-config/db-production.php: Disable writes on es4 T317522 (duration: 04m 10s) |
[production] |
11:16 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker[1143-1148].eqiad.wmnet |
[production] |
11:15 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-etcd1001.eqiad.wmnet |
[production] |
11:14 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1033 (re)pooling @ 100%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34493 and previous config saved to /var/cache/conftool/dbconfig/20220912-111442-root.json |
[production] |
11:14 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Set es1020 with weight 0 T317522', diff saved to https://phabricator.wikimedia.org/P34492 and previous config saved to /var/cache/conftool/dbconfig/20220912-111424-root.json |
[production] |
11:13 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es4 T317522 |
[production] |
11:13 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es4 T317522 |
[production] |
11:12 |
<btullis@cumin1001> |
START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1143-1148].eqiad.wmnet |
[production] |
11:11 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host dse-k8s-etcd1001.eqiad.wmnet |
[production] |
11:11 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1034 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34491 and previous config saved to /var/cache/conftool/dbconfig/20220912-111130-root.json |
[production] |
11:10 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1142.eqiad.wmnet |
[production] |
11:09 |
<btullis@cumin1001> |
START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1142.eqiad.wmnet |
[production] |
11:08 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1101.eqiad.wmnet |
[production] |
11:04 |
<moritzm> |
updated bullseye install image for 11.5 release T317416 |
[production] |
10:59 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host an-worker1101.eqiad.wmnet |
[production] |
10:59 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1033 (re)pooling @ 75%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34490 and previous config saved to /var/cache/conftool/dbconfig/20220912-105937-root.json |
[production] |
10:59 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1100.eqiad.wmnet |
[production] |
10:58 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db2108 (T314041)', diff saved to https://phabricator.wikimedia.org/P34489 and previous config saved to /var/cache/conftool/dbconfig/20220912-105841-ladsgroup.json |
[production] |
10:58 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: Maintenance |
[production] |
10:58 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2108.codfw.wmnet with reason: Maintenance |
[production] |
10:56 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1034 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34488 and previous config saved to /var/cache/conftool/dbconfig/20220912-105625-root.json |
[production] |
10:55 |
<topranks> |
re-pooliong esams after successful upgrade of core router cr3-esams T295690 |
[production] |
10:50 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host an-worker1100.eqiad.wmnet |
[production] |