2022-09-12
ยง
|
08:47 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host an-worker1097.eqiad.wmnet |
[production] |
08:45 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1096.eqiad.wmnet |
[production] |
08:39 |
<cmooney@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr3-esams,cr3-esams IPv6,re0.cr3-esams.mgmt with reason: router upgrade |
[production] |
08:39 |
<cmooney@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cr3-esams,cr3-esams IPv6,re0.cr3-esams.mgmt with reason: router upgrade |
[production] |
08:38 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host an-worker1096.eqiad.wmnet |
[production] |
08:37 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2021 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34465 and previous config saved to /var/cache/conftool/dbconfig/20220912-083729-root.json |
[production] |
08:36 |
<cmooney@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cr3-esams,cr3-esams IPv6,cr3-esams.mgmt with reason: router upgrade |
[production] |
08:36 |
<cmooney@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cr3-esams,cr3-esams IPv6,cr3-esams.mgmt with reason: router upgrade |
[production] |
08:33 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es1032 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34464 and previous config saved to /var/cache/conftool/dbconfig/20220912-083308-root.json |
[production] |
08:32 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2023 (re)pooling @ 10%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34463 and previous config saved to /var/cache/conftool/dbconfig/20220912-083258-root.json |
[production] |
08:22 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2021 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34462 and previous config saved to /var/cache/conftool/dbconfig/20220912-082224-root.json |
[production] |
08:19 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool es1032', diff saved to https://phabricator.wikimedia.org/P34461 and previous config saved to /var/cache/conftool/dbconfig/20220912-081936-root.json |
[production] |
08:17 |
<moritzm> |
imported jenkins 2.361.1 to thirdparty/ci T317418 |
[production] |
08:17 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2023 (re)pooling @ 5%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34460 and previous config saved to /var/cache/conftool/dbconfig/20220912-081754-root.json |
[production] |
08:09 |
<cmooney@cumin1001> |
END (PASS) - Cookbook sre.network.cf (exit_code=0) |
[production] |
08:08 |
<cmooney@cumin1001> |
START - Cookbook sre.network.cf |
[production] |
08:07 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2021 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34459 and previous config saved to /var/cache/conftool/dbconfig/20220912-080719-root.json |
[production] |
08:06 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool es2023 T317508', diff saved to https://phabricator.wikimedia.org/P34458 and previous config saved to /var/cache/conftool/dbconfig/20220912-080602-root.json |
[production] |
08:04 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Promote es2024 to es5 codfw primary T317508', diff saved to https://phabricator.wikimedia.org/P34457 and previous config saved to /var/cache/conftool/dbconfig/20220912-080400-root.json |
[production] |
08:03 |
<marostegui> |
Starting es5 codfw failover from es2023 to es2024 - T317508 |
[production] |
08:01 |
<cmooney@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr3-esams with reason: router upgrade |
[production] |
08:01 |
<elukey> |
restart kafka on kafka2001 to pick up new PKI settings |
[production] |
08:01 |
<cmooney@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cr3-esams with reason: router upgrade |
[production] |
08:00 |
<hashar> |
Restarting CI Jenkins for upgrade T317418 |
[production] |
08:00 |
<topranks> |
de-pooliong esams in advance of upgrade to core router cr3-esams T295690 |
[production] |
07:58 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on kafka-logging2001.codfw.wmnet with reason: Kafka PKI upgrade |
[production] |
07:57 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 0:20:00 on kafka-logging2001.codfw.wmnet with reason: Kafka PKI upgrade |
[production] |
07:57 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Set es2024 with weight 0 T317508', diff saved to https://phabricator.wikimedia.org/P34456 and previous config saved to /var/cache/conftool/dbconfig/20220912-075739-root.json |
[production] |
07:56 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es5 T317508 |
[production] |
07:56 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es5 T317508 |
[production] |
07:47 |
<hashar> |
Upgraded Jenkins instances from 2.346.1 to 2.346.3 # T317418 |
[production] |
07:42 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool es2021 T317507', diff saved to https://phabricator.wikimedia.org/P34455 and previous config saved to /var/cache/conftool/dbconfig/20220912-074258-root.json |
[production] |
07:41 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Promote es2020 to es4 primary and set section read-write T317507', diff saved to https://phabricator.wikimedia.org/P34454 and previous config saved to /var/cache/conftool/dbconfig/20220912-074100-root.json |
[production] |
07:39 |
<marostegui> |
Starting es4 codfw failover from es2021 to es2020 - T317507 |
[production] |
07:34 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Set es2020 with weight 0 T317507', diff saved to https://phabricator.wikimedia.org/P34453 and previous config saved to /var/cache/conftool/dbconfig/20220912-073408-root.json |
[production] |
07:33 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es4 T317507 |
[production] |
07:33 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 6 hosts with reason: Primary switchover es4 T317507 |
[production] |
07:31 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance |
[production] |
07:31 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2098.codfw.wmnet with reason: Maintenance |
[production] |
07:31 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mwdebug: apply |
[production] |
07:29 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db2174 (T312863)', diff saved to https://phabricator.wikimedia.org/P34452 and previous config saved to /var/cache/conftool/dbconfig/20220912-072931-ladsgroup.json |
[production] |
07:29 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance |
[production] |
07:29 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance |
[production] |
07:29 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2173 (T312863)', diff saved to https://phabricator.wikimedia.org/P34450 and previous config saved to /var/cache/conftool/dbconfig/20220912-072909-ladsgroup.json |
[production] |
07:27 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] START helmfile.d/services/mwdebug: apply |
[production] |
07:27 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply |
[production] |
07:23 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mwdebug: apply |
[production] |
07:18 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'es2024 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34449 and previous config saved to /var/cache/conftool/dbconfig/20220912-071829-root.json |
[production] |
07:17 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1098:3317 (T314041)', diff saved to https://phabricator.wikimedia.org/P34448 and previous config saved to /var/cache/conftool/dbconfig/20220912-071700-ladsgroup.json |
[production] |
07:16 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance |
[production] |