2024-05-08
ยง
|
09:41 |
<btullis@cumin1002> |
START - Cookbook sre.hosts.reimage for host snapshot1011.eqiad.wmnet with OS bullseye |
[production] |
09:33 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1192 (T361627)', diff saved to https://phabricator.wikimedia.org/P62046 and previous config saved to /var/cache/conftool/dbconfig/20240508-093350-marostegui.json |
[production] |
09:33 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'es2023 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62045 and previous config saved to /var/cache/conftool/dbconfig/20240508-093347-root.json |
[production] |
09:29 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1178 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62044 and previous config saved to /var/cache/conftool/dbconfig/20240508-092944-root.json |
[production] |
09:28 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage |
[production] |
09:25 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es1022.eqiad.wmnet with reason: host reimage |
[production] |
09:23 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1177.eqiad.wmnet with reason: host reimage |
[production] |
09:22 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on es1022.eqiad.wmnet with reason: host reimage |
[production] |
09:18 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'es2023 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62043 and previous config saved to /var/cache/conftool/dbconfig/20240508-091841-root.json |
[production] |
09:14 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1178 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62042 and previous config saved to /var/cache/conftool/dbconfig/20240508-091434-root.json |
[production] |
09:10 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.reimage for host db1177.eqiad.wmnet with OS bookworm |
[production] |
09:09 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool db1177 T363792', diff saved to https://phabricator.wikimedia.org/P62041 and previous config saved to /var/cache/conftool/dbconfig/20240508-090925-root.json |
[production] |
09:08 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db1192 (T361627)', diff saved to https://phabricator.wikimedia.org/P62040 and previous config saved to /var/cache/conftool/dbconfig/20240508-090817-marostegui.json |
[production] |
09:08 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1192.eqiad.wmnet with reason: Maintenance |
[production] |
09:07 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db1192.eqiad.wmnet with reason: Maintenance |
[production] |
09:07 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1177 (T361627)', diff saved to https://phabricator.wikimedia.org/P62039 and previous config saved to /var/cache/conftool/dbconfig/20240508-090754-marostegui.json |
[production] |
09:07 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bookworm |
[production] |
09:06 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool es1022 T364289', diff saved to https://phabricator.wikimedia.org/P62038 and previous config saved to /var/cache/conftool/dbconfig/20240508-090621-root.json |
[production] |
09:03 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'es2023 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P62037 and previous config saved to /var/cache/conftool/dbconfig/20240508-090334-root.json |
[production] |
08:59 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1178 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P62036 and previous config saved to /var/cache/conftool/dbconfig/20240508-085929-root.json |
[production] |
08:58 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host es2023.codfw.wmnet with OS bookworm |
[production] |
08:52 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P62035 and previous config saved to /var/cache/conftool/dbconfig/20240508-085246-marostegui.json |
[production] |
08:44 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1178 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P62034 and previous config saved to /var/cache/conftool/dbconfig/20240508-084422-root.json |
[production] |
08:37 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P62033 and previous config saved to /var/cache/conftool/dbconfig/20240508-083739-marostegui.json |
[production] |
08:36 |
<jmm@cumin2002> |
START - Cookbook sre.puppet.migrate-host for host db2208.codfw.wmnet |
[production] |
08:35 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2182.codfw.wmnet |
[production] |
08:35 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on es2023.codfw.wmnet with reason: host reimage |
[production] |
08:32 |
<klausman@deploy1002> |
helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
08:31 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on es2023.codfw.wmnet with reason: host reimage |
[production] |
08:31 |
<klausman@deploy1002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'. |
[production] |
08:29 |
<ladsgroup@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance |
[production] |
08:29 |
<ladsgroup@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance |
[production] |
08:29 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1178 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P62032 and previous config saved to /var/cache/conftool/dbconfig/20240508-082917-root.json |
[production] |
08:24 |
<klausman@deploy1002> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'. |
[production] |
08:23 |
<klausman@deploy1002> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'. |
[production] |
08:22 |
<klausman@deploy1002> |
helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'. |
[production] |
08:22 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1177 (T361627)', diff saved to https://phabricator.wikimedia.org/P62031 and previous config saved to /var/cache/conftool/dbconfig/20240508-082231-marostegui.json |
[production] |
08:22 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'es2022 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62030 and previous config saved to /var/cache/conftool/dbconfig/20240508-082202-root.json |
[production] |
08:21 |
<klausman@deploy1002> |
helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'. |
[production] |
08:21 |
<jmm@cumin2002> |
START - Cookbook sre.puppet.migrate-host for host db2182.codfw.wmnet |
[production] |
08:20 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db2168.codfw.wmnet |
[production] |
08:16 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'es2025 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P62029 and previous config saved to /var/cache/conftool/dbconfig/20240508-081633-root.json |
[production] |
08:14 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db1178 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P62028 and previous config saved to /var/cache/conftool/dbconfig/20240508-081412-root.json |
[production] |
08:12 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.reimage for host es2023.codfw.wmnet with OS bookworm |
[production] |
08:08 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Give some weight to es5 master', diff saved to https://phabricator.wikimedia.org/P62027 and previous config saved to /var/cache/conftool/dbconfig/20240508-080848-marostegui.json |
[production] |
08:08 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool es2023 T364443', diff saved to https://phabricator.wikimedia.org/P62026 and previous config saved to /var/cache/conftool/dbconfig/20240508-080812-root.json |
[production] |
08:06 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'es2022 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P62025 and previous config saved to /var/cache/conftool/dbconfig/20240508-080656-root.json |
[production] |
08:06 |
<marostegui> |
Starting es5 codfw failover from es2023 to es2024 - T364443 |
[production] |
08:03 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Set es2024 with weight 0 T364443', diff saved to https://phabricator.wikimedia.org/P62024 and previous config saved to /var/cache/conftool/dbconfig/20240508-080312-root.json |
[production] |
08:03 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: Primary switchover es5 T364443 |
[production] |