2024-07-02
ยง
|
11:50 |
<cgoubert@cumin1002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2307 to wikikube-worker2030 - cgoubert@cumin1002" |
[production] |
11:50 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance |
[production] |
11:50 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2174.codfw.wmnet with reason: Maintenance |
[production] |
11:50 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2173 (T364069)', diff saved to https://phabricator.wikimedia.org/P65657 and previous config saved to /var/cache/conftool/dbconfig/20240702-115003-marostegui.json |
[production] |
11:48 |
<root@cumin1002> |
START - Cookbook sre.hosts.reimage for host cloudcephosd1008.eqiad.wmnet with OS bullseye |
[production] |
11:46 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2165 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P65656 and previous config saved to /var/cache/conftool/dbconfig/20240702-114627-root.json |
[production] |
11:44 |
<root@cumin1002> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcephosd1008.eqiad.wmnet with OS bullseye |
[production] |
11:43 |
<cgoubert@cumin1002> |
START - Cookbook sre.dns.netbox |
[production] |
11:43 |
<cgoubert@cumin1002> |
START - Cookbook sre.hosts.rename from mw2307 to wikikube-worker2030 |
[production] |
11:37 |
<brouberol@cumin1002> |
START - Cookbook sre.druid.roll-restart-workers for Druid analytics cluster: Roll restart of Druid jvm daemons. |
[production] |
11:36 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Long schema change |
[production] |
11:36 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Long schema change |
[production] |
11:34 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P65655 and previous config saved to /var/cache/conftool/dbconfig/20240702-113457-marostegui.json |
[production] |
11:31 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2165 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P65654 and previous config saved to /var/cache/conftool/dbconfig/20240702-113122-root.json |
[production] |
11:27 |
<brouberol@cumin1002> |
END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid public cluster: Roll restart of Druid jvm daemons. |
[production] |
11:26 |
<btullis@cumin1002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host eventlog1003.eqiad.wmnet |
[production] |
11:26 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool db2129 T369021', diff saved to https://phabricator.wikimedia.org/P65653 and previous config saved to /var/cache/conftool/dbconfig/20240702-112616-root.json |
[production] |
11:25 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Promote db2214 to s6 primary T369021', diff saved to https://phabricator.wikimedia.org/P65652 and previous config saved to /var/cache/conftool/dbconfig/20240702-112518-marostegui.json |
[production] |
11:24 |
<marostegui> |
Starting s6 codfw failover from db2129 to db2214 - T369021 |
[production] |
11:24 |
<jayme> |
switched wikikube production clusters from PSP to PSS for restricted namespaces - T273507 |
[production] |
11:23 |
<jayme@deploy1002> |
helmfile [eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
11:22 |
<btullis@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host eventlog1003.eqiad.wmnet |
[production] |
11:22 |
<jayme@deploy1002> |
helmfile [eqiad] START helmfile.d/admin 'apply'. |
[production] |
11:22 |
<fabfur@cumin1002> |
START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad |
[production] |
11:22 |
<fabfur@cumin1002> |
START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad |
[production] |
11:21 |
<jayme@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host kubernetes1051.eqiad.wmnet |
[production] |
11:21 |
<jayme@deploy1002> |
helmfile [codfw] DONE helmfile.d/admin 'apply'. |
[production] |
11:21 |
<claime> |
Uncordoning wikikube-ctrl2001.codfw.wmnet and wikikube-ctrl2002.codfw.wmnet |
[production] |
11:20 |
<jayme@deploy1002> |
helmfile [codfw] START helmfile.d/admin 'apply'. |
[production] |
11:19 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P65651 and previous config saved to /var/cache/conftool/dbconfig/20240702-111949-marostegui.json |
[production] |
11:17 |
<root@cumin1002> |
START - Cookbook sre.hosts.reimage for host cloudcephosd1008.eqiad.wmnet with OS bullseye |
[production] |
11:16 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2165 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P65650 and previous config saved to /var/cache/conftool/dbconfig/20240702-111616-root.json |
[production] |
11:14 |
<fabfur@cumin1002> |
END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad |
[production] |
11:12 |
<cgoubert@cumin1002> |
conftool action : set/weight=10:pooled=yes; selector: name=(wikikube-worker2025.codfw.wmnet|wikikube-worker2026.codfw.wmnet|wikikube-worker2027.codfw.wmnet|wikikube-worker2028.codfw.wmnet|wikikube-worker2029.codfw.wmnet),cluster=kubernetes,service=kubesvc |
[production] |
11:12 |
<fabfur@cumin1002> |
END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad |
[production] |
11:12 |
<claime> |
pooling and uncordoning wikikube-worker2025.codfw.wmnet|wikikube-worker2026.codfw.wmnet|wikikube-worker2027.codfw.wmnet|wikikube-worker2028.codfw.wmnet|wikikube-worker2029.codfw.wmnet - T351074 |
[production] |
11:11 |
<jiji@cumin1002> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kubemaster[2001-2002].codfw.wmnet |
[production] |
11:11 |
<jiji@cumin1002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
11:11 |
<jiji@cumin1002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubemaster[2001-2002].codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1002" |
[production] |
11:07 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s6 T369021 |
[production] |
11:07 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Set db2214 with weight 0 T369021', diff saved to https://phabricator.wikimedia.org/P65649 and previous config saved to /var/cache/conftool/dbconfig/20240702-110750-root.json |
[production] |
11:07 |
<jiji@cumin1002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: kubemaster[2001-2002].codfw.wmnet decommissioned, removing all IPs except the asset tag one - jiji@cumin1002" |
[production] |
11:07 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s6 T369021 |
[production] |
11:04 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2173 (T364069)', diff saved to https://phabricator.wikimedia.org/P65648 and previous config saved to /var/cache/conftool/dbconfig/20240702-110442-marostegui.json |
[production] |
11:01 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2165 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P65647 and previous config saved to /var/cache/conftool/dbconfig/20240702-110111-root.json |
[production] |
10:56 |
<jiji@cumin1002> |
START - Cookbook sre.dns.netbox |
[production] |
10:50 |
<jiji@cumin1002> |
START - Cookbook sre.hosts.decommission for hosts kubemaster[2001-2002].codfw.wmnet |
[production] |
10:46 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2165 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P65646 and previous config saved to /var/cache/conftool/dbconfig/20240702-104605-root.json |
[production] |
10:42 |
<pfischer@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
10:42 |
<pfischer@deploy1002> |
helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |