2024-04-11
ยง
|
13:46 |
<btullis@cumin1002> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host matomo1003.eqiad.wmnet with OS bookworm |
[production] |
13:46 |
<sukhe@cumin1002> |
START - Cookbook sre.hosts.reimage for host cp2042.codfw.wmnet with OS bullseye |
[production] |
13:45 |
<sukhe@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=cp2042.codfw.wmnet,service=(cdn|ats-be) |
[production] |
13:43 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2177 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P60407 and previous config saved to /var/cache/conftool/dbconfig/20240411-134341-root.json |
[production] |
13:43 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P60406 and previous config saved to /var/cache/conftool/dbconfig/20240411-134312-arnaudb.json |
[production] |
13:41 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.maps.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:maps-replica-eqiad |
[production] |
13:36 |
<jmm@cumin2002> |
START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-eqiad |
[production] |
13:34 |
<sukhe@cumin1002> |
START - Cookbook sre.hosts.reimage for host cp3073.esams.wmnet with OS bullseye |
[production] |
13:32 |
<sukhe@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=cp3073.esams.wmnet,service=(cdn|ats-be) |
[production] |
13:32 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db2160.codfw.wmnet with reason: reboot multiinstance replica |
[production] |
13:32 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:30:00 on db2160.codfw.wmnet with reason: reboot multiinstance replica |
[production] |
13:32 |
<btullis@cumin1002> |
START - Cookbook sre.hosts.reimage for host matomo1003.eqiad.wmnet with OS bookworm |
[production] |
13:31 |
<btullis@cumin1002> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host matomo1003.eqiad.wmnet with OS bookworm |
[production] |
13:30 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2133.codfw.wmnet |
[production] |
13:28 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2177 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P60405 and previous config saved to /var/cache/conftool/dbconfig/20240411-132834-root.json |
[production] |
13:28 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db2129 (re)pooling @ 5%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P60404 and previous config saved to /var/cache/conftool/dbconfig/20240411-132807-arnaudb.json |
[production] |
13:27 |
<vgutierrez@cumin1002> |
END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-eqsin and not P{cp[5030,5032].eqsin.wmnet} and A:cp |
[production] |
13:26 |
<arnaudb@cumin1002> |
START - Cookbook sre.mysql.upgrade for db2133.codfw.wmnet |
[production] |
13:26 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db[2133,2160].codfw.wmnet with reason: reboot |
[production] |
13:25 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:30:00 on db[2133,2160].codfw.wmnet with reason: reboot |
[production] |
13:23 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2135.codfw.wmnet |
[production] |
13:18 |
<arnaudb@cumin1002> |
START - Cookbook sre.mysql.upgrade for db2135.codfw.wmnet |
[production] |
13:17 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db[2135,2160].codfw.wmnet with reason: reboot |
[production] |
13:17 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:30:00 on db[2135,2160].codfw.wmnet with reason: reboot |
[production] |
13:16 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2134.codfw.wmnet |
[production] |
13:13 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2177 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P60403 and previous config saved to /var/cache/conftool/dbconfig/20240411-131327-root.json |
[production] |
13:13 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db2129 (re)pooling @ 4%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P60402 and previous config saved to /var/cache/conftool/dbconfig/20240411-131301-arnaudb.json |
[production] |
13:12 |
<btullis@cumin1002> |
START - Cookbook sre.hosts.reimage for host matomo1003.eqiad.wmnet with OS bookworm |
[production] |
13:12 |
<arnaudb@cumin1002> |
START - Cookbook sre.mysql.upgrade for db2134.codfw.wmnet |
[production] |
13:12 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on db[2134,2160].codfw.wmnet with reason: reboot |
[production] |
13:11 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:30:00 on db[2134,2160].codfw.wmnet with reason: reboot |
[production] |
13:00 |
<btullis@cumin1002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host matomo1003.eqiad.wmnet with OS bookworm |
[production] |
12:58 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db2132.codfw.wmnet |
[production] |
12:58 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'db2177 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P60401 and previous config saved to /var/cache/conftool/dbconfig/20240411-125821-root.json |
[production] |
12:57 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db2129 (re)pooling @ 2%: Post upgrade', diff saved to https://phabricator.wikimedia.org/P60400 and previous config saved to /var/cache/conftool/dbconfig/20240411-125755-arnaudb.json |
[production] |
12:54 |
<akosiaris> |
lower weight of mw1437 back to 10 from the 30 I had upped it to yesterday. The backlog of videoscaling is apparently now served and CPU usage has reached "normal" levels |
[production] |
12:54 |
<arnaudb@cumin1002> |
START - Cookbook sre.mysql.upgrade for db2132.codfw.wmnet |
[production] |
12:54 |
<jayme@deploy1002> |
helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
12:53 |
<akosiaris@cumin1002> |
conftool action : set/weight=10; selector: name=mw1437.*.wmnet,dc=eqiad |
[production] |
12:53 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2132,2160].codfw.wmnet with reason: reboot |
[production] |
12:53 |
<jayme@deploy1002> |
helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. |
[production] |
12:53 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 4:00:00 on db[2132,2160].codfw.wmnet with reason: reboot |
[production] |
12:52 |
<jayme@deploy1002> |
helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
12:52 |
<jayme@deploy1002> |
helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'. |
[production] |
12:51 |
<jayme@deploy1002> |
helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'. |
[production] |
12:50 |
<jayme@deploy1002> |
helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'. |
[production] |
12:49 |
<jayme@deploy1002> |
helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'. |
[production] |
12:49 |
<jayme@deploy1002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'. |
[production] |
12:45 |
<ayounsi@cumin1002> |
START - Cookbook sre.dns.netbox |
[production] |
12:24 |
<btullis@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply |
[production] |