5251-5300 of 10000 results (102ms)
2024-05-02 ยง
12:46 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2161.codfw.wmnet with reason: host reimage [production]
12:26 <marostegui@cumin1002> START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS bookworm [production]
12:25 <marostegui@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2161.codfw.wmnet with OS bookworm [production]
12:24 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P61704 and previous config saved to /var/cache/conftool/dbconfig/20240502-122409-marostegui.json [production]
12:22 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' . [production]
12:20 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' . [production]
12:19 <marostegui@cumin1002> START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS bookworm [production]
12:18 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool db2161', diff saved to https://phabricator.wikimedia.org/P61703 and previous config saved to /var/cache/conftool/dbconfig/20240502-121759-root.json [production]
12:17 <jmm@cumin2002> END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1230.eqiad.wmnet [production]
12:15 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' . [production]
12:09 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P61702 and previous config saved to /var/cache/conftool/dbconfig/20240502-120901-marostegui.json [production]
12:02 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . [production]
12:00 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1399.eqiad.wmnet with OS bullseye [production]
11:57 <elukey@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
11:57 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1435.eqiad.wmnet with OS bullseye [production]
11:57 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet [production]
11:56 <jmm@cumin2002> START - Cookbook sre.puppet.migrate-host for host db1230.eqiad.wmnet [production]
11:55 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1405.eqiad.wmnet with OS bullseye [production]
11:55 <elukey@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
11:54 <elukey@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
11:53 <jmm@cumin2002> END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1213.eqiad.wmnet [production]
11:53 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2210 (T361627)', diff saved to https://phabricator.wikimedia.org/P61701 and previous config saved to /var/cache/conftool/dbconfig/20240502-115353-marostegui.json [production]
11:53 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1409.eqiad.wmnet with OS bullseye [production]
11:53 <elukey@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
11:51 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1371.eqiad.wmnet with OS bullseye [production]
11:46 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet [production]
11:44 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2210 (T361627)', diff saved to https://phabricator.wikimedia.org/P61700 and previous config saved to /var/cache/conftool/dbconfig/20240502-114448-marostegui.json [production]
11:44 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2210.codfw.wmnet with reason: Maintenance [production]
11:44 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2210.codfw.wmnet with reason: Maintenance [production]
11:44 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2206 (T361627)', diff saved to https://phabricator.wikimedia.org/P61699 and previous config saved to /var/cache/conftool/dbconfig/20240502-114425-marostegui.json [production]
11:43 <elukey> depool LiftWing's codfw services from traffic to move all MW API calls to mw-api-int-ro [production]
11:43 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1399.eqiad.wmnet with reason: host reimage [production]
11:42 <jmm@cumin2002> START - Cookbook sre.puppet.migrate-host for host db1213.eqiad.wmnet [production]
11:42 <elukey@puppetmaster1001> conftool action : set/pooled=false; selector: dnsdisc=inference,name=codfw [production]
11:41 <jiji@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
11:41 <cmooney@cumin1002> END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ganeti01.svc.magru.wmnet on all recursors [production]
11:41 <cmooney@cumin1002> START - Cookbook sre.dns.wipe-cache ganeti01.svc.magru.wmnet on all recursors [production]
11:40 <jiji@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
11:39 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1435.eqiad.wmnet with reason: host reimage [production]
11:37 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1405.eqiad.wmnet with reason: host reimage [production]
11:35 <jmm@cumin2002> END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1210.eqiad.wmnet [production]
11:35 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1409.eqiad.wmnet with reason: host reimage [production]
11:35 <hnowlan@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1405.eqiad.wmnet with reason: host reimage [production]
11:34 <hnowlan@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1399.eqiad.wmnet with reason: host reimage [production]
11:34 <hnowlan@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1435.eqiad.wmnet with reason: host reimage [production]
11:32 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1371.eqiad.wmnet with reason: host reimage [production]
11:30 <hnowlan@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1409.eqiad.wmnet with reason: host reimage [production]
11:29 <hnowlan@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1371.eqiad.wmnet with reason: host reimage [production]
11:29 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P61698 and previous config saved to /var/cache/conftool/dbconfig/20240502-112918-marostegui.json [production]
11:25 <jmm@cumin2002> START - Cookbook sre.puppet.migrate-host for host db1210.eqiad.wmnet [production]