2651-2700 of 10000 results (105ms)
2024-05-02 ยง
13:35 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2140.codfw.wmnet with reason: Maintenance [production]
13:34 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2219 (T361627)', diff saved to https://phabricator.wikimedia.org/P61712 and previous config saved to /var/cache/conftool/dbconfig/20240502-133420-marostegui.json [production]
13:33 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet [production]
13:32 <sukhe> running authdns-update to revert magru text geomap [production]
13:27 <marostegui@cumin1002> dbctl commit (dc=all): 'db2161 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61711 and previous config saved to /var/cache/conftool/dbconfig/20240502-132731-root.json [production]
13:24 <jiji@deploy1002> helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. [production]
13:24 <jiji@deploy1002> helmfile [staging-eqiad] START helmfile.d/admin 'apply'. [production]
13:23 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet [production]
13:19 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P61710 and previous config saved to /var/cache/conftool/dbconfig/20240502-131912-marostegui.json [production]
13:12 <marostegui@cumin1002> dbctl commit (dc=all): 'db2161 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61709 and previous config saved to /var/cache/conftool/dbconfig/20240502-131225-root.json [production]
13:08 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2161.codfw.wmnet with OS bookworm [production]
13:04 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P61708 and previous config saved to /var/cache/conftool/dbconfig/20240502-130404-marostegui.json [production]
13:02 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . [production]
12:57 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . [production]
12:49 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . [production]
12:48 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2219 (T361627)', diff saved to https://phabricator.wikimedia.org/P61707 and previous config saved to /var/cache/conftool/dbconfig/20240502-124857-marostegui.json [production]
12:46 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2161.codfw.wmnet with reason: host reimage [production]
12:26 <marostegui@cumin1002> START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS bookworm [production]
12:25 <marostegui@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2161.codfw.wmnet with OS bookworm [production]
12:24 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P61704 and previous config saved to /var/cache/conftool/dbconfig/20240502-122409-marostegui.json [production]
12:22 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' . [production]
12:20 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' . [production]
12:19 <marostegui@cumin1002> START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS bookworm [production]
12:18 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool db2161', diff saved to https://phabricator.wikimedia.org/P61703 and previous config saved to /var/cache/conftool/dbconfig/20240502-121759-root.json [production]
12:17 <jmm@cumin2002> END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1230.eqiad.wmnet [production]
12:15 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' . [production]
12:09 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P61702 and previous config saved to /var/cache/conftool/dbconfig/20240502-120901-marostegui.json [production]
12:02 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . [production]
12:00 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1399.eqiad.wmnet with OS bullseye [production]
11:57 <elukey@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
11:57 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1435.eqiad.wmnet with OS bullseye [production]
11:57 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet [production]
11:56 <jmm@cumin2002> START - Cookbook sre.puppet.migrate-host for host db1230.eqiad.wmnet [production]
11:55 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1405.eqiad.wmnet with OS bullseye [production]
11:55 <elukey@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
11:54 <elukey@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
11:53 <jmm@cumin2002> END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1213.eqiad.wmnet [production]
11:53 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2210 (T361627)', diff saved to https://phabricator.wikimedia.org/P61701 and previous config saved to /var/cache/conftool/dbconfig/20240502-115353-marostegui.json [production]
11:53 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1409.eqiad.wmnet with OS bullseye [production]
11:53 <elukey@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
11:51 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1371.eqiad.wmnet with OS bullseye [production]
11:46 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet [production]
11:44 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2210 (T361627)', diff saved to https://phabricator.wikimedia.org/P61700 and previous config saved to /var/cache/conftool/dbconfig/20240502-114448-marostegui.json [production]
11:44 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2210.codfw.wmnet with reason: Maintenance [production]
11:44 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2210.codfw.wmnet with reason: Maintenance [production]
11:44 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2206 (T361627)', diff saved to https://phabricator.wikimedia.org/P61699 and previous config saved to /var/cache/conftool/dbconfig/20240502-114425-marostegui.json [production]
11:43 <elukey> depool LiftWing's codfw services from traffic to move all MW API calls to mw-api-int-ro [production]
11:43 <hnowlan@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1399.eqiad.wmnet with reason: host reimage [production]
11:42 <jmm@cumin2002> START - Cookbook sre.puppet.migrate-host for host db1213.eqiad.wmnet [production]
11:42 <elukey@puppetmaster1001> conftool action : set/pooled=false; selector: dnsdisc=inference,name=codfw [production]