2001-2050 of 10000 results (100ms)
2024-05-02 ยง
13:56 <jmm@cumin2002> START - Cookbook sre.dns.netbox [production]
13:56 <jmm@cumin2002> START - Cookbook sre.ganeti.makevm for new host netflow7001.magru.wmnet [production]
13:54 <hnowlan> running homer 'cr*eqiad*' commit for new kubernetes workers [production]
13:53 <jmm@cumin2002> END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti7003.magru.wmnet to cluster magru01 and group B3 [production]
13:53 <jiji@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'sync'. [production]
13:52 <jiji@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'sync'. [production]
13:52 <jmm@cumin2002> START - Cookbook sre.ganeti.addnode for new host ganeti7003.magru.wmnet to cluster magru01 and group B3 [production]
13:50 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance [production]
13:50 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance [production]
13:50 <jiji@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'sync'. [production]
13:50 <jiji@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'sync'. [production]
13:43 <jmm@cumin2002> END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti7003.magru.wmnet to cluster magru01 and group B3 [production]
13:43 <jmm@cumin2002> START - Cookbook sre.ganeti.addnode for new host ganeti7003.magru.wmnet to cluster magru01 and group B3 [production]
13:43 <marostegui@cumin1002> dbctl commit (dc=all): 'db1175 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61716 and previous config saved to /var/cache/conftool/dbconfig/20240502-134333-root.json [production]
13:43 <marostegui@cumin1002> dbctl commit (dc=all): 'db1189 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61715 and previous config saved to /var/cache/conftool/dbconfig/20240502-134328-root.json [production]
13:42 <jiji@deploy1002> helmfile [codfw] DONE helmfile.d/admin 'apply'. [production]
13:42 <marostegui@cumin1002> dbctl commit (dc=all): 'db2161 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61714 and previous config saved to /var/cache/conftool/dbconfig/20240502-134237-root.json [production]
13:42 <jiji@deploy1002> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
13:41 <jiji@deploy1002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
13:40 <jiji@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
13:40 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool db1175 db1189', diff saved to https://phabricator.wikimedia.org/P61713 and previous config saved to /var/cache/conftool/dbconfig/20240502-134050-root.json [production]
13:35 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2140.codfw.wmnet with reason: Maintenance [production]
13:35 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2140.codfw.wmnet with reason: Maintenance [production]
13:34 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2219 (T361627)', diff saved to https://phabricator.wikimedia.org/P61712 and previous config saved to /var/cache/conftool/dbconfig/20240502-133420-marostegui.json [production]
13:33 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet [production]
13:32 <sukhe> running authdns-update to revert magru text geomap [production]
13:27 <marostegui@cumin1002> dbctl commit (dc=all): 'db2161 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61711 and previous config saved to /var/cache/conftool/dbconfig/20240502-132731-root.json [production]
13:24 <jiji@deploy1002> helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. [production]
13:24 <jiji@deploy1002> helmfile [staging-eqiad] START helmfile.d/admin 'apply'. [production]
13:23 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet [production]
13:19 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P61710 and previous config saved to /var/cache/conftool/dbconfig/20240502-131912-marostegui.json [production]
13:12 <marostegui@cumin1002> dbctl commit (dc=all): 'db2161 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61709 and previous config saved to /var/cache/conftool/dbconfig/20240502-131225-root.json [production]
13:08 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2161.codfw.wmnet with OS bookworm [production]
13:04 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P61708 and previous config saved to /var/cache/conftool/dbconfig/20240502-130404-marostegui.json [production]
13:02 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . [production]
12:57 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . [production]
12:49 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . [production]
12:48 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2219 (T361627)', diff saved to https://phabricator.wikimedia.org/P61707 and previous config saved to /var/cache/conftool/dbconfig/20240502-124857-marostegui.json [production]
12:46 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2161.codfw.wmnet with reason: host reimage [production]
12:26 <marostegui@cumin1002> START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS bookworm [production]
12:25 <marostegui@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2161.codfw.wmnet with OS bookworm [production]
12:24 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P61704 and previous config saved to /var/cache/conftool/dbconfig/20240502-122409-marostegui.json [production]
12:22 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' . [production]
12:20 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' . [production]
12:19 <marostegui@cumin1002> START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS bookworm [production]
12:18 <marostegui@cumin1002> dbctl commit (dc=all): 'Depool db2161', diff saved to https://phabricator.wikimedia.org/P61703 and previous config saved to /var/cache/conftool/dbconfig/20240502-121759-root.json [production]
12:17 <jmm@cumin2002> END (PASS) - Cookbook sre.puppet.migrate-host (exit_code=0) for host db1230.eqiad.wmnet [production]
12:15 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' . [production]
12:09 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P61702 and previous config saved to /var/cache/conftool/dbconfig/20240502-120901-marostegui.json [production]
12:02 <elukey@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . [production]