production SAL

5451-5500 of 10000 results (93ms)

2024-05-02 §
13:58	<jiji@deploy1002>	helmfile [staging-eqiad] START helmfile.d/admin 'apply'.	[production]
13:57	<marostegui@cumin1002>	dbctl commit (dc=all): 'db2161 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P61717 and previous config saved to /var/cache/conftool/dbconfig/20240502-135743-root.json	[production]
13:57	<jiji@deploy1002>	helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.	[production]
13:57	<jiji@deploy1002>	helmfile [staging-eqiad] START helmfile.d/admin 'sync'.	[production]
13:56	<jmm@cumin2002>	START - Cookbook sre.dns.netbox	[production]
13:56	<jmm@cumin2002>	START - Cookbook sre.ganeti.makevm for new host netflow7001.magru.wmnet	[production]
13:54	<hnowlan>	running homer 'creqiad' commit for new kubernetes workers	[production]
13:53	<jmm@cumin2002>	END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti7003.magru.wmnet to cluster magru01 and group B3	[production]
13:53	<jiji@deploy1002>	helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.	[production]
13:52	<jiji@deploy1002>	helmfile [staging-codfw] START helmfile.d/admin 'sync'.	[production]
13:52	<jmm@cumin2002>	START - Cookbook sre.ganeti.addnode for new host ganeti7003.magru.wmnet to cluster magru01 and group B3	[production]
13:50	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance	[production]
13:50	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance	[production]
13:50	<jiji@deploy1002>	helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.	[production]
13:50	<jiji@deploy1002>	helmfile [staging-codfw] START helmfile.d/admin 'sync'.	[production]
13:43	<jmm@cumin2002>	END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti7003.magru.wmnet to cluster magru01 and group B3	[production]
13:43	<jmm@cumin2002>	START - Cookbook sre.ganeti.addnode for new host ganeti7003.magru.wmnet to cluster magru01 and group B3	[production]
13:43	<marostegui@cumin1002>	dbctl commit (dc=all): 'db1175 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61716 and previous config saved to /var/cache/conftool/dbconfig/20240502-134333-root.json	[production]
13:43	<marostegui@cumin1002>	dbctl commit (dc=all): 'db1189 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61715 and previous config saved to /var/cache/conftool/dbconfig/20240502-134328-root.json	[production]
13:42	<jiji@deploy1002>	helmfile [codfw] DONE helmfile.d/admin 'apply'.	[production]
13:42	<marostegui@cumin1002>	dbctl commit (dc=all): 'db2161 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P61714 and previous config saved to /var/cache/conftool/dbconfig/20240502-134237-root.json	[production]
13:42	<jiji@deploy1002>	helmfile [codfw] START helmfile.d/admin 'apply'.	[production]
13:41	<jiji@deploy1002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
13:40	<jiji@deploy1002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
13:40	<marostegui@cumin1002>	dbctl commit (dc=all): 'Depool db1175 db1189', diff saved to https://phabricator.wikimedia.org/P61713 and previous config saved to /var/cache/conftool/dbconfig/20240502-134050-root.json	[production]
13:35	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2140.codfw.wmnet with reason: Maintenance	[production]
13:35	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 4:00:00 on db2140.codfw.wmnet with reason: Maintenance	[production]
13:34	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2219 (T361627)', diff saved to https://phabricator.wikimedia.org/P61712 and previous config saved to /var/cache/conftool/dbconfig/20240502-133420-marostegui.json	[production]
13:33	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet	[production]
13:32	<sukhe>	running authdns-update to revert magru text geomap	[production]
13:27	<marostegui@cumin1002>	dbctl commit (dc=all): 'db2161 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P61711 and previous config saved to /var/cache/conftool/dbconfig/20240502-132731-root.json	[production]
13:24	<jiji@deploy1002>	helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.	[production]
13:24	<jiji@deploy1002>	helmfile [staging-eqiad] START helmfile.d/admin 'apply'.	[production]
13:23	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet	[production]
13:19	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P61710 and previous config saved to /var/cache/conftool/dbconfig/20240502-131912-marostegui.json	[production]
13:12	<marostegui@cumin1002>	dbctl commit (dc=all): 'db2161 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P61709 and previous config saved to /var/cache/conftool/dbconfig/20240502-131225-root.json	[production]
13:08	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2161.codfw.wmnet with OS bookworm	[production]
13:04	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P61708 and previous config saved to /var/cache/conftool/dbconfig/20240502-130404-marostegui.json	[production]
13:02	<elukey@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .	[production]
12:57	<elukey@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .	[production]
12:49	<elukey@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .	[production]
12:48	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2219 (T361627)', diff saved to https://phabricator.wikimedia.org/P61707 and previous config saved to /var/cache/conftool/dbconfig/20240502-124857-marostegui.json	[production]
12:46	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2161.codfw.wmnet with reason: host reimage	[production]
12:26	<marostegui@cumin1002>	START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS bookworm	[production]
12:25	<marostegui@cumin1002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2161.codfw.wmnet with OS bookworm	[production]
12:24	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P61704 and previous config saved to /var/cache/conftool/dbconfig/20240502-122409-marostegui.json	[production]
12:22	<elukey@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .	[production]
12:20	<elukey@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .	[production]
12:19	<marostegui@cumin1002>	START - Cookbook sre.hosts.reimage for host db2161.codfw.wmnet with OS bookworm	[production]
12:18	<marostegui@cumin1002>	dbctl commit (dc=all): 'Depool db2161', diff saved to https://phabricator.wikimedia.org/P61703 and previous config saved to /var/cache/conftool/dbconfig/20240502-121759-root.json	[production]