production SAL

1351-1400 of 10000 results (109ms)

2024-06-06 §
09:20	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P64175 and previous config saved to /var/cache/conftool/dbconfig/20240606-092037-marostegui.json	[production]
09:20	<stevemunene@deploy1002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
09:18	<mvernon@cumin2002>	START - Cookbook sre.hosts.reboot-single for host thanos-be2003.codfw.wmnet	[production]
09:17	<cgoubert@cumin1002>	START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-worker-codfw	[production]
09:17	<stevemunene@deploy1002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
09:15	<stevemunene@deploy1002>	helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.	[production]
09:13	<stevemunene@deploy1002>	helmfile [staging-codfw] START helmfile.d/admin 'apply'.	[production]
09:12	<stevemunene@deploy1002>	helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.	[production]
09:11	<stevemunene@deploy1002>	helmfile [staging-eqiad] START helmfile.d/admin 'apply'.	[production]
09:08	<mvernon@cumin1002>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host thanos-be1004.eqiad.wmnet	[production]
09:07	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db1246 (re)pooling @ 10%: post maintenance repool', diff saved to https://phabricator.wikimedia.org/P64174 and previous config saved to /var/cache/conftool/dbconfig/20240606-090722-arnaudb.json	[production]
09:05	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1244 (T364069)', diff saved to https://phabricator.wikimedia.org/P64173 and previous config saved to /var/cache/conftool/dbconfig/20240606-090529-marostegui.json	[production]
09:01	<mvernon@cumin2002>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host thanos-be2002.codfw.wmnet	[production]
09:01	<filippo@cumin1002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus1006.eqiad.wmnet	[production]
09:01	<filippo@cumin1002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2006.codfw.wmnet	[production]
08:57	<sfaci@deploy1002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply	[production]
08:56	<mvernon@cumin1002>	START - Cookbook sre.hosts.reboot-single for host thanos-be1004.eqiad.wmnet	[production]
08:56	<sfaci@deploy1002>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply	[production]
08:52	<filippo@cumin1002>	START - Cookbook sre.hosts.reboot-single for host prometheus1006.eqiad.wmnet	[production]
08:52	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db1246 (re)pooling @ 5%: post maintenance repool', diff saved to https://phabricator.wikimedia.org/P64172 and previous config saved to /var/cache/conftool/dbconfig/20240606-085216-arnaudb.json	[production]
08:52	<filippo@cumin1002>	START - Cookbook sre.hosts.reboot-single for host prometheus2006.codfw.wmnet	[production]
08:50	<dcaro@cumin1002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephosd1031.eqiad.wmnet	[production]
08:47	<mvernon@cumin1002>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host thanos-be1003.eqiad.wmnet	[production]
08:44	<dcaro@cumin1002>	START - Cookbook sre.hosts.reboot-single for host cloudcephosd1031.eqiad.wmnet	[production]
08:44	<pfischer@deploy1002>	helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
08:43	<mvernon@cumin2002>	START - Cookbook sre.hosts.reboot-single for host thanos-be2002.codfw.wmnet	[production]
08:40	<mvernon@cumin2002>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host thanos-be2001.codfw.wmnet	[production]
08:39	<sfaci@deploy1002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply	[production]
08:39	<sfaci@deploy1002>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply	[production]
08:38	<filippo@cumin1002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2005.codfw.wmnet	[production]
08:37	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db1246 (re)pooling @ 2%: post maintenance repool', diff saved to https://phabricator.wikimedia.org/P64171 and previous config saved to /var/cache/conftool/dbconfig/20240606-083710-arnaudb.json	[production]
08:36	<mvernon@cumin1002>	START - Cookbook sre.hosts.reboot-single for host thanos-be1003.eqiad.wmnet	[production]
08:35	<pfischer@deploy1002>	helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
08:35	<pfischer@deploy1002>	helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
08:19	<filippo@cumin1002>	START - Cookbook sre.hosts.reboot-single for host prometheus1005.eqiad.wmnet	[production]
08:17	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2219 (T364299)', diff saved to https://phabricator.wikimedia.org/P64167 and previous config saved to /var/cache/conftool/dbconfig/20240606-081753-marostegui.json	[production]
08:14	<stevemunene@deploy1002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
08:14	<stevemunene@deploy1002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
08:14	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P64166 and previous config saved to /var/cache/conftool/dbconfig/20240606-081412-ladsgroup.json	[production]
08:02	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P64165 and previous config saved to /var/cache/conftool/dbconfig/20240606-080245-marostegui.json	[production]
08:02	<mvernon@cumin1002>	START - Cookbook sre.hosts.reboot-single for host thanos-be1002.eqiad.wmnet	[production]
08:01	<mvernon@cumin1002>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host thanos-be1001.eqiad.wmnet	[production]
08:00	<urbanecm@deploy1002>	Started scap: Backport for [[gerrit:1039287\|Add throttle exception for an upcoming workshop (T366748)]]	[production]
07:59	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P64164 and previous config saved to /var/cache/conftool/dbconfig/20240606-075904-ladsgroup.json	[production]
07:50	<mvernon@cumin1002>	START - Cookbook sre.hosts.reboot-single for host thanos-be1001.eqiad.wmnet	[production]
07:47	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P64163 and previous config saved to /var/cache/conftool/dbconfig/20240606-074737-marostegui.json	[production]
07:43	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1169 (T352010)', diff saved to https://phabricator.wikimedia.org/P64162 and previous config saved to /var/cache/conftool/dbconfig/20240606-074356-ladsgroup.json	[production]
07:32	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2219 (T364299)', diff saved to https://phabricator.wikimedia.org/P64161 and previous config saved to /var/cache/conftool/dbconfig/20240606-073229-marostegui.json	[production]
07:30	<ryankemper@cumin2002>	END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart - ryankemper@cumin2002 - T366555	[production]
07:06	<hashar>	Restarting Gerrit	[production]