production SAL

851-900 of 10000 results (89ms)

2024-06-06 §
08:50	<dcaro@cumin1002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephosd1031.eqiad.wmnet	[production]
08:47	<mvernon@cumin1002>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host thanos-be1003.eqiad.wmnet	[production]
08:44	<dcaro@cumin1002>	START - Cookbook sre.hosts.reboot-single for host cloudcephosd1031.eqiad.wmnet	[production]
08:44	<pfischer@deploy1002>	helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
08:43	<mvernon@cumin2002>	START - Cookbook sre.hosts.reboot-single for host thanos-be2002.codfw.wmnet	[production]
08:40	<mvernon@cumin2002>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host thanos-be2001.codfw.wmnet	[production]
08:39	<sfaci@deploy1002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply	[production]
08:39	<sfaci@deploy1002>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply	[production]
08:38	<filippo@cumin1002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2005.codfw.wmnet	[production]
08:37	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db1246 (re)pooling @ 2%: post maintenance repool', diff saved to https://phabricator.wikimedia.org/P64171 and previous config saved to /var/cache/conftool/dbconfig/20240606-083710-arnaudb.json	[production]
08:36	<mvernon@cumin1002>	START - Cookbook sre.hosts.reboot-single for host thanos-be1003.eqiad.wmnet	[production]
08:35	<pfischer@deploy1002>	helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
08:35	<pfischer@deploy1002>	helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
08:19	<filippo@cumin1002>	START - Cookbook sre.hosts.reboot-single for host prometheus1005.eqiad.wmnet	[production]
08:17	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2219 (T364299)', diff saved to https://phabricator.wikimedia.org/P64167 and previous config saved to /var/cache/conftool/dbconfig/20240606-081753-marostegui.json	[production]
08:14	<stevemunene@deploy1002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
08:14	<stevemunene@deploy1002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
08:14	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P64166 and previous config saved to /var/cache/conftool/dbconfig/20240606-081412-ladsgroup.json	[production]
08:02	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P64165 and previous config saved to /var/cache/conftool/dbconfig/20240606-080245-marostegui.json	[production]
08:02	<mvernon@cumin1002>	START - Cookbook sre.hosts.reboot-single for host thanos-be1002.eqiad.wmnet	[production]
08:01	<mvernon@cumin1002>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host thanos-be1001.eqiad.wmnet	[production]
08:00	<urbanecm@deploy1002>	Started scap: Backport for [[gerrit:1039287\|Add throttle exception for an upcoming workshop (T366748)]]	[production]
07:59	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P64164 and previous config saved to /var/cache/conftool/dbconfig/20240606-075904-ladsgroup.json	[production]
07:50	<mvernon@cumin1002>	START - Cookbook sre.hosts.reboot-single for host thanos-be1001.eqiad.wmnet	[production]
07:47	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P64163 and previous config saved to /var/cache/conftool/dbconfig/20240606-074737-marostegui.json	[production]
07:43	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1169 (T352010)', diff saved to https://phabricator.wikimedia.org/P64162 and previous config saved to /var/cache/conftool/dbconfig/20240606-074356-ladsgroup.json	[production]
07:32	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2219 (T364299)', diff saved to https://phabricator.wikimedia.org/P64161 and previous config saved to /var/cache/conftool/dbconfig/20240606-073229-marostegui.json	[production]
07:30	<ryankemper@cumin2002>	END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart - ryankemper@cumin2002 - T366555	[production]
07:06	<hashar>	Restarting Gerrit	[production]
07:05	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Depooling db2116 (T352010)', diff saved to https://phabricator.wikimedia.org/P64160 and previous config saved to /var/cache/conftool/dbconfig/20240606-070558-ladsgroup.json	[production]
07:05	<ladsgroup@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance	[production]
07:05	<ladsgroup@cumin1002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance	[production]
06:56	<dcaro@cumin1002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephosd1034.eqiad.wmnet	[production]
06:49	<dcaro@cumin1002>	START - Cookbook sre.hosts.reboot-single for host cloudcephosd1034.eqiad.wmnet	[production]
05:40	<ryankemper@cumin2002>	END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) reloading wikidata_full on wdqs2023.codfw.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/full/20240527/ using stat1009.eqiad.wmnet)	[production]
05:21	<ryankemper@cumin2002>	START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart - ryankemper@cumin2002 - T366555	[production]
05:19	<ryankemper@cumin2002>	END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart - ryankemper@cumin2002 - T366555	[production]
05:04	<ryankemper@cumin2002>	START - Cookbook sre.wdqs.data-reload reloading wikidata_full on wdqs2023.codfw.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/full/20240527/ using stat1009.eqiad.wmnet)	[production]
05:02	<ryankemper@cumin2002>	START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart - ryankemper@cumin2002 - T366555	[production]
04:17	<marostegui@cumin1002>	dbctl commit (dc=all): 'Depooling db2219 (T364299)', diff saved to https://phabricator.wikimedia.org/P64159 and previous config saved to /var/cache/conftool/dbconfig/20240606-041714-marostegui.json	[production]
04:17	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2219.codfw.wmnet with reason: Maintenance	[production]
04:16	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 6:00:00 on db2219.codfw.wmnet with reason: Maintenance	[production]
04:16	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2210 (T364299)', diff saved to https://phabricator.wikimedia.org/P64158 and previous config saved to /var/cache/conftool/dbconfig/20240606-041650-marostegui.json	[production]
04:01	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P64157 and previous config saved to /var/cache/conftool/dbconfig/20240606-040142-marostegui.json	[production]
03:47	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Depooling db1193 (T352010)', diff saved to https://phabricator.wikimedia.org/P64156 and previous config saved to /var/cache/conftool/dbconfig/20240606-034732-ladsgroup.json	[production]
03:47	<ladsgroup@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance	[production]
03:47	<ladsgroup@cumin1002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance	[production]
03:47	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1178 (T352010)', diff saved to https://phabricator.wikimedia.org/P64155 and previous config saved to /var/cache/conftool/dbconfig/20240606-034709-ladsgroup.json	[production]
03:46	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P64154 and previous config saved to /var/cache/conftool/dbconfig/20240606-034635-marostegui.json	[production]
03:32	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P64153 and previous config saved to /var/cache/conftool/dbconfig/20240606-033201-ladsgroup.json	[production]