production SAL

851-900 of 10000 results (47ms)

2022-03-30 §
14:25	<btullis@cumin1001>	START - Cookbook sre.hosts.reboot-single for host an-db1001.eqiad.wmnet	[production]
14:22	<elukey@cumin1001>	START - Cookbook sre.hosts.reboot-single for host ores2005.codfw.wmnet	[production]
14:22	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
14:21	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
14:21	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
14:20	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
14:19	<moritzm>	installing remaining tiff security updates	[production]
14:19	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2004.codfw.wmnet	[production]
14:17	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P23826 and previous config saved to /var/cache/conftool/dbconfig/20220330-141747-ladsgroup.json	[production]
14:15	<hashar>	deploy1002: `git fetch && git rebase` to catchup with `group1 wikis to 1.39.0-wmf.5` commit which did not get send to Gerrit but got deployed earlier today	[production]
14:13	<elukey@cumin1001>	START - Cookbook sre.hosts.reboot-single for host ores2004.codfw.wmnet	[production]
14:11	<kormat@cumin1001>	START - Cookbook sre.hosts.reimage for host db2093.codfw.wmnet with OS bullseye	[production]
14:07	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2003.codfw.wmnet	[production]
14:06	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depooling db1175 (T297189)', diff saved to https://phabricator.wikimedia.org/P23825 and previous config saved to /var/cache/conftool/dbconfig/20220330-140556-marostegui.json	[production]
14:05	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1175.eqiad.wmnet with reason: Maintenance	[production]
14:05	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 8:00:00 on db1175.eqiad.wmnet with reason: Maintenance	[production]
14:05	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1166 (T297189)', diff saved to https://phabricator.wikimedia.org/P23824 and previous config saved to /var/cache/conftool/dbconfig/20220330-140549-marostegui.json	[production]
14:02	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P23823 and previous config saved to /var/cache/conftool/dbconfig/20220330-140242-ladsgroup.json	[production]
14:01	<elukey@cumin1001>	START - Cookbook sre.hosts.reboot-single for host ores2003.codfw.wmnet	[production]
13:59	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2002.codfw.wmnet	[production]
13:55	<kormat>	stopping orchestrator for backend move T301315	[production]
13:52	<elukey@cumin1001>	START - Cookbook sre.hosts.reboot-single for host ores2002.codfw.wmnet	[production]
13:52	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2001.codfw.wmnet	[production]
13:51	<elukey@deploy1002>	helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.	[production]
13:51	<elukey@deploy1002>	helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.	[production]
13:51	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
13:51	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
13:50	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P23822 and previous config saved to /var/cache/conftool/dbconfig/20220330-135044-marostegui.json	[production]
13:47	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1158 (T298565)', diff saved to https://phabricator.wikimedia.org/P23821 and previous config saved to /var/cache/conftool/dbconfig/20220330-134737-ladsgroup.json	[production]
13:47	<elukey@cumin1001>	START - Cookbook sre.hosts.reboot-single for host ores2001.codfw.wmnet	[production]
13:40	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1129 (T298565)', diff saved to https://phabricator.wikimedia.org/P23820 and previous config saved to /var/cache/conftool/dbconfig/20220330-134010-ladsgroup.json	[production]
13:40	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance	[production]
13:40	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1129.eqiad.wmnet with reason: Maintenance	[production]
13:40	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T298565)', diff saved to https://phabricator.wikimedia.org/P23819 and previous config saved to /var/cache/conftool/dbconfig/20220330-134002-ladsgroup.json	[production]
13:36	<jayme>	restarting pybal on lvs1019 and lvs2009	[production]
13:35	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P23818 and previous config saved to /var/cache/conftool/dbconfig/20220330-133538-marostegui.json	[production]
13:34	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1158 (T298565)', diff saved to https://phabricator.wikimedia.org/P23817 and previous config saved to /var/cache/conftool/dbconfig/20220330-133436-ladsgroup.json	[production]
13:34	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance	[production]
13:34	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 12:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance	[production]
13:34	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance	[production]
13:34	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance	[production]
13:34	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1098:3317 (T298565)', diff saved to https://phabricator.wikimedia.org/P23816 and previous config saved to /var/cache/conftool/dbconfig/20220330-133423-ladsgroup.json	[production]
13:33	<jayme>	restarting pybal on lvs1020 and lvs2010	[production]
13:33	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-etcd2003.codfw.wmnet	[production]
13:30	<elukey@cumin1001>	START - Cookbook sre.hosts.reboot-single for host ml-etcd2003.codfw.wmnet	[production]
13:25	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-etcd2002.codfw.wmnet	[production]
13:24	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P23815 and previous config saved to /var/cache/conftool/dbconfig/20220330-132457-ladsgroup.json	[production]
13:22	<elukey@cumin1001>	START - Cookbook sre.hosts.reboot-single for host ml-etcd2002.codfw.wmnet	[production]
13:20	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1166 (T297189)', diff saved to https://phabricator.wikimedia.org/P23814 and previous config saved to /var/cache/conftool/dbconfig/20220330-132033-marostegui.json	[production]
13:19	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P23813 and previous config saved to /var/cache/conftool/dbconfig/20220330-131918-ladsgroup.json	[production]