production SAL

1651-1700 of 10000 results (70ms)

2022-09-30 §
05:05	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1126', diff saved to https://phabricator.wikimedia.org/P35200 and previous config saved to /var/cache/conftool/dbconfig/20220930-050533-root.json	[production]
04:19	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1184 (T314041)', diff saved to https://phabricator.wikimedia.org/P35199 and previous config saved to /var/cache/conftool/dbconfig/20220930-041937-ladsgroup.json	[production]
04:19	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Maintenance	[production]
04:19	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Maintenance	[production]
04:19	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1169 (T314041)', diff saved to https://phabricator.wikimedia.org/P35198 and previous config saved to /var/cache/conftool/dbconfig/20220930-041916-ladsgroup.json	[production]
04:04	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P35197 and previous config saved to /var/cache/conftool/dbconfig/20220930-040409-ladsgroup.json	[production]
03:49	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P35196 and previous config saved to /var/cache/conftool/dbconfig/20220930-034903-ladsgroup.json	[production]
03:33	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1169 (T314041)', diff saved to https://phabricator.wikimedia.org/P35195 and previous config saved to /var/cache/conftool/dbconfig/20220930-033356-ladsgroup.json	[production]
00:31	<robh@cumin2002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4045.ulsfo.wmnet with OS bullseye	[production]
00:22	<robh@cumin2002>	START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS bullseye	[production]
2022-09-29 §
22:46	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2176 (T314041)', diff saved to https://phabricator.wikimedia.org/P35193 and previous config saved to /var/cache/conftool/dbconfig/20220929-224649-ladsgroup.json	[production]
22:31	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P35192 and previous config saved to /var/cache/conftool/dbconfig/20220929-223143-ladsgroup.json	[production]
22:16	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P35191 and previous config saved to /var/cache/conftool/dbconfig/20220929-221637-ladsgroup.json	[production]
22:01	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2176 (T314041)', diff saved to https://phabricator.wikimedia.org/P35190 and previous config saved to /var/cache/conftool/dbconfig/20220929-220130-ladsgroup.json	[production]
21:53	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1169 (T314041)', diff saved to https://phabricator.wikimedia.org/P35189 and previous config saved to /var/cache/conftool/dbconfig/20220929-215333-ladsgroup.json	[production]
21:53	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance	[production]
21:53	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance	[production]
21:43	<sukhe>	alert1001: restart icinga	[production]
21:43	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
21:42	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
21:42	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
21:41	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
21:26	<robh@cumin2002>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cp4045.mgmt.ulsfo.wmnet with reboot policy FORCED	[production]
21:21	<robh@cumin2002>	START - Cookbook sre.hosts.provision for host cp4045.mgmt.ulsfo.wmnet with reboot policy FORCED	[production]
21:18	<robh@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
21:18	<ejegg>	payments-wiki upgraded from 839d6dde to aeee9676	[production]
21:14	<robh@cumin2002>	START - Cookbook sre.dns.netbox	[production]
21:14	<brennen>	end of utc late backport and config window	[production]
21:14	<brennen@deploy1002>	Finished scap: Backport for [[gerrit:836719\|cirrus: Don't configure cloud clusters for private wikis]] (duration: 08m 22s)	[production]
21:10	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
21:09	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
21:09	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
21:08	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
21:06	<brennen@deploy1002>	brennen and ebernhardson: Backport for [[gerrit:836719\|cirrus: Don't configure cloud clusters for private wikis]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet	[production]
21:05	<brennen@deploy1002>	Started scap: Backport for [[gerrit:836719\|cirrus: Don't configure cloud clusters for private wikis]]	[production]
21:03	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
21:02	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
21:02	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
21:01	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
20:59	<ryankemper>	T313431 Repooled `elastic[2073-2074,2080-2081,2083,2086].codfw.wmnet`. Codfw's all on 5 masters now and cluster is back to green.	[production]
20:58	<brennen@deploy1002>	Sync cancelled.	[production]
20:58	<brennen@deploy1002>	brennen and trainbranchbot: Backport for [[gerrit:836928\|Revert "cirrus: Don't configure cloud clusters for private wikis"]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet	[production]
20:58	<ryankemper>	T313431 Updated cross-cluster seed conf with new masters; should resolve the settings check alerts	[production]
20:58	<brennen@deploy1002>	Started scap: Backport for [[gerrit:836928\|Revert "cirrus: Don't configure cloud clusters for private wikis"]]	[production]
20:57	<robh@cumin2002>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cp4027.ulsfo.wmnet	[production]
20:57	<robh@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
20:56	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
20:55	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
20:55	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
20:54	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]