production SAL

1401-1450 of 10000 results (74ms)

2023-05-03 §
11:01	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbproxy[2001-2004].codfw.wmnet with reason: Reboot T335845	[production]
11:00	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbproxy[2001-2004].codfw.wmnet with reason: Reboot T335845	[production]
10:57	<cgoubert@cumin1001>	START - Cookbook sre.hosts.reboot-single for host kubestagemaster1001.eqiad.wmnet	[production]
10:56	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2123 (T335838)', diff saved to https://phabricator.wikimedia.org/P47343 and previous config saved to /var/cache/conftool/dbconfig/20230503-105639-ladsgroup.json	[production]
10:52	<cgoubert@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/recommendation-api: apply	[production]
10:51	<cgoubert@deploy1002>	helmfile [eqiad] START helmfile.d/services/recommendation-api: apply	[production]
10:51	<claime>	Migrating recommendation-api eqiad to mw-api-int-async - T334062	[production]
10:50	<marostegui@cumin1001>	dbctl commit (dc=all): 'db2124 (re)pooling @ 50%: Repooling after migrating', diff saved to https://phabricator.wikimedia.org/P47342 and previous config saved to /var/cache/conftool/dbconfig/20230503-105028-root.json	[production]
10:50	<cgoubert@deploy1002>	helmfile [codfw] DONE helmfile.d/services/recommendation-api: apply	[production]
10:50	<claime>	Migrating recommendation-api codfw to mw-api-int-async - T334062	[production]
10:50	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db2123 (T335838)', diff saved to https://phabricator.wikimedia.org/P47341 and previous config saved to /var/cache/conftool/dbconfig/20230503-105004-ladsgroup.json	[production]
10:49	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance	[production]
10:49	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance	[production]
10:49	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2111 (T335838)', diff saved to https://phabricator.wikimedia.org/P47340 and previous config saved to /var/cache/conftool/dbconfig/20230503-104939-ladsgroup.json	[production]
10:49	<cgoubert@deploy1002>	helmfile [codfw] START helmfile.d/services/recommendation-api: apply	[production]
10:48	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2117', diff saved to https://phabricator.wikimedia.org/P47339 and previous config saved to /var/cache/conftool/dbconfig/20230503-104851-ladsgroup.json	[production]
10:47	<cgoubert@deploy1002>	helmfile [staging] DONE helmfile.d/services/recommendation-api: apply	[production]
10:45	<cgoubert@cumin1001>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host kubestagemaster2001.codfw.wmnet	[production]
10:45	<cgoubert@deploy1002>	helmfile [staging] START helmfile.d/services/recommendation-api: apply	[production]
10:45	<filippo@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite1005.eqiad.wmnet	[production]
10:45	<filippo@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe2004.codfw.wmnet	[production]
10:41	<volans@cumin1001>	END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) netbox to netbox-dev2002.codfw.wmnet with reason: Release v3.2.9-wmf2 to netbox-next - volans@cumin1001	[production]
10:40	<claime>	Migrating recommendation-api staging to mw-api-int-async - T334062	[production]
10:39	<filippo@cumin1001>	START - Cookbook sre.hosts.reboot-single for host thanos-fe2004.codfw.wmnet	[production]
10:38	<filippo@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-fe1004.eqiad.wmnet	[production]
10:38	<filippo@cumin1001>	START - Cookbook sre.hosts.reboot-single for host graphite1005.eqiad.wmnet	[production]
10:35	<lucaswerkmeister-wmde@deploy1002>	Finished scap: Backport for [[gerrit:914297\|wblistentityusage: Deprecate wbeu prefix, new output format (T300460 T196962)]] (duration: 34m 53s)	[production]
10:35	<marostegui@cumin1001>	dbctl commit (dc=all): 'db2124 (re)pooling @ 25%: Repooling after migrating', diff saved to https://phabricator.wikimedia.org/P47338 and previous config saved to /var/cache/conftool/dbconfig/20230503-103523-root.json	[production]
10:35	<volans@cumin1001>	START - Cookbook sre.deploy.python-code netbox to netbox-dev2002.codfw.wmnet with reason: Release v3.2.9-wmf2 to netbox-next - volans@cumin1001	[production]
10:34	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P47337 and previous config saved to /var/cache/conftool/dbconfig/20230503-103433-ladsgroup.json	[production]
10:34	<cgoubert@cumin1001>	START - Cookbook sre.hosts.reboot-single for host kubestagemaster2001.codfw.wmnet	[production]
10:33	<cgoubert@cumin1001>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host kubestagemaster2001.codfw.wmnet	[production]
10:33	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2117 (T335838)', diff saved to https://phabricator.wikimedia.org/P47336 and previous config saved to /var/cache/conftool/dbconfig/20230503-103345-ladsgroup.json	[production]
10:33	<cgoubert@cumin1001>	START - Cookbook sre.hosts.reboot-single for host kubestagemaster2001.codfw.wmnet	[production]
10:32	<filippo@cumin1001>	START - Cookbook sre.hosts.reboot-single for host thanos-fe1004.eqiad.wmnet	[production]
10:27	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db2117 (T335838)', diff saved to https://phabricator.wikimedia.org/P47335 and previous config saved to /var/cache/conftool/dbconfig/20230503-102719-ladsgroup.json	[production]
10:27	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance	[production]
10:27	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance	[production]
10:26	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2114 (T335838)', diff saved to https://phabricator.wikimedia.org/P47334 and previous config saved to /var/cache/conftool/dbconfig/20230503-102654-ladsgroup.json	[production]
10:25	<jelto@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2003.wikimedia.org	[production]
10:21	<akosiaris@deploy1002>	helmfile [staging] DONE helmfile.d/services/machinetranslation: apply	[production]
10:20	<marostegui@cumin1001>	dbctl commit (dc=all): 'db2124 (re)pooling @ 10%: Repooling after migrating', diff saved to https://phabricator.wikimedia.org/P47333 and previous config saved to /var/cache/conftool/dbconfig/20230503-102018-root.json	[production]
10:19	<jelto@cumin1001>	START - Cookbook sre.hosts.reboot-single for host gitlab2003.wikimedia.org	[production]
10:19	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P47332 and previous config saved to /var/cache/conftool/dbconfig/20230503-101926-ladsgroup.json	[production]
10:18	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling update on A:netbox	[production]
10:18	<ayounsi@cumin1001>	START - Cookbook sre.netbox.update-extras rolling update on A:netbox	[production]
10:18	<lucaswerkmeister-wmde@deploy1002>	lucaswerkmeister-wmde and migr: Backport for [[gerrit:914297\|wblistentityusage: Deprecate wbeu prefix, new output format (T300460 T196962)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet	[production]
10:18	<jelto@cumin1001>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host gitlab2003.wikimedia.org	[production]
10:17	<filippo@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host arclamp2001.codfw.wmnet	[production]
10:16	<eoghan@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on aphlict1001.eqiad.wmnet with reason: aphlict1002 is now active	[production]