production SAL

101-150 of 10000 results (106ms)

2025-11-21 §
14:42	<ladsgroup@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1159.eqiad.wmnet with reason: Maintenance	[production]
14:35	<bking@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS bookworm	[production]
14:30	<bking@cumin2002>	START - Cookbook sre.hosts.reimage for host relforge1009.eqiad.wmnet with OS bookworm	[production]
14:27	<bking@cumin2002>	START - Cookbook sre.hosts.reimage for host relforge1008.eqiad.wmnet with OS bookworm	[production]
14:26	<jmm@cumin2002>	DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging West1 out of all services on: 2410 hosts	[production]
14:25	<sukhe>	homer "creqsin" commit "bring up hcaptcha-proxy500[12]": T409780	[production]
14:25	<bking@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1009.eqiad.wmnet with OS bookworm	[production]
14:25	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Repool pc8 (T405942)', diff saved to https://phabricator.wikimedia.org/P85440 and previous config saved to /var/cache/conftool/dbconfig/20251121-142500-ladsgroup.json	[production]
14:24	<bking@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host relforge1008.eqiad.wmnet with OS bookworm	[production]
14:23	<sukhe>	homer "crulsfo" commit "bring up hcaptcha-proxy400[12]": T409780	[production]
14:21	<ladsgroup@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1018.eqiad.wmnet with reason: Maint	[production]
14:21	<ladsgroup@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc2018.codfw.wmnet with reason: Maint	[production]
14:21	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Depool pc8 (T405942)', diff saved to https://phabricator.wikimedia.org/P85439 and previous config saved to /var/cache/conftool/dbconfig/20251121-142059-ladsgroup.json	[production]
14:18	<sukhe>	homer "crcodfw" commit "bring up hcaptcha-proxy200[12]": T409780	[production]
14:17	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Repool pc7 (T405942)', diff saved to https://phabricator.wikimedia.org/P85438 and previous config saved to /var/cache/conftool/dbconfig/20251121-141747-ladsgroup.json	[production]
14:14	<ladsgroup@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on pc2017.codfw.wmnet with reason: Maint	[production]
14:14	<ladsgroup@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on pc1017.eqiad.wmnet with reason: Maint	[production]
14:13	<sukhe>	homer "creqiad" commit "bring up hcaptcha-proxy100[12]": T409780	[production]
14:13	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Depool pc7 (T405942)', diff saved to https://phabricator.wikimedia.org/P85437 and previous config saved to /var/cache/conftool/dbconfig/20251121-141345-ladsgroup.json	[production]
14:09	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Repool pc6 (T405942)', diff saved to https://phabricator.wikimedia.org/P85436 and previous config saved to /var/cache/conftool/dbconfig/20251121-140903-ladsgroup.json	[production]
14:05	<ladsgroup@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on pc2016.codfw.wmnet with reason: Maint	[production]
14:05	<ladsgroup@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on pc1016.eqiad.wmnet with reason: Maint	[production]
14:03	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Depool pc6 (T405942)', diff saved to https://phabricator.wikimedia.org/P85435 and previous config saved to /var/cache/conftool/dbconfig/20251121-140327-ladsgroup.json	[production]
13:52	<ayounsi@cumin1003>	END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox	[production]
13:52	<ayounsi@cumin1003>	START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox	[production]
13:24	<mwpresync@deploy2002>	Started scap build-images: Publishing wmf/next image	[production]
13:19	<btullis@cumin1003>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1007.eqiad.wmnet	[production]
13:12	<btullis@cumin1003>	START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1007.eqiad.wmnet	[production]
12:26	<cmooney@cumin1003>	END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Homer release v0.11.0 minor update - cmooney@cumin1003	[production]
12:24	<cmooney@cumin1003>	START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1003.eqiad.wmnet with reason: Homer release v0.11.0 minor update - cmooney@cumin1003	[production]
10:42	<ayounsi@cumin1003>	START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1005.eqiad.wmnet	[production]
10:19	<jnuche@deploy2002>	Finished deploy [releng/jenkins-deploy@a809ec3] (releasing): T410680 (duration: 02m 13s)	[production]
10:16	<jnuche@deploy2002>	Started deploy [releng/jenkins-deploy@a809ec3] (releasing): T410680	[production]
09:45	<dpogorzelski@deploy2002>	helmfile [staging] DONE helmfile.d/services/changeprop: sync	[production]
09:45	<dpogorzelski@deploy2002>	helmfile [staging] START helmfile.d/services/changeprop: sync	[production]
09:37	<bwojtowicz@deploy2002>	helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .	[production]
09:34	<bwojtowicz@deploy2002>	helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .	[production]
09:16	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cumin1002.eqiad.wmnet	[production]
09:16	<jmm@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
09:16	<jmm@cumin2002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cumin1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"	[production]
09:13	<jmm@cumin2002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cumin1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"	[production]
08:46	<jnuche@deploy2002>	Finished deploy [releng/jenkins-deploy@f3216ec] (releasing): testing issue with instance (duration: 01m 48s)	[production]
08:44	<jnuche@deploy2002>	Started deploy [releng/jenkins-deploy@f3216ec] (releasing): testing issue with instance	[production]
08:41	<jmm@cumin2002>	START - Cookbook sre.dns.netbox	[production]
08:36	<jmm@cumin2002>	START - Cookbook sre.hosts.decommission for hosts cumin1002.eqiad.wmnet	[production]
07:58	<ladsgroup@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance	[production]
07:51	<bwojtowicz@deploy2002>	helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .	[production]
07:50	<bwojtowicz@deploy2002>	helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .	[production]
04:18	<ryankemper@cumin2002>	END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (2 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reboot (apply updates) - ryankemper@cumin2002 - T390860	[production]
02:09	<ejegg>	payments-wiki upgraded from 40f6f252 to 2a73a08d	[production]