production SAL

4601-4650 of 10000 results (128ms)

2024-11-21 §
18:49	<cdanis@deploy2002>	cdanis, bvibber: Backport for [[gerrit:1093983\|Follow-up fix for Charts enable on commons/test2 (T379689)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
18:45	<cdanis@deploy2002>	Started scap sync-world: Backport for [[gerrit:1093983\|Follow-up fix for Charts enable on commons/test2 (T379689)]]	[production]
18:43	<gmodena@deploy2002>	Started deploy [analytics/refinery@199401a]: Ad-hoc deployment [analytics/refinery@199401a6]	[production]
18:21	<cdanis@deploy2002>	Finished scap sync-world: Backport for [[gerrit:1091328\|Enabling Charts on commons+test2 (T379689)]] (duration: 14m 05s)	[production]
18:16	<jayme@cumin2002>	conftool action : set/pooled=yes; selector: name=kubestage200[34].codfw.wmnet	[production]
18:15	<jayme@cumin2002>	conftool action : set/weight=10; selector: name=kubestage200[34].codfw.wmnet	[production]
18:13	<cdanis@deploy2002>	cdanis, bvibber: Continuing with sync	[production]
18:12	<cdanis@deploy2002>	cdanis, bvibber: Backport for [[gerrit:1091328\|Enabling Charts on commons+test2 (T379689)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
18:10	<sukhe>	running puppet on A:cp to resolve failed puppet run	[production]
18:10	<sukhe>	sudo cumin -b11 'A:cp' 'run-puppet-agent	[production]
18:09	<sukhe@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp2038.codfw.wmnet with reason: DIMM replacement in progress	[production]
18:09	<sukhe@cumin1002>	START - Cookbook sre.hosts.downtime for 1:00:00 on cp2038.codfw.wmnet with reason: DIMM replacement in progress	[production]
18:07	<cdanis@deploy2002>	Started scap sync-world: Backport for [[gerrit:1091328\|Enabling Charts on commons+test2 (T379689)]]	[production]
17:58	<sukhe@puppetserver1001>	conftool action : set/pooled=no; selector: name=cp2038.codfw.wmnet [reason: DIMM failure T308459]	[production]
17:45	<jayme@cumin2002>	END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) check for host kubestage2003.codfw.wmnet	[production]
17:45	<jayme@cumin2002>	START - Cookbook sre.k8s.pool-depool-node check for host kubestage2003.codfw.wmnet	[production]
17:40	<andrew@cumin1002>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts clouddb2002-dev.codfw.wmnet	[production]
17:40	<andrew@cumin1002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
17:40	<andrew@cumin1002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: clouddb2002-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"	[production]
17:39	<andrew@cumin1002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: clouddb2002-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin1002"	[production]
17:39	<fabfur>	adding acls to kafka-jumbo cluster (T380373)	[production]
17:36	<andrew@cumin1002>	START - Cookbook sre.dns.netbox	[production]
17:31	<andrew@cumin1002>	START - Cookbook sre.hosts.decommission for hosts clouddb2002-dev.codfw.wmnet	[production]
17:02	<cgoubert@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2157.codfw.wmnet with OS bookworm	[production]
16:54	<sukhe@cumin1002>	END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2013.codfw.wmnet	[production]
16:54	<sukhe@cumin1002>	START - Cookbook sre.hosts.remove-downtime for lvs2013.codfw.wmnet	[production]
16:54	<sukhe>	enable puppet on lvs2013 and start pybal	[production]
16:48	<sukhe@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2013.codfw.wmnet with reason: rebooting	[production]
16:47	<sukhe@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on lvs2013.codfw.wmnet with reason: rebooting	[production]
16:47	<cgoubert@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2140.codfw.wmnet with OS bookworm	[production]
16:47	<cgoubert@cumin1002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - cgoubert@cumin1002"	[production]
16:46	<sukhe@cumin1002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet	[production]
16:46	<cgoubert@cumin1002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - cgoubert@cumin1002"	[production]
16:43	<sukhe@cumin1002>	START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet	[production]
16:43	<sukhe>	rebooting drained lvs2013	[production]
16:43	<cgoubert@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2157.codfw.wmnet with reason: host reimage	[production]
16:39	<cgoubert@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2157.codfw.wmnet with reason: host reimage	[production]
16:26	<cgoubert@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2140.codfw.wmnet with reason: host reimage	[production]
16:23	<cgoubert@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2140.codfw.wmnet with reason: host reimage	[production]
16:21	<cgoubert@cumin1002>	START - Cookbook sre.hosts.reimage for host wikikube-worker2157.codfw.wmnet with OS bookworm	[production]
16:20	<cgoubert@cumin1002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2157.codfw.wmnet with OS bookworm	[production]
16:13	<sukhe@puppetserver1001>	conftool action : set/pooled=no; selector: name=cluster=dnsbox,dc=magru [reason: testing]	[production]
16:08	<dancy@deploy2002>	Finished scap sync-world: testing (duration: 03m 01s)	[production]
16:05	<dancy@deploy2002>	Started scap sync-world: testing	[production]
16:04	<cgoubert@cumin1002>	START - Cookbook sre.hosts.reimage for host wikikube-worker2140.codfw.wmnet with OS bookworm	[production]
16:03	<cgoubert@cumin1002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-worker2140.codfw.wmnet with OS bookworm	[production]
16:00	<dancy@deploy2002>	Installing scap version "4.127.0" for 209 hosts	[production]
15:39	<kartik@deploy2002>	Finished scap sync-world: Backport for [[gerrit:1093927\|Fix layout broken by display:flex on HorizontalLayout (T380471)]], [[gerrit:1093928\|Revert "ExperimentUserDefaultsManager: use read latest when retrieving central id"]] (duration: 15m 51s)	[production]
15:34	<gmodena@deploy2002>	Finished deploy [analytics/refinery@358ccf5] (hadoop-test): Ad-hoc deployment TEST [analytics/refinery@358ccf55] (duration: 03m 30s)	[production]
15:33	<kartik@deploy2002>	abi, sgimeno, kartik: Continuing with sync	[production]