production SAL

1301-1350 of 10000 results (84ms)

2024-05-14 §
20:26	<cjming@deploy1002>	Started scap: Backport for [[gerrit:1031029\|cirrus: Shift 25% of public wikis writes in eqiad to replacement updater (T363475)]]	[production]
20:24	<cjming@deploy1002>	Finished scap: Backport for [[gerrit:1031495\|Enable night mode on Vector on testwiki, disable on Special:Homepage (T357699 T363814)]] (duration: 18m 40s)	[production]
20:14	<ebernhardson@deploy1002>	Finished deploy [airflow-dags/search@ecf603d]: update discolytics to 0.18.0 (duration: 00m 27s)	[production]
20:14	<ebernhardson@deploy1002>	Started deploy [airflow-dags/search@ecf603d]: update discolytics to 0.18.0	[production]
20:11	<cjming@deploy1002>	jdlrobson and cjming: Continuing with sync	[production]
20:08	<cjming@deploy1002>	jdlrobson and cjming: Backport for [[gerrit:1031495\|Enable night mode on Vector on testwiki, disable on Special:Homepage (T357699 T363814)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
20:08	<cdanis@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/opentelemetry-collector: apply	[production]
20:07	<cdanis@deploy1002>	helmfile [eqiad] START helmfile.d/services/opentelemetry-collector: apply	[production]
20:06	<cdanis@deploy1002>	helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply	[production]
20:06	<vriley@cumin1002>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
20:06	<cdanis@deploy1002>	helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply	[production]
20:05	<cjming@deploy1002>	Started scap: Backport for [[gerrit:1031495\|Enable night mode on Vector on testwiki, disable on Special:Homepage (T357699 T363814)]]	[production]
20:04	<cdanis@deploy1002>	helmfile [staging] DONE helmfile.d/services/opentelemetry-collector: apply	[production]
20:04	<cdanis@deploy1002>	helmfile [staging] START helmfile.d/services/opentelemetry-collector: apply	[production]
20:01	<jclark@cumin1002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"	[production]
19:53	<cdanis@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/opentelemetry-collector: apply	[production]
19:53	<cdanis@deploy1002>	helmfile [eqiad] START helmfile.d/services/opentelemetry-collector: apply	[production]
19:47	<vriley@cumin1002>	START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
19:47	<vriley@cumin1002>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
19:47	<vriley@cumin1002>	START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
19:46	<vriley@cumin1002>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
19:45	<jclark@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage	[production]
19:41	<jclark@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1006.eqiad.wmnet with reason: host reimage	[production]
19:39	<vriley@cumin1002>	START - Cookbook sre.hosts.provision for host kafka-main1010.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
19:38	<vriley@cumin1002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
19:38	<vriley@cumin1002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1010 - vriley@cumin1002"	[production]
19:37	<vriley@cumin1002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1010 - vriley@cumin1002"	[production]
19:32	<vriley@cumin1002>	START - Cookbook sre.dns.netbox	[production]
19:30	<vriley@cumin1002>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-main1008.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
19:26	<jclark@cumin1002>	START - Cookbook sre.hosts.reimage for host kafka-main1006.eqiad.wmnet with OS bullseye	[production]
19:25	<vriley@cumin1002>	END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['kafka-main1006']	[production]
19:23	<vriley@cumin1002>	START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main1006']	[production]
19:19	<vriley@cumin1002>	START - Cookbook sre.hosts.provision for host kafka-main1009.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
19:18	<vriley@cumin1002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
19:18	<cdanis>	T364907 💔cdanis@apt1002.wikimedia.org ~ 🕞🍵 sudo -i reprepro --keepunreferencedfiles includedeb bullseye-wikimedia ~/otelcol-contrib_0.100.0_linux_amd64.deb	[production]
19:18	<vriley@cumin1002>	START - Cookbook sre.hosts.provision for host kafka-main1008.mgmt.eqiad.wmnet with reboot policy FORCED	[production]
19:17	<vriley@cumin1002>	START - Cookbook sre.dns.netbox	[production]
19:16	<vriley@cumin1002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
19:16	<vriley@cumin1002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1008 - vriley@cumin1002"	[production]
19:16	<vriley@cumin1002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt kafka-main1008 - vriley@cumin1002"	[production]
19:13	<vriley@cumin1002>	START - Cookbook sre.dns.netbox	[production]
18:18	<sukhe>	restart pybal on backup LVSes	[production]
18:17	<sukhe>	[CORRECTION] above pybal restart was NOT run	[production]
18:15	<amastilovic@deploy1002>	Finished deploy [airflow-dags/analytics@6270c72]: (no justification provided) (duration: 00m 34s)	[production]
18:14	<amastilovic@deploy1002>	Started deploy [airflow-dags/analytics@6270c72]: (no justification provided)	[production]
18:10	<sukhe>	sudo cumin -b1 -s120 'A:lvs' 'systemctl restart pybal.service': clearing up alert for reverted pybal.conf CR 1031470	[production]
17:47	<ejegg>	donorwiki upgraded from b005071a to fa7de70f	[production]
17:33	<ryankemper@cumin2002>	END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.	[production]
17:27	<ryankemper@cumin2002>	START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons.	[production]
17:25	<ryankemper@cumin2002>	END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons.	[production]