production SAL

751-800 of 10000 results (69ms)

2023-07-07 §
11:04	<aborrero@cumin1001>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: wikimediacloud - aborrero@cumin1001"	[production]
11:02	<aborrero@cumin1001>	START - Cookbook sre.dns.netbox	[production]
10:13	<moritzm>	rebooting puppetdb1003	[production]
10:09	<moritzm>	rebooting puppetserver1001	[production]
10:06	<jmm@cumin2002>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host puppetdb2003.codfw.wmnet	[production]
10:05	<moritzm>	rebooting puppetserver2001	[production]
10:05	<jiji@deploy1002>	helmfile [staging] DONE helmfile.d/services/ipoid: apply	[production]
10:03	<jiji@deploy1002>	helmfile [staging] START helmfile.d/services/ipoid: apply	[production]
09:59	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet	[production]
09:55	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host puppetdb2003.codfw.wmnet	[production]
09:55	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet	[production]
09:52	<jmm@cumin2002>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host debmonitor2003.codfw.wmnet	[production]
09:52	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet	[production]
09:46	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet	[production]
09:46	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet	[production]
09:45	<stevemunene@cumin1001>	END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop analytics cluster: Restart of jvm daemons.	[production]
09:39	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor1003.eqiad.wmnet	[production]
09:37	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet	[production]
09:35	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host debmonitor1003.eqiad.wmnet	[production]
09:34	<jmm@cumin2002>	END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host lists1003.wikimedia.org	[production]
09:33	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet	[production]
09:29	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet	[production]
09:29	<stevemunene@cumin1001>	START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons.	[production]
09:26	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet	[production]
09:24	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3002.esams.wmnet	[production]
09:24	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host lists1003.wikimedia.org	[production]
09:20	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people1004.eqiad.wmnet	[production]
09:19	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host people1004.eqiad.wmnet	[production]
09:19	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host netflow3002.esams.wmnet	[production]
09:18	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5002.eqsin.wmnet	[production]
09:17	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2003.codfw.wmnet	[production]
09:13	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host people2003.codfw.wmnet	[production]
09:12	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host netflow5002.eqsin.wmnet	[production]
08:53	<btullis@deploy1002>	helmfile [staging] DONE helmfile.d/services/datahub: sync on main	[production]
08:50	<btullis@deploy1002>	helmfile [staging] START helmfile.d/services/datahub: apply on main	[production]
08:48	<moritzm>	installing bookworm kernel updates	[production]
08:47	<jmm@cumin2002>	END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: xhgui2002.codfw.wmnet	[production]
08:47	<jmm@cumin2002>	START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: xhgui2002.codfw.wmnet	[production]
08:46	<jmm@cumin2002>	END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: xhgui1002.eqiad.wmnet	[production]
08:46	<jmm@cumin2002>	START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: xhgui1002.eqiad.wmnet	[production]
08:05	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on kafka-test[1006-1010].eqiad.wmnet with reason: resetting cluster	[production]
08:05	<elukey@cumin1001>	START - Cookbook sre.hosts.downtime for 0:30:00 on kafka-test[1006-1010].eqiad.wmnet with reason: resetting cluster	[production]
01:55	<bking@cumin1001>	END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)	[production]
00:28	<bking@cumin1001>	START - Cookbook sre.wdqs.data-transfer	[production]
2023-07-06 §
23:14	<mutante>	mx1001 - rm /usr/local/bin/otrs_aliases ; rm /lib/systemd/system/generate_otrs_aliases.* after deploying gerrit:932316 which renamed script and timer without absenting them	[production]
23:08	<mutante>	mx2001 - rm /usr/local/bin/otrs_aliases ; rm /lib/systemd/system/generate_otrs_aliases.* after deploying gerrit:932316 which renamed script and timer without absenting them	[production]
21:12	<thcipriani@deploy1002>	Finished scap: Clean up font directory [[gerrit:723652]] (duration: 06m 33s)	[production]
21:10	<bking@deploy1002>	Finished deploy [wdqs/wdqs@dff41b7]: 0.3.124 (duration: 14m 56s)	[production]
21:06	<thcipriani@deploy1002>	Started scap: Clean up font directory [[gerrit:723652]]	[production]
21:04	<thcipriani@deploy1002>	Finished scap: Backport for [[gerrit:936084\|pawikibooks: Install Quiz extension (T340613)]] (duration: 12m 19s)	[production]