production SAL

501-550 of 10000 results (84ms)

2023-03-28 §
09:35	<vgutierrez>	depool cp2035 - T333312	[production]
09:28	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main1001.eqiad.wmnet with reason: stop kafka and dist-upgrade	[production]
09:28	<elukey@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main1001.eqiad.wmnet with reason: stop kafka and dist-upgrade	[production]
09:12	<jbond@cumin1001>	END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Nicolas Fraison out of all services on: 2048 hosts	[production]
09:11	<jbond@cumin1001>	START - Cookbook sre.idm.logout Logging Nicolas Fraison out of all services on: 2048 hosts	[production]
09:11	<jbond@cumin1001>	END (ERROR) - Cookbook sre.idm.logout (exit_code=97) Logging Nicolas Fraison out of systemdlogoutd on: 2048 hosts	[production]
09:11	<jbond@cumin1001>	START - Cookbook sre.idm.logout Logging Nicolas Fraison out of systemdlogoutd on: 2048 hosts	[production]
08:58	<vgutierrez>	restart ipmiseld on cp2035	[production]
08:50	<aborrero@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2005-dev.wikimedia.org	[production]
08:49	<ayounsi@deploy1002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
08:48	<AndyRussG>	update payments.wiki config 65bedd4a -> e31ffd7d, payments (automatic updates only) a6c6c2b1 -> f5ec2677	[production]
08:45	<ayounsi@deploy1002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
08:43	<ayounsi@deploy1002>	helmfile [codfw] DONE helmfile.d/admin 'apply'.	[production]
08:42	<aborrero@cumin2002>	START - Cookbook sre.hosts.reboot-single for host cloudservices2005-dev.wikimedia.org	[production]
08:39	<ayounsi@deploy1002>	helmfile [codfw] START helmfile.d/admin 'apply'.	[production]
08:37	<ayounsi@deploy1002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.	[production]
08:35	<ayounsi@deploy1002>	helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.	[production]
08:34	<ayounsi@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.	[production]
08:32	<ayounsi@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.	[production]
08:32	<phedenskog@deploy2002>	Finished deploy [performance/navtiming@e757bdf]: (no justification provided) (duration: 00m 06s)	[production]
08:32	<phedenskog@deploy2002>	Started deploy [performance/navtiming@e757bdf]: (no justification provided)	[production]
08:31	<ayounsi@deploy1002>	helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.	[production]
08:29	<ayounsi@deploy1002>	helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.	[production]
08:25	<ayounsi@deploy1002>	helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.	[production]
08:21	<ayounsi@deploy1002>	helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.	[production]
08:14	<ayounsi@deploy1002>	helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.	[production]
08:11	<oblivian@deploy2002>	Finished scap: Backport for [[gerrit:903209\|Failover statsd to graphite2004 (T330165)]] (duration: 08m 48s)	[production]
08:08	<ayounsi@deploy1002>	helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.	[production]
08:06	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on 16 hosts with reason: Switch maintenance	[production]
08:05	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 8:00:00 on 16 hosts with reason: Switch maintenance	[production]
08:05	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on 21 hosts with reason: Switch maintenance	[production]
08:05	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 8:00:00 on 21 hosts with reason: Switch maintenance	[production]
08:04	<oblivian@deploy2002>	oblivian and filippo: Backport for [[gerrit:903209\|Failover statsd to graphite2004 (T330165)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet	[production]
08:03	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on es[1020-1022].eqiad.wmnet with reason: Switch maintenance	[production]
08:03	<ayounsi@deploy1002>	helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.	[production]
08:03	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 8:00:00 on es[1020-1022].eqiad.wmnet with reason: Switch maintenance	[production]
08:02	<oblivian@deploy2002>	Started scap: Backport for [[gerrit:903209\|Failover statsd to graphite2004 (T330165)]]	[production]
08:02	<ayounsi@deploy1002>	helmfile [staging-eqiad] START helmfile.d/admin 'apply'.	[production]
08:00	<ayounsi@deploy1002>	helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.	[production]
08:00	<godog>	move graphite reads to codfw - T330165	[production]
07:56	<jayme@deploy1002>	helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.	[production]
07:56	<jayme@deploy1002>	helmfile [staging-codfw] START helmfile.d/admin 'apply'.	[production]
07:56	<ayounsi@deploy1002>	helmfile [staging-codfw] START helmfile.d/admin 'apply'.	[production]
07:54	<root@deploy1002>	helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.	[production]
07:54	<root@deploy1002>	helmfile [staging-codfw] START helmfile.d/admin 'apply'.	[production]
07:51	<ayounsi@deploy1002>	helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.	[production]
07:51	<ayounsi@deploy1002>	helmfile [staging-codfw] START helmfile.d/admin 'apply'.	[production]
07:31	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1179 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P45965 and previous config saved to /var/cache/conftool/dbconfig/20230328-073122-root.json	[production]
07:28	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'clear' for AS: 17806	[production]
07:27	<ayounsi@cumin1001>	START - Cookbook sre.network.peering with action 'clear' for AS: 17806	[production]