production SAL

351-400 of 10000 results (40ms)

2021-04-27 §
19:44	<dzahn@cumin1001>	END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host people1003.eqiad.wmnet	[production]
19:37	<herron@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2004.codfw.wmnet with reason: REIMAGE	[production]
19:35	<papaul>	powerdown ms-backup2001 for maintenance	[production]
19:35	<herron@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2004.codfw.wmnet with reason: REIMAGE	[production]
19:07	<papaul>	powerdown logstash2035 for maintenance	[production]
19:03	<dzahn@cumin1001>	START - Cookbook sre.ganeti.makevm for new host people1003.eqiad.wmnet	[production]
19:00	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts people1003.eqiad.wmnet	[production]
18:50	<mutante>	people1003 - destroying VM and recreating again from scratch to test if issue of no console and no access is repeatable	[production]
18:50	<dzahn@cumin1001>	START - Cookbook sre.hosts.decommission for hosts people1003.eqiad.wmnet	[production]
18:37	<herron@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1005.eqiad.wmnet with reason: REIMAGE	[production]
18:35	<herron@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1005.eqiad.wmnet with reason: REIMAGE	[production]
18:33	<mutante>	people1003 - rebooting, trying to get new VM to work	[production]
18:33	<Urbanecm>	Morning B&C window done	[production]
18:32	<urbanecm@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: 91a85f2: ac770bf: Enable language in header for office and testwiki users (T280526) (duration: 01m 19s)	[production]
18:32	<bblack>	lvs2009 - restart pybal + re-run puppet agent - T279457	[production]
18:23	<robh@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
18:20	<bblack@cumin1001>	conftool action : set/pooled=yes; selector: name=cp203[56].codfw.wmnet	[production]
18:20	<bblack>	cp203[56] - repooling in etcd - T279457	[production]
18:19	<robh@cumin1001>	START - Cookbook sre.dns.netbox	[production]
18:17	<robh@cumin1001>	END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)	[production]
18:17	<robh@cumin1001>	START - Cookbook sre.dns.netbox	[production]
18:16	<robh@cumin1001>	END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)	[production]
18:12	<robh@cumin1001>	START - Cookbook sre.dns.netbox	[production]
18:11	<bblack>	dns2001 - restarting bird to repool, then re-enabling puppet - T279457	[production]
18:04	<pt1979@cumin2001>	END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)	[production]
18:02	<pt1979@cumin2001>	START - Cookbook sre.dns.netbox	[production]
18:02	<ejegg>	update payments-wiki from 9a4eef1375 to 44570561f2	[production]
18:00	<herron@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1004.eqiad.wmnet with reason: REIMAGE	[production]
17:58	<herron@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1004.eqiad.wmnet with reason: REIMAGE	[production]
17:34	<papaul>	powerdown moss-fe2001 for maintenance	[production]
17:32	<robh@cumin1001>	END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)	[production]
17:29	<robh@cumin1001>	START - Cookbook sre.dns.netbox	[production]
17:25	<mbsantos@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'production' .	[production]
17:23	<mbsantos@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'production' .	[production]
17:21	<mbsantos@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .	[production]
17:19	<ryankemper>	T281215 Banned `elastic2043` from codfw cirrussearch cluster	[production]
17:16	<mbsantos@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'proton' for release 'production' .	[production]
17:14	<papaul>	powerdown kafka-logging2003 for maintenance	[production]
17:14	<mbsantos@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'proton' for release 'production' .	[production]
17:10	<mbsantos@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' .	[production]
17:09	<mbsantos@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .	[production]
17:07	<mbsantos@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .	[production]
17:04	<mbsantos@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .	[production]
16:52	<papaul>	powerdown elastic2045 for maintenance	[production]
16:49	<papaul>	powerdown ms-be2042 for maintenance	[production]
16:39	<dcaro>	reprepro updating packages on thirdparty/ceph-nautilus-buster	[production]
16:34	<pt1979@cumin2001>	END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)	[production]
16:29	<pt1979@cumin2001>	START - Cookbook sre.dns.netbox	[production]
16:23	<andrew@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 39 hosts with reason: upgrading openstack	[production]
16:23	<andrew@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on 39 hosts with reason: upgrading openstack	[production]