production SAL

1201-1250 of 10000 results (29ms)

2021-04-28 §
16:23	<andrew@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1043.eqiad.wmnet with reason: REIMAGE	[production]
16:22	<andrew@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1041.eqiad.wmnet with reason: REIMAGE	[production]
16:21	<andrew@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1042.eqiad.wmnet with reason: REIMAGE	[production]
16:19	<andrew@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1041.eqiad.wmnet with reason: REIMAGE	[production]
16:19	<pt1979@cumin2001>	END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)	[production]
16:12	<pt1979@cumin2001>	START - Cookbook sre.dns.netbox	[production]
15:25	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on sessionstore2001.codfw.wmnet with reason: Server relocation	[production]
15:25	<hnowlan@cumin1001>	START - Cookbook sre.hosts.downtime for 0:15:00 on sessionstore2001.codfw.wmnet with reason: Server relocation	[production]
15:24	<jayme@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
15:20	<jayme@cumin1001>	START - Cookbook sre.dns.netbox	[production]
15:19	<jayme@cumin1001>	END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts conf[2001-2003].codfw.wmnet	[production]
15:12	<pt1979@cumin2001>	END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)	[production]
15:09	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on sessionstore2001.codfw.wmnet with reason: Server relocation	[production]
15:09	<hnowlan@cumin1001>	START - Cookbook sre.hosts.downtime for 0:15:00 on sessionstore2001.codfw.wmnet with reason: Server relocation	[production]
15:03	<pt1979@cumin2001>	START - Cookbook sre.dns.netbox	[production]
15:00	<moritzm>	imported python-poolcounter 0.0.2-1+deb11u1 to apt.wikimedia.org T275873	[production]
14:53	<jayme@cumin1001>	START - Cookbook sre.hosts.decommission for hosts conf[2001-2003].codfw.wmnet	[production]
14:44	<moritzm>	imported gitlab-ce 13.9.7-ce.0 to apt.wikimedia.org	[production]
14:40	<milimetric@deploy1002>	Finished deploy [analytics/refinery@559d98d] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@559d98d] (duration: 04m 59s)	[production]
14:35	<milimetric@deploy1002>	Started deploy [analytics/refinery@559d98d] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@559d98d]	[production]
14:34	<milimetric@deploy1002>	Finished deploy [analytics/refinery@559d98d] (thin): Regular analytics weekly train THIN [analytics/refinery@559d98d] (duration: 00m 06s)	[production]
14:34	<milimetric@deploy1002>	Started deploy [analytics/refinery@559d98d] (thin): Regular analytics weekly train THIN [analytics/refinery@559d98d]	[production]
14:34	<milimetric@deploy1002>	Finished deploy [analytics/refinery@559d98d]: Regular analytics weekly train [analytics/refinery@559d98d] (duration: 03m 07s)	[production]
14:32	<moritzm>	installing iproute2 updates from buster point release	[production]
14:31	<milimetric@deploy1002>	Started deploy [analytics/refinery@559d98d]: Regular analytics weekly train [analytics/refinery@559d98d]	[production]
14:30	<milimetric@deploy1002>	deploy aborted: - (duration: 00m 00s)	[production]
14:30	<milimetric@deploy1002>	Started deploy [analytics/refinery@559d98d]: -	[production]
14:30	<milimetric@deploy1002>	Finished deploy [analytics/refinery@559d98d]: Regular analytics weekly train [analytics/refinery@559d98d] (duration: 12m 31s)	[production]
14:26	<moritzm>	installing net-snmp updates from buster point release	[production]
14:17	<milimetric@deploy1002>	Started deploy [analytics/refinery@559d98d]: Regular analytics weekly train [analytics/refinery@559d98d]	[production]
13:59	<andrew@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1040.eqiad.wmnet with reason: REIMAGE	[production]
13:57	<andrew@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1040.eqiad.wmnet with reason: REIMAGE	[production]
13:15	<jayme>	restarting pybal on lvs5001,lvs4005,lvs2007 - T271573	[production]
13:14	<liw@deploy1002>	rebuilt and synchronized wikiversions files: Revert "group1 wikis to 3.17.0-wmf.1"	[production]
13:10	<jayme>	restarting pybal on lvs5002,lvs4006,lvs2008 - T271573	[production]
13:04	<liw@deploy1002>	Synchronized php: group1 wikis to 1.37.0-wmf.3 (duration: 01m 07s)	[production]
13:03	<jmm@cumin2001>	END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0)	[production]
13:03	<liw@deploy1002>	rebuilt and synchronized wikiversions files: group1 wikis to 1.37.0-wmf.3	[production]
13:02	<moritzm>	upgrading deployment servers to PHP 7.4.32	[production]
12:55	<moritzm>	upgrading snapshot hosts to PHP 7.4.32	[production]
12:48	<jayme>	restarting pybal on lvs2009 - T271573	[production]
12:45	<moritzm>	upgrading labweb to PHP 7.4.32	[production]
12:43	<jmm@cumin2001>	START - Cookbook sre.cassandra.roll-restart	[production]
12:42	<jayme>	restarting pybal on lvs5003,lvs4007 - T271573	[production]
12:39	<jayme>	restarting pybal on lvs2010 - T271573	[production]
12:36	<jmm@cumin2001>	END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0)	[production]
12:28	<apergos>	manually edited /srv/deployment/dumps/dumps-cache/config on snapshots1011,12,13 to change deploy1001 to deploy1002 (where did it get the old value from? these are new installs!)	[production]
12:16	<moritzm>	rolling restart of cassandra in restbase-dev to pick up Java security updates	[production]
12:15	<jmm@cumin2001>	START - Cookbook sre.cassandra.roll-restart	[production]
12:15	<jmm@cumin2001>	END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99)	[production]