production SAL

3351-3400 of 10000 results (22ms)

2021-05-11 §
19:37	<mforns@deploy1002>	Started deploy [analytics/refinery@7e0598d] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@7e0598d3f0805bf3dda4e01b637d95c16a6a668b]	[production]
19:33	<dancy@deploy1002>	rebuilt and synchronized wikiversions files: group0 wikis to 1.37.0-wmf.5	[production]
19:29	<mforns@deploy1002>	Finished deploy [analytics/refinery@7e0598d] (thin): Regular analytics weekly train THIN [analytics/refinery@7e0598d3f0805bf3dda4e01b637d95c16a6a668b] (duration: 00m 07s)	[production]
19:29	<mforns@deploy1002>	Started deploy [analytics/refinery@7e0598d] (thin): Regular analytics weekly train THIN [analytics/refinery@7e0598d3f0805bf3dda4e01b637d95c16a6a668b]	[production]
19:28	<mforns@deploy1002>	Finished deploy [analytics/refinery@7e0598d]: Regular analytics weekly train [analytics/refinery@7e0598d3f0805bf3dda4e01b637d95c16a6a668b] (duration: 45m 45s)	[production]
18:54	<herron@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash1011.eqiad.wmnet with reason: REIMAGE	[production]
18:53	<otto@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: Migrate VirtualPageView to EventPlatform on testwiki - T238138 (duration: 01m 09s)	[production]
18:52	<herron@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on logstash1011.eqiad.wmnet with reason: REIMAGE	[production]
18:43	<mforns@deploy1002>	Started deploy [analytics/refinery@7e0598d]: Regular analytics weekly train [analytics/refinery@7e0598d3f0805bf3dda4e01b637d95c16a6a668b]	[production]
18:20	<dancy@deploy1002>	Finished scap: testwikis wikis to 1.37.0-wmf.5 (duration: 09m 43s)	[production]
18:10	<dancy@deploy1002>	Started scap: testwikis wikis to 1.37.0-wmf.5	[production]
17:36	<andrew@deploy1002>	Finished deploy [horizon/deploy@acc3c68]: testing default policy deployment in codfw1dev (again) (duration: 01m 25s)	[production]
17:35	<herron@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash1010.eqiad.wmnet with reason: REIMAGE	[production]
17:35	<andrew@deploy1002>	Started deploy [horizon/deploy@acc3c68]: testing default policy deployment in codfw1dev (again)	[production]
17:34	<andrew@deploy1002>	Finished deploy [horizon/deploy@acc3c68]: testing default policy deployment in codfw1dev (again) (duration: 02m 27s)	[production]
17:33	<herron@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on logstash1010.eqiad.wmnet with reason: REIMAGE	[production]
17:32	<andrew@deploy1002>	Started deploy [horizon/deploy@acc3c68]: testing default policy deployment in codfw1dev (again)	[production]
17:31	<andrew@deploy1002>	Finished deploy [horizon/deploy@2604d7b]: testing default policy deployment in codfw1dev (duration: 01m 59s)	[production]
17:29	<andrew@deploy1002>	Started deploy [horizon/deploy@2604d7b]: testing default policy deployment in codfw1dev	[production]
17:20	<mutante>	the backend for people.wikimedia.org switched from people1002 to people1003, the people.wikimedia.org CNAME has been updated. MOTD is about to be updated to inform users.	[production]
17:18	<legoktm>	disabled pipermail redirects on lists.wikimedia.org	[production]
17:07	<dancy@deploy1002>	scap failed: average error rate on 9/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/83629bcb5560d11e61d3085c89dd9ed6 for details)	[production]
16:12	<jynus>	restarting bacula-dir on backup1001, stuck process	[production]
15:59	<dancy@deploy1002>	rebuilt and synchronized wikiversions files: (no justification provided)	[production]
15:58	<herron@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mwlog1001.eqiad.wmnet	[production]
15:55	<bstorm>	restart haproxy on dbproxy1018/9 to remove old config	[production]
15:47	<herron@cumin1001>	START - Cookbook sre.hosts.decommission for hosts mwlog1001.eqiad.wmnet	[production]
15:38	<herron@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mwlog2001.codfw.wmnet	[production]
15:37	<dancy@deploy1002>	scap failed: average error rate on 9/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/83629bcb5560d11e61d3085c89dd9ed6 for details)	[production]
15:36	<dancy@deploy1002>	sync-world aborted: testwikis wikis to 1.37.0-wmf.4 (duration: 02m 04s)	[production]
15:34	<dancy@deploy1002>	Started scap: testwikis wikis to 1.37.0-wmf.4	[production]
15:33	<cmjohnson@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
15:31	<dancy@deploy1002>	scap failed: RuntimeError scap failed: average error rate on 9/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/83629bcb5560d11e61d3085c89dd9ed6 for details) (duration: 17m 36s)	[production]
15:31	<dancy@deploy1002>	scap failed: average error rate on 9/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/83629bcb5560d11e61d3085c89dd9ed6 for details)	[production]
15:27	<herron@cumin1001>	START - Cookbook sre.hosts.decommission for hosts mwlog2001.codfw.wmnet	[production]
15:24	<cmjohnson@cumin1001>	START - Cookbook sre.dns.netbox	[production]
15:13	<dancy@deploy1002>	Started scap: testwikis wikis to 1.37.0-wmf.5	[production]
15:03	<cmjohnson@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
15:01	<cmjohnson@cumin1001>	START - Cookbook sre.dns.netbox	[production]
14:59	<herron@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash1010.eqiad.wmnet with reason: REIMAGE	[production]
14:57	<herron@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on logstash1010.eqiad.wmnet with reason: REIMAGE	[production]
14:49	<moritzm>	installing busybox security updates	[production]
14:38	<cmjohnson@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
14:31	<cmjohnson@cumin1001>	START - Cookbook sre.dns.netbox	[production]
14:29	<cmjohnson@cumin1001>	END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)	[production]
14:27	<moritzm>	installing cgal security updates	[production]
14:26	<cmjohnson@cumin1001>	START - Cookbook sre.dns.netbox	[production]
14:14	<cmjohnson@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
14:14	<hashar>	Restarted CI Jenkins with a snapshot of the Gearman Jenkins plugin # T281737	[production]
14:10	<hashar>	Restarted CI Jenkins for plugin upgrade # T282433	[production]