production SAL

3801-3850 of 10000 results (30ms)

2020-12-11 §
19:28	<razzi@cumin1001>	START - Cookbook sre.ganeti.makevm	[production]
19:18	<razzi@cumin1001>	END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97)	[production]
18:47	<razzi@cumin1001>	END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)	[production]
18:30	<elukey@cumin1001>	END (PASS) - Cookbook sre.hadoop.upgrade-bigtop-distro (exit_code=0) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
18:19	<elukey@cumin1001>	START - Cookbook sre.hadoop.upgrade-bigtop-distro for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
18:19	<elukey@cumin1001>	END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0) for Hadoop test cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001	[production]
18:13	<elukey@cumin1001>	START - Cookbook sre.hadoop.stop-cluster for Hadoop test cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001	[production]
18:13	<mutante>	doc1001 restarted apache2 just in case after DOC_PATH change	[production]
17:53	<razzi@cumin1001>	START - Cookbook sre.hosts.decommission	[production]
17:52	<razzi@cumin1001>	START - Cookbook sre.ganeti.makevm	[production]
17:48	<elukey@cumin1001>	END (FAIL) - Cookbook sre.hadoop.upgrade-bigtop-distro (exit_code=99) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
17:41	<elukey@cumin1001>	START - Cookbook sre.hadoop.upgrade-bigtop-distro for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
16:40	<elukey@cumin1001>	END (FAIL) - Cookbook sre.hadoop.upgrade-bigtop-distro (exit_code=99) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
16:28	<elukey@cumin1001>	START - Cookbook sre.hadoop.upgrade-bigtop-distro for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
16:15	<elukey@cumin1001>	END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0) for Hadoop test cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001	[production]
16:10	<elukey@cumin1001>	START - Cookbook sre.hadoop.stop-cluster for Hadoop test cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001	[production]
15:35	<jbond@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1002.eqiad.wmnet with reason: REIMAGE	[production]
15:33	<jbond@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1002.eqiad.wmnet with reason: REIMAGE	[production]
15:20	<elukey@cumin1001>	END (FAIL) - Cookbook sre.hadoop.upgrade-bigtop-distro (exit_code=99) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
15:15	<jbond@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE	[production]
15:12	<jbond@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE	[production]
15:10	<jayme@deploy1001>	helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.	[production]
15:06	<elukey@cumin1001>	START - Cookbook sre.hadoop.upgrade-bigtop-distro for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
14:59	<jayme@deploy1001>	helmfile [staging-codfw] START helmfile.d/admin 'sync'.	[production]
14:45	<jayme@deploy1001>	helmfile [staging-codfw] START helmfile.d/admin 'sync'.	[production]
14:30	<jbond@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE	[production]
14:28	<jbond@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE	[production]
14:26	<elukey@cumin1001>	END (FAIL) - Cookbook sre.hadoop.upgrade-bigtop-distro (exit_code=99) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
14:23	<elukey@cumin1001>	START - Cookbook sre.hadoop.upgrade-bigtop-distro for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
14:16	<elukey@cumin1001>	END (FAIL) - Cookbook sre.hadoop.upgrade-bigtop-distro (exit_code=99) for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
14:04	<elukey@cumin1001>	START - Cookbook sre.hadoop.upgrade-bigtop-distro for Hadoop test cluster: Change Hadoop distribution - elukey@cumin1001	[production]
14:03	<elukey@cumin1001>	END (PASS) - Cookbook sre.hadoop.stop-cluster (exit_code=0) for Hadoop test cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001	[production]
14:00	<jbond@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE	[production]
13:58	<jbond@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE	[production]
13:57	<elukey@cumin1001>	START - Cookbook sre.hadoop.stop-cluster for Hadoop test cluster: Stop the Hadoop cluster before maintenance. - elukey@cumin1001	[production]
13:38	<jbond@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE	[production]
13:36	<jbond@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE	[production]
12:02	<jbond@cumin1001>	END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE	[production]
12:00	<jbond@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE	[production]
09:57	<akosiaris@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on kubestage2001.codfw.wmnet with reason: REIMAGE	[production]
09:55	<akosiaris@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on kubestage2002.codfw.wmnet with reason: REIMAGE	[production]
09:54	<akosiaris@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage2001.codfw.wmnet with reason: REIMAGE	[production]
09:53	<akosiaris@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage2002.codfw.wmnet with reason: REIMAGE	[production]
09:26	<elukey>	add thirdparty/bigtop15 to buster-wikimedia	[production]
08:13	<elukey>	restart memcached on mwdebug1002 to pick up the correct port (11210 instead of the default 11211)	[production]
07:12	<elukey@cumin1001>	END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0)	[production]
07:04	<elukey@cumin1001>	START - Cookbook sre.presto.roll-restart-workers	[production]
01:24	<ejegg>	updated payments-wiki from df80a99b40 to 63ae7413a8	[production]
2020-12-10 §
23:35	<Urbanecm>	[urbanecm@mwmaint1002 ~]$ mwscript resetAuthenticationThrottle.php --wiki=enwiki --login --ip 'REDACTED' --user 'WP 1.0 bot' # T269898	[production]
23:15	<twentyafterfour@deploy1001>	rebuilt and synchronized wikiversions files: all wikis to 1.36.0-wmf.21	[production]