production SAL

6351-6400 of 10000 results (76ms)

2022-08-18 §
20:09	<brennen@deploy1002>	Started scap: [[gerrit:824395\|Deploy partial action blocks to cswiki (T315525)]]	[production]
20:00	<robh@cumin1001>	START - Cookbook sre.hosts.reimage for host dumpsdata1006.eqiad.wmnet with OS bullseye	[production]
19:57	<ottomata>	renable puppet on an-master*	[production]
19:47	<ottomata>	temporarily disable puppet on an-master100* while applying change in test cluster - T312858	[production]
19:34	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dumpsdata1007.eqiad.wmnet with OS bullseye	[production]
19:19	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dumpsdata1007.eqiad.wmnet with reason: host reimage	[production]
19:16	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on dumpsdata1007.eqiad.wmnet with reason: host reimage	[production]
19:10	<cmooney@cumin1001>	END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)	[production]
19:00	<robh@cumin1001>	START - Cookbook sre.hosts.reimage for host dumpsdata1007.eqiad.wmnet with OS bullseye	[production]
18:58	<robh@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dumpsdata1006.eqiad.wmnet with OS bullseye	[production]
18:57	<robh@cumin1001>	START - Cookbook sre.hosts.reimage for host dumpsdata1006.eqiad.wmnet with OS bullseye	[production]
18:55	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-stretch2001.codfw.wmnet with OS bullseye	[production]
18:52	<cmooney@cumin1001>	START - Cookbook sre.dns.netbox	[production]
18:40	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-stretch2001.codfw.wmnet with reason: host reimage	[production]
18:36	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-stretch2001.codfw.wmnet with reason: host reimage	[production]
18:17	<robh@cumin1001>	START - Cookbook sre.hosts.reimage for host kafka-stretch2001.codfw.wmnet with OS bullseye	[production]
18:15	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
18:14	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
18:14	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
18:13	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
18:13	<dancy@deploy1002>	rebuilt and synchronized wikiversions files: group2 wikis to 1.39.0-wmf.25 refs T314186	[production]
18:08	<dancy>	Testing stashbot behavior #2. T315444, T314613	[production]
18:07	<dancy>	Testing stashbot behavior #1 T315444	[production]
17:56	<robh@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-stretch2001.codfw.wmnet with OS bullseye	[production]
17:54	<bd808@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply	[production]
17:53	<bd808@deploy1002>	helmfile [eqiad] START helmfile.d/services/developer-portal: apply	[production]
17:53	<bd808@deploy1002>	helmfile [codfw] DONE helmfile.d/services/developer-portal: apply	[production]
17:52	<bd808@deploy1002>	helmfile [codfw] START helmfile.d/services/developer-portal: apply	[production]
17:52	<bd808@deploy1002>	helmfile [staging] DONE helmfile.d/services/developer-portal: apply	[production]
17:52	<bd808@deploy1002>	helmfile [staging] START helmfile.d/services/developer-portal: apply	[production]
17:48	<robh@cumin1001>	START - Cookbook sre.hosts.reimage for host kafka-stretch2001.codfw.wmnet with OS bullseye	[production]
17:46	<dancy@deploy1002>	backport aborted: (duration: 00m 21s)	[production]
17:16	<pt1979@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-stretch2001.codfw.wmnet with OS bullseye	[production]
17:08	<hashar@deploy1002>	Finished deploy [integration/docroot@1aca57b]: doc: update links from /mw-tools-scap/ to /scap/ - T315541 (duration: 00m 09s)	[production]
17:08	<hashar@deploy1002>	Started deploy [integration/docroot@1aca57b]: doc: update links from /mw-tools-scap/ to /scap/ - T315541	[production]
16:51	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 9 hosts with reason: Maintenance	[production]
16:51	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 9 hosts with reason: Maintenance	[production]
16:51	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance	[production]
16:51	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance	[production]
16:47	<demon@deploy1002>	Synchronized php: group1 wikis to 1.39.0-wmf.25 refs T314186 (duration: 03m 20s)	[production]
16:45	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 13 hosts with reason: Maintenance	[production]
16:45	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 12:00:00 on 13 hosts with reason: Maintenance	[production]
16:45	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance	[production]
16:45	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance	[production]
16:44	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1143 (T312972)', diff saved to https://phabricator.wikimedia.org/P32541 and previous config saved to /var/cache/conftool/dbconfig/20220818-164456-marostegui.json	[production]
16:44	<demon@deploy1002>	rebuilt and synchronized wikiversions files: group1 wikis to 1.39.0-wmf.25 refs T314186	[production]
16:29	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P32540 and previous config saved to /var/cache/conftool/dbconfig/20220818-162950-marostegui.json	[production]
16:26	<mvernon@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ms-be2067.codfw.wmnet with reason: disk fault investigation	[production]
16:26	<mvernon@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ms-be2067.codfw.wmnet with reason: disk fault investigation	[production]
16:21	<pt1979@cumin2002>	START - Cookbook sre.hosts.reimage for host kafka-stretch2001.codfw.wmnet with OS bullseye	[production]