production SAL

251-300 of 10000 results (53ms)

2022-08-18 §
18:58	<robh@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dumpsdata1006.eqiad.wmnet with OS bullseye	[production]
18:57	<robh@cumin1001>	START - Cookbook sre.hosts.reimage for host dumpsdata1006.eqiad.wmnet with OS bullseye	[production]
18:55	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-stretch2001.codfw.wmnet with OS bullseye	[production]
18:52	<cmooney@cumin1001>	START - Cookbook sre.dns.netbox	[production]
18:40	<robh@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-stretch2001.codfw.wmnet with reason: host reimage	[production]
18:36	<robh@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-stretch2001.codfw.wmnet with reason: host reimage	[production]
18:17	<robh@cumin1001>	START - Cookbook sre.hosts.reimage for host kafka-stretch2001.codfw.wmnet with OS bullseye	[production]
18:15	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
18:14	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
18:14	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
18:13	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
18:13	<dancy@deploy1002>	rebuilt and synchronized wikiversions files: group2 wikis to 1.39.0-wmf.25 refs T314186	[production]
18:08	<dancy>	Testing stashbot behavior #2. T315444, T314613	[production]
18:07	<dancy>	Testing stashbot behavior #1 T315444	[production]
17:56	<robh@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-stretch2001.codfw.wmnet with OS bullseye	[production]
17:54	<bd808@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply	[production]
17:53	<bd808@deploy1002>	helmfile [eqiad] START helmfile.d/services/developer-portal: apply	[production]
17:53	<bd808@deploy1002>	helmfile [codfw] DONE helmfile.d/services/developer-portal: apply	[production]
17:52	<bd808@deploy1002>	helmfile [codfw] START helmfile.d/services/developer-portal: apply	[production]
17:52	<bd808@deploy1002>	helmfile [staging] DONE helmfile.d/services/developer-portal: apply	[production]
17:52	<bd808@deploy1002>	helmfile [staging] START helmfile.d/services/developer-portal: apply	[production]
17:48	<robh@cumin1001>	START - Cookbook sre.hosts.reimage for host kafka-stretch2001.codfw.wmnet with OS bullseye	[production]
17:46	<dancy@deploy1002>	backport aborted: (duration: 00m 21s)	[production]
17:16	<pt1979@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-stretch2001.codfw.wmnet with OS bullseye	[production]
17:08	<hashar@deploy1002>	Finished deploy [integration/docroot@1aca57b]: doc: update links from /mw-tools-scap/ to /scap/ - T315541 (duration: 00m 09s)	[production]
17:08	<hashar@deploy1002>	Started deploy [integration/docroot@1aca57b]: doc: update links from /mw-tools-scap/ to /scap/ - T315541	[production]
16:51	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 9 hosts with reason: Maintenance	[production]
16:51	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 9 hosts with reason: Maintenance	[production]
16:51	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance	[production]
16:51	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance	[production]
16:47	<demon@deploy1002>	Synchronized php: group1 wikis to 1.39.0-wmf.25 refs T314186 (duration: 03m 20s)	[production]
16:45	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 13 hosts with reason: Maintenance	[production]
16:45	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 12:00:00 on 13 hosts with reason: Maintenance	[production]
16:45	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance	[production]
16:45	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db2110.codfw.wmnet with reason: Maintenance	[production]
16:44	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1143 (T312972)', diff saved to https://phabricator.wikimedia.org/P32541 and previous config saved to /var/cache/conftool/dbconfig/20220818-164456-marostegui.json	[production]
16:44	<demon@deploy1002>	rebuilt and synchronized wikiversions files: group1 wikis to 1.39.0-wmf.25 refs T314186	[production]
16:29	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P32540 and previous config saved to /var/cache/conftool/dbconfig/20220818-162950-marostegui.json	[production]
16:26	<mvernon@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ms-be2067.codfw.wmnet with reason: disk fault investigation	[production]
16:26	<mvernon@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ms-be2067.codfw.wmnet with reason: disk fault investigation	[production]
16:21	<pt1979@cumin2002>	START - Cookbook sre.hosts.reimage for host kafka-stretch2001.codfw.wmnet with OS bullseye	[production]
16:17	<pt1979@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-stretch2001.codfw.wmnet with OS bullseye	[production]
16:14	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1143', diff saved to https://phabricator.wikimedia.org/P32539 and previous config saved to /var/cache/conftool/dbconfig/20220818-161444-marostegui.json	[production]
15:59	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1143 (T312972)', diff saved to https://phabricator.wikimedia.org/P32538 and previous config saved to /var/cache/conftool/dbconfig/20220818-155938-marostegui.json	[production]
15:54	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depooling db1143 (T312972)', diff saved to https://phabricator.wikimedia.org/P32537 and previous config saved to /var/cache/conftool/dbconfig/20220818-155410-marostegui.json	[production]
15:54	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance	[production]
15:53	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on db1143.eqiad.wmnet with reason: Maintenance	[production]
15:53	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1149 (T312972)', diff saved to https://phabricator.wikimedia.org/P32536 and previous config saved to /var/cache/conftool/dbconfig/20220818-155348-marostegui.json	[production]
15:38	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P32535 and previous config saved to /var/cache/conftool/dbconfig/20220818-153842-marostegui.json	[production]
15:23	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P32534 and previous config saved to /var/cache/conftool/dbconfig/20220818-152335-marostegui.json	[production]