production SAL

951-1000 of 10000 results (59ms)

2022-08-05 §
00:53	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for gerrit2002.wikimedia.org	[production]
00:53	<dzahn@cumin1001>	START - Cookbook sre.hosts.remove-downtime for gerrit2002.wikimedia.org	[production]
00:52	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8 days, 0:00:00 on gerrit2002.wikimedia.org with reason: decom, replaced by gerrit2002	[production]
00:52	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 8 days, 0:00:00 on gerrit2002.wikimedia.org with reason: decom, replaced by gerrit2002	[production]
00:18	<mutante>	restarting gerrit for config change - removing old replica T313250	[production]
2022-08-04 §
23:06	<mutante>	switching gerrit-replica.wikimedia.org to new machine gerrit2002, dropping gerrit-replica-new.wikimedia.org T313250	[production]
21:07	<ryankemper@deploy1002>	helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply	[production]
20:59	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
20:57	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
20:57	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
20:56	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
20:56	<thcipriani@deploy1002>	Finished scap: Backport for [[gerrit:819774]] tkwiki: Update wordmark (duration: 06m 12s)	[production]
20:51	<ryankemper@deploy1002>	helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply	[production]
20:51	<ryankemper@deploy1002>	helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply	[production]
20:51	<ryankemper@deploy1002>	helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply	[production]
20:50	<thcipriani@deploy1002>	Started scap: Backport for [[gerrit:819774]] tkwiki: Update wordmark	[production]
20:48	<thcipriani@deploy1002>	Finished scap: Backport for [[gerrit:812391]] [config]: Add click event logging for mobile and desktop (duration: 39m 16s)	[production]
20:45	<ryankemper@deploy1002>	helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply	[production]
20:24	<ryankemper@deploy1002>	helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply	[production]
20:23	<ryankemper@deploy1002>	helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply	[production]
20:22	<ryankemper@deploy1002>	helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply	[production]
20:16	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
20:15	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
20:15	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
20:14	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
20:13	<ryankemper@deploy1002>	helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply	[production]
20:13	<ryankemper@deploy1002>	helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply	[production]
20:10	<ryankemper@deploy1002>	helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply	[production]
20:09	<ryankemper@deploy1002>	helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply	[production]
20:08	<thcipriani@deploy1002>	Started scap: Backport for [[gerrit:812391]] [config]: Add click event logging for mobile and desktop	[production]
19:59	<ryankemper@deploy1002>	helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply	[production]
19:55	<dancy@deploy1002>	rebuilt and synchronized wikiversions files: resync	[production]
19:49	<mvernon@cumin1001>	END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for thanos-be2001.codfw.wmnet	[production]
19:49	<mvernon@cumin1001>	START - Cookbook sre.hosts.remove-downtime for thanos-be2001.codfw.wmnet	[production]
19:44	<mvernon@cumin1001>	END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 8 hosts	[production]
19:44	<mvernon@cumin1001>	START - Cookbook sre.hosts.remove-downtime for 8 hosts	[production]
19:42	<Emperor>	rebooting thanos-be2001 to fix drive ordering	[production]
19:37	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for elastic2071.codfw.wmnet	[production]
19:37	<bking@cumin1001>	START - Cookbook sre.hosts.remove-downtime for elastic2071.codfw.wmnet	[production]
19:31	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic2071.codfw.wmnet with reason: T310146	[production]
19:31	<bking@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic2071.codfw.wmnet with reason: T310146	[production]
19:13	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
19:12	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
19:12	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
19:12	<ryankemper@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/changeprop: apply	[production]
19:11	<ryankemper@deploy1002>	helmfile [eqiad] START helmfile.d/services/changeprop: apply	[production]
19:11	<dancy>	There were many errors during php-fpm restart due to failure to contact http://lvs2009:9090/pools/appservers-https_443/mw2361.codfw.wmnet and the like.	[production]
19:11	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
19:10	<dancy@deploy1002>	rebuilt and synchronized wikiversions files: group2 wikis to 1.39.0-wmf.23 refs T308076	[production]
19:09	<ryankemper@deploy1002>	helmfile [codfw] DONE helmfile.d/services/changeprop: apply	[production]