production SAL

3101-3150 of 10000 results (52ms)

2022-05-16 §
20:10	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
18:44	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
18:43	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
18:43	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
18:42	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
18:42	<ladsgroup@deploy1002>	Synchronized php-1.39.0-wmf.10/includes/api/ApiQueryBacklinksprop.php: Backport: [[gerrit:792140\|ApiQueryBacklinksprop: Make sure the index setting exists (T306673)]] (duration: 00m 50s)	[production]
18:12	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
18:11	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
18:11	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
18:10	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
17:25	<mutante>	ACKIng again all unhandled CRIT alerts on hosts with "dev" in their name - (imho dev hosts should not have prod CRIT alerts?)	[production]
15:59	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts netbox-dev2001.wikimedia.org	[production]
15:59	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
15:54	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
15:50	<ayounsi@cumin1001>	START - Cookbook sre.dns.netbox	[production]
15:50	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
15:50	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
15:49	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
15:47	<ayounsi@cumin1001>	START - Cookbook sre.hosts.decommission for hosts netbox-dev2001.wikimedia.org	[production]
15:46	<jdrewniak@deploy1002>	Synchronized portals: Wikimedia Portals Update: [[gerrit:792229\| Bumping portals to master (T128546)]] (duration: 00m 51s)	[production]
15:46	<jdrewniak@deploy1002>	Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:792229\| Bumping portals to master (T128546)]] (duration: 00m 50s)	[production]
15:44	<ayounsi@cumin1001>	END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts netbox2001-dev.wikimedia.org	[production]
15:44	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
15:42	<ayounsi@cumin1001>	START - Cookbook sre.dns.netbox	[production]
15:39	<ayounsi@cumin1001>	START - Cookbook sre.hosts.decommission for hosts netbox2001-dev.wikimedia.org	[production]
15:24	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
15:23	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update homer wmf-netbox plugin - ayounsi@cumin1001	[production]
15:23	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
15:23	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
15:22	<ayounsi@cumin1001>	START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: update homer wmf-netbox plugin - ayounsi@cumin1001	[production]
15:21	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
15:18	<papaul>	rebooting pfw3[a-b]-eqiad for Junos upgrade	[production]
14:50	<ladsgroup@deploy1002>	Synchronized php-1.39.0-wmf.10/includes/api/ApiQueryBacklinksprop.php: Backport: Revert: [[gerrit:792136\|ApiQueryBacklinksprop: Force the correct templatelinks index on read new (T306673)]] (duration: 00m 50s)	[production]
14:47	<ladsgroup@deploy1002>	scap failed: average error rate on 3/8 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org for details)	[production]
14:45	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
14:44	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
14:44	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
14:43	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
14:42	<XioNoX>	fix MTUs on asw-c-codfw	[production]
14:14	<godog>	bump disk space in prometheus codfw k8s-ml-serve (+30G)	[production]
14:14	<Lucas_WMDE>	UTC afternoon backport+config window done (just for the record; actual last backport was half an hour ago)	[production]
13:54	<btullis@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main	[production]
13:52	<btullis@deploy1002>	helmfile [eqiad] START helmfile.d/services/datahub: apply on main	[production]
13:50	<XioNoX>	fix MTUs on asw-b-codfw	[production]
13:47	<btullis@deploy1002>	helmfile [codfw] DONE helmfile.d/services/datahub: sync on main	[production]
13:46	<btullis@deploy1002>	helmfile [codfw] START helmfile.d/services/datahub: apply on main	[production]
13:43	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
13:42	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
13:42	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
13:41	<btullis@deploy1002>	helmfile [staging] DONE helmfile.d/services/datahub: sync on main	[production]