production SAL

5201-5250 of 10000 results (64ms)

2022-07-26 §
07:41	<vgutierrez>	rolling restart of ats-be on cp[1080,1083,1085,1087,5006,6001,6006,6009,6011,6015]	[production]
07:30	<_joe_>	running a restart-all for php-fpm on appservers in codfw to test python-poolcounter 0.0.3 T310835	[production]
06:58	<_joe_>	upgrade all of codfw to python3-poolcounter 0.0.3 T310835	[production]
06:54	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS bullseye	[production]
06:40	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage	[production]
06:36	<ayounsi@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: host reimage	[production]
06:24	<ayounsi@cumin1001>	START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS bullseye	[production]
06:21	<ayounsi@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1001.eqiad.wmnet with OS bullseye	[production]
06:07	<ayounsi@cumin1001>	START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS bullseye	[production]
02:30	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
02:30	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
02:30	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
02:29	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
02:09	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
02:08	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
02:08	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
02:05	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
00:11	<TimStarling>	restarted php7.2-fpm on the 9 canary hosts in eqiad T313770	[production]
2022-07-25 §
22:54	<pt1979@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
22:50	<pt1979@cumin2002>	START - Cookbook sre.dns.netbox	[production]
22:41	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T312863)', diff saved to https://phabricator.wikimedia.org/P31900 and previous config saved to /var/cache/conftool/dbconfig/20220725-224153-ladsgroup.json	[production]
22:26	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P31899 and previous config saved to /var/cache/conftool/dbconfig/20220725-222648-ladsgroup.json	[production]
22:11	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P31898 and previous config saved to /var/cache/conftool/dbconfig/20220725-221143-ladsgroup.json	[production]
21:56	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1096:3315 (T312863)', diff saved to https://phabricator.wikimedia.org/P31897 and previous config saved to /var/cache/conftool/dbconfig/20220725-215637-ladsgroup.json	[production]
21:27	<brennen@deploy1002>	Finished scap: no-op deploy to get wmf.21 on all boxen (T313770) (duration: 03m 33s)	[production]
21:24	<brennen@deploy1002>	Started scap: no-op deploy to get wmf.21 on all boxen (T313770)	[production]
21:20	<brennen>	running a no-op sync-world for T313770 to hopefully get 1.39.0-wmf.21 (T308074) to all servers.	[production]
20:28	<cjming>	end of UTC late backport window	[production]
20:17	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
20:16	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
20:16	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
20:15	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
20:10	<cjming@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:816706\|[cirrus] Increase shard count for ruwikinews]] (duration: 03m 15s)	[production]
20:10	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
20:09	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
20:09	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
20:08	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
20:05	<cjming@deploy1002>	Synchronized wmf-config: Config: [[gerrit:810405\|Remove Table of Contents config (T310527)]] (duration: 03m 13s)	[production]
19:24	<mutante>	after new wikis have been created apparently they need a "initSiteStats.php" run to make statistics work but this only runs in a timer on mwmaint once weekly or so	[production]
19:23	<mutante>	[mwmaint1002:~] $ sudo systemctl start mediawiki_job_initsitestats.service	[production]
17:07	<jbond>	enable puppet fleet wide	[production]
16:59	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1178 (T312863)', diff saved to https://phabricator.wikimedia.org/P31895 and previous config saved to /var/cache/conftool/dbconfig/20220725-165931-ladsgroup.json	[production]
16:49	<jbond>	disable puppet fleet wide	[production]
16:44	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P31894 and previous config saved to /var/cache/conftool/dbconfig/20220725-164426-ladsgroup.json	[production]
16:31	<ejegg>	updated payments-wiki from f56e9391 to 4487bd31	[production]
16:29	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P31893 and previous config saved to /var/cache/conftool/dbconfig/20220725-162921-ladsgroup.json	[production]
16:14	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1178 (T312863)', diff saved to https://phabricator.wikimedia.org/P31892 and previous config saved to /var/cache/conftool/dbconfig/20220725-161416-ladsgroup.json	[production]
16:14	<bblack>	cp*: re-enable puppet for normal staggered rollout (cp4027 tested all the esitest stuff without incident)	[production]
16:05	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1178 (T312863)', diff saved to https://phabricator.wikimedia.org/P31891 and previous config saved to /var/cache/conftool/dbconfig/20220725-160532-ladsgroup.json	[production]
16:05	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1178.eqiad.wmnet with reason: Maintenance	[production]