production SAL

951-1000 of 10000 results (99ms)

2025-04-16 §
11:41	<hnowlan@deploy1003>	helmfile [staging] START helmfile.d/services/rest-gateway: apply	[production]
11:37	<jelto>	temporarily disable query sites on miscweb vms - T350793	[production]
11:29	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1182 (T391056)', diff saved to https://phabricator.wikimedia.org/P75102 and previous config saved to /var/cache/conftool/dbconfig/20250416-112948-fceratto.json	[production]
11:18	<fceratto@cumin1002>	dbctl commit (dc=all): 'Depooling db1182 (T391056)', diff saved to https://phabricator.wikimedia.org/P75101 and previous config saved to /var/cache/conftool/dbconfig/20250416-111822-fceratto.json	[production]
11:18	<fceratto@cumin1002>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1182.eqiad.wmnet with reason: Maintenance	[production]
11:18	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1156 (T391056)', diff saved to https://phabricator.wikimedia.org/P75100 and previous config saved to /var/cache/conftool/dbconfig/20250416-111759-fceratto.json	[production]
11:11	<cmooney@cumin1002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
11:10	<cgoubert@deploy1003>	Started scap sync-world: Move mwscript wrapper from base image to copy on build - T391665	[production]
11:09	<cmooney@cumin1002>	START - Cookbook sre.dns.netbox	[production]
11:06	<claime>	Rebuilding php base images to pick up 1135922 - T391665	[production]
11:02	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P75099 and previous config saved to /var/cache/conftool/dbconfig/20250416-110252-fceratto.json	[production]
10:58	<cgoubert@deploy1003>	Finished scap build-images: (no justification provided) (duration: 05m 36s)	[production]
10:52	<cgoubert@deploy1003>	Started scap build-images: (no justification provided)	[production]
10:47	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P75098 and previous config saved to /var/cache/conftool/dbconfig/20250416-104744-fceratto.json	[production]
10:32	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1156 (T391056)', diff saved to https://phabricator.wikimedia.org/P75097 and previous config saved to /var/cache/conftool/dbconfig/20250416-103236-fceratto.json	[production]
10:29	<MichaelG_WMF>	migr@mwmaint1002:/srv/mediawiki/php-1.44.0-wmf.24$ time mwscript ./extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki ruwiki --verbose #T391695	[production]
10:23	<MichaelG_WMF>	migr@mwmaint1002:/srv/mediawiki/php-1.44.0-wmf.24$ time mwscript ./extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki frwiki --verbose #T391695	[production]
10:21	<fceratto@cumin1002>	dbctl commit (dc=all): 'Depooling db1156 (T391056)', diff saved to https://phabricator.wikimedia.org/P75096 and previous config saved to /var/cache/conftool/dbconfig/20250416-102110-fceratto.json	[production]
10:21	<fceratto@cumin1002>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance	[production]
10:20	<fceratto@cumin1002>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: Maintenance	[production]
10:19	<fnegri@cumin1002>	END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database nupwiki (T390714)	[production]
10:19	<fnegri@cumin1002>	START - Cookbook sre.wikireplicas.add-wiki for database nupwiki (T390714)	[production]
10:17	<MichaelG_WMF>	migr@mwmaint1002:/srv/mediawiki/php-1.44.0-wmf.25$ mwscript ./extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki testwiki --verbose #T391695	[production]
10:13	<cmooney@cumin1002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
10:11	<cmooney@cumin1002>	START - Cookbook sre.dns.netbox	[production]
09:54	<elukey@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve-ctrl1001.eqiad.wmnet with OS bookworm	[production]
09:43	<ladsgroup@deploy1003>	Finished scap sync-world: Backport for [[gerrit:1136964\|Change default thumbnail size to 250px (T355914)]] (duration: 19m 35s)	[production]
09:36	<ladsgroup@deploy1003>	ladsgroup: Continuing with sync	[production]
09:36	<elukey@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve-ctrl1001.eqiad.wmnet with reason: host reimage	[production]
09:35	<ladsgroup@deploy1003>	ladsgroup: Backport for [[gerrit:1136964\|Change default thumbnail size to 250px (T355914)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
09:32	<elukey@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve-ctrl1001.eqiad.wmnet with reason: host reimage	[production]
09:23	<ladsgroup@deploy1003>	Started scap sync-world: Backport for [[gerrit:1136964\|Change default thumbnail size to 250px (T355914)]]	[production]
09:22	<ladsgroup@deploy1003>	Finished scap sync-world: Backport for [[gerrit:1136963\|Bump thumbnail steps to 100% (T360589)]] (duration: 19m 05s)	[production]
09:18	<elukey@cumin1002>	START - Cookbook sre.hosts.reimage for host ml-serve-ctrl1001.eqiad.wmnet with OS bookworm	[production]
09:15	<vgutierrez>	repooling cp4047 - T387238	[production]
09:15	<ladsgroup@deploy1003>	ladsgroup: Continuing with sync	[production]
09:15	<ladsgroup@deploy1003>	ladsgroup: Backport for [[gerrit:1136963\|Bump thumbnail steps to 100% (T360589)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
09:02	<ladsgroup@deploy1003>	Started scap sync-world: Backport for [[gerrit:1136963\|Bump thumbnail steps to 100% (T360589)]]	[production]
09:02	<ladsgroup@deploy1003>	sync-world failed: <CalledProcessError> Command '['helmfile', '-e', 'eqiad', '--selector', 'name=main', 'write-values', '--output-file-template', '/tmp/tmpsh_tee3p']' returned non-zero exit status 3. (scap version: 4.153.0) (duration: 15m 58s)	[production]
08:59	<ladsgroup@deploy1003>	ladsgroup: Continuing with sync	[production]
08:58	<ladsgroup@deploy1003>	ladsgroup: Backport for [[gerrit:1136963\|Bump thumbnail steps to 100% (T360589)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
08:46	<ladsgroup@deploy1003>	Started scap sync-world: Backport for [[gerrit:1136963\|Bump thumbnail steps to 100% (T360589)]]	[production]
08:16	<akosiaris>	destroy the "main" helmfile releases for mw-wikifunctions. The service is now being powered by the single version MediaWiki HTTP routing solution releases, this is a cleanup.	[production]
07:50	<aikochou@deploy1003>	helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .	[production]
07:26	<brouberol@deploy1003>	helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.	[production]
07:26	<brouberol@deploy1003>	helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.	[production]
07:02	<elukey>	powercycle ml-serve2007 - OEM event registered in getsel (seems DIMM-related)	[production]
06:09	<volans>	installing spicerack v10.1.0 on cumin1002	[production]
05:38	<volans>	installing spicerack v10.1.0 on cumin2002	[production]
02:30	<fceratto@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2222 (T391056)', diff saved to https://phabricator.wikimedia.org/P75094 and previous config saved to /var/cache/conftool/dbconfig/20250416-023052-fceratto.json	[production]