951-1000 of 10000 results (124ms)
2025-04-16 ยง
11:41 <hnowlan@deploy1003> helmfile [staging] START helmfile.d/services/rest-gateway: apply [production]
11:37 <jelto> temporarily disable query sites on miscweb vms - T350793 [production]
11:29 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1182 (T391056)', diff saved to https://phabricator.wikimedia.org/P75102 and previous config saved to /var/cache/conftool/dbconfig/20250416-112948-fceratto.json [production]
11:18 <fceratto@cumin1002> dbctl commit (dc=all): 'Depooling db1182 (T391056)', diff saved to https://phabricator.wikimedia.org/P75101 and previous config saved to /var/cache/conftool/dbconfig/20250416-111822-fceratto.json [production]
11:18 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1182.eqiad.wmnet with reason: Maintenance [production]
11:18 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1156 (T391056)', diff saved to https://phabricator.wikimedia.org/P75100 and previous config saved to /var/cache/conftool/dbconfig/20250416-111759-fceratto.json [production]
11:11 <cmooney@cumin1002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
11:10 <cgoubert@deploy1003> Started scap sync-world: Move mwscript wrapper from base image to copy on build - T391665 [production]
11:09 <cmooney@cumin1002> START - Cookbook sre.dns.netbox [production]
11:06 <claime> Rebuilding php base images to pick up 1135922 - T391665 [production]
11:02 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P75099 and previous config saved to /var/cache/conftool/dbconfig/20250416-110252-fceratto.json [production]
10:58 <cgoubert@deploy1003> Finished scap build-images: (no justification provided) (duration: 05m 36s) [production]
10:52 <cgoubert@deploy1003> Started scap build-images: (no justification provided) [production]
10:47 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P75098 and previous config saved to /var/cache/conftool/dbconfig/20250416-104744-fceratto.json [production]
10:32 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1156 (T391056)', diff saved to https://phabricator.wikimedia.org/P75097 and previous config saved to /var/cache/conftool/dbconfig/20250416-103236-fceratto.json [production]
10:29 <MichaelG_WMF> migr@mwmaint1002:/srv/mediawiki/php-1.44.0-wmf.24$ time mwscript ./extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki ruwiki --verbose #T391695 [production]
10:23 <MichaelG_WMF> migr@mwmaint1002:/srv/mediawiki/php-1.44.0-wmf.24$ time mwscript ./extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki frwiki --verbose #T391695 [production]
10:21 <fceratto@cumin1002> dbctl commit (dc=all): 'Depooling db1156 (T391056)', diff saved to https://phabricator.wikimedia.org/P75096 and previous config saved to /var/cache/conftool/dbconfig/20250416-102110-fceratto.json [production]
10:21 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [production]
10:20 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: Maintenance [production]
10:19 <fnegri@cumin1002> END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database nupwiki (T390714) [production]
10:19 <fnegri@cumin1002> START - Cookbook sre.wikireplicas.add-wiki for database nupwiki (T390714) [production]
10:17 <MichaelG_WMF> migr@mwmaint1002:/srv/mediawiki/php-1.44.0-wmf.25$ mwscript ./extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki testwiki --verbose #T391695 [production]
10:13 <cmooney@cumin1002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
10:11 <cmooney@cumin1002> START - Cookbook sre.dns.netbox [production]
09:54 <elukey@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve-ctrl1001.eqiad.wmnet with OS bookworm [production]
09:43 <ladsgroup@deploy1003> Finished scap sync-world: Backport for [[gerrit:1136964|Change default thumbnail size to 250px (T355914)]] (duration: 19m 35s) [production]
09:36 <ladsgroup@deploy1003> ladsgroup: Continuing with sync [production]
09:36 <elukey@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve-ctrl1001.eqiad.wmnet with reason: host reimage [production]
09:35 <ladsgroup@deploy1003> ladsgroup: Backport for [[gerrit:1136964|Change default thumbnail size to 250px (T355914)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
09:32 <elukey@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve-ctrl1001.eqiad.wmnet with reason: host reimage [production]
09:23 <ladsgroup@deploy1003> Started scap sync-world: Backport for [[gerrit:1136964|Change default thumbnail size to 250px (T355914)]] [production]
09:22 <ladsgroup@deploy1003> Finished scap sync-world: Backport for [[gerrit:1136963|Bump thumbnail steps to 100% (T360589)]] (duration: 19m 05s) [production]
09:18 <elukey@cumin1002> START - Cookbook sre.hosts.reimage for host ml-serve-ctrl1001.eqiad.wmnet with OS bookworm [production]
09:15 <vgutierrez> repooling cp4047 - T387238 [production]
09:15 <ladsgroup@deploy1003> ladsgroup: Continuing with sync [production]
09:15 <ladsgroup@deploy1003> ladsgroup: Backport for [[gerrit:1136963|Bump thumbnail steps to 100% (T360589)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
09:02 <ladsgroup@deploy1003> Started scap sync-world: Backport for [[gerrit:1136963|Bump thumbnail steps to 100% (T360589)]] [production]
09:02 <ladsgroup@deploy1003> sync-world failed: <CalledProcessError> Command '['helmfile', '-e', 'eqiad', '--selector', 'name=main', 'write-values', '--output-file-template', '/tmp/tmpsh_tee3p']' returned non-zero exit status 3. (scap version: 4.153.0) (duration: 15m 58s) [production]
08:59 <ladsgroup@deploy1003> ladsgroup: Continuing with sync [production]
08:58 <ladsgroup@deploy1003> ladsgroup: Backport for [[gerrit:1136963|Bump thumbnail steps to 100% (T360589)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
08:46 <ladsgroup@deploy1003> Started scap sync-world: Backport for [[gerrit:1136963|Bump thumbnail steps to 100% (T360589)]] [production]
08:16 <akosiaris> destroy the "main" helmfile releases for mw-wikifunctions. The service is now being powered by the single version MediaWiki HTTP routing solution releases, this is a cleanup. [production]
07:50 <aikochou@deploy1003> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . [production]
07:26 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
07:26 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
07:02 <elukey> powercycle ml-serve2007 - OEM event registered in getsel (seems DIMM-related) [production]
06:09 <volans> installing spicerack v10.1.0 on cumin1002 [production]
05:38 <volans> installing spicerack v10.1.0 on cumin2002 [production]
02:30 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2222 (T391056)', diff saved to https://phabricator.wikimedia.org/P75094 and previous config saved to /var/cache/conftool/dbconfig/20250416-023052-fceratto.json [production]