production SAL

1-50 of 10000 results (88ms)

2025-09-08 §
23:59	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P82798 and previous config saved to /var/cache/conftool/dbconfig/20250908-235927-ladsgroup.json	[production]
23:53	<ladsgroup@cumin1003>	START - Cookbook sre.mysql.pool db1223 gradually with 4 steps - Maint over	[production]
23:45	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P82796 and previous config saved to /var/cache/conftool/dbconfig/20250908-234506-ladsgroup.json	[production]
23:44	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P82795 and previous config saved to /var/cache/conftool/dbconfig/20250908-234419-ladsgroup.json	[production]
23:33	<ladsgroup@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1223.eqiad.wmnet with reason: Upgrade to 10.11	[production]
23:31	<rzl>	helmfile -e eqiad -i apply --set mesh.image_name=envoy-future --set mesh.image_version=1.29.12-1 --context=5 # T403663	[production]
23:30	<rzl@deploy1003>	helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply	[production]
23:30	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Upgrade db1223 to MariaDB 10.11 (T399548)', diff saved to https://phabricator.wikimedia.org/P82794 and previous config saved to /var/cache/conftool/dbconfig/20250908-233042-ladsgroup.json	[production]
23:29	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Repooling after maintenance db1235 (T402925)', diff saved to https://phabricator.wikimedia.org/P82793 and previous config saved to /var/cache/conftool/dbconfig/20250908-232958-ladsgroup.json	[production]
23:29	<ladsgroup@cumin1003>	END (ERROR) - Cookbook sre.mysql.pool (exit_code=97) db1223 gradually with 4 steps - Maint over	[production]
23:29	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Repooling after maintenance db2176 (T402925)', diff saved to https://phabricator.wikimedia.org/P82791 and previous config saved to /var/cache/conftool/dbconfig/20250908-232912-ladsgroup.json	[production]
23:28	<rzl@deploy1003>	helmfile [eqiad] START helmfile.d/services/mw-debug: apply	[production]
23:21	<jdlrobson@deploy1003>	Finished scap sync-world: Backport for [[gerrit:1182944\|Cleanup: Simplify configuration for wgSpecialContributeSkinsEnabled]], [[gerrit:1186044\|Temporarily use production for summary endpoint (T400694)]] (duration: 16m 06s)	[production]
23:16	<jdlrobson@deploy1003>	jdlrobson: Continuing with sync	[production]
23:11	<jdlrobson@deploy1003>	jdlrobson: Backport for [[gerrit:1182944\|Cleanup: Simplify configuration for wgSpecialContributeSkinsEnabled]], [[gerrit:1186044\|Temporarily use production for summary endpoint (T400694)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.	[production]
23:10	<ladsgroup@cumin1003>	START - Cookbook sre.mysql.pool db1223 gradually with 4 steps - Maint over	[production]
23:08	<ladsgroup@cumin1002>	END (PASS) - Cookbook sre.mysql.upgrade (exit_code=0) for db1223.eqiad.wmnet	[production]
23:08	<eileen>	civicrm upgraded from c7ebd726 to 1ec5de94	[production]
23:05	<jdlrobson@deploy1003>	Started scap sync-world: Backport for [[gerrit:1182944\|Cleanup: Simplify configuration for wgSpecialContributeSkinsEnabled]], [[gerrit:1186044\|Temporarily use production for summary endpoint (T400694)]]	[production]
23:02	<ladsgroup@cumin1002>	END (PASS) - Cookbook sre.mysql.depool (exit_code=0) db1223 - Upgrading db1223.eqiad.wmnet	[production]
23:02	<ladsgroup@cumin1002>	START - Cookbook sre.mysql.depool db1223 - Upgrading db1223.eqiad.wmnet	[production]
23:02	<ladsgroup@cumin1002>	START - Cookbook sre.mysql.upgrade for db1223.eqiad.wmnet	[production]
22:56	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Depool db1223 T404025', diff saved to https://phabricator.wikimedia.org/P82789 and previous config saved to /var/cache/conftool/dbconfig/20250908-225603-ladsgroup.json	[production]
22:54	<ladsgroup@dns1004>	END - running authdns-update	[production]
22:53	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Depooling db2176 (T402925)', diff saved to https://phabricator.wikimedia.org/P82788 and previous config saved to /var/cache/conftool/dbconfig/20250908-225313-ladsgroup.json	[production]
22:53	<ladsgroup@dns1004>	START - running authdns-update	[production]
22:53	<ladsgroup@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2176.codfw.wmnet with reason: Maintenance	[production]
22:52	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Repooling after maintenance db2174 (T402925)', diff saved to https://phabricator.wikimedia.org/P82787 and previous config saved to /var/cache/conftool/dbconfig/20250908-225250-ladsgroup.json	[production]
22:50	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Promote db1189 to s3 primary and set section read-write T404025', diff saved to https://phabricator.wikimedia.org/P82786 and previous config saved to /var/cache/conftool/dbconfig/20250908-225054-ladsgroup.json	[production]
22:49	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Set s3 eqiad as read-only for maintenance - T404025', diff saved to https://phabricator.wikimedia.org/P82785 and previous config saved to /var/cache/conftool/dbconfig/20250908-224914-ladsgroup.json	[production]
22:48	<Amir1>	Starting s3 eqiad failover from db1223 to db1189 - T404025	[production]
22:43	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Set db1189 with weight 0 T404025', diff saved to https://phabricator.wikimedia.org/P82784 and previous config saved to /var/cache/conftool/dbconfig/20250908-224330-ladsgroup.json	[production]
22:42	<ladsgroup@cumin1002>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s3 T404025	[production]
22:37	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P82783 and previous config saved to /var/cache/conftool/dbconfig/20250908-223742-ladsgroup.json	[production]
22:35	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Depooling db1235 (T402925)', diff saved to https://phabricator.wikimedia.org/P82782 and previous config saved to /var/cache/conftool/dbconfig/20250908-223528-ladsgroup.json	[production]
22:35	<ladsgroup@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1235.eqiad.wmnet with reason: Maintenance	[production]
22:35	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Repooling after maintenance db1234 (T402925)', diff saved to https://phabricator.wikimedia.org/P82781 and previous config saved to /var/cache/conftool/dbconfig/20250908-223504-ladsgroup.json	[production]
22:23	<andrew@cumin2002>	START - Cookbook sre.hosts.reimage for host cloudcephmon1004.eqiad.wmnet with OS bullseye	[production]
22:22	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P82780 and previous config saved to /var/cache/conftool/dbconfig/20250908-222235-ladsgroup.json	[production]
22:19	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P82779 and previous config saved to /var/cache/conftool/dbconfig/20250908-221956-ladsgroup.json	[production]
22:07	<andrew@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephmon1004.eqiad.wmnet with OS bullseye	[production]
22:07	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Repooling after maintenance db2174 (T402925)', diff saved to https://phabricator.wikimedia.org/P82778 and previous config saved to /var/cache/conftool/dbconfig/20250908-220728-ladsgroup.json	[production]
22:04	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P82777 and previous config saved to /var/cache/conftool/dbconfig/20250908-220449-ladsgroup.json	[production]
21:58	<jhancock@cumin1002>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART	[production]
21:57	<jhancock@cumin1002>	START - Cookbook sre.hosts.provision for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART	[production]
21:52	<jhancock@cumin1002>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART	[production]
21:51	<jhancock@cumin1002>	START - Cookbook sre.hosts.provision for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART	[production]
21:49	<ladsgroup@cumin1003>	dbctl commit (dc=all): 'Repooling after maintenance db1234 (T402925)', diff saved to https://phabricator.wikimedia.org/P82776 and previous config saved to /var/cache/conftool/dbconfig/20250908-214941-ladsgroup.json	[production]
21:46	<jhancock@cumin1002>	END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART	[production]
21:45	<jhancock@cumin1002>	START - Cookbook sre.hosts.provision for host dse-k8s-worker1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART	[production]