production SAL

1601-1650 of 10000 results (135ms)

2026-07-28 §
09:10	<ayounsi@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr2-drmrs,cr2-drmrs IPv6,cr2-drmrs.mgmt with reason: router upgrade	[production]
09:06	<XioNoX>	draining cr2-drmrs - T431749	[production]
09:06	<cwilliams@cumin1003>	dbctl commit (dc=all): 'Depooling db2163 (T431660)', diff saved to https://phabricator.wikimedia.org/P95320 and previous config saved to /var/cache/conftool/dbconfig/20260728-090638-cwilliams.json	[production]
09:06	<cwilliams@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2163.codfw.wmnet with reason: Maintenance	[production]
09:06	<root@cumin1003>	END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2161: Maintenance	[production]
09:04	<klausman@cumin1003>	START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2002.codfw.wmnet	[production]
09:04	<klausman@cumin1003>	END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host ml-staging2001.codfw.wmnet	[production]
09:04	<klausman@cumin1003>	START - Cookbook sre.k8s.pool-depool-node pool for host ml-staging2001.codfw.wmnet	[production]
08:57	<klausman@cumin1003>	END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host ml-staging2001.codfw.wmnet	[production]
08:48	<XioNoX>	un-drain cr1-drmrs - T431749	[production]
08:47	<klausman@cumin1003>	START - Cookbook sre.k8s.pool-depool-node depool for host ml-staging2001.codfw.wmnet	[production]
08:47	<klausman@cumin1003>	START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-staging-worker	[production]
08:42	<kevinbazira@deploy1003>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .	[production]
08:35	<XioNoX>	rebooting cr1-drmrs - T431749	[production]
08:33	<XioNoX>	draining cr1-drmrs - T431749	[production]
08:31	<cwilliams@cumin1003>	START - Cookbook sre.mysql.pool pool db2210: Maintenance	[production]
08:29	<cwilliams@cumin1003>	START - Cookbook sre.mysql.pool pool db2182: Maintenance	[production]
08:21	<root@cumin1003>	END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db2182: Maintenance	[production]
08:17	<root@cumin1003>	START - Cookbook sre.mysql.pool pool db2161: Maintenance	[production]
08:16	<root@cumin1003>	START - Cookbook sre.mysql.pool pool db2182: Maintenance	[production]
08:12	<root@cumin1003>	END (FAIL) - Cookbook sre.mysql.pool (exit_code=99) pool db2210: Maintenance	[production]
08:10	<cwilliams@cumin1003>	dbctl commit (dc=all): 'Depooling db2161 (T431660)', diff saved to https://phabricator.wikimedia.org/P95309 and previous config saved to /var/cache/conftool/dbconfig/20260728-081044-cwilliams.json	[production]
08:10	<cwilliams@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2161.codfw.wmnet with reason: Maintenance	[production]
08:10	<root@cumin1003>	END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2154: Maintenance	[production]
08:09	<cwilliams@cumin1003>	dbctl commit (dc=all): 'Depooling db2182 (T431660)', diff saved to https://phabricator.wikimedia.org/P95307 and previous config saved to /var/cache/conftool/dbconfig/20260728-080947-cwilliams.json	[production]
08:09	<cwilliams@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2182.codfw.wmnet with reason: Maintenance	[production]
08:09	<root@cumin1003>	END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2168: Maintenance	[production]
08:06	<ayounsi@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr1-drmrs,cr1-drmrs IPv6,cr1-drmrs.mgmt with reason: router upgrade	[production]
08:06	<root@cumin1003>	START - Cookbook sre.mysql.pool pool db2210: Maintenance	[production]
08:05	<ayounsi@cumin1003>	END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool drmrs [reason: router upgrade, T431749]	[production]
08:05	<ayounsi@cumin1003>	START - Cookbook sre.dns.admin DNS admin: depool drmrs [reason: router upgrade, T431749]	[production]
08:00	<cwilliams@cumin1003>	dbctl commit (dc=all): 'Depooling db2210 (T431660)', diff saved to https://phabricator.wikimedia.org/P95305 and previous config saved to /var/cache/conftool/dbconfig/20260728-080008-cwilliams.json	[production]
08:00	<cwilliams@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2210.codfw.wmnet with reason: Maintenance	[production]
07:59	<root@cumin1003>	END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2206: Maintenance	[production]
07:50	<gkyziridis@deploy1003>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .	[production]
07:50	<gkyziridis@deploy1003>	helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .	[production]
07:22	<root@cumin1003>	START - Cookbook sre.mysql.pool pool db2154: Maintenance	[production]
07:22	<root@cumin1003>	START - Cookbook sre.mysql.pool pool db2168: Maintenance	[production]
07:16	<cwilliams@cumin1003>	dbctl commit (dc=all): 'Depooling db2154 (T431660)', diff saved to https://phabricator.wikimedia.org/P95295 and previous config saved to /var/cache/conftool/dbconfig/20260728-071640-cwilliams.json	[production]
07:16	<cwilliams@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2154.codfw.wmnet with reason: Maintenance	[production]
07:16	<cwilliams@cumin1003>	dbctl commit (dc=all): 'Depooling db2168 (T431660)', diff saved to https://phabricator.wikimedia.org/P95294 and previous config saved to /var/cache/conftool/dbconfig/20260728-071604-cwilliams.json	[production]
07:15	<cwilliams@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance	[production]
07:08	<root@cumin1003>	START - Cookbook sre.mysql.pool pool db2206: Maintenance	[production]
07:02	<cwilliams@cumin1003>	dbctl commit (dc=all): 'Depooling db2206 (T431660)', diff saved to https://phabricator.wikimedia.org/P95292 and previous config saved to /var/cache/conftool/dbconfig/20260728-070219-cwilliams.json	[production]
07:02	<cwilliams@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2206.codfw.wmnet with reason: Maintenance	[production]
06:44	<marostegui>	Failover m5 from db1164 to db1228 - T432967	[production]
06:39	<marostegui@cumin1003>	DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2160,2235].codfw.wmnet,db[1164,1217,1228].eqiad.wmnet with reason: m5 master switch T432967	[production]
04:02	<mwpresync@deploy1003>	Pruned MediaWiki: 1.47.0-wmf.10 (duration: 02m 34s)	[production]
03:39	<mwpresync@deploy1003>	Finished scap sync-world: testwikis to 1.47.0-wmf.13 refs T430832 (duration: 36m 06s)	[production]
03:03	<mwpresync@deploy1003>	Started scap sync-world: testwikis to 1.47.0-wmf.13 refs T430832	[production]