production SAL

601-650 of 10000 results (97ms)

2025-01-06 §
10:13	<cgoubert@deploy2002>	helmfile [codfw] START helmfile.d/admin 'apply'.	[production]
10:13	<cgoubert@deploy2002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
10:13	<jelto@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2050.codfw.wmnet with reason: host reimage	[production]
10:12	<cgoubert@deploy2002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
10:12	<cgoubert@deploy2002>	helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.	[production]
10:11	<cgoubert@deploy2002>	helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.	[production]
10:11	<cgoubert@deploy2002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.	[production]
10:10	<cgoubert@deploy2002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.	[production]
10:10	<cgoubert@deploy2002>	helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.	[production]
10:09	<cgoubert@deploy2002>	helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.	[production]
10:09	<cgoubert@deploy2002>	helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.	[production]
10:09	<cgoubert@deploy2002>	helmfile [staging-codfw] START helmfile.d/admin 'apply'.	[production]
10:08	<cgoubert@deploy2002>	helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.	[production]
10:08	<jayme@cumin1002>	START - Cookbook sre.hosts.reimage for host wikikube-worker1242.eqiad.wmnet with OS bookworm	[production]
10:08	<cgoubert@deploy2002>	helmfile [staging-eqiad] START helmfile.d/admin 'apply'.	[production]
10:07	<cgoubert@deploy2002>	helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.	[production]
10:07	<cgoubert@deploy2002>	helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.	[production]
10:07	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Depooling db1167 (T371742)', diff saved to https://phabricator.wikimedia.org/P71801 and previous config saved to /var/cache/conftool/dbconfig/20250106-100706-ladsgroup.json	[production]
10:07	<ladsgroup@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance	[production]
10:06	<claime>	Deploying admin_ng external services changes on all kubernetes clusters	[production]
10:06	<ladsgroup@cumin1002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance	[production]
10:06	<ladsgroup@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db1167.eqiad.wmnet with reason: Maintenance	[production]
10:06	<ladsgroup@cumin1002>	START - Cookbook sre.hosts.downtime for 12:00:00 on db1167.eqiad.wmnet with reason: Maintenance	[production]
10:06	<jayme@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1241.eqiad.wmnet with OS bookworm	[production]
09:55	<jelto@cumin1002>	END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2050	[production]
09:55	<jelto@cumin1002>	END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker2049	[production]
09:55	<jelto@cumin1002>	START - Cookbook sre.hosts.move-vlan for host wikikube-worker2050	[production]
09:55	<jelto@cumin1002>	START - Cookbook sre.hosts.move-vlan for host wikikube-worker2049	[production]
09:55	<jelto@cumin1002>	START - Cookbook sre.hosts.reimage for host wikikube-worker2050.codfw.wmnet with OS bookworm	[production]
09:55	<jelto@cumin1002>	START - Cookbook sre.hosts.reimage for host wikikube-worker2049.codfw.wmnet with OS bookworm	[production]
09:54	<jelto@cumin1002>	END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2049-2050].codfw.wmnet	[production]
09:54	<jayme@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1246.eqiad.wmnet with reason: host reimage	[production]
09:53	<jelto@cumin1002>	START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2049-2050].codfw.wmnet	[production]
09:52	<jelto@cumin1002>	END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2052.codfw.wmnet	[production]
09:52	<jelto@cumin1002>	START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2052.codfw.wmnet	[production]
09:52	<jelto@cumin1002>	END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2051.codfw.wmnet	[production]
09:52	<jelto@cumin1002>	START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2051.codfw.wmnet	[production]
09:51	<jelto@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2052.codfw.wmnet with OS bookworm	[production]
09:50	<jayme@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1246.eqiad.wmnet with reason: host reimage	[production]
09:47	<jayme@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1241.eqiad.wmnet with reason: host reimage	[production]
09:46	<jelto@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2051.codfw.wmnet with OS bookworm	[production]
09:43	<jayme@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1241.eqiad.wmnet with reason: host reimage	[production]
09:41	<dcausse>	repooling wdqs1012	[production]
09:31	<jelto@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2052.codfw.wmnet with reason: host reimage	[production]
09:29	<jayme@cumin1002>	START - Cookbook sre.hosts.reimage for host wikikube-worker1246.eqiad.wmnet with OS bookworm	[production]
09:27	<jelto@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2051.codfw.wmnet with reason: host reimage	[production]
09:26	<dcausse>	depooling wdqs1012 (high lag, forgot to keep it depooled after restarting blazegraph)	[production]
09:26	<marostegui>	Reboot db2160 for kernel upgrade T376905	[production]
09:25	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2160.codfw.wmnet with reason: upgrade kernel	[production]
09:25	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 5:00:00 on db2160.codfw.wmnet with reason: upgrade kernel	[production]