production SAL

7651-7700 of 10000 results (148ms)

2024-09-24 §
12:30	<jiji@cumin1002>	START - Cookbook sre.dns.wipe-cache wikikube-worker2127.codfw.wmnet on all recursors	[production]
12:30	<jiji@cumin1002>	END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mw2427 to wikikube-worker2127	[production]
12:29	<elukey@deploy1003>	helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: sync	[production]
12:29	<jiji@cumin1002>	END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker2127	[production]
12:29	<elukey@deploy1003>	helmfile [staging] START helmfile.d/services/tegola-vector-tiles: sync	[production]
12:29	<jiji@cumin1002>	START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker2127	[production]
12:29	<jiji@cumin1002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
12:29	<jiji@cumin1002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2427 to wikikube-worker2127 - jiji@cumin1002"	[production]
12:28	<jiji@cumin1002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mw2427 to wikikube-worker2127 - jiji@cumin1002"	[production]
12:28	<akosiaris@deploy1003>	helmfile [eqiad] DONE helmfile.d/services/mw-web: apply	[production]
12:28	<akosiaris@deploy1003>	helmfile [eqiad] START helmfile.d/services/mw-web: apply	[production]
12:26	<jiji@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on mc2038.codfw.wmnet with reason: CPU failure - T375495	[production]
12:25	<jiji@cumin1002>	START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on mc2038.codfw.wmnet with reason: CPU failure - T375495	[production]
12:24	<btullis@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1001.eqiad.wmnet with OS bullseye	[production]
12:21	<jiji@cumin1002>	START - Cookbook sre.dns.netbox	[production]
12:21	<jiji@cumin1002>	START - Cookbook sre.hosts.rename from mw2427 to wikikube-worker2127	[production]
12:17	<jiji@cumin1002>	END (FAIL) - Cookbook sre.hosts.rename (exit_code=93) from mw2427 to wikikube-worker2127	[production]
12:17	<jiji@cumin1002>	START - Cookbook sre.hosts.rename from mw2427 to wikikube-worker2127	[production]
12:16	<jiji@cumin1002>	END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2427.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTARTand with Dell SCP reboot policy GRACEFUL	[production]
12:16	<akosiaris@deploy1003>	helmfile [eqiad] DONE helmfile.d/services/mw-web: apply	[production]
12:16	<akosiaris@deploy1003>	helmfile [eqiad] START helmfile.d/services/mw-web: apply	[production]
12:13	<akosiaris@deploy1003>	helmfile [eqiad] DONE helmfile.d/services/mw-web: apply	[production]
12:13	<akosiaris@deploy1003>	helmfile [eqiad] START helmfile.d/services/mw-web: apply	[production]
12:12	<jiji@cumin1002>	START - Cookbook sre.hosts.provision for host mw2427.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTARTand with Dell SCP reboot policy GRACEFUL	[production]
12:12	<jynus>	running db-compare on s2, s3 T375186	[production]
12:05	<btullis@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1001.eqiad.wmnet with reason: host reimage	[production]
12:05	<jiji@cumin1002>	END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host mw2427.codfw.wmnet	[production]
12:05	<jiji@cumin1002>	START - Cookbook sre.k8s.pool-depool-node depool for host mw2427.codfw.wmnet	[production]
12:01	<btullis@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1001.eqiad.wmnet with reason: host reimage	[production]
11:59	<jiji@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2427.codfw.wmnet with reason: reimage	[production]
11:59	<jiji@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on mw2427.codfw.wmnet with reason: reimage	[production]
11:48	<btullis@cumin1002>	START - Cookbook sre.hosts.reimage for host dse-k8s-worker1001.eqiad.wmnet with OS bullseye	[production]
11:48	<btullis@cumin1002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1001.eqiad.wmnet with OS bullseye	[production]
11:45	<gmodena@deploy1003>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply	[production]
11:45	<gmodena@deploy1003>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich: apply	[production]
11:42	<btullis@cumin1002>	START - Cookbook sre.hosts.reimage for host dse-k8s-worker1001.eqiad.wmnet with OS bullseye	[production]
11:39	<gmodena@deploy1003>	helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich-next: apply	[production]
11:39	<gmodena@deploy1003>	helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-dump-rev-content-reconcile-enrich-next: apply	[production]
11:38	<jiji@cumin1002>	END (PASS) - Cookbook sre.k8s.renumber-node (exit_code=0) Renumbering for host wikikube-worker2126.codfw.wmnet	[production]
11:38	<jiji@cumin1002>	END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2126.codfw.wmnet	[production]
11:38	<jiji@cumin1002>	START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2126.codfw.wmnet	[production]
11:38	<moritzm>	installing systemd bugfix updates from Bookworm point release	[production]
11:31	<effie>	homer lsw1-a6-codfw* commit 'T372878'	[production]
11:31	<effie>	homer crcodfw commit 'T372878'	[production]
10:57	<jiji@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2126.codfw.wmnet with OS bullseye	[production]
10:34	<jiji@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2126.codfw.wmnet with reason: host reimage	[production]
10:32	<btullis@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1001.eqiad.wmnet with OS bullseye	[production]
10:30	<jiji@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2126.codfw.wmnet with reason: host reimage	[production]
10:25	<godog>	force deletion of older thanos blocks - T351927	[production]
10:17	<stevemunene@cumin1002>	START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1176.eqiad.wmnet	[production]