production SAL

2301-2350 of 10000 results (134ms)

2024-11-20 §
08:35	<jmm@cumin2002>	START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7004.magru.wmnet	[production]
08:35	<jmm@cumin2002>	END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7004.magru.wmnet	[production]
08:34	<jmm@cumin2002>	START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7004.magru.wmnet	[production]
08:18	<hashar>	Restarted CI Jenkins to upgrade Leastload plugin and remove the SSH server plugin	[production]
2024-11-19 §
22:50	<ryankemper@deploy2002>	Started deploy [wdqs/wdqs@9927a5a] (wcqs): Deploy 0.3.150 to WCQS	[production]
22:00	<urbanecm@deploy2002>	Finished scap sync-world: Backport for [[gerrit:1092341\|Enable experimental Parsoid fragment support on labs and test wikis (T374661)]], [[gerrit:1092850\|Revert "editcheck: Remove try/catch around transaction squashing" (T333710 T380234)]], [[gerrit:1092851\|Revert "editcheck: Remove try/catch around transaction squashing" (T333710 T380234)]] (duration: 20m 39s)	[production]
21:53	<urbanecm@deploy2002>	cscott, kemayo, urbanecm: Continuing with sync	[production]
21:45	<urbanecm@deploy2002>	cscott, kemayo, urbanecm: Backport for [[gerrit:1092341\|Enable experimental Parsoid fragment support on labs and test wikis (T374661)]], [[gerrit:1092850\|Revert "editcheck: Remove try/catch around transaction squashing" (T333710 T380234)]], [[gerrit:1092851\|Revert "editcheck: Remove try/catch around transaction squashing" (T333710 T380234)]] synced to the testservers (https://wikitech.wikimedia.or	[production]
21:39	<jhancock@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2041.codfw.wmnet with OS bookworm	[production]
21:39	<urbanecm@deploy2002>	Started scap sync-world: Backport for [[gerrit:1092341\|Enable experimental Parsoid fragment support on labs and test wikis (T374661)]], [[gerrit:1092850\|Revert "editcheck: Remove try/catch around transaction squashing" (T333710 T380234)]], [[gerrit:1092851\|Revert "editcheck: Remove try/catch around transaction squashing" (T333710 T380234)]]	[production]
21:38	<urbanecm@deploy2002>	Finished scap sync-world: Backport for [[gerrit:1092296\|Promote Vector 2022 as default on 3 wikis (T379765)]], [[gerrit:1092912\|Separate cache key space for test & production JsonConfig data (T380320)]] (duration: 14m 38s)	[production]
21:31	<urbanecm@deploy2002>	bvibber, jdlrobson, urbanecm: Continuing with sync	[production]
21:29	<urbanecm@deploy2002>	bvibber, jdlrobson, urbanecm: Backport for [[gerrit:1092296\|Promote Vector 2022 as default on 3 wikis (T379765)]], [[gerrit:1092912\|Separate cache key space for test & production JsonConfig data (T380320)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
21:23	<urbanecm@deploy2002>	Started scap sync-world: Backport for [[gerrit:1092296\|Promote Vector 2022 as default on 3 wikis (T379765)]], [[gerrit:1092912\|Separate cache key space for test & production JsonConfig data (T380320)]]	[production]
21:16	<eevans@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2038.codfw.wmnet with reason: Bootstrapping — T380236	[production]
21:15	<eevans@cumin1002>	START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2038.codfw.wmnet with reason: Bootstrapping — T380236	[production]
21:15	<eevans@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2037.codfw.wmnet with reason: Bootstrapping — T380236	[production]
21:15	<eevans@cumin1002>	START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2037.codfw.wmnet with reason: Bootstrapping — T380236	[production]
21:15	<eevans@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2036.codfw.wmnet with reason: Bootstrapping — T380236	[production]
21:14	<eevans@cumin1002>	START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2036.codfw.wmnet with reason: Bootstrapping — T380236	[production]
20:56	<jhancock@cumin2002>	START - Cookbook sre.hosts.reimage for host es2041.codfw.wmnet with OS bookworm	[production]
20:50	<jhathaway@cumin2002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host thanos-be2005.codfw.wmnet with OS bullseye	[production]
20:40	<jhathaway@cumin2002>	START - Cookbook sre.hosts.reimage for host thanos-be2005.codfw.wmnet with OS bullseye	[production]
20:40	<jhathaway@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host thanos-be2005.codfw.wmnet with OS bullseye	[production]
20:32	<sukhe@cumin1002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp7007.magru.wmnet with OS bullseye	[production]
20:29	<sukhe@cumin1002>	START - Cookbook sre.hosts.reimage for host cp7007.magru.wmnet with OS bullseye	[production]
20:24	<jhancock@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es2041.codfw.wmnet with OS bookworm	[production]
20:24	<jhathaway@cumin2002>	START - Cookbook sre.hosts.reimage for host thanos-be2005.codfw.wmnet with OS bullseye	[production]
20:10	<jhathaway@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ms-be2082.codfw.wmnet with reason: T371400	[production]
20:10	<jhathaway@cumin1002>	START - Cookbook sre.hosts.downtime for 3:00:00 on ms-be2082.codfw.wmnet with reason: T371400	[production]
20:05	<jhancock@cumin2002>	START - Cookbook sre.hosts.reimage for host es2041.codfw.wmnet with OS bookworm	[production]
20:03	<jclark@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1183.eqiad.wmnet with OS bullseye	[production]
20:03	<jclark@cumin1002>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"	[production]
19:47	<pt1979@cumin2002>	END (FAIL) - Cookbook sre.hosts.dhcp (exit_code=99) for host cp7007.magru.wmnet	[production]
19:41	<sukhe@cumin1002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp7007.magru.wmnet with OS bullseye	[production]
19:40	<pt1979@cumin2002>	START - Cookbook sre.hosts.dhcp for host cp7007.magru.wmnet	[production]
19:34	<jclark@cumin1002>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"	[production]
19:17	<ebernhardson@deploy2002>	Finished deploy [airflow-dags/search@a4d0954]: mjolnir: T379045 Increase maxResultSize (duration: 00m 26s)	[production]
19:16	<ebernhardson@deploy2002>	Started deploy [airflow-dags/search@a4d0954]: mjolnir: T379045 Increase maxResultSize	[production]
19:15	<sukhe@cumin1002>	START - Cookbook sre.hosts.reimage for host cp7007.magru.wmnet with OS bullseye	[production]
19:14	<sukhe@cumin1002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp7007.magru.wmnet with OS bullseye	[production]
19:12	<jclark@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1183.eqiad.wmnet with reason: host reimage	[production]
19:08	<sukhe@cumin1002>	START - Cookbook sre.hosts.reimage for host cp7007.magru.wmnet with OS bullseye	[production]
19:08	<sukhe@cumin1002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp7007.magru.wmnet with OS bullseye	[production]
19:08	<jclark@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1183.eqiad.wmnet with reason: host reimage	[production]
19:05	<jhathaway@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ms-be2082.codfw.wmnet with reason: T371400	[production]
19:05	<jhathaway@cumin1002>	START - Cookbook sre.hosts.downtime for 3:00:00 on ms-be2082.codfw.wmnet with reason: T371400	[production]
18:53	<jclark@cumin1002>	START - Cookbook sre.hosts.reimage for host an-worker1183.eqiad.wmnet with OS bullseye	[production]
18:53	<brett>	Import ncmonitor 1.3.0-1 into main apt repo	[production]
18:52	<jclark@cumin1002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1183.eqiad.wmnet with OS bullseye	[production]