production SAL

4751-4800 of 10000 results (96ms)

2023-11-08 §
22:12	<milimetric@deploy2002>	helmfile [eqiad] START helmfile.d/services/edit-analytics: apply	[production]
22:12	<milimetric@deploy2002>	helmfile [staging] DONE helmfile.d/services/edit-analytics: apply	[production]
22:12	<milimetric@deploy2002>	helmfile [staging] START helmfile.d/services/edit-analytics: apply	[production]
22:08	<ryankemper@cumin1001>	START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart (java 11 sec updates) - ryankemper@cumin1001 - T350703	[production]
21:50	<eevans@cumin1001>	START - Cookbook sre.cassandra.roll-restart for nodes matching restbase10[25-27,30,33].eqiad.wmnet: Applying JVM security upgrade (row A) - eevans@cumin1001	[production]
21:48	<eevans@cumin1001>	END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase10[22-24,29,32].eqiad.wmnet: Applying JVM security upgrade (row A) - eevans@cumin1001	[production]
20:28	<brett@cumin2002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host acmechief-test1001.eqiad.wmnet with OS bookworm	[production]
20:26	<otto@deploy2002>	helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply	[production]
20:25	<otto@deploy2002>	helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply	[production]
20:23	<eevans@cumin1001>	START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore2*.codfw.wmnet: Applying JVM security upgrade - eevans@cumin1001	[production]
20:21	<otto@deploy2002>	helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply	[production]
20:21	<otto@deploy2002>	helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply	[production]
20:19	<eevans@cumin1001>	END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore1*.eqiad.wmnet: Applying JVM security upgrade - eevans@cumin1001	[production]
20:10	<eevans@cumin1001>	START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore1*.eqiad.wmnet: Applying JVM security upgrade - eevans@cumin1001	[production]
20:08	<eevans@cumin1001>	END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Applying JVM security upgrade - eevans@cumin1001	[production]
20:05	<brett@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on acmechief-test1001.eqiad.wmnet with reason: host reimage	[production]
20:02	<brett@cumin2002>	START - Cookbook sre.hosts.downtime for 2:00:00 on acmechief-test1001.eqiad.wmnet with reason: host reimage	[production]
20:01	<ebernhardson@deploy2002>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
20:01	<ebernhardson@deploy2002>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
19:57	<eevans@cumin1001>	START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Applying JVM security upgrade - eevans@cumin1001	[production]
19:49	<brett@cumin2002>	START - Cookbook sre.hosts.reimage for host acmechief-test1001.eqiad.wmnet with OS bookworm	[production]
17:43	<sukhe@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp3079.esams.wmnet	[production]
16:56	<hnowlan@deploy2002>	helmfile [staging] START helmfile.d/services/editor-analytics: apply	[production]
16:54	<otto@deploy2002>	helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply	[production]
16:54	<otto@deploy2002>	helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply	[production]
16:48	<btullis@cumin1001>	END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad	[production]
16:47	<jforrester@deploy2002>	Finished scap: Backport for [[gerrit:972721\|Skip PerformanceBudgetTest::testTotalModulesSize (T350338)]], [[gerrit:972720\|Modify regex to reflect updated DOM (T350777)]] (duration: 07m 29s)	[production]
16:41	<jforrester@deploy2002>	jforrester: Continuing with sync	[production]
16:40	<jforrester@deploy2002>	jforrester: Backport for [[gerrit:972721\|Skip PerformanceBudgetTest::testTotalModulesSize (T350338)]], [[gerrit:972720\|Modify regex to reflect updated DOM (T350777)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
16:39	<jforrester@deploy2002>	Started scap: Backport for [[gerrit:972721\|Skip PerformanceBudgetTest::testTotalModulesSize (T350338)]], [[gerrit:972720\|Modify regex to reflect updated DOM (T350777)]]	[production]
16:38	<btullis@cumin1001>	END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.	[production]
16:34	<ebernhardson@deploy2002>	Finished deploy [airflow-dags/search@869cca4]: Set group ownership of processed sparql queries (duration: 00m 27s)	[production]
16:33	<ebernhardson@deploy2002>	Started deploy [airflow-dags/search@869cca4]: Set group ownership of processed sparql queries	[production]
16:31	<btullis@cumin1001>	START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons.	[production]
16:24	<jmm@cumin2002>	END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: dse_k8s::master	[production]
16:23	<btullis@cumin1001>	START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad	[production]
16:19	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host pc1014.eqiad.wmnet with OS bookworm	[production]
16:11	<jmm@cumin2002>	START - Cookbook sre.puppet.migrate-role for role: dse_k8s::master	[production]
16:09	<jmm@cumin2002>	END (PASS) - Cookbook sre.puppet.migrate-role (exit_code=0) for role: dse_k8s::worker	[production]
16:04	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on pc1014.eqiad.wmnet with reason: host reimage	[production]
16:01	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on pc1014.eqiad.wmnet with reason: host reimage	[production]
15:57	<jmm@cumin2002>	START - Cookbook sre.puppet.migrate-role for role: dse_k8s::worker	[production]
15:57	<btullis@cumin1001>	END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop analytics cluster: Roll restart of jvm daemons for openjdk upgrade.	[production]
15:51	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubestage2001.codfw.wmnet	[production]
15:48	<bvibber>	brion running requeueTranscodes.php on mwmaint2002 to continue backfill for iOS-compatible low-res video (throttled)	[production]
15:43	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host kubestage2001.codfw.wmnet	[production]
15:43	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kubestagemaster2001.codfw.wmnet	[production]
15:41	<hnowlan@deploy2002>	helmfile [staging] DONE helmfile.d/services/editor-analytics: apply	[production]
15:41	<hnowlan@deploy2002>	helmfile [staging] START helmfile.d/services/editor-analytics: apply	[production]
15:33	<bvibber>	brion running requeueTranscodes.php to batch-remove old low-res VP9 WebM transcodes (should be low impact)	[production]