production SAL

151-200 of 10000 results (83ms)

2024-08-14 §
09:11	<arnaudb@cumin1002>	START - Cookbook sre.hosts.downtime for 4:00:00 on db2189.codfw.wmnet with reason: replication still catching up	[production]
08:53	<jayme@cumin1002>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main2010.codfw.wmnet with OS bullseye	[production]
08:46	<jayme@cumin1002>	START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bullseye	[production]
07:45	<arnaudb@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2189.codfw.wmnet with reason: index corruption	[production]
07:45	<arnaudb@cumin1002>	START - Cookbook sre.hosts.downtime for 4:00:00 on db2189.codfw.wmnet with reason: index corruption	[production]
00:54	<eileen>	config revision changed from d6f17100 to f569b590	[production]
00:41	<eileen>	civicrm upgraded from dd54b9ae to eecbba5d	[production]
00:11	<eileen>	civicrm upgraded from 686c7c5f to dd54b9ae	[production]
00:04	<eileen>	config revision changed from e8cc0ed6 to d6f17100	[production]
2024-08-13 §
23:08	<ejegg>	payments-wiki upgraded from 2d48f432 to 3eb3be67	[production]
21:56	<inflatador>	bking@cumin2002 reboot wdqs101[3-5],1018,1020 from DRAC due to unresponsiveness T372442	[production]
21:16	<ebernhardson@deploy1003>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
21:16	<ebernhardson@deploy1003>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
21:15	<ebernhardson@deploy1003>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
21:15	<ebernhardson@deploy1003>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
21:09	<ebernhardson@deploy1003>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
21:09	<ebernhardson@deploy1003>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
21:07	<ebernhardson@deploy1003>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
21:07	<ebernhardson@deploy1003>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
20:51	<ryankemper@cumin2002>	END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T370754, transfer fresh wdqs-main journal to codfw host) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs2021.codfw.wmnet w/ force delete existing files, repooling neither afterwards	[production]
20:22	<brett>	Update ncmonitor to 1.2.0 via apt1002	[production]
19:57	<ryankemper@cumin2002>	START - Cookbook sre.wdqs.data-transfer (T370754, transfer fresh wdqs-main journal to codfw host) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs2021.codfw.wmnet w/ force delete existing files, repooling neither afterwards	[production]
19:44	<ebernhardson@deploy1003>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
19:43	<ebernhardson@deploy1003>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
19:32	<bking@cumin2002>	END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_eqiad: security update - bking@cumin2002 - T371874	[production]
19:29	<ryankemper@cumin2002>	END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T370754, transfer fresh wdqs-main journal to codfw host) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs2021.codfw.wmnet, repooling neither afterwards	[production]
19:27	<ryankemper@cumin2002>	START - Cookbook sre.wdqs.data-transfer (T370754, transfer fresh wdqs-main journal to codfw host) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs2021.codfw.wmnet, repooling neither afterwards	[production]
19:25	<ebernhardson@deploy1003>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
19:25	<ebernhardson@deploy1003>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
19:25	<ebernhardson@deploy1003>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
19:25	<ebernhardson@deploy1003>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
19:24	<ebernhardson@deploy1003>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
19:24	<ebernhardson@deploy1003>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
19:05	<jhuneidi@deploy1003>	rebuilt and synchronized wikiversions files: group0 to 1.43.0-wmf.18 refs T366963	[production]
18:54	<jhuneidi@deploy1003>	Finished scap sync-world: Backport for [[gerrit:1062284\|Revert "Prevent dark-mode styles from affecting print media" (T372370)]] (duration: 10m 58s)	[production]
18:50	<jhuneidi@deploy1003>	jdlrobson, jhuneidi: Continuing with sync	[production]
18:46	<jhuneidi@deploy1003>	jdlrobson, jhuneidi: Backport for [[gerrit:1062284\|Revert "Prevent dark-mode styles from affecting print media" (T372370)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
18:43	<jhuneidi@deploy1003>	Started scap sync-world: Backport for [[gerrit:1062284\|Revert "Prevent dark-mode styles from affecting print media" (T372370)]]	[production]
18:42	<ebernhardson@deploy1003>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
18:41	<ebernhardson@deploy1003>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
18:41	<ebernhardson@deploy1003>	helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
18:41	<ebernhardson@deploy1003>	helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
18:40	<ebernhardson@deploy1003>	helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
18:40	<ebernhardson@deploy1003>	helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
17:45	<eevans@cumin1002>	END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching sessionstore1*: Apply openjdk upgrade — T371874 - eevans@cumin1002	[production]
17:40	<bking@cumin2002>	START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_eqiad: security update - bking@cumin2002 - T371874	[production]
17:39	<bking@cumin2002>	END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_eqiad: security update - bking@cumin2002 - T371874	[production]
17:39	<jhuneidi@deploy1003>	Finished scap sync-world: testing T371904 (duration: 10m 31s)	[production]
17:28	<jhuneidi@deploy1003>	Started scap sync-world: testing T371904	[production]
17:27	<eevans@cumin1002>	START - Cookbook sre.cassandra.roll-restart for nodes matching sessionstore1*: Apply openjdk upgrade — T371874 - eevans@cumin1002	[production]