production SAL

1-50 of 10000 results (66ms)

2022-05-05 §
22:06	<razzi@cumin1001>	END (PASS) - Cookbook sre.kafka.reboot-workers (exit_code=0) for Kafka main-eqiad cluster: Reboot kafka nodes	[production]
22:01	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
22:00	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
22:00	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
21:58	<hoo@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:734722\|Add missing termbox codes from Wikibase (T277836)]] (duration: 00m 48s)	[production]
21:56	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
21:35	<brennen@deploy1002>	Synchronized php-1.39.0-wmf.10/includes/user: Backport: [[gerrit:789332\|Suppress "named" group when TempUser system is disabled (T307675)]] (duration: 00m 48s)	[production]
21:33	<brennen@deploy1002>	scap failed: average error rate on 7/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org for details)	[production]
21:26	<brennen@deploy1002>	Finished scap: Resuming previously interrupted sync-world (duration: 03m 47s)	[production]
21:25	<jhathaway@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel	[production]
21:24	<jhathaway@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel	[production]
21:22	<brennen@deploy1002>	Started scap: Resuming previously interrupted sync-world	[production]
21:21	<jhathaway@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx1001.wikimedia.org with reason: new kernel	[production]
21:21	<jhathaway@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on mx1001.wikimedia.org with reason: new kernel	[production]
21:21	<jhathaway>	reboot mx1001	[production]
21:18	<dduvall@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply	[production]
21:18	<dduvall@deploy1002>	helmfile [eqiad] START helmfile.d/services/blubberoid: apply	[production]
21:18	<dduvall@deploy1002>	helmfile [codfw] DONE helmfile.d/services/blubberoid: apply	[production]
21:17	<dduvall@deploy1002>	helmfile [codfw] START helmfile.d/services/blubberoid: apply	[production]
21:15	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
21:12	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
21:12	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
21:11	<dduvall@deploy1002>	helmfile [staging] DONE helmfile.d/services/blubberoid: apply	[production]
21:11	<dduvall@deploy1002>	helmfile [staging] START helmfile.d/services/blubberoid: apply	[production]
21:08	<jhathaway@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx2001.wikimedia.org with reason: new kernel	[production]
21:08	<jhathaway@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on mx2001.wikimedia.org with reason: new kernel	[production]
21:08	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
21:05	<jhathaway>	reboot mx2001 for kernel update	[production]
21:05	<brennen@deploy1002>	Synchronized php-1.39.0-wmf.10/includes/user: Backport: Revert: [[gerrit:789332\|Suppress "named" group when TempUser system is disabled (T307675)]] (duration: 00m 50s)	[production]
21:03	<brennen@deploy1002>	sync-world aborted: Backport: Revert: [[gerrit:789333\|Add messages for the "named" user group (T307675)]] and Backport: [[gerrit:789332\|Suppress "named" group when TempUser system is disabled (T307675)]] (duration: 11m 53s)	[production]
20:53	<brennen>	sync of last patch ongoing, otherwise closing UTC late backport and config window	[production]
20:51	<brennen@deploy1002>	Started scap: Backport: Revert: [[gerrit:789333\|Add messages for the "named" user group (T307675)]] and Backport: [[gerrit:789332\|Suppress "named" group when TempUser system is disabled (T307675)]]	[production]
20:32	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
20:32	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
20:31	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
20:31	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
20:28	<thcipriani@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:789630\|urwiki: allow "sysop" to add/remove "eliminator" (T307029)]] (duration: 00m 49s)	[production]
20:22	<thcipriani@deploy1002>	backport aborted: (duration: 00m 41s)	[production]
20:20	<thcipriani@deploy1002>	backport aborted: (duration: 00m 02s)	[production]
20:10	<razzi@cumin1001>	START - Cookbook sre.kafka.reboot-workers for Kafka main-eqiad cluster: Reboot kafka nodes	[production]
18:55	<mwdebug-deploy@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mwdebug: apply	[production]
18:54	<mwdebug-deploy@deploy1002>	helmfile [codfw] START helmfile.d/services/mwdebug: apply	[production]
18:54	<mwdebug-deploy@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply	[production]
18:53	<herron@cumin1001>	END (PASS) - Cookbook sre.kafka.reboot-workers (exit_code=0) for Kafka main-eqiad cluster: Reboot kafka nodes	[production]
18:53	<mwdebug-deploy@deploy1002>	helmfile [eqiad] START helmfile.d/services/mwdebug: apply	[production]
18:51	<ladsgroup@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:789562\|Set cebwiki to read new in templatelinks migration (T306673)]] (duration: 00m 49s)	[production]
18:51	<mutante>	contitn1001 - apt-get remove --purge docker.io after docker-ce was installed by puppet for T300682 (different behaviour from contint2001 since it did not have /var/lib/docker)	[production]
18:47	<razzi@cumin1001>	END (PASS) - Cookbook sre.kafka.reboot-workers (exit_code=0) for Kafka main-codfw cluster: Reboot kafka nodes	[production]
18:42	<mutante>	contitn2001 - apt-get remove --purge docker.io after docker-ce was installed by puppet for T300682	[production]
18:38	<robh@cumin1001>	END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dumpsdata1006.eqiad.wmnet with OS bullseye	[production]