production SAL

251-300 of 10000 results (66ms)

2023-07-27 §
09:56	<elukey@cumin1001>	START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-main-codfw cluster: Roll restart of jvm daemons.	[production]
09:54	<fabfur>	begin restarting lvs3005 (T335835)	[production]
09:44	<fabfur>	done restarting lvs3007 (T335835)	[production]
09:42	<fabfur@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs3007.esams.wmnet	[production]
09:40	<fabfur@cumin1001>	START - Cookbook sre.hosts.reboot-single for host lvs3007.esams.wmnet	[production]
09:38	<fabfur>	begin restarting lvs3007 (T335835)	[production]
09:20	<urbanecm>	Run `mwscript extensions/GrowthExperiments/maintenance/refreshLinkRecommendations.php --wiki=frwiki --page="Sensibilité électromagnétique" --force` to debug T342488	[production]
09:12	<fabfur>	done restarting lvs1019 (T335835)	[production]
09:11	<fabfur@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1019.eqiad.wmnet	[production]
09:07	<fabfur@cumin1001>	START - Cookbook sre.hosts.reboot-single for host lvs1019.eqiad.wmnet	[production]
08:42	<fabfur>	begin restarting lvs1019 (T335835)	[production]
08:34	<elukey@cumin1001>	END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-main-eqiad cluster: Roll restart of jvm daemons.	[production]
08:15	<jnuche@deploy1002>	rebuilt and synchronized wikiversions files: group2 wikis to 1.41.0-wmf.19 refs T340247	[production]
07:54	<oblivian@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mw-misc: apply	[production]
07:54	<oblivian@deploy1002>	helmfile [eqiad] START helmfile.d/services/mw-misc: apply	[production]
07:54	<oblivian@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mw-misc: apply	[production]
07:54	<oblivian@deploy1002>	helmfile [codfw] START helmfile.d/services/mw-misc: apply	[production]
07:40	<XioNoX>	reboot lsw1-a1-codfw (test device)	[production]
06:53	<elukey@cumin1001>	START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-main-eqiad cluster: Roll restart of jvm daemons.	[production]
06:39	<isaranto@deploy1002>	helmfile [ml-serve-eqiad] 'sync' command on namespace 'ores-legacy' for release 'main' .	[production]
06:38	<isaranto@deploy1002>	helmfile [ml-serve-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .	[production]
06:36	<isaranto@deploy1002>	helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .	[production]
06:03	<oblivian@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mw-misc: apply	[production]
05:57	<oblivian@deploy1002>	helmfile [eqiad] START helmfile.d/services/mw-misc: apply	[production]
05:45	<oblivian@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mw-misc: apply	[production]
05:40	<oblivian@deploy1002>	helmfile [codfw] START helmfile.d/services/mw-misc: apply	[production]
05:26	<oblivian@deploy1002>	Started scap: (no justification provided)	[production]
05:26	<_joe_>	scap is not syncing; just rebuilding the image from scratch to verify the reason for a bug.	[production]
05:22	<oblivian@deploy1002>	Started scap: (no justification provided)	[production]
03:19	<cstone>	payments-wiki upgraded from 2a68dfe2 to 1a6ca7ab	[production]
03:04	<eileen>	civicrm upgraded from 5a84b138 to 16c2e58a	[production]
00:54	<eileen>	civicrm upgraded from 68f29b70 to 5a84b138	[production]
00:51	<eileen>	civicrm upgraded from 853c14f3 to 68f29b70	[production]
00:20	<eileen>	rollback because I got an error when I tried to view - so let's see	[production]
00:20	<eileen>	civicrm rolled back from 68f29b70 to 853c14f3 (locked)	[production]
00:17	<eileen>	civicrm upgraded from 853c14f3 to 68f29b70	[production]
2023-07-26 §
23:01	<jforrester@deploy1002>	Synchronized wmf-config/interwiki.php: Update interwiki cache now that wikifunctions is here (duration: 06m 52s)	[production]
21:53	<bking@cumin1001>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wcqs2001.codfw.wmnet	[production]
21:46	<bking@cumin1001>	START - Cookbook sre.hosts.reboot-single for host wcqs2001.codfw.wmnet	[production]
21:23	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2180 (T342617)', diff saved to https://phabricator.wikimedia.org/P49745 and previous config saved to /var/cache/conftool/dbconfig/20230726-212310-ladsgroup.json	[production]
21:08	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P49744 and previous config saved to /var/cache/conftool/dbconfig/20230726-210804-ladsgroup.json	[production]
21:04	<jhancock@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb1013.eqiad.wmnet with OS bullseye	[production]
21:04	<jhancock@cumin2002>	START - Cookbook sre.hosts.reimage for host rdb1013.eqiad.wmnet with OS bullseye	[production]
21:00	<taavi>	manually attach User:WikiLambda_system to SUL T342811	[production]
20:52	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P49743 and previous config saved to /var/cache/conftool/dbconfig/20230726-205257-ladsgroup.json	[production]
20:37	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db2180 (T342617)', diff saved to https://phabricator.wikimedia.org/P49742 and previous config saved to /var/cache/conftool/dbconfig/20230726-203751-ladsgroup.json	[production]
20:34	<taavi@deploy1002>	Finished scap: Backport for [[gerrit:941954\|clienthints: Start collecting client hints data on testwiki (T341110)]], [[gerrit:941021\|CheckUser event table migration: Write new on group0 (T330158)]] (duration: 26m 17s)	[production]
20:15	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db2180 (T342617)', diff saved to https://phabricator.wikimedia.org/P49741 and previous config saved to /var/cache/conftool/dbconfig/20230726-201554-ladsgroup.json	[production]
20:15	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance	[production]
20:15	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance	[production]