production SAL

5551-5600 of 10000 results (117ms)

2024-06-12 §
06:58	<jmm@cumin2002>	START - Cookbook sre.hosts.reimage for host ganeti1019.eqiad.wmnet with OS bullseye	[production]
06:55	<moritzm>	remove ganeti1019 from eqiad cluster T367071	[production]
06:54	<moritzm>	rebalance ganeti clusters in codfw following reboots	[production]
06:47	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P64656 and previous config saved to /var/cache/conftool/dbconfig/20240612-064733-marostegui.json	[production]
06:44	<oblivian@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply	[production]
06:43	<oblivian@deploy1002>	helmfile [eqiad] START helmfile.d/services/mw-debug: apply	[production]
06:42	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s6 T367262	[production]
06:42	<marostegui@cumin1002>	dbctl commit (dc=all): 'Set db2129 with weight 0 T367262', diff saved to https://phabricator.wikimedia.org/P64655 and previous config saved to /var/cache/conftool/dbconfig/20240612-064200-root.json	[production]
06:41	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 1:00:00 on 26 hosts with reason: Primary switchover s6 T367262	[production]
06:40	<oblivian@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mw-debug: apply	[production]
06:40	<oblivian@deploy1002>	helmfile [codfw] START helmfile.d/services/mw-debug: apply	[production]
06:38	<hashar@deploy1002>	Finished deploy [gerrit/gerrit@69984f7]: wm-zuul-status: fix reload button - T360550 (duration: 00m 07s)	[production]
06:38	<hashar@deploy1002>	Started deploy [gerrit/gerrit@69984f7]: wm-zuul-status: fix reload button - T360550	[production]
06:32	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P64654 and previous config saved to /var/cache/conftool/dbconfig/20240612-063225-marostegui.json	[production]
06:17	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2168 (T364069)', diff saved to https://phabricator.wikimedia.org/P64653 and previous config saved to /var/cache/conftool/dbconfig/20240612-061718-marostegui.json	[production]
05:59	<oblivian@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mw-debug: apply	[production]
05:59	<oblivian@deploy1002>	helmfile [codfw] START helmfile.d/services/mw-debug: apply	[production]
05:58	<oblivian@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mw-debug: apply	[production]
05:58	<oblivian@deploy1002>	helmfile [codfw] START helmfile.d/services/mw-debug: apply	[production]
05:51	<oblivian@deploy1002>	helmfile [codfw] DONE helmfile.d/services/mw-debug: apply	[production]
05:51	<oblivian@deploy1002>	helmfile [codfw] START helmfile.d/services/mw-debug: apply	[production]
05:17	<dani@deploy1002>	helmfile [codfw] DONE helmfile.d/services/miscweb: apply	[production]
05:17	<dani@deploy1002>	helmfile [codfw] START helmfile.d/services/miscweb: apply	[production]
05:17	<dani@deploy1002>	helmfile [eqiad] DONE helmfile.d/services/miscweb: apply	[production]
05:16	<dani@deploy1002>	helmfile [eqiad] START helmfile.d/services/miscweb: apply	[production]
05:16	<dani@deploy1002>	helmfile [staging] DONE helmfile.d/services/miscweb: apply	[production]
05:16	<dani@deploy1002>	helmfile [staging] START helmfile.d/services/miscweb: apply	[production]
00:54	<marostegui@cumin1002>	dbctl commit (dc=all): 'Depooling db2168 (T364069)', diff saved to https://phabricator.wikimedia.org/P64652 and previous config saved to /var/cache/conftool/dbconfig/20240612-005420-marostegui.json	[production]
00:54	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance	[production]
00:53	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2168.codfw.wmnet with reason: Maintenance	[production]
00:53	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2159 (T364069)', diff saved to https://phabricator.wikimedia.org/P64651 and previous config saved to /var/cache/conftool/dbconfig/20240612-005347-marostegui.json	[production]
00:53	<fabfur@cumin1002>	END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw	[production]
00:38	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P64650 and previous config saved to /var/cache/conftool/dbconfig/20240612-003840-marostegui.json	[production]
00:23	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P64649 and previous config saved to /var/cache/conftool/dbconfig/20240612-002332-marostegui.json	[production]
00:08	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2159 (T364069)', diff saved to https://phabricator.wikimedia.org/P64648 and previous config saved to /var/cache/conftool/dbconfig/20240612-000825-marostegui.json	[production]
2024-06-11 §
23:45	<eevans@deploy1002>	helmfile [staging] DONE helmfile.d/services/data-gateway: apply	[production]
23:45	<eevans@deploy1002>	helmfile [staging] START helmfile.d/services/data-gateway: apply	[production]
22:56	<ryankemper@cumin2002>	END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) reloading wikidata_full on wdqs2023.codfw.wmnet from DumpsSource.HDFS (hdfs:///wmf/discovery/wdqs-reload-cookbook-test-T349069/ using stat1009.eqiad.wmnet)	[production]
22:29	<eevans@cumin1002>	END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:aqs-codfw	[production]
21:56	<ladsgroup@deploy1002>	Finished scap: Backport for [[gerrit:1041297\|Fix Linker::makeExternalLink build failures (T367127)]] (duration: 12m 33s)	[production]
21:51	<ejegg>	fundraising civicrm upgraded from 7252b1b9 to f7855d25	[production]
21:47	<ladsgroup@deploy1002>	matmarex, ladsgroup: Continuing with sync	[production]
21:47	<ladsgroup@deploy1002>	matmarex, ladsgroup: Backport for [[gerrit:1041297\|Fix Linker::makeExternalLink build failures (T367127)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
21:44	<ladsgroup@deploy1002>	Started scap: Backport for [[gerrit:1041297\|Fix Linker::makeExternalLink build failures (T367127)]]	[production]
21:42	<ladsgroup@deploy1002>	Finished scap: Backport for [[gerrit:1041698\|Reduce the threshold for section wide circuit breaking to 300]] (duration: 12m 08s)	[production]
21:33	<ladsgroup@deploy1002>	ladsgroup: Continuing with sync	[production]
21:32	<ladsgroup@deploy1002>	ladsgroup: Backport for [[gerrit:1041698\|Reduce the threshold for section wide circuit breaking to 300]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
21:30	<ladsgroup@deploy1002>	Started scap: Backport for [[gerrit:1041698\|Reduce the threshold for section wide circuit breaking to 300]]	[production]
21:27	<ladsgroup@deploy1002>	Finished scap: Backport for [[gerrit:1038899\|[zghwiki] Add patroller and autopatrolled groups (T357411)]] (duration: 11m 53s)	[production]
21:18	<ladsgroup@deploy1002>	pppery, ladsgroup: Continuing with sync	[production]