production SAL

4651-4700 of 10000 results (102ms)

2024-01-04 §
11:01	<kamila@cumin1002>	START - Cookbook sre.hosts.reimage for host mw1379.eqiad.wmnet with OS bullseye	[production]
10:51	<akosiaris@deploy2002>	helmfile [codfw] DONE helmfile.d/admin 'apply'.	[production]
10:33	<akosiaris@deploy2002>	helmfile [codfw] START helmfile.d/admin 'apply'.	[production]
10:32	<akosiaris@deploy2002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
10:17	<akosiaris>	bump memory limits for calico-node in wikikube codfw/eqiad by 25% (i.e from 400Mi to 500Mi) take #3	[production]
10:17	<akosiaris@deploy2002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
09:57	<akosiaris@deploy2002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
09:38	<akosiaris>	delete mw1377-mw1383 from eqiad wikikube nodes	[production]
09:38	<akosiaris>	bump memory limits for calico-node in wikikube codfw/eqiad by 25% (i.e from 400Mi to 500Mi) take #2	[production]
09:36	<akosiaris@deploy2002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
09:36	<akosiaris@deploy2002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
09:22	<akosiaris>	bump memory limits for calico-node in wikikube codfw/eqiad by 25% (i.e from 400Mi to 500Mi)	[production]
09:22	<akosiaris@deploy2002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
09:13	<pfischer@deploy2002>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
09:13	<pfischer@deploy2002>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
09:13	<pfischer@deploy2002>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
09:12	<pfischer@deploy2002>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
09:11	<pfischer@deploy2002>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
09:09	<pfischer@deploy2002>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
08:49	<ladsgroup@deploy2002>	Finished scap: Backport for [[gerrit:987134\|Update virtual domain for url shortener]] (duration: 12m 35s)	[production]
08:43	<ladsgroup@deploy2002>	ladsgroup: Continuing with sync	[production]
08:38	<ladsgroup@deploy2002>	ladsgroup: Backport for [[gerrit:987134\|Update virtual domain for url shortener]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
08:36	<ladsgroup@deploy2002>	Started scap: Backport for [[gerrit:987134\|Update virtual domain for url shortener]]	[production]
08:34	<ladsgroup@deploy2002>	Finished scap: Backport for [[gerrit:985160\|Add virtual domain config for reading lists extension (T353948)]] (duration: 09m 05s)	[production]
08:28	<ladsgroup@deploy2002>	ladsgroup: Continuing with sync	[production]
08:27	<ladsgroup@deploy2002>	ladsgroup: Backport for [[gerrit:985160\|Add virtual domain config for reading lists extension (T353948)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
08:25	<ladsgroup@deploy2002>	Started scap: Backport for [[gerrit:985160\|Add virtual domain config for reading lists extension (T353948)]]	[production]
07:00	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1151.eqiad.wmnet with OS bookworm	[production]
06:42	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1151.eqiad.wmnet with reason: host reimage	[production]
06:40	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on db1151.eqiad.wmnet with reason: host reimage	[production]
06:28	<marostegui@cumin1002>	START - Cookbook sre.hosts.reimage for host db1151.eqiad.wmnet with OS bookworm	[production]
03:49	<rzl@deploy2002>	helmfile [codfw] START helmfile.d/admin 'apply'.	[production]
2024-01-03 §
23:50	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow	[production]
23:50	<kamila@cumin1002>	START - Cookbook sre.hosts.downtime for 12:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow	[production]
23:50	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow	[production]
23:50	<kamila@cumin1002>	START - Cookbook sre.hosts.downtime for 4:00:00 on mw1379.eqiad.wmnet with reason: failed reimage, will fix tomorrow	[production]
23:33	<bking@cumin2002>	END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)	[production]
23:24	<rzl@deploy2002>	helmfile [codfw] DONE helmfile.d/admin 'apply'.	[production]
23:24	<rzl@deploy2002>	helmfile [codfw] START helmfile.d/admin 'apply'.	[production]
23:18	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1383.eqiad.wmnet with OS bullseye	[production]
23:15	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1380.eqiad.wmnet with OS bullseye	[production]
23:14	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1382.eqiad.wmnet with OS bullseye	[production]
23:12	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1378.eqiad.wmnet with OS bullseye	[production]
23:10	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mw1381.eqiad.wmnet with OS bullseye	[production]
23:07	<kamila@cumin1002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mw1379.eqiad.wmnet with OS bullseye	[production]
23:02	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1383.eqiad.wmnet with reason: host reimage	[production]
23:01	<bking@cumin2002>	END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)	[production]
22:59	<kamila@cumin1002>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mw1380.eqiad.wmnet with reason: host reimage	[production]
22:59	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1379.eqiad.wmnet with reason: host reimage	[production]
22:57	<kamila@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1382.eqiad.wmnet with reason: host reimage	[production]