production SAL

651-700 of 10000 results (83ms)

2024-06-13 §
13:33	<pfischer@deploy1002>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
13:33	<pfischer@deploy1002>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
13:32	<logmsgbot>	lucaswerkmeister-wmde@deploy1002 Started scap: Backport for [[gerrit:1042997\|[svwikt] Add a temporary logo for the 100.000 pages (T364247)]]	[production]
13:31	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P64846 and previous config saved to /var/cache/conftool/dbconfig/20240613-133132-marostegui.json	[production]
13:30	<volans>	upgrading spicerack on cumin2002 to v8.6.0	[production]
13:26	<moritzm>	installing pillow security updates	[production]
13:25	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'db2123 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P64845 and previous config saved to /var/cache/conftool/dbconfig/20240613-132512-ladsgroup.json	[production]
13:18	<taavi@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1032.eqiad.wmnet with OS bookworm	[production]
13:17	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'Depooling db1206 (T352010)', diff saved to https://phabricator.wikimedia.org/P64844 and previous config saved to /var/cache/conftool/dbconfig/20240613-131746-ladsgroup.json	[production]
13:17	<ladsgroup@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance	[production]
13:17	<ladsgroup@cumin1002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1206.eqiad.wmnet with reason: Maintenance	[production]
13:16	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2169', diff saved to https://phabricator.wikimedia.org/P64843 and previous config saved to /var/cache/conftool/dbconfig/20240613-131625-marostegui.json	[production]
13:10	<ladsgroup@cumin1002>	dbctl commit (dc=all): 'db2123 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P64842 and previous config saved to /var/cache/conftool/dbconfig/20240613-131006-ladsgroup.json	[production]
13:07	<ladsgroup@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance	[production]
13:07	<ladsgroup@cumin1002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2123.codfw.wmnet with reason: Maintenance	[production]
13:06	<moritzm>	installing pillow security updates	[production]
13:03	<jmm@cumin1002>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet	[production]
13:01	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2169 (T367261)', diff saved to https://phabricator.wikimedia.org/P64841 and previous config saved to /var/cache/conftool/dbconfig/20240613-130117-marostegui.json	[production]
12:57	<marostegui@cumin1002>	dbctl commit (dc=all): 'Depooling db2169 (T367261)', diff saved to https://phabricator.wikimedia.org/P64840 and previous config saved to /var/cache/conftool/dbconfig/20240613-125700-marostegui.json	[production]
12:56	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2169.codfw.wmnet with reason: Maintenance	[production]
12:56	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 12:00:00 on db2169.codfw.wmnet with reason: Maintenance	[production]
12:56	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2158 (T367261)', diff saved to https://phabricator.wikimedia.org/P64839 and previous config saved to /var/cache/conftool/dbconfig/20240613-125648-marostegui.json	[production]
12:52	<jmm@cumin1002>	START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet	[production]
12:51	<taavi@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1032.eqiad.wmnet with reason: host reimage	[production]
12:48	<taavi@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1032.eqiad.wmnet with reason: host reimage	[production]
12:41	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P64838 and previous config saved to /var/cache/conftool/dbconfig/20240613-124141-marostegui.json	[production]
12:39	<elukey>	reset BIOS/BMC to factory default on sretest1001 - T365372	[production]
12:30	<taavi@cumin1002>	START - Cookbook sre.hosts.reimage for host cloudvirt1032.eqiad.wmnet with OS bookworm	[production]
12:26	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2158', diff saved to https://phabricator.wikimedia.org/P64837 and previous config saved to /var/cache/conftool/dbconfig/20240613-122634-marostegui.json	[production]
12:26	<taavi@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on cloudvirt1032.eqiad.wmnet with reason: reimage and move to OVS	[production]
12:26	<taavi@cumin1002>	START - Cookbook sre.hosts.downtime for 4:00:00 on cloudvirt1032.eqiad.wmnet with reason: reimage and move to OVS	[production]
12:21	<ladsgroup@deploy1002>	Finished scap: Backport for [[gerrit:1043006\|Temporarily bump circuit breaking threshold to 350]] (duration: 12m 13s)	[production]
12:20	<pfischer@deploy1002>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
12:19	<pfischer@deploy1002>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
12:17	<pfischer@deploy1002>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
12:16	<pfischer@deploy1002>	helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply	[production]
12:15	<pfischer@deploy1002>	helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply	[production]
12:12	<ladsgroup@deploy1002>	ladsgroup: Continuing with sync	[production]
12:12	<ladsgroup@deploy1002>	ladsgroup: Backport for [[gerrit:1043006\|Temporarily bump circuit breaking threshold to 350]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
12:11	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2158 (T367261)', diff saved to https://phabricator.wikimedia.org/P64836 and previous config saved to /var/cache/conftool/dbconfig/20240613-121127-marostegui.json	[production]
12:09	<ladsgroup@deploy1002>	Started scap: Backport for [[gerrit:1043006\|Temporarily bump circuit breaking threshold to 350]]	[production]
12:07	<marostegui@cumin1002>	dbctl commit (dc=all): 'Depooling db2158 (T367261)', diff saved to https://phabricator.wikimedia.org/P64835 and previous config saved to /var/cache/conftool/dbconfig/20240613-120711-marostegui.json	[production]
12:07	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance	[production]
12:07	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2187.codfw.wmnet with reason: Maintenance	[production]
12:07	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on db2158.codfw.wmnet with reason: Maintenance	[production]
12:06	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 12:00:00 on db2158.codfw.wmnet with reason: Maintenance	[production]
12:06	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db2151 (T367261)', diff saved to https://phabricator.wikimedia.org/P64834 and previous config saved to /var/cache/conftool/dbconfig/20240613-120644-marostegui.json	[production]
12:04	<jiji@cumin1002>	END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-worker-eqiad	[production]
11:58	<fabfur@cumin1002>	conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet	[production]
11:57	<fabfur>	enabling puppet && repool cp4037 (T360454)	[production]