production SAL

3701-3750 of 10000 results (101ms)

2024-04-02 §
14:02	<fabfur@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on cp3066.esams.wmnet with reason: host reimage	[production]
14:00	<marostegui@cumin1002>	dbctl commit (dc=all): 'db1188 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P59174 and previous config saved to /var/cache/conftool/dbconfig/20240402-140035-root.json	[production]
13:58	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db1185 (re)pooling @ 50%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P59173 and previous config saved to /var/cache/conftool/dbconfig/20240402-135847-arnaudb.json	[production]
13:45	<marostegui@cumin1002>	dbctl commit (dc=all): 'db1188 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P59172 and previous config saved to /var/cache/conftool/dbconfig/20240402-134528-root.json	[production]
13:43	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db1185 (re)pooling @ 25%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P59171 and previous config saved to /var/cache/conftool/dbconfig/20240402-134342-arnaudb.json	[production]
13:38	<TheresNoTime>	closing UTC afternoon backport window	[production]
13:38	<fabfur@cumin1002>	START - Cookbook sre.hosts.reimage for host cp3066.esams.wmnet with OS bullseye	[production]
13:37	<samtar@deploy1002>	Finished scap: Backport for [[gerrit:1016334\|InitialiseSettings: Enable Edit Recovery on all projects (T355548)]] (duration: 16m 26s)	[production]
13:33	<fabfur@cumin1002>	conftool action : set/pooled=no; selector: name=cp3066.esams.wmnet	[production]
13:32	<fabfur>	depool cp3066 for reimage (T360430)	[production]
13:30	<marostegui@cumin1002>	dbctl commit (dc=all): 'db1188 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P59170 and previous config saved to /var/cache/conftool/dbconfig/20240402-133023-root.json	[production]
13:28	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db1185 (re)pooling @ 15%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P59169 and previous config saved to /var/cache/conftool/dbconfig/20240402-132836-arnaudb.json	[production]
13:25	<samtar@deploy1002>	samtar: Continuing with sync	[production]
13:23	<samtar@deploy1002>	samtar: Backport for [[gerrit:1016334\|InitialiseSettings: Enable Edit Recovery on all projects (T355548)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
13:21	<samtar@deploy1002>	Started scap: Backport for [[gerrit:1016334\|InitialiseSettings: Enable Edit Recovery on all projects (T355548)]]	[production]
13:18	<dreamyjazz@deploy1002>	Finished scap: Backport for [[gerrit:1015373\|Deploy partial action blocks everywhere (T353496)]] (duration: 15m 33s)	[production]
13:15	<marostegui@cumin1002>	dbctl commit (dc=all): 'db1188 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P59168 and previous config saved to /var/cache/conftool/dbconfig/20240402-131517-root.json	[production]
13:13	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db1185 (re)pooling @ 10%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P59167 and previous config saved to /var/cache/conftool/dbconfig/20240402-131330-arnaudb.json	[production]
13:05	<dreamyjazz@deploy1002>	dreamyjazz and tchanders: Continuing with sync	[production]
13:04	<dreamyjazz@deploy1002>	dreamyjazz and tchanders: Backport for [[gerrit:1015373\|Deploy partial action blocks everywhere (T353496)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug)	[production]
13:04	<marostegui@cumin1002>	dbctl commit (dc=all): 'Depooling db1212 (T356166)', diff saved to https://phabricator.wikimedia.org/P59166 and previous config saved to /var/cache/conftool/dbconfig/20240402-130423-marostegui.json	[production]
13:04	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance	[production]
13:04	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 16:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance	[production]
13:03	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1212.eqiad.wmnet with reason: Maintenance	[production]
13:03	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 8:00:00 on db1212.eqiad.wmnet with reason: Maintenance	[production]
13:03	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1198 (T356166)', diff saved to https://phabricator.wikimedia.org/P59165 and previous config saved to /var/cache/conftool/dbconfig/20240402-130341-marostegui.json	[production]
13:02	<dreamyjazz@deploy1002>	Started scap: Backport for [[gerrit:1015373\|Deploy partial action blocks everywhere (T353496)]]	[production]
13:00	<marostegui@cumin1002>	dbctl commit (dc=all): 'db1188 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P59164 and previous config saved to /var/cache/conftool/dbconfig/20240402-130012-root.json	[production]
12:58	<arnaudb@cumin1002>	dbctl commit (dc=all): 'db1185 (re)pooling @ 5%: Post reimage repool', diff saved to https://phabricator.wikimedia.org/P59163 and previous config saved to /var/cache/conftool/dbconfig/20240402-125825-arnaudb.json	[production]
12:48	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P59162 and previous config saved to /var/cache/conftool/dbconfig/20240402-124834-marostegui.json	[production]
12:45	<marostegui@cumin1002>	dbctl commit (dc=all): 'db1188 (re)pooling @ 1%: Repooling', diff saved to https://phabricator.wikimedia.org/P59161 and previous config saved to /var/cache/conftool/dbconfig/20240402-124506-root.json	[production]
12:44	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1188.eqiad.wmnet with OS bookworm	[production]
12:39	<aborrero@cumin1002>	END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1035.eqiad.wmnet with OS bookworm	[production]
12:33	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P59160 and previous config saved to /var/cache/conftool/dbconfig/20240402-123326-marostegui.json	[production]
12:28	<taavi>	taavi@deploy1002 ~ $ sudo systemctl kill train-presync.service # T361580	[production]
12:22	<marostegui@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage	[production]
12:20	<hnowlan@deploy2002>	helmfile [eqiad] [main] DONE helmfile.d/services/mw-jobrunner : sync	[production]
12:19	<marostegui@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on db1188.eqiad.wmnet with reason: host reimage	[production]
12:19	<hnowlan@deploy2002>	helmfile [eqiad] [canary] DONE helmfile.d/services/mw-jobrunner : sync	[production]
12:18	<hnowlan@deploy2002>	helmfile [eqiad] [main] START helmfile.d/services/mw-jobrunner : sync	[production]
12:18	<hnowlan@deploy2002>	helmfile [eqiad] [canary] START helmfile.d/services/mw-jobrunner : sync	[production]
12:18	<marostegui@cumin1002>	dbctl commit (dc=all): 'Repooling after maintenance db1198 (T356166)', diff saved to https://phabricator.wikimedia.org/P59159 and previous config saved to /var/cache/conftool/dbconfig/20240402-121819-marostegui.json	[production]
12:13	<moritzm>	installing pillow security updates	[production]
12:13	<hnowlan@deploy2002>	helmfile [codfw] [main] DONE helmfile.d/services/mw-jobrunner : sync	[production]
12:12	<hnowlan@deploy2002>	helmfile [codfw] [canary] DONE helmfile.d/services/mw-jobrunner : sync	[production]
12:11	<hnowlan@deploy2002>	helmfile [codfw] [main] START helmfile.d/services/mw-jobrunner : sync	[production]
12:11	<hnowlan@deploy2002>	helmfile [codfw] [canary] START helmfile.d/services/mw-jobrunner : sync	[production]
12:09	<aborrero@cumin1002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1035.eqiad.wmnet with reason: host reimage	[production]
12:07	<marostegui@cumin1002>	START - Cookbook sre.hosts.reimage for host db1188.eqiad.wmnet with OS bookworm	[production]
12:06	<aborrero@cumin1002>	START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1035.eqiad.wmnet with reason: host reimage	[production]