production SAL

1351-1400 of 10000 results (41ms)

2021-12-07 §
15:09	<pt1979@cumin2002>	START - Cookbook sre.hosts.reimage for host restbase2026.codfw.wmnet with OS buster	[production]
14:38	<jbond>	renable puppet fleet wide post monitoring refactor 744787	[production]
14:28	<godog>	reboot graphite1004 - T297180	[production]
14:15	<Amir1>	fixing heartbeat grants for wikiuser across the cluster (T296537)	[production]
14:11	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ganeti[2013-2014].codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage	[production]
14:11	<jmm@cumin2002>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ganeti[2013-2014].codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage	[production]
14:07	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd2006.codfw.wmnet with reason: switch to drbd storage	[production]
14:07	<jmm@cumin2002>	START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd2006.codfw.wmnet with reason: switch to drbd storage	[production]
13:52	<Amir1>	removing wikiuser@localhost on s6 (T296537)	[production]
13:45	<pt1979@cumin2002>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase2026.codfw.wmnet with OS buster	[production]
13:42	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd2004.codfw.wmnet with reason: switch to drbd storage	[production]
13:42	<jmm@cumin2002>	START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd2004.codfw.wmnet with reason: switch to drbd storage	[production]
13:40	<godog>	reboot graphite2003 - T297180	[production]
13:39	<jbond>	disable puppet fleet wide to rollout 744787	[production]
13:26	<pt1979@cumin2002>	START - Cookbook sre.hosts.reimage for host restbase2026.codfw.wmnet with OS buster	[production]
13:16	<jelto>	update GitLab to 14.4.4-ce.0	[production]
13:07	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ganeti2014.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage	[production]
13:07	<jmm@cumin2002>	START - Cookbook sre.hosts.downtime for 1:00:00 on ganeti2014.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage	[production]
12:46	<Lucas_WMDE>	UTC morning backport+config window done	[production]
12:46	<Lucas_WMDE>	deployed [[gerrit:744071\|Update termbox to 2021-12-06-171243-production (T297006)]]	[production]
12:44	<lucaswerkmeister-wmde@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'termbox' for release 'production' .	[production]
12:42	<lucaswerkmeister-wmde@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'termbox' for release 'production' .	[production]
12:39	<lucaswerkmeister-wmde@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'test' .	[production]
12:39	<lucaswerkmeister-wmde@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'staging' .	[production]
12:24	<jbond>	merge refactor of monitoring classes 725045	[production]
12:16	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1156 (T277354)', diff saved to https://phabricator.wikimedia.org/P18071 and previous config saved to /var/cache/conftool/dbconfig/20211207-121655-marostegui.json	[production]
12:10	<mwdebug-deploy@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
12:09	<lucaswerkmeister-wmde@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:744043\|Enable reply tool by default on mediawikiwiki (T296444)]] (duration: 00m 57s)	[production]
12:09	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
12:01	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P18070 and previous config saved to /var/cache/conftool/dbconfig/20211207-120150-marostegui.json	[production]
11:51	<moritzm>	draining primary/secondary instances off ganeti2014 T296622	[production]
11:46	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P18069 and previous config saved to /var/cache/conftool/dbconfig/20211207-114645-marostegui.json	[production]
11:38	<cmooney@cumin1001>	END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host cloudvirt1028.eqiad.wmnet	[production]
11:32	<cmooney@cumin1001>	START - Cookbook sre.hosts.dhcp for host cloudvirt1028.eqiad.wmnet	[production]
11:31	<topranks>	removing IP addressing on cloudvirt1028 manually and forcing DHCP to debug reimage failure (T296906)	[production]
11:31	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1156 (T277354)', diff saved to https://phabricator.wikimedia.org/P18068 and previous config saved to /var/cache/conftool/dbconfig/20211207-113140-marostegui.json	[production]
11:30	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depooling db1156 (T277354)', diff saved to https://phabricator.wikimedia.org/P18067 and previous config saved to /var/cache/conftool/dbconfig/20211207-113005-marostegui.json	[production]
11:30	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db[1155-1156].eqiad.wmnet with reason: Maintenance T277354	[production]
11:29	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db[1155-1156].eqiad.wmnet with reason: Maintenance T277354	[production]
11:27	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1182 (T277354)', diff saved to https://phabricator.wikimedia.org/P18066 and previous config saved to /var/cache/conftool/dbconfig/20211207-112707-marostegui.json	[production]
11:26	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd2004.codfw.wmnet with reason: switch to drbd storage	[production]
11:26	<jmm@cumin2002>	START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd2004.codfw.wmnet with reason: switch to drbd storage	[production]
11:12	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P18065 and previous config saved to /var/cache/conftool/dbconfig/20211207-111203-marostegui.json	[production]
11:11	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2012.codfw.wmnet	[production]
11:06	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host ganeti2012.codfw.wmnet	[production]
10:56	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P18064 and previous config saved to /var/cache/conftool/dbconfig/20211207-105658-marostegui.json	[production]
10:41	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1182 (T277354)', diff saved to https://phabricator.wikimedia.org/P18063 and previous config saved to /var/cache/conftool/dbconfig/20211207-104153-marostegui.json	[production]
10:40	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depooling db1182 (T277354)', diff saved to https://phabricator.wikimedia.org/P18062 and previous config saved to /var/cache/conftool/dbconfig/20211207-104018-marostegui.json	[production]
10:40	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1182.eqiad.wmnet with reason: Maintenance T277354	[production]
10:40	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on db1182.eqiad.wmnet with reason: Maintenance T277354	[production]