production SAL

51-100 of 10000 results (37ms)

2021-11-25 §
16:49	<jynus@cumin1001>	dbctl commit (dc=all): 'Fully repool db1163', diff saved to https://phabricator.wikimedia.org/P17862 and previous config saved to /var/cache/conftool/dbconfig/20211125-164941-jynus.json	[production]
16:46	<volans@deploy1002>	Finished deploy [netbox/deploy@87a36a7]: Test v2.10.4-wmf6 on netbox-next (duration: 01m 04s)	[production]
16:45	<volans@deploy1002>	Started deploy [netbox/deploy@87a36a7]: Test v2.10.4-wmf6 on netbox-next	[production]
16:41	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1148 (T296143)', diff saved to https://phabricator.wikimedia.org/P17861 and previous config saved to /var/cache/conftool/dbconfig/20211125-164153-ladsgroup.json	[production]
16:18	<jynus@cumin1001>	dbctl commit (dc=all): 'Slowly repool db1163++', diff saved to https://phabricator.wikimedia.org/P17860 and previous config saved to /var/cache/conftool/dbconfig/20211125-161833-jynus.json	[production]
16:14	<jynus@cumin1001>	dbctl commit (dc=all): 'Slowly repool db1163+', diff saved to https://phabricator.wikimedia.org/P17859 and previous config saved to /var/cache/conftool/dbconfig/20211125-161404-jynus.json	[production]
16:10	<klausman>	restarting pybal on lvs2009 T289835	[production]
15:57	<vgutierrez>	restarting pybal on lvs2010 - T289835	[production]
15:55	<jynus@cumin1001>	dbctl commit (dc=all): 'Slowly repool db1163', diff saved to https://phabricator.wikimedia.org/P17856 and previous config saved to /var/cache/conftool/dbconfig/20211125-155538-jynus.json	[production]
15:47	<jynus>	reenable gtid on db1163	[production]
15:29	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1148 (T296143)', diff saved to https://phabricator.wikimedia.org/P17853 and previous config saved to /var/cache/conftool/dbconfig/20211125-152906-ladsgroup.json	[production]
15:29	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1148.eqiad.wmnet with reason: Maintenance T296143	[production]
15:29	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on db1148.eqiad.wmnet with reason: Maintenance T296143	[production]
15:28	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1147 (T296143)', diff saved to https://phabricator.wikimedia.org/P17852 and previous config saved to /var/cache/conftool/dbconfig/20211125-152858-ladsgroup.json	[production]
15:22	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ping1001.eqiad.wmnet	[production]
15:19	<klausman@cumin1001>	conftool action : set/pooled=yes:weight=1; selector: cluster=ml_serve,service=kubesvc	[production]
15:13	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1147 (T296143)', diff saved to https://phabricator.wikimedia.org/P17851 and previous config saved to /var/cache/conftool/dbconfig/20211125-151354-ladsgroup.json	[production]
15:13	<ayounsi@cumin1001>	START - Cookbook sre.hosts.decommission for hosts ping1001.eqiad.wmnet	[production]
15:12	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ping3001.esams.wmnet	[production]
15:05	<ayounsi@cumin1001>	START - Cookbook sre.hosts.decommission for hosts ping3001.esams.wmnet	[production]
15:04	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ping2001.codfw.wmnet	[production]
14:58	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1147 (T296143)', diff saved to https://phabricator.wikimedia.org/P17850 and previous config saved to /var/cache/conftool/dbconfig/20211125-145849-ladsgroup.json	[production]
14:54	<ayounsi@cumin1001>	START - Cookbook sre.hosts.decommission for hosts ping2001.codfw.wmnet	[production]
14:43	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1147 (T296143)', diff saved to https://phabricator.wikimedia.org/P17849 and previous config saved to /var/cache/conftool/dbconfig/20211125-144344-ladsgroup.json	[production]
14:42	<XioNoX>	Update ping redirect to point to new ping VMs - T295767	[production]
14:25	<jayme>	uncordoned kubestage1003.eqiad.wmnet kubestage1004.eqiad.wmnet - T293729	[production]
14:17	<klausman@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .	[production]
14:16	<klausman@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .	[production]
14:12	<klausman@deploy1002>	helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .	[production]
13:40	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ping1002.eqiad.wmnet	[production]
13:32	<ayounsi@cumin1001>	START - Cookbook sre.ganeti.makevm for new host ping1002.eqiad.wmnet	[production]
13:30	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ping2002.codfw.wmnet	[production]
13:28	<Amir1>	killing lingering process from mwmaint to depooled db1147	[production]
13:20	<ayounsi@cumin1001>	START - Cookbook sre.ganeti.makevm for new host ping2002.codfw.wmnet	[production]
13:14	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ping3002.esams.wmnet	[production]
13:05	<ayounsi@cumin1001>	START - Cookbook sre.ganeti.makevm for new host ping3002.esams.wmnet	[production]
12:27	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase202[1-3].codfw.wmnet: Restarting for certificate updates - hnowlan@cumin1001	[production]
12:14	<arturo>	update repo bullseye-wikimedia/thirdparty/ceph-octopus (T296175)	[production]
12:14	<jynus>	disable temp. gtid on db1163	[production]
12:11	<jynus@cumin1001>	dbctl commit (dc=all): 'Temp. depool db1163 fully', diff saved to https://phabricator.wikimedia.org/P17847 and previous config saved to /var/cache/conftool/dbconfig/20211125-121138-jynus.json	[production]
12:04	<jynus@cumin1001>	dbctl commit (dc=all): 'Reduce db1163 load even more', diff saved to https://phabricator.wikimedia.org/P17846 and previous config saved to /var/cache/conftool/dbconfig/20211125-120435-jynus.json	[production]
11:56	<hnowlan@cumin1001>	START - Cookbook sre.cassandra.roll-restart for nodes matching restbase202[1-3].codfw.wmnet: Restarting for certificate updates - hnowlan@cumin1001	[production]
11:56	<jynus@cumin1001>	dbctl commit (dc=all): 'Reduce db1163 load', diff saved to https://phabricator.wikimedia.org/P17845 and previous config saved to /var/cache/conftool/dbconfig/20211125-115602-jynus.json	[production]
11:04	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1147 (T296143)', diff saved to https://phabricator.wikimedia.org/P17844 and previous config saved to /var/cache/conftool/dbconfig/20211125-110443-ladsgroup.json	[production]
11:04	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1147.eqiad.wmnet with reason: Maintenance T296143	[production]
11:04	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on db1147.eqiad.wmnet with reason: Maintenance T296143	[production]
11:04	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1146:3314 (T296143)', diff saved to https://phabricator.wikimedia.org/P17843 and previous config saved to /var/cache/conftool/dbconfig/20211125-110435-ladsgroup.json	[production]
10:49	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1146:3314 (T296143)', diff saved to https://phabricator.wikimedia.org/P17842 and previous config saved to /var/cache/conftool/dbconfig/20211125-104930-ladsgroup.json	[production]
10:34	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1146:3314 (T296143)', diff saved to https://phabricator.wikimedia.org/P17841 and previous config saved to /var/cache/conftool/dbconfig/20211125-103425-ladsgroup.json	[production]
10:25	<vgutierrez>	rolling restart of varnish and HAProxy on cp2042.codfw.wmnet,cp1090.eqiad.wmnet,cp[5012].eqsin.wmnet,cp3065.esams.wmnet,cp[4026,4032].ulsfo.wmnet' to disable PROXY protocol - T290005	[production]