production SAL

6151-6200 of 10000 results (84ms)

2020-02-06 §
16:30	<bblack>	lvs1014 - restart pybal for dual bgp session config - T180069	[production]
16:30	<bblack>	lvs1015 - restart pybal for dual bgp session config - T180069	[production]
16:29	<bblack>	lvs1016 - restart pybal for dual bgp session config - T180069	[production]
16:28	<moritzm>	restarting apache on bromine to pick up SASL security updates	[production]
16:24	<vgutierrez@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
16:22	<vgutierrez@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
16:22	<moritzm>	installing cyrus-sasl2 security updates on jessie	[production]
16:20	<bblack>	lvs2001 - restart pybal for dual bgp session config - T180069	[production]
16:19	<bblack>	lvs2002 - restart pybal for dual bgp session config - T180069	[production]
16:19	<bblack>	lvs2003 - restart pybal for dual bgp session config - T180069	[production]
16:07	<vgutierrez>	depool and reimage ncredir5002 as buster - T243391	[production]
16:07	<bblack>	lvs4005 - restart pybal for dual bgp session config - T180069	[production]
16:06	<bblack>	lvs4006 - restart pybal for dual bgp session config - T180069	[production]
16:06	<bblack>	lvs4007 - restart pybal for dual bgp session config - T180069	[production]
16:03	<vgutierrez>	depool & reimage cp4023 as buster - T242093	[production]
16:03	<vgutierrez>	pooling cp4024 with buster - T242093	[production]
15:59	<akosiaris>	repool eventgate-analytics/eqiad. Experiment proved the failover wouldn't cause (on it's own) a problem. Experiment done.	[production]
15:58	<akosiaris@cumin1001>	conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=eventgate-analytics	[production]
15:57	<halfak@deploy1001>	Finished deploy [ores/deploy@50a101a]: T242705 (duration: 04m 35s)	[production]
15:56	<vgutierrez>	pooling ncredir4001 running buster - T243391	[production]
15:55	<moritzm>	installing qemu security updates	[production]
15:54	<bblack>	lvs5001 - restart pybal for dual bgp session config - T180069	[production]
15:53	<bblack>	lvs5002 - restart pybal for dual bgp session config - T180069	[production]
15:53	<halfak@deploy1001>	Started deploy [ores/deploy@50a101a]: T242705	[production]
15:52	<vgutierrez@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
15:52	<bblack>	lvs5003 - restart pybal for dual bgp session config - T180069	[production]
15:50	<moritzm>	installing python-ecdsa security updates	[production]
15:50	<vgutierrez@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
15:41	<moritzm>	installing jsoup security updates	[production]
15:30	<vgutierrez>	depool & reimage ncredir4001 as buster - T243391	[production]
15:29	<vgutierrez>	depool & reimage cp4024 as buster - T242093	[production]
15:28	<vgutierrez>	pooling ncredir4002 running buster - T243391	[production]
15:27	<moritzm>	installing sudo security updates on jessie	[production]
15:23	<vgutierrez>	pooling cp4025 with buster - T242093	[production]
15:14	<ema>	A:mw-api: force puppet run to increase keepalive_requests from 100 to 200 https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/570670/ T241145	[production]
15:09	<vgutierrez@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
15:07	<vgutierrez@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
14:59	<godog>	extend graphite1004 / graphite2003 fs +200G	[production]
14:56	<vgutierrez>	depool and reimage ncredir4002 as buster - T243391	[production]
14:46	<vgutierrez>	depool & reimage cp4025 as buster - T242093	[production]
14:16	<akosiaris>	20mins in with eventgate-analytics/eqiad depooled from discovery, no issues yet.	[production]
14:14	<ema>	run puppet on mw-api-canary to revert nginx keepalive_requests bump T241145	[production]
13:55	<marostegui>	Stop MySQL on es1019, upgrade and poweroff for on-site maintenance - T243963	[production]
13:54	<akosiaris@cumin1001>	conftool action : set/pooled=false; selector: name=eqiad,dnsdisc=eventgate-analytics	[production]
13:53	<akosiaris>	depool eqiad eventgate-analytics for testing purposes. Requests will flow to codfw, monitoring https://grafana.wikimedia.org/d/RIA1lzDZk/application-servers-red-dashboard?orgId=1&from=now-30m&to=now for issues.	[production]
13:51	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool es1019 for onsite maintenance T243963', diff saved to https://phabricator.wikimedia.org/P10321 and previous config saved to /var/cache/conftool/dbconfig/20200206-135157-marostegui.json	[production]
13:45	<XioNoX>	rollback deactivate BGP transits on cr3-knams	[production]
13:34	<elukey>	repool mw1347 with mcrouter running with 10 proxy threads (was: 5)	[production]
13:31	<XioNoX>	reboot cr3-knams	[production]
13:30	<elukey>	depool mw1347 to test some mcrouter settings	[production]