production SAL

9701-9750 of 10000 results (19ms)

2017-02-21 §
13:41	<elukey>	restarting nodejs on aqs1* to pick up openssl security upgrades	[production]
11:02	<elukey>	rolling restart of cassandra-metrics-collector on aqs1* for T157022	[production]
10:55	<elukey>	rolling restart of the analyics jmxtrans daemons for T157022	[production]
2017-02-15 §
13:59	<elukey>	disabled mod_deflate on bohrium (piwik) and disabled puppet. Testing 503 reduction.	[production]
12:56	<elukey>	restart of jmxtrans on all the analytics kafka brokers	[production]
2017-02-14 §
11:57	<elukey@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=mw2221.codfw.wmnet	[production]
2017-02-13 §
12:32	<elukey>	updating elastic search ACLs on cr1/cr2 for the analytics-ip4 filter	[production]
11:18	<elukey>	stopped ircecho on einsteinium	[production]
08:10	<elukey>	removed empty log files from elastic1022,1024,2001,1026,1040 to fix logrotate cronspam	[production]
2017-02-11 §
09:53	<elukey>	mw1236 back in production (scap pull executed before pooled=yes) - T156610	[production]
09:52	<elukey@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=mw1236.eqiad.wmnet	[production]
09:35	<elukey>	rebooting mw1236 to make sure that it comes up cleanly - T156610	[production]
2017-02-10 §
12:11	<elukey>	updating firewall rules for analytics on cr1/cr2	[production]
10:37	<elukey>	roll-restart of aqs to pick up new statsd.eqiad.wmnet - T157022	[production]
10:20	<godog>	restart of jmxtrans on analytics by elukey - T157022	[production]
08:41	<elukey>	restarting kafka mirror maker and jmxtrans of kafka[12]00[123] for java security upgrades	[production]
2017-02-09 §
17:55	<elukey>	proactively restarted statsv on hafnium after the kafka broker restarts	[production]
15:49	<elukey@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=aqs1009.eqiad.wmnet	[production]
15:19	<elukey>	restarting all Analytics Kafka brokers for Java security upgrades	[production]
10:49	<elukey>	restarting Java daemons on druid100[123] for security upgrades	[production]
10:03	<elukey>	restore Hadoop master to an1001	[production]
09:57	<elukey>	failover Hadoop masters from an1001 to an1002 to allow Java upgrades	[production]
09:50	<elukey>	restarting oozie and hive on analytics1003 for java security upgrades	[production]
09:07	<elukey>	Executing Cassandra nodetool cleanup on aqs1006-{a,b} (one at the time) and aqs1009-a	[production]
09:01	<elukey>	restarting java daemons on all the Hadoop nodes for security upgrades	[production]
08:59	<gehel>	cleaning empty logs on elastic10(22\|24\|40) - thanks elukey !	[production]
07:46	<elukey>	Renamed some logs in /var/log (adding _renamed) on aluminum, elastic102[46]/1040 to avoid cronspam and logrotate failures	[production]
2017-02-08 §
17:45	<elukey>	added some annotations to the aqs analytics ACLs on cr1/cr2	[production]
15:28	<elukey>	Eqiad cr1/cr2 - Updated analytics-in4 for new aqs nodes and removed decommed ones	[production]
14:26	<elukey>	restarting nutcracker in all the codfw mw servers to pick up the new shards	[production]
13:46	<elukey>	replacing the codfw memcached/redis shards 12->16	[production]
09:44	<elukey>	boostrapping aqs1009-b (last new AQS Cassandra instance)	[production]
2017-02-07 §
14:27	<elukey>	restarting hhvm on mw1304 (load very high, no queue, threads locked - /tmp/hhvm.62070.bt.)	[production]
14:19	<elukey>	restarting all the Yarn Node Managers on the Hadoop worker nodes to pick up the new config - T156932	[production]
11:37	<elukey>	restarting hhvm on mw1226 (hhvm dump debug in /tmp/hhvm.33183.bt.)	[production]
09:46	<elukey>	stopped and masked cassandra-{a,b} - T157425	[production]
07:31	<elukey>	added "> /dev/null" manually to the carbon's root crontab (rsync job) to avoid cronspam. The change was already merged in https://gerrit.wikimedia.org/r/#/c/336218 but puppet is disabled on carbon.	[production]
2017-02-06 §
15:55	<elukey>	mc2029 shutdown for DC ops	[production]
13:30	<elukey>	applied https://gerrit.wikimedia.org/r/#/c/336203/ manually to analytics1028 (hadoop worker node) as live test - T156932	[production]
08:19	<elukey@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=aqs1008.eqiad.wmnet	[production]
07:38	<elukey>	bootstrapping aqs1009-a (new AQS cassandra instance)	[production]
2017-02-04 §
19:24	<elukey>	Started nodetool-b cleanup on aqs1005 (after 1008-{ab} bootstraps)	[production]
11:44	<elukey>	Started nodetool-a cleanup on aqs1008 (after 1008-{ab} bootstraps)	[production]
09:09	<elukey>	Started nodetool-a cleanup on aqs1005 (after 1008-{ab} bootstraps)	[production]
2017-02-03 §
09:10	<elukey>	Replace Redis/Memcached shards mc2008->2011 with mc2026->mc2029	[production]
08:05	<elukey>	bootstrapping aqs1008-b (AQS Cassandra instance)	[production]
2017-02-02 §
16:24	<elukey>	reboot mc2019->mc2025 to see if they come up cleanly (currently codfw replicas of eqiad redis shards)	[production]
16:13	<elukey>	rebooting mc202[6789] (not serving any traffic) to see if they come up cleanly	[production]
16:00	<elukey>	rebooting mc203[01234] (not serving any traffic) to see if they come up cleanly	[production]
15:11	<elukey>	rebooting mc203[56] (not taking any traffic) to test if they come up cleanly	[production]