production SAL

9601-9650 of 10000 results (29ms)

2017-04-07 §
14:07	<elukey>	restart hadoop-mapreduce-historyserver on an1001 to pick up the new jvm settings	[production]
10:53	<elukey>	increase Redis connection timeout manually (.3s -> .5s) on mw1306 as performance test - T125735	[production]
07:58	<elukey>	added "notifempty" to /etc/logrotate.d/nginx on cp1008, it should remove cronspam for access_pipe.log.1.gz	[production]
2017-04-06 §
16:18	<elukey>	restart hhvm on mw1227 - debug in /tmp/hhvm.30097.bt. - theads stuck in HPHP::Treadmill::getAgeOldestRequest	[production]
14:04	<elukey>	reimage analytics1002 to Debian Jessie (Hadoop Master Node standby)	[production]
08:02	<elukey>	restart hhvm on mw1194 - dump debug in /tmp/hhvm.1692.bt. - threads stuck in HPHP::Treadmill::getAgeOldestRequest	[production]
06:29	<elukey>	restart hhvm on mw1165 (jobrunner) - dump debug in /tmp/hhvm.19449.bt. - threads stuck in HPHP::Treadmill::getAgeOldestRequest	[production]
2017-04-05 §
15:20	<elukey>	playing with hhvm settings on mwdebug1002	[production]
12:57	<elukey>	reimage analytics1035 (journal node) to Debian Jessie	[production]
09:11	<elukey>	reimage analytics1057 to Debian Jessie	[production]
06:36	<elukey>	restart hhvm on mw1288 (hhvm-dump-debug in /tmp/hhvm.92520.bt.)	[production]
06:33	<elukey>	restart hhvm on mw1223 (hhvm-dump-debug in /tmp/hhvm.2164.bt.)	[production]
2017-04-04 §
15:59	<elukey>	reimage analytics1052 (Hadoop Journal node) to Debian Jessie	[production]
14:06	<elukey>	reimage analytics1039 and 1051 to Debian Jessie	[production]
11:53	<elukey>	reimage analytics10[36,37,38] to Debian Jessie	[production]
07:35	<elukey>	reimage analytics103[234] to Debian Jessie	[production]
2017-04-03 §
12:37	<elukey>	reimage analytics10[29,30,31] to Debian Jessie	[production]
07:39	<elukey@puppetmaster1001>	conftool action : set/pooled=no; selector: name=mw1261.eqiad.wmnet	[production]
2017-04-02 §
08:18	<elukey>	powercycle ms-be1016 (stuck in console, answers pings but not ssh)	[production]
2017-04-01 §
19:01	<elukey>	restart hhvm on mw1191 (dump debug in /tmp/hhvm.16619.bt.) - threads stuck in HPHP::Treadmill::getAgeOldestRequest	[production]
2017-03-31 §
20:18	<elukey>	stopping jobrunners on mw116[89] and restarting hhvm after https://gerrit.wikimedia.org/r/345881	[production]
13:23	<elukey>	restart hhvm on mw116[89] after https://gerrit.wikimedia.org/r/345829	[production]
09:59	<elukey@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=mw2244.codfw.wmnet	[production]
09:56	<elukey>	set pooled=yes mw210[56789], mw2260 and mw2213 (and cleaned up old /srv/mediawiki dirs that were causing rsync spam in scap pull)	[production]
09:47	<elukey>	restart hhvm on mw1197 - hhvm dump debug in /tmp/hhvm.14540.bt. - threads stuck in Treadmill::getAgeOldestRequest (HHVM 3.12)	[production]
2017-03-30 §
17:32	<elukey>	shutdown analytics1039 to apply new thermal paste - T132256	[production]
09:06	<elukey@puppetmaster1001>	conftool action : set/pooled=no; selector: name=mw1261.eqiad.wmnet	[production]
09:05	<elukey>	depooling mw1261 (hhvm-dump-debug in /tmp/hhvm.98736.bt.)	[production]
2017-03-29 §
17:11	<elukey>	restarting nginx on eqiad appservers to pick up the new certs	[production]
16:51	<elukey>	upgrading ssl cert appservers.svc.eqiad.wmnet to include the new discovery endpoints	[production]
14:31	<elukey>	upgrading ssl cert api.svc.eqiad.wmnet to include the new discovery endpoints	[production]
13:48	<elukey>	upgrading ssl cert rendering.svc.eqiad.wmnet to include the new discovery endpoints	[production]
12:53	<elukey>	reimage analytics1045 to Debian Jessie	[production]
11:03	<elukey>	upgrading ssl cert appservers.svc.codfw.wmnet to include the new discovery endpoints	[production]
10:11	<elukey>	upgrading ssl cert api.svc.codfw.wmnet to include the new discovery endpoints	[production]
08:29	<elukey>	upgrading ssl cert rendering.svc.codfw.wmnet to include the new discovery endpoints	[production]
2017-03-28 §
14:41	<elukey@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=mw2256.codfw.wmnet	[production]
14:38	<elukey>	ran restart-hhvm on mw1242, hhvm threads stuck (dump debug in /tmp/hhvm.9008.bt.) - HHVM 3.12	[production]
13:44	<elukey>	started hhvm on mw1261 (still depooled) - no hhvm process running	[production]
10:14	<elukey>	Switching hue.w.o's backend (cache misc) from anaytics1027 to thorium - T159527	[production]
2017-03-27 §
07:17	<elukey@puppetmaster1001>	conftool action : set/pooled=active; selector: name=mw2256.codfw.wmnet	[production]
2017-03-20 §
14:28	<elukey>	(Correct one) Temporary hack for T160888 - moved /srv/mw-log/archive/api.log-20170224.gz to /srv/mw-log/archive/api_log_backup_elukey/ to avoid rsync timeouts to stat1002 (the file is big and close to being deleted for retention)	[production]
14:27	<elukey>	Temporary hack for T160886 - moved /srv/mw-log/archive/api.log-20170224.gz to /srv/mw-log/archive/api_log_backup_elukey/ to avoid rsync timeouts to stat1002 (the file is big and close to being deleted for retention)	[production]
2017-03-17 §
16:16	<elukey>	reimage restbase-dev1001.eqiad.wmnet	[production]
11:33	<elukey>	reimage analytics1044 (Hadoop Worker node) to Debian Jessie	[production]
2017-03-16 §
16:00	<elukey>	racadm serveraction powerdown on mw2256 for hw maintenance	[production]
15:13	<elukey>	restart hhvm on mw1200, high load and queued requests - hhvm-dump-debug on /tmp/hhvm.27107.bt.	[production]
15:09	<elukey>	restart hhvm on mw1207, high load and queued requests - hhvm-dump-debug on /tmp/hhvm.27441.bt.	[production]
2017-03-14 §
13:18	<elukey>	started redis-cli --bigkeys -i 0.1 on rdb1008 (eqiad jobqueue slave)	[production]
12:41	<elukey>	reimage analytics1043 to Debian Jessie	[production]