8851-8900 of 10000 results (24ms)
2018-04-24 §
12:28 <elukey> cleanup /home/elukey/zookeeper backup files taken before the 3.4.9 migration [production]
12:10 <elukey> reimage analytics106[34] to Debian Stretch [production]
10:50 <elukey> reimage analytics106[56] to Debian Stretch [production]
08:14 <elukey> upload druid_0.10.0-3~jessie1 (collection of druid packages) to jessie-wikimedia - T164008 [production]
06:56 <elukey> restart zookeeper on conf200[123] for openjdk upgrades [production]
2018-04-23 §
19:34 <elukey@tin> Finished deploy [analytics/pivot/deploy@cb9ddee]: Fix 0.10.0 compatibility - T164008 (duration: 00m 17s) [production]
19:34 <elukey@tin> Started deploy [analytics/pivot/deploy@cb9ddee]: Fix 0.10.0 compatibility - T164008 [production]
13:55 <elukey> reimage analytics1067 to Debian Stretch - T192557 [production]
07:35 <elukey> reboot ms-be2034 - stuck in com2 console with "sd 0:1:0:1: rejecting I/O to offline device", not responsive to ssh [production]
2018-04-20 §
11:27 <elukey> reimage analytics1068 to Debian Stretch - T192557 [production]
09:12 <elukey> restart of mw apis showing ~50% cpu utilization as precaution before the weekend - mw[1224,1225,1228,1230,1231,1233-1235,1276-1283,1286,1312,1313,1315,1316,1341,1343,1344,1347,1348]* [production]
08:39 <elukey> restart hhvm on mw[1226,1232].eqiad.wmnet - high load [production]
2018-04-19 §
09:59 <elukey> complete migration of zookeeper on conf100[123] [production]
09:33 <elukey> upgrade zookeper on conf100[123] from 3.4.5 to 3.4.9 - T182924 [production]
2018-04-18 §
14:00 <elukey> restart kafka on kafka1001 and kafka2001 (jobqueues,eventbus) for opnejdk-7 upgrades [production]
08:44 <elukey> execute cumin 'analytics10[28-69]*' 'rm /etc/apt/preferences.d/r_* && apt-get update' to clear jessie backports apt config - T192348 [production]
2018-04-17 §
16:37 <elukey> incremental rollout of the new zookeeper jmx config to druid1* and conf* [production]
13:25 <elukey> completed migration of zookeeper on conf200[123] [production]
13:00 <elukey> upgrade zookeeper on conf200[123] to 3.4.9~jessie - T182924 [production]
08:19 <elukey> restart nrpe-server on kafka2001 (kafka check not defined) [production]
2018-04-16 §
08:04 <elukey> restart hhvm on mw[1228,1234,1281-1287,1289,1290,1312-1314,1317,1339,1343,1345,1346,1348] - more than 50% cpu usage, prevention scheme for current high load [production]
05:59 <elukey> restart hhvm on mw[1221,1233,1280,1347] - high load [production]
05:55 <elukey> repool mw1341 after investigation [production]
05:48 <elukey> restart hhvm on mw1225, 1315, 1316, 1340, 1341, 1342, 1347 - high load [production]
05:36 <elukey> restart hhvm on mw1226,27,32,88 - high load [production]
2018-04-15 §
21:42 <elukey> restart hhvm on mw1286,1317,1339 - high load [production]
20:52 <elukey> restart hhvm on mw13[43,45,46,48] - high load [production]
20:48 <elukey> restart hhvm on mw13[12-14] - high load [production]
20:45 <elukey> restart hhvm on mw[1285,1287,1289-1290] - high load [production]
20:38 <elukey> restart hhvm on mw12[22,79,82] - high load [production]
20:32 <elukey> restart hhvm on mw12[32-35] - high load [production]
20:24 <elukey> restart hhvm on mw1229-31 - high load [production]
20:17 <elukey> restart hhvm on mw122[6-8] - high load [production]
20:05 <elukey> restart hhvm on mw122[3,4] - high load [production]
13:42 <elukey> restart hhvm on mw1227 due to high load (hhvm dump debug in /tmp/hhvm.44071.bt) [production]
10:53 <elukey> powercycle mw1272 - not responsive to ssh, mgmt com2 console showing "[OK" and no tty [production]
2018-04-13 §
13:52 <elukey> roll restart druid + zookeeper daemons on druid100[123] for openjdk-7 updates [production]
13:32 <elukey> restart druid and zookeeper daemons on druid100[456] for opejdk-7 updates [production]
2018-04-12 §
06:08 <elukey> force kill of fuse_dfs (handling /mnt/hdfs) on stat1004, apparently causing a huge load [production]
06:05 <elukey> force kill of fuse_dfs (handling /mnt/hdfs) on stat1005, apparently causing a huge load [production]
2018-04-11 §
16:44 <elukey> restart hadoop hdfs namenodes on analytics100[12] to pick up HDFS Trash settings - T189051 [production]
16:14 <elukey> reboot notebook1001 for kernel updates [production]
13:12 <elukey> restart kafka brokers on kafka1012->23 for openjdk-7 upgrades [production]
06:49 <elukey> restart Yarn Resource Manager daemons on analytics100[12] to pick up the new Prometheus configuration file [production]
2018-04-10 §
13:35 <elukey> restart kafka on kafka-jumbo1001 for openjdk upgrades [production]
2018-04-09 §
12:01 <elukey> upgrading Boost libraries on all mediawiki eqiad API server with a ICU 57-enabled HHVM build and restart HHVM (T189295) [production]
10:13 <elukey> completed upgrade of mw eqiad api appservers to ICU 57-enabled HHVM [production]
08:45 <elukey> upgrading eqiad api appservers to ICU 57-enabled HHVM build (T189295) [production]
07:09 <elukey> upgrade burrow to 1.0 on kafkamon[12]* - T188719 [production]
06:24 <elukey> upgrade burrow 1.0.0 to stretch/jessie wikimedia [production]