8201-8250 of 10000 results (19ms)
2019-08-09 §
06:19 <elukey> powercycle thumbor2004 (no ssh, serial console showing a fronzen os) [production]
2019-08-08 §
08:45 <elukey> restart hadoop namenodes on an-master100* to pick up new GC settings (CMS -> G1 switch) [production]
2019-08-07 §
13:22 <elukey> roll restart aqs on aqs100[4-9] to pick up new Druid backend settings [production]
2019-08-06 §
12:30 <elukey> roll restart cassandra on aqs for openjdk-8 upgrades [production]
2019-08-05 §
13:28 <elukey@cumin1001> END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) [production]
13:16 <elukey@cumin1001> START - Cookbook sre.kafka.roll-restart-mirror-maker [production]
2019-08-02 §
09:12 <elukey> umount /sys/kernel/debug/tracing on analytics1043 [production]
2019-08-01 §
06:59 <elukey> install python3-docopt manually on lithium to test check_anycast_healthchecker [production]
2019-07-31 §
14:04 <elukey@cumin1001> END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) [production]
13:49 <elukey@cumin1001> START - Cookbook sre.zookeeper.roll-restart-zookeeper [production]
13:37 <elukey@cumin1001> END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) [production]
13:31 <elukey@cumin1001> START - Cookbook sre.zookeeper.roll-restart-zookeeper [production]
13:27 <elukey> roll restart of zookeeper on conf100[4-6] and conf200[1-3] for openjdk upgrades [production]
13:12 <elukey@cumin1001> END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) [production]
13:05 <elukey@cumin1001> START - Cookbook sre.zookeeper.roll-restart-zookeeper [production]
12:59 <elukey@cumin1001> END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) [production]
12:53 <elukey@cumin1001> START - Cookbook sre.zookeeper.roll-restart-zookeeper [production]
10:08 <elukey@cumin1001> END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) [production]
09:56 <elukey@cumin1001> START - Cookbook sre.kafka.roll-restart-mirror-maker [production]
08:37 <elukey> restart Yarn Resource Managers on an-master100[12] to pick up the new openjdk version [production]
08:05 <elukey> restart hadoop Namenodes on an-master100[12] to pick up new heap settings and new openjdk [production]
07:29 <elukey> restart-hhvm on mw1290 [production]
2019-07-30 §
18:38 <elukey@cumin1001> END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) [production]
16:33 <elukey@cumin1001> START - Cookbook sre.kafka.roll-restart-brokers [production]
15:41 <elukey@cumin1001> END (FAIL) - Cookbook sre.kafka.roll-restart-brokers (exit_code=99) [production]
15:21 <elukey@cumin1001> START - Cookbook sre.kafka.roll-restart-brokers [production]
15:13 <elukey> remove snakebite from buster-wikimedia (not needed anymore) [production]
09:49 <elukey> upload python-snakebite to buster-wikimedia (rebuilt for buster from source) [production]
09:27 <elukey> add thirdparty/cloudera to buster-wikimedia and import packages to it (pull from the jessie component) [production]
2019-07-29 §
16:19 <elukey> manually stopped the sre.kafka.roll-restart-brokers cookbook after 4 brokers restarts since the sleep interval (10mins) is too tight. [production]
16:17 <elukey@cumin1001> END (ERROR) - Cookbook sre.kafka.roll-restart-brokers (exit_code=97) [production]
15:34 <elukey@cumin1001> START - Cookbook sre.kafka.roll-restart-brokers [production]
13:30 <elukey@cumin1001> END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) [production]
13:01 <elukey@cumin1001> START - Cookbook sre.druid.roll-restart-workers [production]
09:24 <elukey@cumin1001> END (FAIL) - Cookbook sre.druid.roll-restart-workers (exit_code=99) [production]
09:22 <elukey@cumin1001> START - Cookbook sre.druid.roll-restart-workers [production]
09:21 <elukey@cumin1001> END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) [production]
08:55 <elukey@cumin1001> START - Cookbook sre.druid.roll-restart-workers [production]
08:47 <elukey> set mcrouter async behavior for codfw replication to all mw app/api servers (changes will be picked up when puppet runs on the hosts) - T225642 [production]
08:32 <elukey@cumin1001> END (ERROR) - Cookbook sre.hadoop.roll-restart-workers (exit_code=97) [production]
08:32 <elukey@cumin1001> START - Cookbook sre.hadoop.roll-restart-workers [production]
07:18 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) [production]
06:30 <elukey@cumin1001> START - Cookbook sre.hadoop.roll-restart-workers [production]
2019-07-27 §
06:43 <elukey> powercycle mw1300 - no ssh, serial com2 stuck with no root loging available [production]
2019-07-25 §
17:51 <elukey> powercycle stat1007 [production]
17:17 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.rolling-restart-workers (exit_code=0) [production]
17:01 <elukey@cumin1001> START - Cookbook sre.hadoop.rolling-restart-workers [production]
06:42 <elukey> restart kafka* on kafka-jumbo1001 to pick up new openjdk-8 version [production]
06:37 <elukey> restart cassandra instances on aqs1004 to pick up new openjdk-8 version [production]
06:34 <elukey> add term eventgate to analytics-in4 on cr1/cr2-eqiad - T228882 [production]