9851-9900 of 10000 results (31ms)
2016-11-09 §
13:58 <elukey> stopping kafka* daemons on kafka1014 to upgrade its fstab with UUID (T147879) [production]
13:46 <elukey> rebooting kafka1012 for kernel and openjdk updates [production]
13:35 <elukey> stopping kafka* daemons on kafka1012 to upgrade its fstab with UUID (T147879) [production]
12:57 <elukey> rebooting kafka1022 for kernel + openjdk updates [production]
10:52 <elukey> restarting kafka* on kafka1013 for openjkd upgrades [production]
10:33 <elukey> rebooting kafka1020 for kernel and openjdk upgrades [production]
09:35 <elukey> rebooting kafka1018 for kernel + openjdk upgrade [production]
2016-11-08 §
08:04 <elukey> rebooting stat1001 for kernel upgrades (will cause a brief unavail for analytics websites) [production]
2016-11-07 §
15:44 <elukey> started kafka-mirror-main-eqiad_to_analytics.service on kafka1012 [production]
15:26 <elukey> rebooting kafka1013 for kernel upgrades [production]
2016-11-06 §
10:13 <elukey> removing logstash.log.1 from logstash100[123] to free some space [production]
2016-11-02 §
08:32 <elukey> restarted cassandra-metrics-collector on aqs100[456] for jvm upgrades [production]
2016-10-31 §
19:17 <elukey> restarted varnishkafka-webrequest on cp2018 and cp3045 (CRITICALs in icinga, librdkafka errors logged for kafka1018.eqiad.wmnet) [production]
11:00 <elukey> restarting cassandra on aqs100[456] for OpenJDK upgrades [production]
07:43 <elukey> powercycled cp2010 (not reachable via ssh, com2 console showed a frozen screen) [production]
2016-10-26 §
08:43 <elukey> increasing the AQS cassandra system_auth keyspace replication from 1 to 6 (and running nodetool-{a,b} repair system_auth on all nodes) [production]
08:29 <elukey> downgraded memcached on mc2009 to the Debian Jessie version (was part of a performance experiment) [production]
2016-10-25 §
14:14 <elukey> removed logstash filter for Apache (https://logstash.wikimedia.org/app/kibana#/dashboard/apache2log) - T144005 [production]
12:24 <elukey> rebooting druid100[123] for kernel upgrades [production]
10:11 <elukey> reimaging mc103[1-6] to Jessie [production]
2016-10-24 §
13:20 <elukey> reimaging mc120[89] and mc1030 [production]
10:47 <elukey> reimaged mc102[56], currently doing mc1027 [production]
08:53 <elukey> reimaging mc1024 [production]
08:20 <elukey> reimaging mc1023.eqiad.wmnet [production]
07:46 <elukey> reimaging mc1022.eqiad.wmnet (T137345) [production]
2016-10-21 §
16:05 <elukey> reimaging mc1021 with wmf-auto-reimage (T137345) [production]
15:28 <elukey> reimaging mc1019 with wmf-auto-reimage (T137345) [production]
14:50 <elukey> reimaging mc1020 with wmf-auto-reimage (T137345) [production]
09:32 <elukey> rebooting kafka100[12] for kernel upgrades (EventBus hosts) [production]
07:20 <elukey> rebooting stat100[234] for kernel upgrades [production]
06:26 <elukey> restarting stat1001 for kernel upgrades (will cause a brief outage for some analytics websites like analytics.w.o and pivot.w.o) [production]
2016-10-20 §
13:10 <elukey> force failover from temporary Hadoop Master node (an1002) to its stanby (an1001) to restore the standard configuration [production]
13:05 <elukey> correction: force failover for Hadoop Master node (an1001) to its stanby (an1002) and rebooting an1001 for kernel upgrades [production]
12:59 <elukey> force failover for Hadoop Master node (an1002) to its stanby (an1002) and rebooting an1001 for kernel upgrades [production]
12:39 <elukey> restarting an1003 for kernel upgrades (oozie/hive master) [production]
11:53 <elukey> rebooting an1027 (camus job launcher) for kernel upgrades [production]
11:17 <elukey> rebooting all the Analytics Hadoop nodes for kernel upgrades [production]
10:50 <elukey> rebooting kafka200[12] for kernel upgrades (Kafka main-codfw non live cluster) [production]
10:05 <elukey> rebooting the Analytics Hadoop cluster for kernel upgrades [production]
08:57 <elukey> rebooting eventlog2001 for kernel upgrades (EL spare host) [production]
08:54 <elukey> rebooting eventlog1001 for kernel upgrades (Eventlogging host) [production]
08:32 <elukey> rebooting aqs100[456] for kernel upgrades (one at the time, de-pool/reboot/pool) [production]
08:31 <elukey> rebooting aqs100[123] for kernel upgrades (one at the time, de-pool/reboot/pool) [production]
2016-10-19 §
17:15 <elukey> depooled mw1239.eqiad.wmnet to allow hw investigation (T148421) (was done today but didn't logged properly) [production]
2016-10-18 §
12:52 <elukey> mw1169 back in service after reimage (MW Jobrunner) [production]
11:55 <elukey> removed /etc/mysql/conf.d/research-client.cnf from stat1002 (root:root perms, not supposed to be there but only on stat1003) [production]
11:37 <elukey> reimaging mw1169 to Debian Jessie (MW Jobrunner) [production]
10:40 <elukey> mw1168.eqiad.wmnet back in service after reimage (MW Jobrunner) [production]
09:28 <elukey> reimaging mw1168 to Debian Jessie (MW Jobrunner) [production]
09:25 <elukey> varnishkafka restarting in upload/misc/maps with new settings (https://gerrit.wikimedia.org/r/316306) [production]