production SAL

8801-8850 of 10000 results (26ms)

2018-05-16 §
05:54	<elukey>	removed acpi_power_meter manually from conf1004 (blacklisted module in puppet), Acpi errors in dmesg	[production]
2018-05-15 §
16:36	<elukey>	rolling restart of aqs on aqs* nodes to pick up the new druid config	[production]
16:12	<elukey>	roll restart kafka on kafka-jumbo to pick up new zookeeper settings	[production]
15:52	<elukey>	rolling restart of hadoop master daemons to pick up new zookeeper settings	[production]
15:20	<elukey>	roll restart of Kafka Analytics to pick up new zookeeper settings	[production]
14:59	<elukey>	roll restart of kafka daemons on kafka100[1-3] to pick up new zookeeper settings and group.initial.rebalance.delay.ms = 10s	[production]
14:14	<elukey>	swap conf1001 with conf1004 in the zookeeper main eqiad's config + roll restart of the service	[production]
13:50	<elukey>	roll restart of kafka main codfw (kafka200[1-3]) to pick up group.initial.rebalance.delay.ms = 10s	[production]
2018-05-14 §
16:19	<elukey>	umount/remount /mnt/hdfs on stat1005 to pick up new openjdk upgrades	[production]
06:38	<elukey>	rolling restart of cassandra on aqs* for openjdk-8 upgrades	[production]
2018-05-11 §
14:28	<elukey>	restart kafka brokers on kafka10[20,22,23] to pick up openjdk-7 security upgrades	[production]
14:13	<elukey>	restart Hadoop daemons on analytics100[12] for openjdk security upgrades	[production]
10:07	<elukey>	reimage analytics1052 to Debian Stretch (Hadoop Journal node)	[production]
07:59	<elukey>	reimage analytics1035 to Debian Stretch	[production]
2018-05-10 §
15:54	<elukey>	drain and reimage analytics1028 to Debian Jessie (Hadoop Journal node)	[production]
15:00	<elukey>	rolling restart of Hadoop HDFS datanodes on analytics workers to pick up the new openjdk-8 security upgrades	[production]
14:38	<elukey>	rolling restart of Hadoop Yarn nodemanagers on analytics worker nodes for openjdk-8 security upgrades	[production]
14:07	<elukey>	restart hive/oozie Hadoop daemons on analytics1003 for openjdk-8 upgrades	[production]
13:31	<elukey>	rolling restart of kafka on kafka-jumbo1* for openjdk-8 security upgrades	[production]
13:19	<elukey>	reimage analytics1029 to Debian Stretch	[production]
11:46	<elukey>	reimage analytics1030/31 to Debian Stretch	[production]
2018-05-09 §
13:08	<elukey>	reimage analytics103[2,3] to Debian Stretch	[production]
2018-05-08 §
09:20	<elukey>	forced a BBU re-learn cycle on analytics1032	[production]
07:53	<elukey>	second attempt to remove the cassandra-metrics-collector (+ cleanup) from aqs*	[production]
2018-05-07 §
16:57	<elukey>	executed sudo megacli -AdpBbuCmd -BbuLearn -aALL -NoLog on analytics1032 - BBU alerts flapping	[production]
09:52	<elukey>	stop graphite cassandra-metrics-collector on aqs* (touch /etc/cassandra-metrics-collector/disable)	[production]
08:56	<elukey>	drain + reimage analytics103[7,8] to Debian Stretch	[production]
2018-05-03 §
05:57	<elukey>	reimage analytics10[39,40] to Debian Stretch	[production]
2018-05-02 §
13:29	<elukey>	upgrade zookeeper to 3.4.9 on druid100[4-6] (wikistats 2 backend) - T164008	[production]
13:20	<elukey>	restart druid broker on druid100[1-3] to enable the 'druid.sql.enable' feature	[production]
08:11	<elukey>	upgrading Druid to 0.10 on druid100[4-6] (wikistats 2 backend) - T164008	[production]
07:42	<elukey>	remove openjdk-7 related packages from druid100[1-3] after zookeeper upgrade	[production]
07:31	<elukey>	upgrade zookeeper on druid100[1-3] to 3.4.9 - T164008	[production]
2018-04-30 §
14:27	<elukey>	upgrade druid on druid100[1-3] from 0.9.2 to 0.10	[production]
13:11	<elukey>	reimage analytics1049 and 1050 to Debian Stretch	[production]
10:06	<elukey>	restart hdfs namenode on analytics1002 to pick up new heap settings (last step of the maintenance)	[production]
10:00	<elukey>	set analytics1001 as active HDFS Namenode using manual failover	[production]
09:50	<elukey>	restart HDFS Namenode on analtics1001 (current standby) again with Xmx/Xms set to 8g	[production]
09:47	<elukey>	restart HDFS Namenode on analtics1001 (current standby)	[production]
08:38	<elukey>	restart HDFS namenode on analytics1001 (standby master) to pick up new JVM settings - T193257	[production]
08:16	<elukey>	force a manual failover of the HDFS Namenode from analytics1001 to analytics1002 to test new GC Settings - T193257	[production]
07:31	<elukey>	restart HDFS namenode on analytics1002 (standby master) to pick up new JVM settings - T193257	[production]
2018-04-27 §
08:58	<elukey>	reimage analytics10[51,53] to Debian Stretch	[production]
2018-04-25 §
14:37	<elukey>	restart hive-server2 on analytics1003 to pick up settings in https://gerrit.wikimedia.org/r/428919	[production]
09:58	<elukey>	reimage analytics106[1,2] to Debian Stretch	[production]
2018-04-24 §
19:49	<elukey>	re-enable ircecho	[production]
19:36	<elukey>	stop ircecho on einstenium - icinga shower	[production]
16:30	<elukey>	restart hadoop hdfs journalnode on analytics1035/52 to pick up prometheus jmx settings	[production]
16:11	<elukey>	restart hadoop-hdfs-journalnode on analytics1028 to pick up prometheus monitoring	[production]
14:41	<elukey>	restart hadoop hdfs journalnode on analytics1028 to pick up jmx settings	[production]