751-800 of 3211 results (21ms)
2020-02-21 §
11:41 <elukey> restart varnishkafka on cp3057 (stuck in timeouts to kafka, analytics alarms raised) [analytics]
08:19 <fdans> deploying refinery [analytics]
00:11 <joal> Rerun failed wikidata-json_entity-weekly-coord instances after having created the missing hive table [analytics]
2020-02-20 §
16:57 <fdans> refinery source jars updated [analytics]
16:39 <fdans> deploying refinery source 0.0.114 [analytics]
15:16 <fdans> deploying AQS [analytics]
2020-02-19 §
16:58 <ottomata> Deployed refinery using scap, then deployed onto hdfs [analytics]
2020-02-17 §
18:29 <elukey> reboot turnilo and superset's hosts for kernel upgrades [analytics]
18:25 <elukey> restart kafka on kafka-jumbo1001 to pick up new openjdk updates [analytics]
18:22 <elukey> restart cassandra on aqs1004 to pick up new openjdk updates [analytics]
17:59 <elukey> restart druid daemons on druid1003 to pick up new openjdk updates [analytics]
17:58 <elukey> restart cassandra on aqs1004 to pick up new openjdk updates [analytics]
17:56 <elukey> restart hadoop daemons on analytics1042 to pick up new openjdk updates [analytics]
2020-02-15 §
12:07 <elukey> re-run failed pageview druid hour [analytics]
12:05 <elukey> re-run failed virtualpageview hours [analytics]
2020-02-12 §
14:33 <elukey> restart hue on analytics-tool1001 [analytics]
13:36 <joal> Kill-restart webrequest bundle to see if it mitigates the error [analytics]
2020-02-10 §
15:26 <elukey> kill application_1576512674871_246621 (consuming too much memory) [analytics]
14:31 <elukey> kill application_1576512674871_246419 (eating a ton of ram on the cluster) [analytics]
2020-02-08 §
09:35 <elukey> created /wmf/data/raw/wikidata/dumps/all_ttl on hdfs [analytics]
09:35 <elukey> created /wmf/data/raw/wikidata/dumps/all_json on hdfs [analytics]
2020-02-05 §
21:14 <joal> Kill data_quality_stats-hourly-bundle and data_quality_stats-daily-bundle [analytics]
21:11 <joal> Kill-restart mediawiki-history-dumps-coord, drop existing data, and restart at 2019-11 [analytics]
21:06 <joal> Kill-restart mediawiki-wikitext-history-coord and mediawiki-wikitext-current-coord [analytics]
20:51 <joal> Deploy refinery using scap [analytics]
20:29 <joal> Refinery-source released in archiva by jenkins [analytics]
20:20 <joal> Deploy hdfs-tools 0.0.5 using scap [analytics]
2020-02-03 §
11:20 <elukey> restart oozie on an-coord1001 [analytics]
10:11 <elukey> enable all timers on an-coord1001 after spark encryption/auth settings [analytics]
09:32 <elukey> roll restart yarn node managers again to pick up spark encryption/authentication settings [analytics]
08:34 <elukey> stop timers on an-coord1001 to drain the cluster and ease the deploy of spark encryption settings [analytics]
07:58 <elukey> roll restart hadoop yarn node managers to pick up new libcrypto.so link (shouldn't be necessary but just in case) [analytics]
07:24 <elukey> create /usr/lib/x86_64-linux-gnu/libcrypto.so on all the analytics nodes via puppet [analytics]
2020-01-27 §
05:38 <elukey> re-run webrequest text 2020-01-26T20/21 with higher dataloss thresholds (false positives) [analytics]
02:49 <elukey> re-run refine eventlogging manually to clear out refine failed events [analytics]
2020-01-26 §
17:58 <elukey> re-run failed refine job for MobileWebUIActionsTracking 2020-01-26T12 [analytics]
17:32 <elukey> restart varnishkafka on cp3056/cp3064 due to network issues on the hosts [analytics]
2020-01-23 §
17:48 <milimetric> launching a sqoop for imagelinks (will be slow because tuning sess) [analytics]
2020-01-20 §
12:19 <elukey> restart zookeeper on an-conf100X to pick up openjdk-11 updates [analytics]
2020-01-18 §
10:06 <elukey> re-run all entropy job failed via Hue (StopWatch issue) [analytics]
2020-01-16 §
20:52 <mforns> deployed refinery accompanying source v0.0.112 [analytics]
17:00 <mforns> deployed refinery-source v0.0.112 [analytics]
15:17 <elukey> upgrade superset to 0.35.2 [analytics]
15:14 <elukey> stop superset as prep step for upgrade [analytics]
2020-01-15 §
10:44 <elukey> remove flume-ng and spark-python/core packages from an-coord1001,analytics1030,analytics-tool1001,analytics1039 - T242754 [analytics]
10:39 <elukey> remove flume-ng from all stat/notebooks - T242754 [analytics]
10:37 <elukey> remove spark-core flume-ng from all the hadoop workers - T242754 [analytics]
08:44 <elukey> move aqs to the new rsyslog-logstash pipeline [analytics]
2020-01-14 §
20:12 <milimetric> deployed aqs with new service-runner version 2.7.3 [analytics]
2020-01-13 §
21:45 <milimetric> webrequest restarted [analytics]