3801-3850 of 6013 results (30ms)
2019-10-10 §
09:13 <joal> Kill stuck oozie launcher in yarn (application_1569878150519_43184) [analytics]
2019-10-09 §
20:52 <milimetric> deploy of refinery and refinery-source 0.0.102 finally seems to have finished [analytics]
19:55 <milimetric> refinery ... probably? deployed with errors like "No such file or directory (2)\nrsync error" [analytics]
17:11 <elukey> restart druid-broker on druid100[5-6] - not serving data correctly [analytics]
2019-10-08 §
09:22 <elukey> delete druid old test datasource from the analytics cluster - test_kafka_event_centralnoticeimpression [analytics]
2019-10-07 §
17:46 <ottomata> powercycling stat1007 [analytics]
06:08 <elukey> upgrade python-kafka on eventlog1002 to 1.4.7-1 (manually via dpkg -i) [analytics]
2019-10-05 §
18:18 <elukey> kill/restart mediawiki-history-reduced oozie coord to pick up the new druid_loader.py version on HDFS [analytics]
06:49 <elukey> force umount/remount of /mnt/hdfs on an-coord1001 - processes stuck in D state, fuser proc consuming a ton of memory [analytics]
2019-10-04 §
16:27 <ottomata> manually rsyncing mediawiki_history 2019-08 snapshot to labstore1006 [analytics]
2019-10-03 §
14:17 <elukey> stop the Hadoop test cluster to migrate it to the new kerberos cluster [analytics]
13:26 <elukey> re-run refinery-download-project-namespace-map (modified with recent fixes for encoding and python3) [analytics]
09:48 <elukey> ran apt-get autoremove -y on all Hadoop workers to remove old Python 2 deps [analytics]
08:43 <elukey> apply 5% threshold to the HDFS balancer - T231828 [analytics]
07:48 <elukey> restart druid-broker on druid1003 (used by superset) [analytics]
07:47 <elukey> restart superset to test if a stale status might cause data not to be shown [analytics]
2019-10-02 §
21:21 <nuria> restarting superset [analytics]
16:18 <elukey> kill duplicate of oozie pageview-druid-hourly coord and start the wrongly killed oozie pageview-hourly-coord (causing jobs to wait for data) [analytics]
13:12 <elukey> remove python-request from all the hadoop workers (shouldn't be needed anymore) [analytics]
13:08 <elukey> kill/start oozie webrequest druid daily/hourly coords to pick up new druid_loader.py version [analytics]
13:04 <elukey> kill/start oozie virtualpageview druid daily/monthly coords to pick up new druid_loader.py version [analytics]
12:54 <elukey> kill/start oozie unique devices per family druid daily/daily_agg_mon/monthly coords to pick up new druid_loader.py version [analytics]
10:24 <elukey> restart unique dev per domain druid daily_agg_monthly/daily/montly coords to pick up new hdfs version of druid_loader.py [analytics]
10:15 <elukey> re-run unique devices druid daily 28/09/2019 - failed but possibly no alert was fired to analytics-alerts@ [analytics]
09:48 <elukey> restart pageview druid hourly/daily/montly coords to pick up new hdfs version of druid_loader.py [analytics]
09:45 <elukey> restart mw geoeditors druid coord to pick up new hdfs version of druid_loader.py [analytics]
09:41 <elukey> restart edit druid hourly coord to pick up new hdfs version of druid_loader.py [analytics]
09:38 <elukey> restart banner activity druid daily/montly coords to pick up new hdfs version of druid_loader.py [analytics]
08:31 <elukey> kill/restart mw check denormalize with hive2_jdbc parameter [analytics]
2019-09-30 §
21:05 <ottomata> rolling restart of hdfs namenode and hdfs resourcemanager to take presto proxy user settings [analytics]
05:26 <elukey> re-run manually pageview-druid-hourly 29/09T22:00 [analytics]
2019-09-27 §
06:44 <elukey> clean up files older than 30d in /var/log/{oozie,hive} on an-coord1001 [analytics]
2019-09-26 §
18:42 <mforns> finished deploying refinery using scap (together with refinery-source 0.0.101) [analytics]
18:27 <mforns> deploying refinery using scap (together with refinery-source 0.0.101) [analytics]
17:33 <elukey> run apt-get autoremove on stat* and notebook* to clean up old python2 deps [analytics]
15:01 <mforns> deploying analytics/aqs using scap [analytics]
13:04 <elukey> removing python2 packages from the analytics hosts (not from eventlog1002) [analytics]
11:13 <mforns> deployed analytics-refinery-source v0.0.101 using Jenkins [analytics]
05:47 <elukey> upload the new version of the pageview whitelist - https://gerrit.wikimedia.org/r/539225 [analytics]
2019-09-25 §
13:37 <elukey> move the Hadoop test cluster to the Analytics Zookeeper cluster [analytics]
08:37 <elukey> add netflow realtime ingestion alert for Druid [analytics]
06:02 <elukey> set python3 for all report updater jobs on stat1006/7 [analytics]
2019-09-24 §
14:46 <ottomata> temporarily disabled camus-mediawiki_analytics_events systemd timer on an-coord1001 - T233718 [analytics]
13:18 <joal> Manually repairing wmf.mediawiki_wikitext_history [analytics]
06:07 <elukey> update Druid Kafka supervisor for netflow to index new dimensions [analytics]
2019-09-23 §
20:56 <ottomata> created new camus job for high volume mediawiki analytics events: mediawiki_analytics_events [analytics]
16:46 <elukey> deploy refinery again (no hdfs, no source) to deploy the latest python fixes [analytics]
09:25 <elukey> temporarily disable *drop* timers on an-coord1001 to verify refinery python change with the team [analytics]
08:24 <elukey> deploy refinery to apply all the python2 -> python3 fixes [analytics]
07:44 <elukey> restart manually refine_mediawiki_events on an-coord1001 with --since 48 to force the refinement after camus backfilled the missing data [analytics]