751-800 of 3651 results (15ms)
2020-08-04 §
07:29 <elukey> stop kafka supervisor for netflow on Druid Analytics (prep step for druid upgrade) [analytics]
07:00 <elukey> suspend all druid-related coordinators in Hue as prep step for upgrade [analytics]
2020-08-03 §
09:53 <elukey> move all druid-related systemd timer to spark client mode - T254493 [analytics]
08:07 <elukey> roll restart aqs on aqs* to pick up new druid settings [analytics]
2020-08-01 §
13:22 <joal> Rerun cassandra-monthly-wf-local_group_default_T_unique_devices-2020-7 to load missing data (email with bug description sent to list) [analytics]
2020-07-31 §
14:46 <mforns> restarted webrequest oozie bundle [analytics]
14:46 <mforns> restarted mediawiki history reduced oozie job [analytics]
09:00 <elukey> SET GLOBAL expire_logs_days=14; on matomo1002's mysql [analytics]
09:00 <elukey> SET GLOBAL expire_logs_days=14; on an-coord1001's mysql [analytics]
06:32 <elukey> roll restart of druid brokers on druid100[4-8] to pick up new changes [analytics]
2020-07-30 §
19:14 <mforns> finished refinery deploy (for v0.0.132) [analytics]
18:48 <mforns> starting refinery deploy (for v0.0.132) [analytics]
18:27 <mforns> deployed refinery-source v0.0.132 [analytics]
2020-07-29 §
14:37 <mforns> quick deployment of pageview white-list [analytics]
2020-07-28 §
17:52 <ottomata> stopped riting eventlogging data log files on eventlog1002 and stopped syncing them to stat100[67] - T259030 [analytics]
14:29 <elukey> stop client-side-events-log.service on eventlog1002 to avoid /srv to fill up [analytics]
09:48 <elukey> re-enable eventlogging file consumers on eventlog1002 [analytics]
09:10 <elukey> temporarily stop eventlogging file consumers on eventlog1002 to copy some data over to stat1005 (/srv partition full) [analytics]
08:03 <elukey> Superset migrated to CAS [analytics]
06:42 <elukey> re-run webrequest-load hour 2020-7-28-3 [analytics]
2020-07-27 §
17:15 <elukey> restart eventlogging on eventlog1002 to update the event whitelist (exclude MobileWebUIClickTracking) [analytics]
08:19 <elukey> reset-failed the monitor_refine_failures for eventlogging on an-launcher1002 [analytics]
06:44 <elukey> truncate big log file on an-launcher1002 that is filling up the /srv partition [analytics]
2020-07-22 §
15:05 <joal> manually drop /user/analytics/.Trash/200714000000/wmf/data/wmf/pageview/actor to free some space [analytics]
15:03 <joal> Manually drop /wmf/data/wmf/mediawiki/wikitext/history/snapshot=2020-03 to free some spqce [analytics]
15:01 <elukey> hdfs dfs -rm -r -skipTrash /var/log/hadoop-yarn/apps/analytics-privatedata/logs [analytics]
14:49 <elukey> hdfs dfs -rm -r -skipTrash /var/log/hadoop-yarn/apps/analytics/logs/* [analytics]
08:09 <elukey> turnilo.wikimedia.org migrated to CAS [analytics]
2020-07-21 §
18:30 <mforns> finished re-deploying refinery to unbreak unique devices per domain monthly [analytics]
18:05 <mforns> re-deploying refinery to unbreak unique devices per domain monthly [analytics]
17:34 <mforns> restarted unique_devices-per_domain-daily-coord [analytics]
15:09 <elukey> yarn.wikimedia.org migrated earlier on to CAS auth [analytics]
14:58 <ottomata> Refine - reverted change to not merge hive schema + event schema before reading - T255818 [analytics]
13:36 <ottomata> Refine no longer merges with Hive table schema when reading (except for refine_eventlogging_analytics job) - T255818 [analytics]
2020-07-20 §
19:56 <joal> kill-restart cassandra unique-devices loading daily and monthly after deploy (2020-07-20 and 2020-07-01) [analytics]
19:55 <joal> kill-restart mediawiki-history-denormalize after dpeloy (2020-07-01) [analytics]
19:55 <joal> kill-restart webrequest after dpeloy (2020-07-20T18:00) [analytics]
19:19 <mforns> finished refinery deployment (for v0.0.131) [analytics]
19:02 <mforns> starting refinery deployment (for v0.0.131) [analytics]
19:02 <mforns> deployed refinery-source v0.0.131 [analytics]
18:16 <joal> Rerun cassandra-daily-coord-local_group_default_T_unique_devices from 2020-07-15 to 2020-07-19 (both included) [analytics]
14:50 <elukey> restart superset to pick up TLS to mysql settings [analytics]
14:18 <elukey> re-enable timers on an-launcher1002 [analytics]
14:01 <elukey> resume pageview-daily_dump-coord via Hue to ease the draining + mariadb restart [analytics]
14:00 <elukey> restart mariadb on an-coord1001 with TLS settings [analytics]
13:43 <elukey> suspend pageview-daily_dump-coord via Hue to ease the draining + mariadb restart [analytics]
12:55 <elukey> stop timers on an-launcher1002 to ease the mariadb restart on an-coord1001 (second attempt) [analytics]
09:10 <elukey> start timers on an-launcher1002 (no mysql restart happened, long jobs not completing, will postpone) [analytics]
07:16 <joal> Restart mobile_apps-session_metrics-wf-7-2020-7-12 after heisenbug kerbe failure [analytics]
06:58 <elukey> stop timers on an-launcher1002 to ease the mariadb restart on an-coord1001 [analytics]