701-750 of 3584 results (7ms)
2020-07-28 §
09:10 <elukey> temporarily stop eventlogging file consumers on eventlog1002 to copy some data over to stat1005 (/srv partition full) [analytics]
08:03 <elukey> Superset migrated to CAS [analytics]
06:42 <elukey> re-run webrequest-load hour 2020-7-28-3 [analytics]
2020-07-27 §
17:15 <elukey> restart eventlogging on eventlog1002 to update the event whitelist (exclude MobileWebUIClickTracking) [analytics]
08:19 <elukey> reset-failed the monitor_refine_failures for eventlogging on an-launcher1002 [analytics]
06:44 <elukey> truncate big log file on an-launcher1002 that is filling up the /srv partition [analytics]
2020-07-22 §
15:05 <joal> manually drop /user/analytics/.Trash/200714000000/wmf/data/wmf/pageview/actor to free some space [analytics]
15:03 <joal> Manually drop /wmf/data/wmf/mediawiki/wikitext/history/snapshot=2020-03 to free some spqce [analytics]
15:01 <elukey> hdfs dfs -rm -r -skipTrash /var/log/hadoop-yarn/apps/analytics-privatedata/logs [analytics]
14:49 <elukey> hdfs dfs -rm -r -skipTrash /var/log/hadoop-yarn/apps/analytics/logs/* [analytics]
08:09 <elukey> turnilo.wikimedia.org migrated to CAS [analytics]
2020-07-21 §
18:30 <mforns> finished re-deploying refinery to unbreak unique devices per domain monthly [analytics]
18:05 <mforns> re-deploying refinery to unbreak unique devices per domain monthly [analytics]
17:34 <mforns> restarted unique_devices-per_domain-daily-coord [analytics]
15:09 <elukey> yarn.wikimedia.org migrated earlier on to CAS auth [analytics]
14:58 <ottomata> Refine - reverted change to not merge hive schema + event schema before reading - T255818 [analytics]
13:36 <ottomata> Refine no longer merges with Hive table schema when reading (except for refine_eventlogging_analytics job) - T255818 [analytics]
2020-07-20 §
19:56 <joal> kill-restart cassandra unique-devices loading daily and monthly after deploy (2020-07-20 and 2020-07-01) [analytics]
19:55 <joal> kill-restart mediawiki-history-denormalize after dpeloy (2020-07-01) [analytics]
19:55 <joal> kill-restart webrequest after dpeloy (2020-07-20T18:00) [analytics]
19:19 <mforns> finished refinery deployment (for v0.0.131) [analytics]
19:02 <mforns> starting refinery deployment (for v0.0.131) [analytics]
19:02 <mforns> deployed refinery-source v0.0.131 [analytics]
18:16 <joal> Rerun cassandra-daily-coord-local_group_default_T_unique_devices from 2020-07-15 to 2020-07-19 (both included) [analytics]
14:50 <elukey> restart superset to pick up TLS to mysql settings [analytics]
14:18 <elukey> re-enable timers on an-launcher1002 [analytics]
14:01 <elukey> resume pageview-daily_dump-coord via Hue to ease the draining + mariadb restart [analytics]
14:00 <elukey> restart mariadb on an-coord1001 with TLS settings [analytics]
13:43 <elukey> suspend pageview-daily_dump-coord via Hue to ease the draining + mariadb restart [analytics]
12:55 <elukey> stop timers on an-launcher1002 to ease the mariadb restart on an-coord1001 (second attempt) [analytics]
09:10 <elukey> start timers on an-launcher1002 (no mysql restart happened, long jobs not completing, will postpone) [analytics]
07:16 <joal> Restart mobile_apps-session_metrics-wf-7-2020-7-12 after heisenbug kerbe failure [analytics]
06:58 <elukey> stop timers on an-launcher1002 to ease the mariadb restart on an-coord1001 [analytics]
2020-07-17 §
12:34 <elukey> deprecate pivot.wikimedia.org (to ease CAS work) [analytics]
2020-07-16 §
17:37 <andrewbogott> adding "java::egd_source: '/dev/random'" to hiera k4 prefix in order to unbreak puppet runs [analytics]
17:36 <andrewbogott> adding "profile::java::hardened_tls: false" to hiera k4 prefix in order to unbreak puppet runs [analytics]
2020-07-15 §
17:58 <joal> Backfill cassandra unique-devices for per-project-family starting 2019-07 [analytics]
08:18 <elukey> move piwik to CAS (idp.wikimedia.org) [analytics]
2020-07-14 §
15:50 <elukey> upgrade spark2 on all stat100x hosts [analytics]
15:07 <elukey> upgrade spark2 to 2.4.4-bin-hadoop2.6-3 on stat1004 [analytics]
14:55 <elukey> re-create jupyterhub's venv on stat1005/8 after https://gerrit.wikimedia.org/r/612484 [analytics]
14:45 <elukey> re-create jupyterhub's base kernel directory on stat1005 (trying to debug some problems) [analytics]
07:27 <joal> Restart forgotten unique-devices per-project-family jobs after yesterday deploy [analytics]
2020-07-13 §
20:17 <milimetric> deployed weekly train with two oozie job bugfixes and rename to pageview_actor table [analytics]
19:42 <joal> Deploy refinery with scap [analytics]
19:24 <joal> Drop pageview_actor_hourly and replace it by pageview_actor [analytics]
18:26 <joal> Kill pageview_actor_hourly and unique_devices_per_project_family jobs to copy backfilled data [analytics]
12:35 <joal> Start backfilling of wdqs_internal (external had been done, not internal :S) [analytics]
2020-07-10 §
17:10 <nuria> updating the EL whitelist, refinery reploy (but not source) [analytics]
16:01 <milimetric> deployed, EL whitelist is updated [analytics]