201-250 of 3067 results (9ms)
2020-07-20 §
19:56 <joal> kill-restart cassandra unique-devices loading daily and monthly after deploy (2020-07-20 and 2020-07-01) [analytics]
19:55 <joal> kill-restart mediawiki-history-denormalize after dpeloy (2020-07-01) [analytics]
19:55 <joal> kill-restart webrequest after dpeloy (2020-07-20T18:00) [analytics]
19:19 <mforns> finished refinery deployment (for v0.0.131) [analytics]
19:02 <mforns> starting refinery deployment (for v0.0.131) [analytics]
19:02 <mforns> deployed refinery-source v0.0.131 [analytics]
18:16 <joal> Rerun cassandra-daily-coord-local_group_default_T_unique_devices from 2020-07-15 to 2020-07-19 (both included) [analytics]
14:50 <elukey> restart superset to pick up TLS to mysql settings [analytics]
14:18 <elukey> re-enable timers on an-launcher1002 [analytics]
14:01 <elukey> resume pageview-daily_dump-coord via Hue to ease the draining + mariadb restart [analytics]
14:00 <elukey> restart mariadb on an-coord1001 with TLS settings [analytics]
13:43 <elukey> suspend pageview-daily_dump-coord via Hue to ease the draining + mariadb restart [analytics]
12:55 <elukey> stop timers on an-launcher1002 to ease the mariadb restart on an-coord1001 (second attempt) [analytics]
09:10 <elukey> start timers on an-launcher1002 (no mysql restart happened, long jobs not completing, will postpone) [analytics]
07:16 <joal> Restart mobile_apps-session_metrics-wf-7-2020-7-12 after heisenbug kerbe failure [analytics]
06:58 <elukey> stop timers on an-launcher1002 to ease the mariadb restart on an-coord1001 [analytics]
2020-07-17 §
12:34 <elukey> deprecate pivot.wikimedia.org (to ease CAS work) [analytics]
2020-07-16 §
17:37 <andrewbogott> adding "java::egd_source: '/dev/random'" to hiera k4 prefix in order to unbreak puppet runs [analytics]
17:36 <andrewbogott> adding "profile::java::hardened_tls: false" to hiera k4 prefix in order to unbreak puppet runs [analytics]
2020-07-15 §
17:58 <joal> Backfill cassandra unique-devices for per-project-family starting 2019-07 [analytics]
08:18 <elukey> move piwik to CAS (idp.wikimedia.org) [analytics]
2020-07-14 §
15:50 <elukey> upgrade spark2 on all stat100x hosts [analytics]
15:07 <elukey> upgrade spark2 to 2.4.4-bin-hadoop2.6-3 on stat1004 [analytics]
14:55 <elukey> re-create jupyterhub's venv on stat1005/8 after https://gerrit.wikimedia.org/r/612484 [analytics]
14:45 <elukey> re-create jupyterhub's base kernel directory on stat1005 (trying to debug some problems) [analytics]
07:27 <joal> Restart forgotten unique-devices per-project-family jobs after yesterday deploy [analytics]
2020-07-13 §
20:17 <milimetric> deployed weekly train with two oozie job bugfixes and rename to pageview_actor table [analytics]
19:42 <joal> Deploy refinery with scap [analytics]
19:24 <joal> Drop pageview_actor_hourly and replace it by pageview_actor [analytics]
18:26 <joal> Kill pageview_actor_hourly and unique_devices_per_project_family jobs to copy backfilled data [analytics]
12:35 <joal> Start backfilling of wdqs_internal (external had been done, not internal :S) [analytics]
2020-07-10 §
17:10 <nuria> updating the EL whitelist, refinery reploy (but not source) [analytics]
16:01 <milimetric> deployed, EL whitelist is updated [analytics]
2020-07-09 §
18:52 <elukey> upgrade spark2 to 2.4.4-bin-hadoop2.6-3 on stat1008 [analytics]
2020-07-07 §
10:12 <elukey> decom archiva1001 [analytics]
2020-07-06 §
08:09 <elukey> roll restart aqs on aqs100[4-9] to pick up new druid settings [analytics]
07:51 <elukey> enable binlog on matomo's database on matomo1002 [analytics]
2020-07-04 §
10:52 <joal> Rerun mediawiki-geoeditors-monthly-wf-2020-06 after heisenbug (patch provided for long-term fix) [analytics]
2020-07-03 §
19:20 <joal> restart failed webrequest-load job webrequest-load-wf-text-2020-7-3-17 with higher thresholds - error due to burst of requests in ulsfo [analytics]
19:13 <joal> restart mediawiki-history-denormalize oozie job using 0.0.115 refinery-job jar [analytics]
19:05 <joal> kill manual execution of mediawiki-history to save an-coord1001 (too big of a spark-driver) [analytics]
18:53 <joal> restart webrequest-load-wf-text-2020-7-3-17 after hive server failure [analytics]
18:52 <joal> restart data_quality_stats-wf-event.navigationtiming-useragent_entropy-hourly-2020-7-3-15 after have server failure [analytics]
18:51 <joal> restart virtualpageview-hourly-wf-2020-7-3-15 after hive-server failure [analytics]
16:41 <joal> Rerun mediawiki-history-check_denormalize-wf-2020-06 after having cleaned up wrong files and restarted a job without deterministic skewed join [analytics]
2020-07-02 §
18:16 <joal> Launch a manual instance of mediawiki-history-denormalize to release data despite oozie failing [analytics]
16:17 <joal> rerun mediawiki-history-denormalize-wf-2020-06 after oozie sharelib bump through manual restart [analytics]
12:41 <joal> retry mediawiki-history-denormalize-wf-2020-06 [analytics]
07:26 <elukey> start a tmux on an-launcher1002 with 'sudo -u analytics /usr/local/bin/kerberos-run-command analytics /usr/local/bin/refinery-sqoop-mediawiki-production' [analytics]
07:20 <elukey> execute systemctl reset-failed refinery-sqoop-whole-mediawiki.service to clear our alarms on launcher1002 [analytics]