751-800 of 3584 results (10ms)
2020-07-09 §
18:52 <elukey> upgrade spark2 to 2.4.4-bin-hadoop2.6-3 on stat1008 [analytics]
2020-07-07 §
10:12 <elukey> decom archiva1001 [analytics]
2020-07-06 §
08:09 <elukey> roll restart aqs on aqs100[4-9] to pick up new druid settings [analytics]
07:51 <elukey> enable binlog on matomo's database on matomo1002 [analytics]
2020-07-04 §
10:52 <joal> Rerun mediawiki-geoeditors-monthly-wf-2020-06 after heisenbug (patch provided for long-term fix) [analytics]
2020-07-03 §
19:20 <joal> restart failed webrequest-load job webrequest-load-wf-text-2020-7-3-17 with higher thresholds - error due to burst of requests in ulsfo [analytics]
19:13 <joal> restart mediawiki-history-denormalize oozie job using 0.0.115 refinery-job jar [analytics]
19:05 <joal> kill manual execution of mediawiki-history to save an-coord1001 (too big of a spark-driver) [analytics]
18:53 <joal> restart webrequest-load-wf-text-2020-7-3-17 after hive server failure [analytics]
18:52 <joal> restart data_quality_stats-wf-event.navigationtiming-useragent_entropy-hourly-2020-7-3-15 after have server failure [analytics]
18:51 <joal> restart virtualpageview-hourly-wf-2020-7-3-15 after hive-server failure [analytics]
16:41 <joal> Rerun mediawiki-history-check_denormalize-wf-2020-06 after having cleaned up wrong files and restarted a job without deterministic skewed join [analytics]
2020-07-02 §
18:16 <joal> Launch a manual instance of mediawiki-history-denormalize to release data despite oozie failing [analytics]
16:17 <joal> rerun mediawiki-history-denormalize-wf-2020-06 after oozie sharelib bump through manual restart [analytics]
12:41 <joal> retry mediawiki-history-denormalize-wf-2020-06 [analytics]
07:26 <elukey> start a tmux on an-launcher1002 with 'sudo -u analytics /usr/local/bin/kerberos-run-command analytics /usr/local/bin/refinery-sqoop-mediawiki-production' [analytics]
07:20 <elukey> execute systemctl reset-failed refinery-sqoop-whole-mediawiki.service to clear our alarms on launcher1002 [analytics]
2020-07-01 §
19:04 <joal> Kill/restart webrequest-load-bundle for mobile-pageview update [analytics]
18:59 <joal> kill/restart pageview-druid jobs (hourly, daily, monthly) for in_content_namespace field update [analytics]
18:57 <joal> kill/restart mediawiki-wikitext-history-coord and mediawiki-wikitext-current-coord for bz2 codec update [analytics]
18:55 <joal> kill/restart mediawiki-history-denormalize-coord after skewed-join strategy update [analytics]
18:52 <joal> Kill/Restart unique_devices-per_project_family-monthly-coord after fix [analytics]
18:41 <joal> deploy refinery to HDFS [analytics]
18:28 <joal> Deploy refinery using scap after hotfix [analytics]
18:20 <joal> Deploy refinery using scap [analytics]
16:58 <joal> trying to release refinery-source 0.0.129 to archiva, version 3 [analytics]
16:51 <elukey> remove /etc/maven/settings.xml from all analytics nodes that have it [analytics]
2020-06-30 §
18:28 <joal> trying to release refinery-source to archiva from jenkins (second time) [analytics]
16:30 <joal> Release refinery-source v0.0.129 using jenkins [analytics]
16:30 <joal> Deploy refien [analytics]
16:05 <elukey> re-enable timers on an-launcher1002 after archiva maintenance [analytics]
15:23 <elukey> stop timers on an-launcher1002 to ease debugging for refinery deploy [analytics]
13:12 <elukey> restart nodemanager on analytics1068 after GC overhead and OOMs [analytics]
09:32 <joal> Kill/Restart mediawiki-wikitext-history job now that the current month one is done (bz2 fix) [analytics]
2020-06-29 §
13:09 <elukey> archiva.wikimedia.org migrated to archiva1002 [analytics]
2020-06-25 §
17:20 <elukey> move RU jobs/timers from an-launcher1001 to an-launcher1002 [analytics]
16:07 <elukey> move all timers but RU from an-launcher1001 to 1002 (puppet disabled on 1001, all timers completed) [analytics]
12:13 <elukey> reimage notebook1003/4 to debian buster as fresh start [analytics]
09:28 <joal> Kill-restart pageview-hourly to read from pageview_actor [analytics]
09:25 <joal> Kill-restart pageview_actor jobs (current+backfill) after dpeloy [analytics]
09:14 <joal> Deploy refinery to HDFS [analytics]
08:56 <joal> deploying refinery using scap to fix pageview_actor_hourly [analytics]
08:02 <joal> Start backfilling pageview_actor_hourly job with new patch (expected to solve heisenbug) [analytics]
07:40 <joal> Dropping refinery-camus jars from archiva up to 0.0.115 [analytics]
07:04 <joal> rerun failed pageview_actor_hourly [analytics]
2020-06-24 §
19:36 <joal> Cleaning refinery-spark from archiva (up to 0.0.115) [analytics]
19:28 <joal> Cleaning refinery-tools from archiva (up to 0.0.115) [analytics]
19:16 <joal> Restarting unique-devices jobs to use pageview_actor_hourly instead of webrequest (4 jobs) [analytics]
19:08 <joal> Start pageview_actor_hourly oozie job [analytics]
19:06 <joal> Create pageview_actor_hourly after deploy to start new jobs [analytics]