3001-3050 of 3794 results (22ms)
2017-07-01 §
21:33 <joal> Restart cassandra bundle at beginning of the month [analytics]
2017-06-29 §
11:39 <joal> Update tables and archived data and kill/start jobs for unique-devices per project-family [analytics]
11:34 <joal> Kill and restart druid webrequest sampled oozie jobs after deploy [analytics]
11:18 <joal> Update tables and restart mediawiki_history oozie jobs after deploy [analytics]
10:58 <elukey> deploy refinery to HDFS [analytics]
10:57 <elukey> fixed archiva whitelist in the analytics VLAN (VM changed IP) [analytics]
07:03 <joal> Deploying refinery with scap (after yesterday's failure) [analytics]
2017-06-28 §
18:17 <joal> Deploying refinery with scap [analytics]
16:25 <joal> Building / Deploying refinery-source from jenkins to archiva (v0.0.480 [analytics]
15:42 <elukey> analytics1030 back to the worker nodes after maintenance [analytics]
2017-06-27 §
16:26 <milimetric> quarry Rebooted all the boxes in an attempt to fix performance problems [analytics]
10:05 <elukey> added https://wiki.apache.org/commons/VfsProblems to stat1004 [analytics]
07:14 <joal> Rerun wikidata-articleplaceholder_metrics-wf-2017-6-26 [analytics]
2017-06-24 §
10:31 <elukey> re-run webrequest-load-coord-misc's failed job in hue [analytics]
2017-06-23 §
07:32 <elukey> uploaded new pageview whitelist following https://wikitech.wikimedia.org/wiki/Analytics/Team/Oncall#Find_and_fix_pageview_whitelist_exceptions for kbp.wikipedia [analytics]
2017-06-21 §
20:23 <joal> Disable puppet agent and restart kafka with 48h retention in deployment-kafka01 [analytics]
13:59 <elukey> eventlogging restarted after reboot [analytics]
13:54 <elukey> stop eventlogging and reboot eventlog1001 [analytics]
13:15 <elukey> reboot analytics1003 for kernel update [analytics]
11:08 <elukey> stop camus on an1003 [analytics]
2017-06-20 §
19:24 <ottomata> beginning to consume select eventbus event using eventlogging mysql consumer and inserting into eventlogging analytics mysql db [analytics]
18:01 <joal> Rerun webrequest-load-wf-text-2017-6-20-12 after oozie failure [analytics]
16:23 <joal> Restarted tranquility for banners and netflow on druid1003 [analytics]
16:18 <joal> Rererun pageview-druid-hourly-wf-2017-6-20-14 (failed due to druid reboots) [analytics]
16:04 <elukey> re-run pageview-druid-hourly-wf-2017-6-20-14 (failed due to druid reboots) [analytics]
14:46 <elukey> re-run failed webrequest-load-text/upload jobs due to reboots [analytics]
13:29 <elukey> restart webrequest-load-coord-text and webrequest-load-coord-upload failed jobs due to reboots [analytics]
13:14 <elukey> re-run wikidata-wdqs_extract-wf-2017-6-20-11 (failed for connection issues, likely due to reboots) [analytics]
11:54 <joal> Deleting old unique_devices data (renamed to unique_devices_per_domain) [analytics]
10:27 <elukey> reboot kafka1012, analytics1028, aqs1004 for kernel upgrades (canary hosts) [analytics]
08:51 <elukey> manually added the user 'hdfs' to the 'hive' group to be able to run refinery-drop-webrequest-partitions [analytics]
08:49 <elukey> manually running /srv/deployment/analytics/refinery/bin/refinery-drop-webrequest-partitions on an1003 to free hdfs space [analytics]
2017-06-19 §
12:10 <elukey> disable BBU auto learn on all the hadoop workers [analytics]
2017-06-13 §
10:10 <elukey> merged big zookeeper refactoring https://gerrit.wikimedia.org/r/#/c/354449 - Druid's Hadoop client config now correctly points to conf1* and not drud1* [analytics]
2017-06-12 §
17:21 <joal> Last deploy of the day for uniques patch [analytics]
13:26 <joal> redeploying refinery after bug patch [analytics]
11:32 <joal> Change production last_access_uniques dataset to unique_devices/per_domain [analytics]
11:11 <joal> Deploy refinery onto HDFS [analytics]
11:03 <joal> Regular weekly deploy of refinery (mostly unique_devices patches) [analytics]
10:54 <joal> Refinery-source deployed to archiva [analytics]
2017-06-08 §
16:41 <nuria_> deploying refinery to cluster [analytics]
13:44 <elukey> AQS cluster in beta wiped and re-bootstrapped due to T167222 [analytics]
12:54 <elukey> run megacli -LDSetProp ADRA -LALL -aALL on analytics1047 to set ReadAheadAdaptive on analytics[1042-1046,1048-1057].eqiad.wmnet - T166140 [analytics]
12:16 <elukey> run megacli -LDSetProp ADRA -LALL -aALL on analytics1047 to set ReadAheadAdaptive - T166140 [analytics]
10:35 <elukey> executed megacli -LDSetProp NoCachedBadBBU -LALL -aALL on analytics1049/45 [analytics]
10:28 <elukey> executed megacli -LDSetProp NoCachedBadBBU -LALL -aALL on analytics1032 as test - T166140 [analytics]
07:25 <elukey> kill maps webrequest load coordinator as temporary measure to avoid oozie spamming [analytics]
07:21 <elukey> suspended cache maps as temporary measure to avoid oozie spamming [analytics]
2017-06-07 §
06:50 <elukey> restarted mediacounts-archive-wf-2017-06-06 in Hue (Java OOMs) [analytics]
2017-06-06 §
15:44 <ottomata> restarting eventlogging mysql consumer to allow is_mediawiki events through is_not_bot filter [analytics]