2017-06-20
§
|
16:04 |
<elukey> |
re-run pageview-druid-hourly-wf-2017-6-20-14 (failed due to druid reboots) |
[analytics] |
14:46 |
<elukey> |
re-run failed webrequest-load-text/upload jobs due to reboots |
[analytics] |
13:29 |
<elukey> |
restart webrequest-load-coord-text and webrequest-load-coord-upload failed jobs due to reboots |
[analytics] |
13:14 |
<elukey> |
re-run wikidata-wdqs_extract-wf-2017-6-20-11 (failed for connection issues, likely due to reboots) |
[analytics] |
11:54 |
<joal> |
Deleting old unique_devices data (renamed to unique_devices_per_domain) |
[analytics] |
10:27 |
<elukey> |
reboot kafka1012, analytics1028, aqs1004 for kernel upgrades (canary hosts) |
[analytics] |
08:51 |
<elukey> |
manually added the user 'hdfs' to the 'hive' group to be able to run refinery-drop-webrequest-partitions |
[analytics] |
08:49 |
<elukey> |
manually running /srv/deployment/analytics/refinery/bin/refinery-drop-webrequest-partitions on an1003 to free hdfs space |
[analytics] |
2017-06-08
§
|
16:41 |
<nuria_> |
deploying refinery to cluster |
[analytics] |
13:44 |
<elukey> |
AQS cluster in beta wiped and re-bootstrapped due to T167222 |
[analytics] |
12:54 |
<elukey> |
run megacli -LDSetProp ADRA -LALL -aALL on analytics1047 to set ReadAheadAdaptive on analytics[1042-1046,1048-1057].eqiad.wmnet - T166140 |
[analytics] |
12:16 |
<elukey> |
run megacli -LDSetProp ADRA -LALL -aALL on analytics1047 to set ReadAheadAdaptive - T166140 |
[analytics] |
10:35 |
<elukey> |
executed megacli -LDSetProp NoCachedBadBBU -LALL -aALL on analytics1049/45 |
[analytics] |
10:28 |
<elukey> |
executed megacli -LDSetProp NoCachedBadBBU -LALL -aALL on analytics1032 as test - T166140 |
[analytics] |
07:25 |
<elukey> |
kill maps webrequest load coordinator as temporary measure to avoid oozie spamming |
[analytics] |
07:21 |
<elukey> |
suspended cache maps as temporary measure to avoid oozie spamming |
[analytics] |