351-400 of 2905 results (21ms)
2020-04-02
§
|
16:12 |
<elukey> |
re-enable timers on an-coord1001 after maintenance |
[analytics] |
15:52 |
<elukey> |
restart hive server2/metastore with G1 settings |
[analytics] |
14:05 |
<elukey> |
temporary stop timers on an-coord1001 to facilitate hive daemons restarts |
[analytics] |
13:47 |
<hashar> |
test 1 2 3 |
[analytics] |
13:30 |
<joal> |
Releasing refinery-source v0.0.121 using new jenkins-docker :) |
[analytics] |
08:23 |
<elukey> |
kill/restart netflow realtime druid indexation with a new dimension (peer_ip_src) - T246186 |
[analytics] |
2020-04-01
§
|
21:19 |
<joal> |
restart pageview-hourly-wf-2020-4-1-15 |
[analytics] |
18:24 |
<joal> |
Kill learning-features-actor-hourly as new version to come |
[analytics] |
18:23 |
<joal> |
Restart unique_devices-per_project_family-monthly-wf-2020-3 and aqs-hourly-wf-2020-4-1-15 after hive fialure |
[analytics] |
18:21 |
<joal> |
restart webrequest-load-wf-upload-2020-4-1-16 and webrequest-load-wf-text-2020-4-1-16 after hive failure |
[analytics] |
18:14 |
<joal> |
Kill groceryheist job taking half the cluster |
[analytics] |
18:06 |
<ottomata> |
restarted hive-server2 |
[analytics] |
10:07 |
<jbond42> |
updating icu packages |
[analytics] |
2020-03-31
§
|
12:57 |
<jbond42> |
updating icu on presto-analytics-canary and hadoop-worker-canary |
[analytics] |
2020-03-30
§
|
07:27 |
<elukey> |
run /usr/local/bin/refine_sanitize_eventlogging_analytics_immediate --ignore_failure_flag=true --since=72 --verbose --table_whitelist_regex="ResourceTiming" refine_sanitize_eventlogging_analytics_immediate to fix _REFINE_FAILED events |
[analytics] |
07:16 |
<elukey> |
run eventlogging refine manually for schemas "EditorActivation|EditorJourney|HomepageVisit|VisualEditorFeatureUse|WikibaseTermboxInteraction|UploadWizardErrorFlowEvent|MobileWikiAppiOSReadingLists|ContentTranslationCTA|QuickSurveysResponses|MobileWikiAppiOSSessions to fix _REFINE_FAILED events |
[analytics] |
2020-03-29
§
|
08:44 |
<elukey> |
blacklist TwoColConflictExit from Eventlogging Refine to avoid alarm spam |
[analytics] |
2020-03-28
§
|
16:54 |
<elukey> |
restart yarn nodemanger on analytics1071 - network errors in the logs |
[analytics] |
2020-03-27
§
|
08:09 |
<elukey> |
deployed new kernerls for https://gerrit.wikimedia.org/r/580083 on stat1004 |
[analytics] |
2020-03-26
§
|
09:09 |
<elukey> |
re-running manually webrequest-load upload 26/03/2020T08 - kerberos failures |
[analytics] |
2020-03-25
§
|
08:14 |
<elukey> |
restart presto-server on an-coord1001 to remove jmx catalog config |
[analytics] |
2020-03-24
§
|
15:46 |
<elukey> |
restart all cron.service processes on stat/notebook (killing long lingering processes) to move the unit under user.slice |
[analytics] |
2020-03-21
§
|
14:17 |
<joal> |
Restart wikidata_item_page_link job with manual fix - review to be confirmed |
[analytics] |
14:06 |
<joal> |
Kill buggy wikidata_item_page_link job |
[analytics] |
2020-03-18
§
|
19:39 |
<fdans> |
refinery deployed |
[analytics] |
18:52 |
<fdans> |
deploying refinery |
[analytics] |
18:51 |
<fdans> |
refinery source 0.0.119 jars generated and symlinked |
[analytics] |
18:17 |
<fdans> |
beginning deploy of refinery-source 0.0.119 |
[analytics] |
2020-03-17
§
|
17:25 |
<elukey> |
deploy superset to enable Presto and Kerberos (Pyhive 0.6.2.) |
[analytics] |
2020-03-16
§
|
19:43 |
<joal> |
Kill-restart wikidata-articleplaceholder_metrics-coord to fix yarn queue |
[analytics] |
18:30 |
<mforns> |
Deployed refinery using scap, then deployed onto hdfs |
[analytics] |
17:05 |
<elukey> |
roll restart of hadoop namenodes to get the new GC setting (MaxGCPauseMillis 400 -> 1000) |
[analytics] |
2020-03-13
§
|
12:18 |
<joal> |
Restart cassandra-daily-wf-local_group_default_T_pageviews_per_article_flat-2020-3-12 |
[analytics] |
2020-03-12
§
|
22:53 |
<mforns> |
Deployed refinery using scap, then deployed onto hdfs |
[analytics] |
22:22 |
<mforns> |
deployed refinery-source using jenkins |
[analytics] |
11:09 |
<elukey> |
roll restart kerberos kdcs to pick up new ticket lifetime settings (10h -> 48h) |
[analytics] |
08:27 |
<elukey> |
re-running refine eventlogging with --since 12 (very conservative but just in case) |
[analytics] |
2020-03-11
§
|
14:49 |
<elukey> |
add xmldumps mountpoints on stat1004 and stat1005 |
[analytics] |
2020-03-10
§
|
15:20 |
<elukey> |
remove the analytics user keytab from stat100[4,5] |
[analytics] |
15:06 |
<elukey> |
move stat1006 to role::statistics::explorer |
[analytics] |
09:24 |
<elukey> |
removed /etc/mysql/conf.d/stats-research-client.cnf from all stat boxes (all file used for RU, now on an-launcher1001) |
[analytics] |
2020-03-09
§
|
07:27 |
<elukey> |
deploy jupyterhub on notebook100[3,4] (manual venv re-creation) to allow the use of the user.slice - T247055 |
[analytics] |
07:26 |
<elukey> |
upgrade nodejs from 6->10 on stat1* and notebook1* |
[analytics] |
2020-03-08
§
|
17:58 |
<elukey> |
restart hadoop-yarn-nodemanger on an-worker1087 |
[analytics] |
2020-03-06
§
|
14:58 |
<joal> |
AQS new druid snapshot released (2020-02) |
[analytics] |
10:06 |
<elukey> |
roll restart Presto daemons for openjdk upgrades |
[analytics] |
09:45 |
<elukey> |
roll restart of cassandra on AQS to pick up new openjdk upgrades |
[analytics] |
2020-03-05
§
|
19:45 |
<elukey> |
deleted dangling 'reports' symlink on stat100[6,7] in /srv/published |
[analytics] |
19:39 |
<elukey> |
mv /srv/reportupdater to /srv/reportupdater-backup05032020 on stat100[6,7] |
[analytics] |
16:34 |
<mforns> |
restart turnilo to refresh deleted datasources |
[analytics] |