2021-01-07
§
|
18:22 |
<elukey> |
chown -R /tmp/analytics analytics:analytics-privatedata-users (tmp dir for data quality stats tables) |
[analytics] |
18:21 |
<elukey> |
"sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chown -R analytics:analytics-privatedata-users /wmf/data/wmf/data_quality_stats" |
[analytics] |
18:10 |
<elukey> |
disable temporarily hdfs-cleaner.timer to prevent /tmp/DataFrameToDruid to be dropped |
[analytics] |
18:08 |
<elukey> |
chown -R /tmp/DataFrameToDruid analytics:druid (was: analytics:hdfs) on hdfs to temporarily unblock Hive2Druid jobs |
[analytics] |
16:31 |
<elukey> |
remove /etc/mysql/conf.d/research-client.cnf from stat100x nodes |
[analytics] |
15:40 |
<elukey> |
deprecate the 'reseachers' posix group for good |
[analytics] |
11:24 |
<elukey> |
execute "sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chmod -R o-rwx /wmf/data/event_sanitized" to fix some file permissions as well |
[analytics] |
10:36 |
<elukey> |
execute "sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chmod -R o-rwx /wmf/data/event" on an-master1001 to fix some file permissions (an-launcher executed timers during the past hours without the new umask) - T270629 |
[analytics] |
09:37 |
<elukey> |
forced re-run of monitor_refine_event_failure_flags.service on an-launcher1002 to clear alerts |
[analytics] |
08:26 |
<joal> |
Rerunning 4 failed refine jobs (mediawiki_cirrussearch_request, day=6/hour=20|21, day=7/hour=0|2) |
[analytics] |
08:14 |
<elukey> |
re-enable puppet on an-launcher1002 to apply new refine memory settings |
[analytics] |
07:59 |
<elukey> |
re-enabling all oozie jobs previously suspended |
[analytics] |
07:54 |
<elukey> |
restart oozie on an-coord1001 |
[analytics] |
2020-12-22
§
|
19:35 |
<elukey> |
restart hive daemons on an-coord1001 to pick up new settings |
[analytics] |
18:13 |
<elukey> |
failover analytics-hive.eqiad.wmnet to an-coord1002 (to allow maintenance on an-coord1001) |
[analytics] |
18:07 |
<elukey> |
restart hive server on an-coord1002 (current standby - no traffic) to pick up the new config (use the local metastore as opposed to what it is pointed by analytics-hive) |
[analytics] |
17:00 |
<mforns> |
Deployed refinery as part of weekly train (v0.0.142) |
[analytics] |
16:42 |
<mforns> |
Deployed refinery-source v0.0.142 |
[analytics] |
16:30 |
<mforns> |
Deployed refinery-source v0.0.142 |
[analytics] |
15:00 |
<razzi> |
stopping superset server on analytics-tool1004 |
[analytics] |
10:36 |
<elukey> |
restart presto coordinator to pick up analytics-hive settings |
[analytics] |
10:25 |
<elukey> |
failover analytics-hive.eqiad.wmnet to an-coord1001 |
[analytics] |
09:56 |
<elukey> |
restart hive daemons on an-coord1001 to pick up analytics-hive settings |
[analytics] |
07:27 |
<elukey> |
reboot stat100[4-8] (analytics hadoop clients) for kernel upgrades |
[analytics] |
07:23 |
<elukey> |
move all analytics clients (spark refine, stat100x, hive-site.xml on hdfs, etc..) to analytics-hive.eqiad.wmnet |
[analytics] |