|
2021-02-09
§
|
| 13:39 |
<elukey> |
restart hdfs-datanode on analytics10[65,69] - failed to bootstrap due to issues reading datanode dirs |
[analytics] |
| 13:38 |
<elukey> |
restart hdfs-datanode on an-worker1080 (test canary - not showing up in block report) |
[analytics] |
| 10:04 |
<elukey> |
stop mysql replication an-coord1001 -> an-coord1002, an-coord1001 -> db1108 |
[analytics] |
| 08:29 |
<elukey> |
leave hdfs safemode to let distcp do its job |
[analytics] |
| 08:25 |
<elukey> |
set hdfs safemode on for the Analytics cluster |
[analytics] |
| 08:19 |
<elukey> |
umount /mnt/hdfs from all nodes using it |
[analytics] |
| 08:16 |
<joal> |
Kill flink yarn app |
[analytics] |
| 08:08 |
<elukey> |
stop jupyterhub on stat100x |
[analytics] |
| 08:07 |
<elukey> |
stop hive on an-coord100[1,2] - prep step for bigtop upgrade |
[analytics] |
| 08:05 |
<elukey> |
stop oozie an-coord1001 - prep step for bigtop upgrade |
[analytics] |
| 08:03 |
<elukey> |
stop presto-server on an-presto100x and an-coord1001 - prep step for bigtop upgrade |
[analytics] |
| 07:28 |
<elukey> |
roll out new apt bigtop changes across all hadoop-related nodes |
[analytics] |
| 07:19 |
<joal> |
Killing yarn users applications |
[analytics] |
| 07:12 |
<elukey> |
stop airflow on an-airflow1001 (prep step for bigtop) |
[analytics] |
| 07:09 |
<elukey> |
stop namenode on an-worker1124 (backup cluster), create two new partitions for backup and namenode, restart namenode |
[analytics] |
| 06:14 |
<elukey> |
disable timers on labstore nodes (prep step for bigtop) |
[analytics] |
| 06:11 |
<elukey> |
disable systemd timers on an-launcher1002 (prep step for bigtop) |
[analytics] |
|
2021-02-03
§
|
| 21:37 |
<razzi> |
rebalance kafka partitions for eventlogging_MobileWikiAppLinkPreview |
[analytics] |
| 20:04 |
<razzi> |
rebalance kafka partitions for eqiad.mediawiki.job.RecordLintJob |
[analytics] |
| 20:03 |
<razzi> |
rebalance kafka partitions for codfw.mediawiki.job.RecordLintJob |
[analytics] |
| 18:28 |
<razzi> |
rebalance kafka partitions for eqiad.mediawiki.job.refreshLinks |
[analytics] |
| 18:28 |
<razzi> |
rebalance kafka partitions for codfw.mediawiki.job.refreshLinks |
[analytics] |
| 17:52 |
<razzi> |
rebalance kafka partitions for eqiad.wdqs-internal.sparql-query |
[analytics] |
| 17:50 |
<razzi> |
rebalance kafka partitions for codfw.wdqs-internal.sparql-query |
[analytics] |
| 14:48 |
<elukey> |
sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chmod -R o+rx /wmf/data/wmf/mediawiki/history_reduced |
[analytics] |
| 14:45 |
<elukey> |
sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chmod o+rx /wmf/data/wmf/mediawiki |
[analytics] |
| 14:40 |
<elukey> |
kill + restart webrequest-druid-{hourly,daily} to pick up new changes after refinery deployment |
[analytics] |
| 14:30 |
<elukey> |
kill + relaunch webrequest_load to pick up new changes after refinery deployment |
[analytics] |
| 14:28 |
<elukey> |
relaunch edit-hourly-druid-coord 02-2021 after chmods |
[analytics] |
| 14:25 |
<elukey> |
sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chmod -R o+rx /wmf/data/wmf/edit |
[analytics] |
| 14:24 |
<elukey> |
sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chmod o+rx /wmf/data/wmf |
[analytics] |
| 10:57 |
<elukey> |
deploy refinery to hdfs |
[analytics] |