2020-10-08
§
|
18:08 |
<razzi> |
restart oozie server on an-coord1001 for reverting T262660 |
[analytics] |
17:42 |
<razzi> |
restart oozie server on an-coord1001 for T262660 |
[analytics] |
17:19 |
<elukey> |
removed /var/lib/puppet/clientbucket/6/f/a/c/d/9/8/d/6facd98d16886787ab9656eef07d631e/content on an-launcher1002 (29G, last modified Aug 4th) |
[analytics] |
15:45 |
<elukey> |
executed git pull on /srv/jupyterhub/deploy and run again create_virtualenv.sh on stat1007 (pyspark kernels may not run correctly due to a missing feature) |
[analytics] |
15:43 |
<elukey> |
executed git pull on /srv/jupyterhub/deploy and run again create_virtualenv.sh on stat1006 (pyspark kernels not running due to a missing feature) |
[analytics] |
13:13 |
<elukey> |
roll restart of druid overlords and coordinators on druid public to pick up new TLS settings |
[analytics] |
12:51 |
<elukey> |
roll restart of druid overlords and coordinators on druid analytics to pick up new TLS settings |
[analytics] |
10:35 |
<elukey> |
force the re-creation of default jupyterhub venvs on stat1006 after reimage |
[analytics] |
08:47 |
<klausman> |
Starting re-image of stat1006 to Buster |
[analytics] |
07:14 |
<elukey> |
decom analytics1043 from the Hadoop cluster |
[analytics] |
06:46 |
<elukey> |
move the hdfs balancer from an-coord1001 to an-launcher1002 |
[analytics] |
2020-10-05
§
|
19:14 |
<mforns> |
restarted oozie coord unique_devices-per_domain-monthly after deployment |
[analytics] |
19:05 |
<mforns> |
finished deploying refinery to unblock deletion of raw mediawiki_job and raw netflow data |
[analytics] |
18:45 |
<mforns> |
deploying refinery to unblock deletion of raw mediawiki_job and raw netflow data |
[analytics] |
18:20 |
<elukey> |
manual creation of /opt/rocm -> /opt/rocm-3.3.0 on stat1008 to avoid failures in finding the lib dir |
[analytics] |
17:11 |
<elukey> |
bootstrap an-worker[1115-1117] as hadoop workers |
[analytics] |
14:52 |
<milimetric> |
disabling drop-el-unsanitized-events timer until https://gerrit.wikimedia.org/r/c/analytics/refinery/+/631804/ is deployed |
[analytics] |
14:41 |
<elukey> |
shutdown stat1005 and stat1008 for ram expansion (1005 again) |
[analytics] |
14:25 |
<elukey> |
shutdown an-master1001 for ram expansion |
[analytics] |
13:54 |
<elukey> |
shutdown stat1005 for ram upgrade |
[analytics] |
13:31 |
<elukey> |
shutdown an-master1002 for ram expansion (64 -> 128G) |
[analytics] |
12:35 |
<elukey> |
execute "PURGE BINARY LOGS BEFORE '2020-09-28 00:00:00';" on an-coord1001's mysql to free space - T264081 |
[analytics] |
10:31 |
<elukey> |
bootstrap an-worker111[0,2] as hadoop workers |
[analytics] |
10:31 |
<elukey> |
bootstrap an-worker111[0,2 |
[analytics] |
06:33 |
<elukey> |
reboot stat1005 to resolve weird GPU state (scheduled last week) |
[analytics] |