2023-11-07
§
|
20:48 |
<xcollazo> |
Ran 'kerberos-run-command hdfs hdfs dfs -chmod -R g+w /wmf/data/wmf_dumps/wikitext_raw_rc2' to ease experimentation on this release candidate table. |
[analytics] |
15:52 |
<btullis> |
restart airflow-sheduler and airflow-webserver services on an-test-client1002 |
[analytics] |
15:50 |
<btullis> |
restart mariadb service on an-test-coord1001 |
[analytics] |
15:50 |
<btullis> |
restart mariadb service on an-test-coord100 |
[analytics] |
15:49 |
<btullis> |
restart presto-server service on an-test-coord1001 and an-test-presto1001 to pick up new puppet 7 CA settings |
[analytics] |
15:48 |
<btullis> |
restart hive-server2 and hive-metastore services on an-test-coord1001 to pick up new puppet 7 CA settings. |
[analytics] |
15:35 |
<btullis> |
roll-restarting hadoop workers in test, to test new puppet 7 CA settings. |
[analytics] |
14:52 |
<btullis> |
roll-restarting hadoop masters on the test cluster, after upgrading to puppet 7 |
[analytics] |
12:05 |
<btullis> |
deploying datahub to prod for the pki certificates. |
[analytics] |
11:36 |
<btullis> |
deploying datahub to staging to start using pki certificates - https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/969345/ |
[analytics] |
10:40 |
<btullis> |
re-running the kafka_jumbo_ingestion in analytics airflow |
[analytics] |
2023-10-19
§
|
19:58 |
<xcollazo> |
ran "sudo -u hdfs hdfs dfs -cp /user/xcollazo/artifacts/spark-3.3.2-assembly.zip /user/spark/share/lib/" and "sudo -u hdfs hdfs dfs -chmod o+r /user/spark/share/lib/spark-3.3.2-assembly.zip" to bring make Spark 3.3.2 assembly available for other folks. |
[analytics] |
19:54 |
<xcollazo> |
ran "sudo -u hdfs hdfs dfs -rm /user/spark/share/lib/spark-3.1.2-assembly.jar.backup" to remove old spark assembly backup from May 25 2023. |
[analytics] |
19:52 |
<xcollazo> |
ran "$ sudo -u hdfs hdfs dfs -rm /user/spark/share/lib/spark-3.1.2-assembly.jar.bak" to remove old spark assembly backup from Jun 13 2023. |
[analytics] |
15:22 |
<brouberol> |
The kafka service has been stopped on kafka-jumbo100[1-6] - T336044 |
[analytics] |
15:04 |
<brouberol> |
sudo cumin --batch-size 1 --batch-sleep 60 'kafka-jumbo100[1-6].eqiad.wmnet' 'sudo systemctl stop kafka.service' - T336044 |
[analytics] |
15:02 |
<brouberol> |
disabling puppet on kafka-jumbo100[1-6] to make sure kafka isn't resarted - T336044 |
[analytics] |
12:13 |
<brouberol> |
disabling puppet on kafka-jumbo nodes so we can merge https://gerrit.wikimedia.org/r/c/operations/puppet/+/966497 |
[analytics] |
09:42 |
<btullis> |
re-running airflow jobs for missing webrequest data on hadoop-test |
[analytics] |