2024-01-15
§
|
17:01 |
<btullis> |
roll-restarting analytics druid cluster |
[analytics] |
16:55 |
<joal> |
Clearing analytics failed aiflow tasks after fix |
[analytics] |
16:47 |
<btullis> |
restarted the hive-server2 and hive-metastore services on an-coord100[3-4] which had been accidentally omitted earlier for T332573 |
[analytics] |
12:00 |
<btullis> |
removing all downtime for hadoop-all for T332573 |
[analytics] |
11:57 |
<btullis> |
un-pausing all previously paused DAGS on all airflow instances for T332573 |
[analytics] |
11:55 |
<btullis> |
re-enabling gobblin jobs |
[analytics] |
11:38 |
<brouberol> |
redeploying the Spark History Server to pick up the new HDFS namenodes - T332573 |
[analytics] |
11:29 |
<btullis> |
puppet runs cleanly on an-master1003 and it is the active namenode - running puppet an an-master1004. |
[analytics] |
11:20 |
<btullis> |
running puppet on an-master1003 to set it to active for T332573 |
[analytics] |
11:16 |
<btullis> |
running puppet on journal nodes first for T332573 |
[analytics] |
11:03 |
<btullis> |
stopping all hadoop services |
[analytics] |
10:59 |
<btullis> |
disabling puppet on all hadoop nodes |
[analytics] |
10:54 |
<btullis> |
putting HDFS into safe mode for T332573 |
[analytics] |
2024-01-09
§
|
21:28 |
<aqu> |
airflow-dags/analytics(_test) are both deployed |
[analytics] |
21:18 |
<aqu> |
analytics/refinery not deployed fully on test cluster. Ticket for the bug here: https://phabricator.wikimedia.org/T354703 |
[analytics] |
21:07 |
<aqu> |
Deployed refinery using scap, then deployed onto hdfs |
[analytics] |
20:48 |
<aqu> |
about to deploy analytics/refinery - weekly train |
[analytics] |
12:57 |
<stevemunene> |
roll restart analytics hadoop masters to pickup new net_topology script and new JRE T254480 |
[analytics] |
11:48 |
<stevemunene> |
roll restarting hadoop test masters to pick up new net_topology script and new JRE |
[analytics] |
11:36 |
<stevemunene> |
disable puppet on hadoop masters both test and production to test/implement new net_topology script |
[analytics] |
10:39 |
<btullis> |
roll-restarting kafka-jumbo to pick up new JRE |
[analytics] |
2023-12-20
§
|
22:45 |
<mforns> |
re-ran Airflow DAG druid_load_unique_devices_per_project_family_monthly for 2023-11 |
[analytics] |
22:40 |
<mforns> |
re-ran Airflow DAG druid_load_unique_devices_per_project_family_daily_aggregated_monthly for 2023-11 |
[analytics] |
22:35 |
<mforns> |
re-ran Airflow DAG druid_load_unique_devices_per_domain_monthly for 2023-11 |
[analytics] |
22:28 |
<mforns> |
re-ran Airflow DAG druid_load_unique_devices_per_domain_daily_aggregated_monthly for 2023-11 |
[analytics] |
21:34 |
<mforns> |
re-ran Airflow DAG cassandra_load_unique_devices_monthly for 2023-11 |
[analytics] |
20:56 |
<mforns> |
re-ran Airflow DAG cassandra_load_unique_devices_daily for 2023-11-08 |
[analytics] |
20:27 |
<mforns> |
re-ran Airflow DAG unique_devices_per_project_family_daily for 2023-11-08 |
[analytics] |
20:26 |
<mforns> |
re-ran Airflow DAG unique_devices_per_domain_daily for 2023-11-08 |
[analytics] |
18:43 |
<mforns> |
re-ran Airflow DAG unique_devices_per_domain_monthly for 2023-11 |
[analytics] |