2023-06-22
§
|
15:18 |
<btullis> |
cleared status for aqs_hourly.wait_for_webrequest run 13:00 and the downstream task on an-test-client1001. |
[analytics] |
15:07 |
<btullis> |
clearing task for refine_webrequest_hourly_test_text hour 13:00 |
[analytics] |
14:36 |
<btullis> |
restarted airflow-webserver and airflow-scheduler on an-test-client1001 with version 2.6.1. |
[analytics] |
14:11 |
<btullis> |
redeploying datahub to staging to try to get upgrade to 0.10.0 working. |
[analytics] |
14:02 |
<stevemunene> |
running sre.hadoop.roll-restart-masters restart the Namenodes to completely remove any reference of analytics106[1-3] T317861 |
[analytics] |
13:47 |
<stevemunene> |
run puppet on hadoop-masters |
[analytics] |
13:43 |
<stevemunene> |
Remove analytics106[1-3] from the HDFS topology |
[analytics] |
13:16 |
<elukey> |
move varnishafka instances in eqiad to PKI - T337825 |
[analytics] |
13:14 |
<btullis> |
deploying the new eventgate-wikimedia container to eventgate-main |
[analytics] |
08:57 |
<btullis> |
cleared airflow task for `projectview_geo.move_data_to_archive` |
[analytics] |
2023-06-15
§
|
19:27 |
<btullis> |
restarting aqs service on A:aqs in batches of 2, 10 seconds apart |
[analytics] |
17:02 |
<joal> |
Deploying airflow (again) to fix memory issues |
[analytics] |
15:58 |
<joal> |
Rerun druid indexation for mediawiki_history_reduced |
[analytics] |
15:56 |
<joal> |
Deploy airflow to fix druid loading jobs using snapshot |
[analytics] |
15:53 |
<milimetric> |
refinery-source 0.2.17 deployed, refinery updated and synced to hdfs |
[analytics] |
12:47 |
<stevemunene> |
roll running sre.hadoop.roll-restart-masters to completely remove any reference of analytics1058-1060 for T317861 |
[analytics] |
12:34 |
<joal> |
Deploy analytics-airlfow to patch mediawiki_history_reduced druid loading |
[analytics] |
09:05 |
<elukey> |
move varnishkafka instances in ulsfo to PKI |
[analytics] |
2023-06-13
§
|
19:27 |
<btullis> |
restarting the hive-server2 and hive-metastore services on an-coord1001 |
[analytics] |
19:03 |
<btullis> |
freeing up space in /srv on an-launcher1002 with `btullis@an-launcher1002:/srv/airflow-analytics/logs/scheduler$ find -maxdepth 1 -type d -mtime +15 -print0 | xargs -0 sudo rm -rf` for T339002 |
[analytics] |
16:41 |
<ottomata> |
deploying refinery for weekly train |
[analytics] |
15:45 |
<SandraEbele> |
Deployed refinery-source using jenkins |
[analytics] |
15:19 |
<ottomata> |
drop event.mediawiki_page_outlink_topic_prediction_change table and data - T337395 |
[analytics] |
15:13 |
<SandraEbele> |
deploying refinery source |
[analytics] |
15:05 |
<ottomata> |
dropping hive table event.mediawiki_page_change_v1 to pick up backwards incompatible schema change - T337395 |
[analytics] |
15:03 |
<btullis> |
failing over the analytics-hive cname to an-coord1002 |
[analytics] |
13:45 |
<elukey> |
fixed broken graphs in the varnishkafka's dashboard |
[analytics] |
13:37 |
<btullis> |
restarting hive-server2 and hive-metastore on an-coord1002 prior to failover. |
[analytics] |