801-850 of 6100 results (32ms)
2023-06-22 §
14:36 <btullis> restarted airflow-webserver and airflow-scheduler on an-test-client1001 with version 2.6.1. [analytics]
14:11 <btullis> redeploying datahub to staging to try to get upgrade to 0.10.0 working. [analytics]
14:02 <stevemunene> running sre.hadoop.roll-restart-masters restart the Namenodes to completely remove any reference of analytics106[1-3] T317861 [analytics]
13:47 <stevemunene> run puppet on hadoop-masters [analytics]
13:43 <stevemunene> Remove analytics106[1-3] from the HDFS topology [analytics]
13:16 <elukey> move varnishafka instances in eqiad to PKI - T337825 [analytics]
13:14 <btullis> deploying the new eventgate-wikimedia container to eventgate-main [analytics]
08:57 <btullis> cleared airflow task for `projectview_geo.move_data_to_archive` [analytics]
2023-06-21 §
16:46 <joal> Rerun cassandra-load tasks for pageview-per-project daily and hourly for 2023-06-20 hour 4 [analytics]
16:46 <joal> rerun browser_general_daily for 2023-06-20 [analytics]
16:40 <joal> Rerun projectview-hourly DAG for hour: 2023-06-20T04:00 [analytics]
15:44 <mforns> deployed airflow analytics to remove deprecated dag for mobile_apps [analytics]
12:51 <elukey> move varnishafka instances in codfw to PKI - T337825 [analytics]
2023-06-20 §
21:28 <aqu> Manual edit of `/srv/airflow-analytics/connections.yml` following changes in https://gerrit.wikimedia.org/r/c/operations/puppet/+/931690 to avoid alerts Airflow analytics aqs_hourly [analytics]
20:59 <aqu> Manually marked as success `wikidata_dump_to_hive_weekly` iteration `2023-02-13` in Airflow analytics [analytics]
19:55 <btullis> clearing the first failed emit_lineage_to_datahub_for_hive_wmf_aqs_hourly task https://usercontent.irccloud-cdn.com/file/vW6YdEof/image.png [analytics]
19:51 <btullis> merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/931683 to fix the aqs_hourly datahub lineage failure [analytics]
18:13 <mforns> deployed airflow analytics to fix webrequest job [analytics]
17:52 <joal> deploy Refinery to unbreak webrequrest [analytics]
2023-06-19 §
14:04 <elukey> move varnishafka instances in eqsin to PKI - T337825 [analytics]
11:28 <stevemunene> decommission host analytics1060.eqiad.wmnet -t T338409 [analytics]
10:47 <stevemunene> decommission host analytics1059.eqiad.wmnet -t T338408 [analytics]
09:13 <stevemunene> Decommissioning analytics1058.eqiad.wmnet -t T338227 [analytics]
2023-06-16 §
12:18 <btullis> restarting the remaining monitor_refine_event_sanitized_analytics_immediate.service monitor_refine_event_sanitized_main_delayed.service monitor_refine_event_sanitized_main_immediate.service services on an-launcher1002 [analytics]
12:11 <btullis> restarting refine_event_sanitized_main_delayed.service on an-launcher1002 [analytics]
12:03 <btullis> restarting refine_event_sanitized_analytics_delayed.service on an-launcher1002 [analytics]
11:14 <btullis> rebooting an-test-worker1002 for T335358 and stuck gobblin [analytics]
10:13 <joal> rerun druid_load_edit_hourly to reload full snapshot [analytics]
2023-06-15 §
19:27 <btullis> restarting aqs service on A:aqs in batches of 2, 10 seconds apart [analytics]
17:02 <joal> Deploying airflow (again) to fix memory issues [analytics]
15:58 <joal> Rerun druid indexation for mediawiki_history_reduced [analytics]
15:56 <joal> Deploy airflow to fix druid loading jobs using snapshot [analytics]
15:53 <milimetric> refinery-source 0.2.17 deployed, refinery updated and synced to hdfs [analytics]
12:47 <stevemunene> roll running sre.hadoop.roll-restart-masters to completely remove any reference of analytics1058-1060 for T317861 [analytics]
12:34 <joal> Deploy analytics-airlfow to patch mediawiki_history_reduced druid loading [analytics]
09:05 <elukey> move varnishkafka instances in ulsfo to PKI [analytics]
2023-06-14 §
20:18 <milimetric> reran mediawiki_history_reduced druid load task after deploying Joseph's fix [analytics]
13:15 <stevemunene> running the puppet on an-master100[1-2] Remove analytics58_60 from the HDFS topology T317861 [analytics]
2023-06-13 §
19:27 <btullis> restarting the hive-server2 and hive-metastore services on an-coord1001 [analytics]
19:03 <btullis> freeing up space in /srv on an-launcher1002 with `btullis@an-launcher1002:/srv/airflow-analytics/logs/scheduler$ find -maxdepth 1 -type d -mtime +15 -print0 | xargs -0 sudo rm -rf` for T339002 [analytics]
16:41 <ottomata> deploying refinery for weekly train [analytics]
15:45 <SandraEbele> Deployed refinery-source using jenkins [analytics]
15:19 <ottomata> drop event.mediawiki_page_outlink_topic_prediction_change table and data - T337395 [analytics]
15:13 <SandraEbele> deploying refinery source [analytics]
15:05 <ottomata> dropping hive table event.mediawiki_page_change_v1 to pick up backwards incompatible schema change - T337395 [analytics]
15:03 <btullis> failing over the analytics-hive cname to an-coord1002 [analytics]
13:45 <elukey> fixed broken graphs in the varnishkafka's dashboard [analytics]
13:37 <btullis> restarting hive-server2 and hive-metastore on an-coord1002 prior to failover. [analytics]
13:00 <btullis> rolled out conda-analytics 0.0.18 to analytics-airflow and hadoop-coordinator [analytics]
12:25 <btullis> beginning rollout of conda-analytics 0.0.18 to hadoop-workers [analytics]