651-700 of 5877 results (27ms)
2023-06-06 §
11:13 <btullis> restart airflow-scheduler service on an-test-client1001 for analytics_test instance [analytics]
11:12 <btullis> restart airflow-scheduler service on an-airflow1006 for product_analytics instance [analytics]
11:12 <btullis> restart airflow-scheduler service on an-airflow1005 for search instance [analytics]
11:08 <btullis> restart airflow-scheduler service on an-airflow1002 for research instance [analytics]
11:07 <btullis> (correction) that should have read an-airflow1004 for platform_eng instance [analytics]
11:06 <btullis> restart airflow-scheduler service on an-launcher1004 for postgresql restart [analytics]
11:05 <btullis> restart airflow-scheduler service on an-launcher1002 for postgresql restart [analytics]
05:41 <stevemunene> hadoop-yarn-resourcemanager restart for T317861 [analytics]
2023-06-05 §
18:20 <btullis> restarted haproxy service on dbproxy1018 for T338172 [analytics]
16:21 <btullis> depooling service=wikireplicas-a,name=dbproxy1018.eqiad.wmnet [analytics]
16:20 <btullis> pooling service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet to allow us to depool the analytics wikireplica servers [analytics]
15:19 <mforns> deployed airflow analytics to fix edit_hourly DAG [analytics]
11:43 <btullis> sudo -u hdfs /usr/bin/hdfs haadmin -failover an-master1002-eqiad-wmnet an-master1001-eqiad-wmnet [analytics]
09:52 <btullis> powered up an-worker1125 [analytics]
2023-06-01 §
19:09 <mforns> deploy airflow analytics to bump up cassandra load monthly for top articles [analytics]
17:50 <mforns> deployed airflow analytics to unbreak monthly cassandra loading DAGs [analytics]
14:13 <mforns> deployed airflow analytics to fix anomaly detection ooms [analytics]
2023-05-31 §
20:41 <mforns> finished refinery deployment [analytics]
20:20 <mforns> starting refinery deployment [analytics]
07:29 <elukey> set "loadByPeriod(P8D+future), dropForever" for webrequest_sampled_live in druid-analytics - T337460 [analytics]
2023-05-30 §
15:52 <xcollazo> created HDFS folder `/wmf/data/wmf_traffic` (T335305 and T337562) [analytics]
2023-05-26 §
06:42 <elukey> `apt-get clean` on stat1008 to clean up some space in the root partition [analytics]
06:36 <elukey> `truncate /var/log/kerberos/krb5kdc.log -s 10g` on krb1001 to avoid the root partition to fill up [analytics]
2023-05-25 §
13:42 <joal> rerun webrequest-refine job for 2023-05-20T00 - we're missing data [analytics]
12:31 <elukey> set "loadByPeriod(P3D+future), dropForever" for webrequest_sampled_live in druid-analytics - T337460 [analytics]
08:37 <joal> rerun druid_load_webrequest_sampled_128_daily 2023-05-20 to reload missing hour (T337088) [analytics]
08:37 <joal> rerun druid_load_webrequest_sampled_128_daily [analytics]
2023-05-24 §
16:19 <aqu> Deployed refinery using scap, then deployed onto hdfs [analytics]
16:05 <elukey> move kafka mirror on kafka main brokers to PKI - T337248 [analytics]
15:56 <elukey> move kafka mirror on kafka jumbo brokers to PKI - T337248 [analytics]
15:48 <elukey> run `kafka acls --add --allow-principal User:CN=kafka_mirror_maker --producer --topic '*'` on kafka test - T337248 [analytics]
15:18 <aqu> analytics-refinery, about to deploy [analytics]
12:21 <joal> rerun failed druid_load_pageviews_hourly_aggregated_daily 2023-05-17 [analytics]
12:21 <joal> rerun failed druid_load_pageviews_hourly_aggregated_daily [analytics]
2023-05-23 §
10:01 <stevemunene> reboot an-test-master1001.eqiad.wmnet December 2022 Buster reboots T325132 [analytics]
09:33 <stevemunene> reboot an-test-coord1001.eqiad.wmnetDecember 2022 Buster reboots T325132 [analytics]
08:22 <btullis> installing conda-analytics-0.0.17.dev_amd64.deb to an-test-worker1001 for T332765 [analytics]
2023-05-22 §
22:12 <btullis> installing conda-analytics-0.0.17.dev_amd64.deb to an-test-client1001 for T332765 [analytics]
2023-05-19 §
13:23 <btullis> restart monitor_refine_eventlogging_analytics.service on an-launcher1002 [analytics]
2023-05-18 §
16:54 <btullis> systemctl reset-failed services on stat1008 [analytics]
16:53 <btullis> installing conda-analytics 0.0.15 to an-test-worker1001 for T332765 [analytics]
15:49 <mforns> deployed airflow analytics_test [analytics]
14:22 <btullis> systemctl reset-failed user manager services on stat1004 [analytics]
12:46 <elukey> clean up old jupyterhub.service references (crash looping) on stat* nodes that had it [analytics]
10:31 <btullis> cold booting an-worker1110 to troubleshoot drive failure T336929 [analytics]
2023-05-17 §
17:58 <ottomata> Deployed refinery-source using jenkins [analytics]
13:22 <btullis> roll-rebooting dse-k8s-workers via cookbook [analytics]
13:16 <btullis> roll-rebooting an-worker1[096-101] for T335835 [analytics]
2023-05-16 §
17:59 <joal> rerun druid_load_pageviews_daily_aggregated_monthly [analytics]
17:34 <joal> Stop, delete then restart airflow druid_load_banner_activity jobs [analytics]