1-50 of 4465 results (18ms)
2022-04-27 §
19:37 <ottomata> restarting airflow services on all airflow instances after installing updated airflow debian package [analytics]
2022-04-26 §
19:02 <aqu> About to deploy analytics/refinery: Weekly deployment train + Artifacts to 0.1.27 [analytics]
12:02 <joal> Rerun cassandra-daily-wf-local_group_default_T_mediarequest_per_file-2022-4-23 [analytics]
2022-04-25 §
20:09 <ottomata> dropping event.ios_notification_interaction hive table and data for backwards incompatible schema change in T290920 [analytics]
11:51 <btullis> failing back hdfs active role to an-master1001 [analytics]
11:49 <btullis> restarted hadoop-yarn-resourcemanager on an-master1002 to force the active role back to an-master1001 [analytics]
11:01 <btullis> rebooting an-master1001 [analytics]
10:25 <btullis> restarting the `check_webrequest_partitions` service on an-launcher1002 [analytics]
09:39 <btullis> failover to an-master1002 successful at 3rd attempt [analytics]
09:30 <btullis> 2nd attempt to switch HDFS services to an-master1002 [analytics]
09:13 <btullis> switching HDFS services to an-master1002 [analytics]
08:53 <btullis> rebooting an-master1002 - T304938 [analytics]
2022-04-23 §
09:38 <elukey> `apt-get clean` on an-airflow1001 to free some space [analytics]
2022-04-21 §
22:26 <mforns> killed browser_general oozie job and started corresponding airflow job [analytics]
2022-04-13 §
16:40 <razzi> reboot an-launcher1002 for security updates [analytics]
2022-04-12 §
22:12 <milimetric> deployed and synced refinery-source 0.1.26 to hdfs [analytics]
2022-04-11 §
12:35 <aqu> About to deploy analytics/refinery "Migrate mediarequest hourly from Oozie to Airflow" (replace previous msg) [analytics]
12:35 <aqu> About to deploy refinery/source "Migrate mediarequest hourly from Oozie to Airflow" [analytics]
2022-04-06 §
20:53 <razzi> roll restart aqs to deploy new mediawiki history snapshot [analytics]
15:51 <mforns> deployed airflow to analytics (big refactor) [analytics]
15:23 <mforns> deployed Airflow to analytics_test (big refactor) [analytics]
09:18 <btullis> restarted eventlogging_to_druid_netflow_hourly on an-launcher1002 [analytics]
2022-04-05 §
20:41 <razzi> deploying refinery for https://gerrit.wikimedia.org/r/c/analytics/refinery/+/776269/ [analytics]
15:54 <razzi> razzi@cumin1001:~$ sudo cookbook sre.hosts.reimage --os bullseye -t T299481 dbstore1005 [analytics]
15:10 <razzi> razzi@cumin1001:~$ sudo cookbook sre.hosts.reimage --os bullseye -t T299481 dbstore1003 [analytics]
15:02 <razzi> set dbstore1003.eqiad.wmnet to downtime for upgrade T299481 [analytics]
15:01 <razzi> set dbstore1003.eqiad.wmnet to downtime for upgrade [analytics]
2022-04-01 §
09:05 <btullis> restarted varnishkafka-eventlogging.service on cp3050 T300246 [analytics]
2022-03-29 §
20:08 <joal> rerun cassandra editors_bycountry_monthly for month 2022-02 [analytics]
20:08 <mforns> restarted webrequest bundle [analytics]
19:57 <mforns> restarted mediawiki-geoeditors-public_monthly-coord [analytics]
19:56 <mforns> finished refinery deployment (regular weekly train) scap and hdfs [analytics]
19:53 <joal> Add new columns to wmf.webrequest (high entropy CH-UA) [analytics]
19:16 <joal> Drop/recreate wmf_raw.webrequest for schema change (high-entropy CH-UA) [analytics]
19:13 <mforns> starting refinery deployment (regular weekly train) [analytics]
19:11 <joal> kill webrequest-load oozie bundle for webrequest schema change [analytics]
17:13 <razzi> razzi@cumin1001:~$ sudo cookbook sre.hosts.downtime an-tool1005.eqiad.wmnet -D 1 -r 'Testing deploy of superset 1.4.2 to staging' [analytics]
15:38 <ntsako> Stopped geoeditor Airflow DAGs to check on data quality [analytics]
14:13 <btullis> correction: restarted hadoop-yarn-nodemanager.service on an-worker1128 [analytics]
14:13 <btullis> restarted hadoop-yarn-nodemanager.service on an-worker1238 [analytics]
2022-03-24 §
11:15 <btullis> roll-restarting kafka-jumbo brokers T300626 [analytics]
2022-03-21 §
18:10 <razzi> sudo systemctl restart jupyter-bearloga-singleuser on stat1008 [analytics]
2022-03-17 §
17:10 <ottomata> restart webrequest and pageview_actor data purge - https://gerrit.wikimedia.org/r/c/operations/puppet/+/771389 [analytics]
14:07 <btullis> shutdown analytics1063 and analytics1067 with 120 minutes of downtime T303151 [analytics]
06:46 <elukey> kill remaining hanging processes for ppche*lko and accra*ze on an-test-client1001 to allow users offboard (puppet broken) [analytics]
2022-03-16 §
19:14 <ottomata> deploying refinery to hadoop-test cluster with new gobblin-wmf-core jar [analytics]
18:00 <razzi> sudo cookbook sre.hosts.downtime -D 3 -r 'Setting up karapace for the first time' karapace1001.eqiad.wmnet [analytics]
17:57 <btullis> restarted mediawiki-history-drop-snapshot service on an-launcher1002 [analytics]
16:03 <aqu> analytics/refinery - scap deply "Migrate session_length/daily from Oozie to Airflow" [analytics]
10:26 <btullis> rerunning failed mediawiki_structured_task_article_link_suggestion_interaction refnie job [analytics]