401-450 of 4847 results (22ms)
2022-04-06 §
20:53 <razzi> roll restart aqs to deploy new mediawiki history snapshot [analytics]
15:51 <mforns> deployed airflow to analytics (big refactor) [analytics]
15:23 <mforns> deployed Airflow to analytics_test (big refactor) [analytics]
09:18 <btullis> restarted eventlogging_to_druid_netflow_hourly on an-launcher1002 [analytics]
2022-04-05 §
20:41 <razzi> deploying refinery for https://gerrit.wikimedia.org/r/c/analytics/refinery/+/776269/ [analytics]
15:54 <razzi> razzi@cumin1001:~$ sudo cookbook sre.hosts.reimage --os bullseye -t T299481 dbstore1005 [analytics]
15:10 <razzi> razzi@cumin1001:~$ sudo cookbook sre.hosts.reimage --os bullseye -t T299481 dbstore1003 [analytics]
15:02 <razzi> set dbstore1003.eqiad.wmnet to downtime for upgrade T299481 [analytics]
15:01 <razzi> set dbstore1003.eqiad.wmnet to downtime for upgrade [analytics]
2022-04-01 §
09:05 <btullis> restarted varnishkafka-eventlogging.service on cp3050 T300246 [analytics]
2022-03-29 §
20:08 <joal> rerun cassandra editors_bycountry_monthly for month 2022-02 [analytics]
20:08 <mforns> restarted webrequest bundle [analytics]
19:57 <mforns> restarted mediawiki-geoeditors-public_monthly-coord [analytics]
19:56 <mforns> finished refinery deployment (regular weekly train) scap and hdfs [analytics]
19:53 <joal> Add new columns to wmf.webrequest (high entropy CH-UA) [analytics]
19:16 <joal> Drop/recreate wmf_raw.webrequest for schema change (high-entropy CH-UA) [analytics]
19:13 <mforns> starting refinery deployment (regular weekly train) [analytics]
19:11 <joal> kill webrequest-load oozie bundle for webrequest schema change [analytics]
17:13 <razzi> razzi@cumin1001:~$ sudo cookbook sre.hosts.downtime an-tool1005.eqiad.wmnet -D 1 -r 'Testing deploy of superset 1.4.2 to staging' [analytics]
15:38 <ntsako> Stopped geoeditor Airflow DAGs to check on data quality [analytics]
14:13 <btullis> correction: restarted hadoop-yarn-nodemanager.service on an-worker1128 [analytics]
14:13 <btullis> restarted hadoop-yarn-nodemanager.service on an-worker1238 [analytics]
2022-03-24 §
11:15 <btullis> roll-restarting kafka-jumbo brokers T300626 [analytics]
2022-03-21 §
18:10 <razzi> sudo systemctl restart jupyter-bearloga-singleuser on stat1008 [analytics]
2022-03-17 §
17:10 <ottomata> restart webrequest and pageview_actor data purge - https://gerrit.wikimedia.org/r/c/operations/puppet/+/771389 [analytics]
14:07 <btullis> shutdown analytics1063 and analytics1067 with 120 minutes of downtime T303151 [analytics]
06:46 <elukey> kill remaining hanging processes for ppche*lko and accra*ze on an-test-client1001 to allow users offboard (puppet broken) [analytics]
2022-03-16 §
19:14 <ottomata> deploying refinery to hadoop-test cluster with new gobblin-wmf-core jar [analytics]
18:00 <razzi> sudo cookbook sre.hosts.downtime -D 3 -r 'Setting up karapace for the first time' karapace1001.eqiad.wmnet [analytics]
17:57 <btullis> restarted mediawiki-history-drop-snapshot service on an-launcher1002 [analytics]
16:03 <aqu> analytics/refinery - scap deply "Migrate session_length/daily from Oozie to Airflow" [analytics]
10:26 <btullis> rerunning failed mediawiki_structured_task_article_link_suggestion_interaction refnie job [analytics]
2022-03-15 §
22:16 <razzi> upload karapace_2.1.3-py3.7-1_amd64.deb to apt.wikimedia.org [analytics]
19:58 <razzi> upload karapace_2.1.3-py3.7-0_amd64.deb to apt.wikimedia.org [analytics]
17:24 <ottomata> also change stats uid and gid to 918 on an-web1001 - T291384 [analytics]
14:35 <ottomata> change stats uid and gid on all stat boxes to 918 - T291384 [analytics]
13:59 <ottomata> roll restarting kafka jumbo brokers to set max.incremental.fetch.session.cache.slots=2000 - T303324 [analytics]
2022-03-14 §
21:05 <razzi> `sudo kill -9 15674` to stop unresponsive hive query [analytics]
2022-03-09 §
21:05 <ottomata> fix group ownership of cchen.db/new_editors/cohort=2021-12 after reverting T291664 - sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chgrp -R analytics-privatedata-users /user/hive/warehouse/cchen.db/new_editors/cohort=2021-12 [analytics]
18:33 <ottomata> fix group ownership of wmf_product.db//new_editors/cohort=2021-12 after reverting T291664 - sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chgrp -R analytics-privatedata-users /user/hive/warehouse/wmf_product.db/new_editors/cohort=2021-12 [analytics]
18:32 <ottomata> fix group ownership of wmf_product.db/global_markets_pageviews/year=2022/month=2 after reverting T291664 - sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chgrp -R analytics-privatedata-users /user/hive/warehouse/wmf_product.db/global_markets_pageviews/year=2022/month=2 [analytics]
18:19 <btullis> btullis@ganeti1024:~$ sudo gnt-instance start karapace1001.eqiad.wmnet (T301562) [analytics]
16:16 <ottomata> fix group ownership of wmf_product.db/poageviews_corrected/year=222/month=2 after reverting T291664 - sudo -u hdfs kerberos-run-command hdfs hdfs dfs -chgrp -R analytics-privatedata-users /user/hive/warehouse/wmf_product.db/pageviews_corrected/year=2022/month=2 [analytics]
2022-03-08 §
13:31 <ottomata> restarted webrequest-load oozie bundle as 0073173-220113112502223-oozie-oozi-B starting at 2022-03-08T12:00Z [analytics]
13:09 <ottomata> killing and rerunning webrequest-load-text-wf for webrequest_source=text/year=2022/month=3/day=7/hour=17, it was stuck in add_partition task as SUSPENDED, not sure why. [analytics]
12:47 <btullis> roll-restarting druid-analytics T300626 [analytics]
12:08 <btullis> roll-restarting druid-public. T300626 [analytics]
11:21 <btullis> roll-restarting druid-test T300626 [analytics]
11:00 <btullis> roll-restarting aqs T300626 [analytics]
10:57 <btullis> restarted archiva T300626 [analytics]