201-250 of 5698 results (33ms)
2023-09-27 §
08:11 <elukey> start kafka mirror on jumbo1002 [analytics]
08:08 <elukey> stop all mirror maker on jumbo, start only one on jumbo1001 [analytics]
07:47 <elukey> roll restart mirror maker instances on kafka jumbo [analytics]
2023-09-26 §
10:43 <btullis> deploying conda-analytics v0.0.23 to stats servers for T337258 [analytics]
10:36 <btullis> deploying conda-analytics v0.0.23 to analytics-airflow for T337258 [analytics]
10:34 <btullis> deploying conda-analytics v0.0.23 to hadoop-all for T337258 [analytics]
10:28 <btullis> upgrading outdated bigtop packages on stat1009 with `dpkg -l |egrep "\-deb11"|awk '{print $2}'|xargs sudo apt install` for T337465 [analytics]
10:11 <btullis> running 'dpkg -l |egrep "\-deb11"|awk '{print $2}'|xargs sudo apt install` on an-test-client1002 for T337465 [analytics]
09:24 <btullis> pushing out build 0.0.23 of conda-analytics to hadoop-test. [analytics]
2023-09-25 §
08:53 <btullis> `root@archiva1002:/var/cache/archiva# sudo rm -rf temp*` [analytics]
2023-09-24 §
18:35 <btullis> restarted archiva to see if it clears some temp files. [analytics]
2023-09-21 §
17:59 <xcollazo> Deploy latest DAGs to analytics Airflow instance [analytics]
15:02 <milimetric> deployed aqs 1.0 to enable etags on all endpoints - so far everything looks ok [analytics]
08:56 <joal> Rerun edit-hourly druid indexation to fix corrupted data file [analytics]
08:10 <brouberol> redeploying eventgate-analytics in staging T336041 [analytics]
2023-09-19 §
14:19 <jennifer_ebe> airflow analytics deployment with scap successful [analytics]
13:57 <btullis> pushing out https://gerrit.wikimedia.org/r/c/operations/puppet/+/955893 for new refinery job jar files [analytics]
13:43 <jennifer_ebe> deploying airflow analytics dag [analytics]
13:32 <jennifer_ebe> deployment successful [analytics]
13:07 <jennifer_ebe> redeploying refinery from deployment.eqiad.wmnet using scap [analytics]
12:02 <jennifer_ebe> deploying refinery from deployment.eqiad.wmnet [analytics]
09:40 <btullis> commencing rolling restart of all brokers in kafka-jumbo [analytics]
09:27 <btullis> deploying change to kafka-jumbo settings for T344688 [analytics]
08:17 <brouberol> redeploying eventstream-analytics in eqiad T336041 [analytics]
08:05 <brouberol> redeploying eventstream-internal in staging T336041 [analytics]
08:02 <brouberol> redeploying eventgate-analytics-external in staging T336041 [analytics]
07:59 <brouberol> redeploying eventgate-analytics in staging T336041 [analytics]
2023-09-18 §
15:38 <btullis> deploying Superset 2.1.1 to an-tool1005 for superset-next.wikimedia.org [analytics]
13:14 <brouberol> Puppet run successfully on kafka-jumbo1010.eqiad.wmnet. The kafka service is running. T336041 [analytics]
10:45 <stevemunene> deploy datahub in eqiad to pick up new changes T305874 [analytics]
10:42 <stevemunene> deploy datahub in codfw to pick up new changes T305874 [analytics]
09:51 <stevemunene> disable auth_jaas and native login to datahub then enable oidc authentication to production in eqiad T305874 [analytics]
09:43 <stevemunene> disable auth_jaas and native login to datahub then enable oidc authentication to production in codfw T305874 [analytics]
2023-09-14 §
21:40 <btullis> executed apt-get clean on hadoop-test [analytics]
21:31 <btullis> deploying conda-analytics version 0.0.21 to hadoop-test for T337258 [analytics]
18:28 <xcollazo> Deployed latest DAGs to analytics Airflow instance T340861 [analytics]
14:13 <stevemunene> powercycle an-worker1138, investigating failures related to reimage T332570 [analytics]
11:42 <btullis> deploying conda-analytics version 0.0.20 to the test cluster for T337258 [analytics]
2023-09-12 §
14:59 <btullis> successfully failed back the HDFS namenode services to an-master1001 [analytics]
11:21 <btullis> demonstrated the use of SAL for T343762 [analytics]
09:54 <btullis> btullis@an-master1001:~$ sudo -u hdfs /usr/bin/hdfs haadmin -failover an-master1002-eqiad-wmnet an-master1001-eqiad-wmnet [analytics]
2023-09-07 §
16:55 <btullis> restarting the aqs service on all aqs* servers in batches to pick up new MW_history snapshot. [analytics]
13:43 <mforns> (actual timestamp: 2023-09-06, 19:10:29 UTC) cleared airflow task mediawiki_history_reduced.check_mediawiki_history_reduced_error_folder (and subsequent tasks) for snapshot=2023-08. This was due to false positive errors having been generated by the checker. [analytics]
2023-09-05 §
14:26 <btullis> completed eventstreams and eventstreams-internal deployments. [analytics]
14:23 <btullis> deploying eventstreams for T344688 [analytics]
14:15 <btullis> deploying eventstreams-internal for T344688 [analytics]
12:35 <stevemunene> power cycle an-worker1132. Host is stuck on debian install after a failed reimage. [analytics]
10:35 <joal> Rerun cassandra_load_pageview_top_articles_monthly [analytics]
10:35 <joal> Clear airflow false-failed tasks for pageview_hourly (log-aggregation issue) [analytics]
2023-09-01 §
07:43 <stevemunene> powercycle an-worker1145.eqiad.wmnet host cpus soft lockup T345413 [analytics]