251-300 of 3518 results (6ms)
2021-01-05 §
21:32 <ottomata> bumped mediawiki history snapshot version in AQS [analytics]
20:45 <ottomata> Refine changes: event tables now have is_wmf_domain, canary events are removed, and corrupt records will result in a better monitoring email [analytics]
20:43 <razzi> deploy aqs as part of train [analytics]
19:17 <razzi> deploying refinery for weekly train [analytics]
09:29 <joal> Manually reload unique-devices monthly in cassandra to fix T271170 [analytics]
2021-01-04 §
22:22 <razzi> reboot an-test-coord1001 to upgrade kernel [analytics]
14:24 <elukey> deprecate the analytics-users group [analytics]
2021-01-03 §
14:11 <milimetric> reset-failed refinery-sqoop-whole-mediawiki.service [analytics]
14:10 <milimetric> manual sqoop finished, logs on an-launcher1002 at /var/log/refinery/sqoop-mediawiki.log and /var/log/refinery/sqoop-mediawiki-production.log [analytics]
2021-01-01 §
14:54 <milimetric> deployed refinery hotfix for sqoop problem, after testing on three small wikis [analytics]
2020-12-29 §
09:18 <elukey> restart hue to pick up analytics-hive endpoint settings [analytics]
2020-12-23 §
15:53 <ottomata> point analytics-hive.eqiad.wmnet back at an-coord1001 - T268028 T270768 [analytics]
2020-12-22 §
19:35 <elukey> restart hive daemons on an-coord1001 to pick up new settings [analytics]
18:13 <elukey> failover analytics-hive.eqiad.wmnet to an-coord1002 (to allow maintenance on an-coord1001) [analytics]
18:07 <elukey> restart hive server on an-coord1002 (current standby - no traffic) to pick up the new config (use the local metastore as opposed to what it is pointed by analytics-hive) [analytics]
17:00 <mforns> Deployed refinery as part of weekly train (v0.0.142) [analytics]
16:42 <mforns> Deployed refinery-source v0.0.142 [analytics]
16:30 <mforns> Deployed refinery-source v0.0.142 [analytics]
15:00 <razzi> stopping superset server on analytics-tool1004 [analytics]
10:36 <elukey> restart presto coordinator to pick up analytics-hive settings [analytics]
10:25 <elukey> failover analytics-hive.eqiad.wmnet to an-coord1001 [analytics]
09:56 <elukey> restart hive daemons on an-coord1001 to pick up analytics-hive settings [analytics]
07:27 <elukey> reboot stat100[4-8] (analytics hadoop clients) for kernel upgrades [analytics]
07:23 <elukey> move all analytics clients (spark refine, stat100x, hive-site.xml on hdfs, etc..) to analytics-hive.eqiad.wmnet [analytics]
2020-12-18 §
14:10 <elukey> restore stat1004 to its previous settings for kerberos credential cache [analytics]
2020-12-17 §
14:54 <klausman> Updated all stat100x machines to now sport kafkacat 1.6.0, backported from Bullseye [analytics]
11:04 <elukey> wipe/reimage the hadoop test cluster to start clean for CDH (and then test the upgrade to bigtop 1.5) [analytics]
2020-12-16 §
21:06 <joal> Kill-restart virtualpageview-hourly-coord and projectview-geo-coord with manually updated jar versions (old versions in conf) [analytics]
19:35 <joal> Kill-restart all oozie jobs belonging to analytics except mediawiki-wikitext-history-coord [analytics]
18:52 <joal> Kill-restart cassandra loading oozie jobs [analytics]
18:37 <joal> Kill-restart wikidata-entity, wikidata-item_page_link and mobile_apps-session_metrics oozie jobs [analytics]
18:31 <joal> Kill-rerun data-quality bundles [analytics]
16:17 <razzi> dropping and re-creating superset staging database [analytics]
08:13 <joal> Manually push updated pageview whitelist to HDFS [analytics]
2020-12-15 §
20:24 <joal> Kill restart webrequest_load oozie job after deploy [analytics]
19:43 <joal> Deploy refinery onto HDFS [analytics]
19:14 <joal> Scap deploy refinery [analytics]
18:26 <joal> Release refinery-source v0.0.141 [analytics]
2020-12-14 §
19:09 <razzi> restart restart hadoop-yarn-resourcemanager on an-master1002 to promote an-master1001 to active again [analytics]
19:08 <razzi> restarted hadoop-yarn-resourcemanager on an-master1001 again by mistake [analytics]
19:02 <razzi> restart hadoop-yarn-resourcemanager on an-master1002 [analytics]
18:54 <razzi> restart hadoop-yarn-resourcemanager on an-master1001 [analytics]
18:43 <razzi> applying yarn config change via `sudo cumin "A:hadoop-worker" "systemctl restart hadoop-yarn-nodemanager" -b 10` [analytics]
14:58 <elukey> stat1004's krb credential cache moved under /run (shared between notebooks and ssh/bash) - T255262 [analytics]
07:55 <elukey> roll restart yarn daemons to pick up https://gerrit.wikimedia.org/r/c/operations/puppet/+/649126 [analytics]
2020-12-11 §
19:30 <ottomata> now ingesting Growth EventLogging schemas using event platform refine job; they are exclude-listed from eventlogging-processor. - T267333 [analytics]
07:04 <elukey> roll restart presto cluster to pick up new jvm xmx settings [analytics]
06:57 <elukey> restart presto on an-presto1003 since all the memory on the host was occupied, and puppet failed to run [analytics]
2020-12-10 §
12:29 <joal> Drop-Recreate-Repair wmf_raw.mediawiki_image table [analytics]
2020-12-09 §
20:34 <elukey> execute on mysql:an-coord1002 "set GLOBAL replicate_wild_ignore_table='superset_staging.%'" to avoid replication for superset_staging from an-coord1002 [analytics]