analytics SAL

51-100 of 3297 results (13ms)

2020-12-22 §
09:56	<elukey>	restart hive daemons on an-coord1001 to pick up analytics-hive settings	[analytics]
07:27	<elukey>	reboot stat100[4-8] (analytics hadoop clients) for kernel upgrades	[analytics]
07:23	<elukey>	move all analytics clients (spark refine, stat100x, hive-site.xml on hdfs, etc..) to analytics-hive.eqiad.wmnet	[analytics]
2020-12-18 §
14:10	<elukey>	restore stat1004 to its previous settings for kerberos credential cache	[analytics]
2020-12-17 §
14:54	<klausman>	Updated all stat100x machines to now sport kafkacat 1.6.0, backported from Bullseye	[analytics]
11:04	<elukey>	wipe/reimage the hadoop test cluster to start clean for CDH (and then test the upgrade to bigtop 1.5)	[analytics]
2020-12-16 §
21:06	<joal>	Kill-restart virtualpageview-hourly-coord and projectview-geo-coord with manually updated jar versions (old versions in conf)	[analytics]
19:35	<joal>	Kill-restart all oozie jobs belonging to analytics except mediawiki-wikitext-history-coord	[analytics]
18:52	<joal>	Kill-restart cassandra loading oozie jobs	[analytics]
18:37	<joal>	Kill-restart wikidata-entity, wikidata-item_page_link and mobile_apps-session_metrics oozie jobs	[analytics]
18:31	<joal>	Kill-rerun data-quality bundles	[analytics]
16:17	<razzi>	dropping and re-creating superset staging database	[analytics]
08:13	<joal>	Manually push updated pageview whitelist to HDFS	[analytics]
2020-12-15 §
20:24	<joal>	Kill restart webrequest_load oozie job after deploy	[analytics]
19:43	<joal>	Deploy refinery onto HDFS	[analytics]
19:14	<joal>	Scap deploy refinery	[analytics]
18:26	<joal>	Release refinery-source v0.0.141	[analytics]
2020-12-14 §
19:09	<razzi>	restart restart hadoop-yarn-resourcemanager on an-master1002 to promote an-master1001 to active again	[analytics]
19:08	<razzi>	restarted hadoop-yarn-resourcemanager on an-master1001 again by mistake	[analytics]
19:02	<razzi>	restart hadoop-yarn-resourcemanager on an-master1002	[analytics]
18:54	<razzi>	restart hadoop-yarn-resourcemanager on an-master1001	[analytics]
18:43	<razzi>	applying yarn config change via `sudo cumin "A:hadoop-worker" "systemctl restart hadoop-yarn-nodemanager" -b 10`	[analytics]
14:58	<elukey>	stat1004's krb credential cache moved under /run (shared between notebooks and ssh/bash) - T255262	[analytics]
07:55	<elukey>	roll restart yarn daemons to pick up https://gerrit.wikimedia.org/r/c/operations/puppet/+/649126	[analytics]
2020-12-11 §
19:30	<ottomata>	now ingesting Growth EventLogging schemas using event platform refine job; they are exclude-listed from eventlogging-processor. - T267333	[analytics]
07:04	<elukey>	roll restart presto cluster to pick up new jvm xmx settings	[analytics]
06:57	<elukey>	restart presto on an-presto1003 since all the memory on the host was occupied, and puppet failed to run	[analytics]
2020-12-10 §
12:29	<joal>	Drop-Recreate-Repair wmf_raw.mediawiki_image table	[analytics]
2020-12-09 §
20:34	<elukey>	execute on mysql:an-coord1002 "set GLOBAL replicate_wild_ignore_table='superset_staging.%'" to avoid replication for superset_staging from an-coord1002	[analytics]
07:12	<elukey>	re-enable timers after maintenance	[analytics]
07:07	<elukey>	restart hive-server2 on an-coord1002 for consistency	[analytics]
07:05	<elukey>	restart hive metastore and server2 on an-coord1001 to pick up settings for DBTokenStore	[analytics]
06:50	<elukey>	stop timers on an-launcher1002 as prep step to restart hive	[analytics]
2020-12-07 §
18:51	<joal>	Test mediawiki-wikitext-history new sizing settings	[analytics]
18:43	<razzi>	kill testing flink job: sudo -u hdfs yarn application -kill application_1605880843685_61049	[analytics]
18:42	<razzi>	truncate /var/lib/hadoop/data/h/yarn/logs/application_1605880843685_61049/container_e27_1605880843685_61049_01_000002/taskmanager.log on an-worker1011	[analytics]
2020-12-03 §
22:34	<milimetric>	updated mw history snapshot on AQS	[analytics]
07:09	<elukey>	manual reset-failed refinery-sqoop-whole-mediawiki.service on an-launcher1002 (job launched manually)	[analytics]
2020-12-02 §
21:37	<joal>	Manually create _SUCCESS flags for banner history monthly jobs to kick off (they'll be deleted by the purge tomorrow morning)	[analytics]
21:16	<joal>	Rerun timed out jobs after oozie config got updated (mediawiki-geoeditors-yearly-coord and banner_activity-druid-monthly-coord)	[analytics]
20:49	<ottomata>	deployed eventgate-analytics-external with refactored stream config, hopefully this will work around the canary events alarm bug - T266573	[analytics]
18:20	<mforns>	finished netflow migration wmf->event	[analytics]
17:50	<mforns>	starting netflow migration wmf->event	[analytics]
17:50	<joal>	Manually start refinery-sqoop-production on an-launcher1002 to cover for couped runs failure	[analytics]
16:50	<mforns>	restarted turnilo to clear deleted datasource	[analytics]
16:47	<milimetric>	faked _SUCCESS flag for image table to allow daisy-chained mediawiki history load dependent coordinators to keep running	[analytics]
07:49	<elukey>	restart oozie to pick up new settings for T264358	[analytics]
2020-12-01 §
19:43	<razzi>	deploy refinery with refinery-source v0.0.140	[analytics]
10:50	<elukey>	restart oozie to pick up new logging settings	[analytics]
09:03	<elukey>	clean up old hive metastore/server old logs on an-coord1001 to free space	[analytics]