analytics SAL

1-50 of 3230 results (15ms)

2020-12-14 §
19:09	<razzi>	restart restart hadoop-yarn-resourcemanager on an-master1002 to promote an-master1001 to active again	[analytics]
19:08	<razzi>	restarted hadoop-yarn-resourcemanager on an-master1001 again by mistake	[analytics]
19:02	<razzi>	restart hadoop-yarn-resourcemanager on an-master1002	[analytics]
18:54	<razzi>	restart hadoop-yarn-resourcemanager on an-master1001	[analytics]
18:43	<razzi>	applying yarn config change via `sudo cumin "A:hadoop-worker" "systemctl restart hadoop-yarn-nodemanager" -b 10`	[analytics]
14:58	<elukey>	stat1004's krb credential cache moved under /run (shared between notebooks and ssh/bash) - T255262	[analytics]
07:55	<elukey>	roll restart yarn daemons to pick up https://gerrit.wikimedia.org/r/c/operations/puppet/+/649126	[analytics]
2020-12-11 §
19:30	<ottomata>	now ingesting Growth EventLogging schemas using event platform refine job; they are exclude-listed from eventlogging-processor. - T267333	[analytics]
07:04	<elukey>	roll restart presto cluster to pick up new jvm xmx settings	[analytics]
06:57	<elukey>	restart presto on an-presto1003 since all the memory on the host was occupied, and puppet failed to run	[analytics]
2020-12-10 §
12:29	<joal>	Drop-Recreate-Repair wmf_raw.mediawiki_image table	[analytics]
2020-12-09 §
20:34	<elukey>	execute on mysql:an-coord1002 "set GLOBAL replicate_wild_ignore_table='superset_staging.%'" to avoid replication for superset_staging from an-coord1002	[analytics]
07:12	<elukey>	re-enable timers after maintenance	[analytics]
07:07	<elukey>	restart hive-server2 on an-coord1002 for consistency	[analytics]
07:05	<elukey>	restart hive metastore and server2 on an-coord1001 to pick up settings for DBTokenStore	[analytics]
06:50	<elukey>	stop timers on an-launcher1002 as prep step to restart hive	[analytics]
2020-12-07 §
18:51	<joal>	Test mediawiki-wikitext-history new sizing settings	[analytics]
18:43	<razzi>	kill testing flink job: sudo -u hdfs yarn application -kill application_1605880843685_61049	[analytics]
18:42	<razzi>	truncate /var/lib/hadoop/data/h/yarn/logs/application_1605880843685_61049/container_e27_1605880843685_61049_01_000002/taskmanager.log on an-worker1011	[analytics]
2020-12-03 §
22:34	<milimetric>	updated mw history snapshot on AQS	[analytics]
07:09	<elukey>	manual reset-failed refinery-sqoop-whole-mediawiki.service on an-launcher1002 (job launched manually)	[analytics]
2020-12-02 §
21:37	<joal>	Manually create _SUCCESS flags for banner history monthly jobs to kick off (they'll be deleted by the purge tomorrow morning)	[analytics]
21:16	<joal>	Rerun timed out jobs after oozie config got updated (mediawiki-geoeditors-yearly-coord and banner_activity-druid-monthly-coord)	[analytics]
20:49	<ottomata>	deployed eventgate-analytics-external with refactored stream config, hopefully this will work around the canary events alarm bug - T266573	[analytics]
18:20	<mforns>	finished netflow migration wmf->event	[analytics]
17:50	<mforns>	starting netflow migration wmf->event	[analytics]
17:50	<joal>	Manually start refinery-sqoop-production on an-launcher1002 to cover for couped runs failure	[analytics]
16:50	<mforns>	restarted turnilo to clear deleted datasource	[analytics]
16:47	<milimetric>	faked _SUCCESS flag for image table to allow daisy-chained mediawiki history load dependent coordinators to keep running	[analytics]
07:49	<elukey>	restart oozie to pick up new settings for T264358	[analytics]
2020-12-01 §
19:43	<razzi>	deploy refinery with refinery-source v0.0.140	[analytics]
10:50	<elukey>	restart oozie to pick up new logging settings	[analytics]
09:03	<elukey>	clean up old hive metastore/server old logs on an-coord1001 to free space	[analytics]
2020-11-30 §
17:51	<joal>	Deploy refinery onto hdfs	[analytics]
17:49	<joal>	Kill-restart mediawiki-history-load job after refactor (1 coordinator per table) and tables addition	[analytics]
17:32	<joal>	Kill-restart mediawiki-history-reduced job for druid-public datasource number of shards update	[analytics]
17:32	<joal>	Deploy refinery using scap for naming hotfix	[analytics]
15:29	<ottomata>	migrated EventLogging schemas SpecialMuteSubmit and SpecialInvestigate to EventGate - T268517	[analytics]
14:56	<joal>	Deploying refinery onto hdfs	[analytics]
14:49	<joal>	Create new hive tables for newly sqooped data	[analytics]
14:45	<joal>	Deploy refinery using scap	[analytics]
09:08	<elukey>	force execution of refinery-drop-pageview-actor-hourly-partitions on an-launcher1002 (after args fixup from Joseph)	[analytics]
2020-11-27 §
14:51	<elukey>	roll restart zookeeper on druid* nodes for openjdk upgrades	[analytics]
10:29	<elukey>	restart eventlogging_to_druid_editattemptstep_hourly on an-launcher1002 (failed) to see if the hive metastore works	[analytics]
10:27	<elukey>	restart oozie and presto-server on an-coord1001 for openjdk upgrades	[analytics]
10:27	<elukey>	restart hive server and metastore on an-coord1001 - openjdk upgrades + problem with high GC caused by a job	[analytics]
08:05	<elukey>	roll restart druid public cluster for openjdk upgrades	[analytics]
2020-11-26 §
13:52	<elukey>	roll restart druid daemons on druid analytics to pick up new openjdk upgrades	[analytics]
13:08	<elukey>	force umount/mount of all /mnt/hdfs mountpoints to pick up opendjdk upgrades	[analytics]
09:07	<elukey>	force purging https://wikimedia.org/api/rest_v1/metrics/pageviews/per-article/en.wikipedia/all-access/user/Diego_Maradona/daily/2020110500/2020112500 from caches	[analytics]