analytics SAL

1-50 of 3783 results (18ms)

2021-04-29 §
15:55	<razzi>	restart hadoop-yarn-nodemanager and hadoop-hdfs-datanode on an-worker1100 for hadoop to recognize new disk /dev/sdl	[analytics]
15:38	<ottomata>	enabling event_sanitized_main jobs - T273789	[analytics]
14:57	<elukey>	run mysql_upgrade on an-coord1001 to complete the buster upgrade - T278424	[analytics]
14:44	<hnowlan>	restored all eventlogging jobs to eventlog1003	[analytics]
14:21	<hnowlan>	bump eventlog1003 CPUs to 6	[analytics]
13:53	<joal>	Rerun failed pageview-hourly-wf-2021-4-29-11 and pageview-hourly-wf-2021-4-29-12	[analytics]
13:09	<joal>	Rerun failed pageview-hourly-wf-2021-4-29-11	[analytics]
12:35	<hnowlan>	restarting 2 processors on eventlog1002	[analytics]
12:02	<hnowlan>	stopping processors on eventlog1002 to migrate to eventlog1003	[analytics]
11:50	<elukey>	manual stop of one of the eventlog processors on eventlog1002 to see if 1003 takes it over	[analytics]
02:59	<milimetric>	deployed hotfix for referrer job	[analytics]
2021-04-28 §
17:46	<hnowlan>	eventlog1003 joined to groups successfully	[analytics]
17:36	<razzi>	sudo mkdir /srv/log/eventlogging and sudo chown eventlogging:eventlogging /srv/log/eventlogging to workaround missing directory puppet error (to be puppetized later)	[analytics]
17:31	<razzi>	remove deployment cache on eventlogging1003: sudo rm -fr /srv/deployment/eventlogging/analytics-cache/	[analytics]
17:26	<razzi>	manually change /srv/deployment/eventlogging/analytics/.git/DEPLOY_HEAD to deployment1002 on deployment1002 to fix puppet scap error	[analytics]
16:53	<hnowlan>	stopping deployment-eventlog05 in deployment-prep	[analytics]
14:42	<milimetric>	deployed refinery with 0.1.9 jars and synced to hdfs	[analytics]
14:30	<elukey>	chown -R analytics-deploy:analytics-deploy /srv/deployment/analytics on an-coord1001	[analytics]
12:50	<ottomata>	applied data_purge jobs in analytics test cluster; old data will now be dropped there - T273789	[analytics]
2021-04-27 §
08:33	<elukey>	run mysql_upgrade for analytics-meta on an-coord1002 (should be part of the upgrade process) - T278424	[analytics]
07:11	<elukey>	restart yarn resource managers to pick up yarn label settings	[analytics]
2021-04-26 §
08:01	<elukey>	restart hadoop-mapreduce-historyserver on an-master1001 after changes to the yarn ui user	[analytics]
07:36	<elukey>	re-enable timers after setting the capacity scheduler	[analytics]
07:31	<elukey>	restart hadoop RM on an-master* to pick up capacity scheduler changes	[analytics]
06:44	<elukey>	stop timers on an-launcher1002 again as prep step for capacity scheduler changes	[analytics]
06:32	<elukey>	roll restart of hadoop-yarn-nodemanagers to pick up new log4j settings - T276906	[analytics]
06:25	<elukey>	re-enable timers	[analytics]
06:20	<elukey>	reboot an-coord1001 to pick up kernel security settings	[analytics]
05:57	<elukey>	stop timers on an-launcher1002 to allow a reboot of an-coord1001	[analytics]
2021-04-24 §
08:03	<joal>	Rerun failed webrequest-druid-hourly-wf-2021-4-23-13	[analytics]
2021-04-23 §
14:23	<elukey>	roll restart an-master100[1,2] daemons to pick up new lo4j settings - T276906	[analytics]
10:30	<elukey>	restart hadoop daemons (NM, DN, JN) on an-worker1080 to further test the new log4j config - T276906	[analytics]
09:12	<elukey>	change default log4j hadoop config to include rolling gzip appender	[analytics]
2021-04-21 §
21:30	<ottomata>	temporariliy disabling sanitize_eventlogging_analytics_delayed jobs until T280813 is completed (probably tomorrow)	[analytics]
20:04	<ottomata>	renaming event_santized hive table directories to lower case and repairing table partition paths - T280813	[analytics]
09:28	<elukey>	roll restart druid-overlord on druid* after an-coord1001 maintenance	[analytics]
09:08	<elukey>	upgrade hue on an-tool1009 to 4.9.0-2	[analytics]
08:31	<elukey>	re-enable timers on an-launcher1002 and airflow on an-airflow1001 after maintenance on an-coord1001	[analytics]
07:08	<elukey>	reimage an-coord1001 after partition reshape (/var/lib/mysql folded in /srv)	[analytics]
06:51	<elukey>	stop airflow on an-airflow1001	[analytics]
06:49	<elukey>	stop all services on an-coord1001 as prep step for reimage	[analytics]
06:45	<elukey>	PURGE BINARY LOGS BEFORE '2021-04-14 00:00:00'; on an-coord1001 to free some space before the reimage	[analytics]
06:00	<elukey>	stop timers on an-launcher1002 as prep step for an-coord1001 reimage	[analytics]
2021-04-20 §
15:51	<elukey>	move analytics-hive.eqiad.wmnet back to an-coord1001 (test on an-coord1002 successful)	[analytics]
15:38	<ottomata>	deployed refiner to hdfs	[analytics]
13:59	<ottomata>	deploying refinery and refinery source 0.1.6 for weekly train	[analytics]
13:37	<ottomata>	deployed aqs	[analytics]
13:16	<elukey>	failover analytics-hive to an-coord1002 to test the host (running on buster)	[analytics]
12:40	<elukey>	PURGE BINARY LOGS BEFORE '2021-04-12 00:00:00'; on an-coord1001 - T280367	[analytics]
2021-04-19 §
16:45	<ottomata>	make RefineMonitor use analytics keytab - this should be a no-op	[analytics]