analytics SAL

3201-3250 of 5901 results (34ms)

2020-05-22 §
08:15	<elukey>	superset down for maintenance	[analytics]
07:09	<elukey>	add druid100[7,8] to the LVS druid-public-brokers service (serving AQS's traffic)	[analytics]
2020-05-21 §
17:24	<elukey>	add druid100[7,8] to the druid public cluster (not serving load balancer traffic for the moment, only joining the cluster) - T252771	[analytics]
16:44	<elukey>	roll restart druid historical nodes on druid100[4-6] (public cluster) to pick up new settings - T252771	[analytics]
14:02	<elukey>	restart druid kafka supervisor for wmf_netflow after maintenance	[analytics]
13:53	<elukey>	restart druid-historical on an-druid100[1,2] to pick up new settings	[analytics]
13:17	<elukey>	kill wmf_netflow druid supervisor for maintenance	[analytics]
13:13	<elukey>	stop druid-daemons on druid100[1-3] (one at the time) to move the druid partition from /srv/druid to /srv (didn't think about it before) - T252771	[analytics]
09:16	<elukey>	move Druid Analytics SQL in Superset to druid://an-druid1001.eqiad.wmnet:8082/druid/v2/sql/	[analytics]
09:05	<elukey>	move turnilo to an-druid1001 (beefier host)	[analytics]
08:15	<elukey>	roll restart of all druid historicals in the analytics cluster to pick up new settings	[analytics]
2020-05-20 §
13:55	<milimetric>	deployed refinery with refinery-source v0.0.125	[analytics]
2020-05-19 §
15:28	<elukey>	restart hadoop master daemons on an-master100[1,2] for openjdk upgrades	[analytics]
06:29	<elukey>	roll restart zookeeper on druid100[4-6] for openjdk upgrades	[analytics]
06:18	<elukey>	roll restart zookeeper on druid100[1-3] for openjdk upgrades	[analytics]
2020-05-18 §
14:02	<elukey>	roll restart of hadoop daemons on the prod cluster for openjdk upgrades	[analytics]
13:30	<elukey>	roll restart hadoop daemons on the test cluster for openjdk upgrades	[analytics]
10:33	<elukey>	add an-druid100[1,2] to the Druid Analytics cluster	[analytics]
2020-05-15 §
13:23	<elukey>	roll restart of the Druid analytics cluster to pick up new openjdk + /srv completed	[analytics]
13:15	<elukey>	turnilo back to druid1001	[analytics]
13:03	<elukey>	move turnilo config to druid1002 to ease druid maintenance	[analytics]
12:31	<elukey>	move superset config to druid1002 (was druid1003) to ease maintenance	[analytics]
09:08	<elukey>	restart druid brokers on Analytics Public	[analytics]
2020-05-14 §
18:41	<ottomata>	fixed TLS authentication for Kafka mirror maker on jumbo - T250250	[analytics]
12:49	<joal>	Release 2020-04 mediawiki_history_reduced to public druid for AQS (elukey did it :-P)	[analytics]
09:53	<elukey>	upgrade matomo to 3.13.3	[analytics]
09:50	<elukey>	set matomo in maintenance mode as prep step for upgrade	[analytics]
2020-05-13 §
21:36	<elukey>	powercycle analytics1055	[analytics]
13:46	<elukey>	upgrade spark2 on all stat100x hosts - T250161	[analytics]
06:47	<elukey>	upgrade spark2 on stat1004 - canary host - T250161	[analytics]
2020-05-11 §
10:17	<elukey>	re-run webrequest-load-wf-text-2020-5-11-9	[analytics]
06:06	<elukey>	restart wikimedia-discovery-golden on stat1007 - apparenlty killed by no memory left to allocate on the system	[analytics]
05:14	<elukey>	force re-run of monitor_refine_event_failure_flags after fixing a refine failed hour	[analytics]
2020-05-10 §
07:44	<joal>	Rerun webrequest-load-wf-upload-2020-5-10-1	[analytics]
2020-05-08 §
21:06	<ottomata>	running prefered replica election for kafka-jumbo to get preferred leaders back after reboot of broker earlier today - T252203	[analytics]
15:36	<ottomata>	starting kafka broker on kafka-jumbo1006, same issue on other brokers when they are leaders of offending partitions - T252203	[analytics]
15:27	<ottomata>	stopping kafka broker on kafka-jumbo1006 to investigate camus import failures - T252203	[analytics]
15:16	<ottomata>	restarted turnilo after applying nuria and mforns changes	[analytics]
2020-05-07 §
17:39	<ottomata>	deploying fix to refinery bin/camus CamusPartitionChecker when using dynamic stream configs	[analytics]
16:49	<joal>	Restart and babysit mediawiki-history-denormalize-wf-2020-04	[analytics]
16:37	<elukey>	roll restart of all the nodemanagers on the hadoop cluster to pick up new jvm settings	[analytics]
13:53	<elukey>	move stat1007 to role::statistics::explorer (adding jupyterhub)	[analytics]
11:00	<joal>	Moving application_1583418280867_334532 to the nice queue	[analytics]
10:58	<joal>	Rerun wikidata-articleplaceholder_metrics-wf-2020-5-6	[analytics]
07:45	<elukey>	re-run mediawiki-history-denormalize	[analytics]
07:43	<elukey>	kill application_1583418280867_333560 after a chat with David, the job is consuming ~2TB of RAM	[analytics]
07:32	<elukey>	re-run mediawiki history load	[analytics]
07:18	<elukey>	execute yarn application -movetoqueue application_1583418280867_332862 -queue root.nice	[analytics]
07:06	<elukey>	restart mediawiki-history-load via hue	[analytics]
06:41	<elukey>	restart oozie on an-coord1001	[analytics]