analytics SAL

3751-3800 of 4843 results (30ms)

2018-01-18 §
08:58	<elukey>	temporarily set druid1002 in superset's druid cluster config (via UI)	[analytics]
08:53	<elukey>	temporarily point pivot's configuration to druid1002 (druid1001 needs to be rebooted)	[analytics]
08:52	<elukey>	disable druid1001's middlemanager as prep step for reboot	[analytics]
07:11	<elukey>	re-run webrequest-load-wf-misc-2018-1-18-3 via Hue	[analytics]
2018-01-17 §
17:33	<elukey>	killed the banner impression spark job (application_1515441536446_27293) again to force it to respawn (real time indexers not present)	[analytics]
17:29	<elukey>	restarted all druid overlords on druid100[123] (weird race condition messages about who was the leader for some task)	[analytics]
16:24	<elukey>	re-run all the pageview-druid-hourly failed jobs via Hue	[analytics]
14:42	<elukey>	restart druid middlemanager on druid1003 as attempt to unblock realtime streaming	[analytics]
14:21	<elukey>	forced kill of banner impression data streaming job to get it restarted	[analytics]
11:44	<elukey>	re-run pageview-druid-hourly-wf-2018-1-17-9 and pageview-druid-hourly-wf-2018-1-17-8 (failed due to druid1002's middlemanager being in a weird state after reboot)	[analytics]
11:44	<elukey>	restart druid middlemanager on druid1002	[analytics]
10:38	<elukey>	stopped all crons on hadoop-coordinator-1	[analytics]
10:37	<elukey>	re-run webrequest-druid-hourly-wf-2018-1-17-8 (failed due to druid1002's reboot)	[analytics]
10:22	<elukey>	reboot druid1002 for kernel upgrades	[analytics]
09:53	<elukey>	disable druid middlemanager on druid1002 as prep step for reboot	[analytics]
09:46	<elukey>	rebooted analytics1003	[analytics]
09:46	<elukey>	removed upstart config for brrd on eventlog1001 (failing and spamming syslog, old leftover?)	[analytics]
08:53	<elukey>	disabled camus as prep step for analytics1003 reboot	[analytics]
2018-01-15 §
13:39	<elukey>	stop eventlogging and reboot eventlog1001 for kernel updates	[analytics]
09:58	<elukey>	rolling reboots of aqs hosts (1005->1009) for kernel updates	[analytics]
09:11	<elukey>	reboot aqs1004 for kernel updates	[analytics]
2018-01-12 §
13:03	<joal>	Rerun webrequest-load-wf-text-2018-1-12-9	[analytics]
13:02	<joal>	Rerun webrequest-load-wf-upload-2018-1-12-9	[analytics]
10:33	<elukey>	reboot analytics1066->69 for kernel updates	[analytics]
09:07	<elukey>	reboot analytics1063->65 for kernel updates	[analytics]
2018-01-11 §
22:35	<ottomata>	restarting kafka-jumbo brokers to apply https://gerrit.wikimedia.org/r/403774	[analytics]
22:04	<ottomata>	restarting kafka-jumbo brokers to apply https://gerrit.wikimedia.org/r/#/c/403762/	[analytics]
20:57	<ottomata>	restarting kafka-jumbo brokers to apply https://gerrit.wikimedia.org/r/#/c/403753/	[analytics]
17:37	<joal>	Kill manual banner-streaming job to see it restarted by cron	[analytics]
17:11	<ottomata>	restart kafka on kafka-jumbo1003	[analytics]
17:08	<ottomata>	restart kafka on kafka-jumbo1001...something is not right with my certpath change yesterday	[analytics]
14:46	<joal>	Deploy refinery onto HDFS	[analytics]
14:33	<joal>	Deploy refinery with Scap	[analytics]
14:07	<joal>	Manually restarting banner streaming job to prevent alerting	[analytics]
13:23	<joal>	Killing banner-streaming job to have it auto-restarted from cron	[analytics]
11:45	<elukey>	re-run webrequest-load-wf-text-2018-1-11-8 (failed due to reboots)	[analytics]
11:39	<joal>	rerun mediacounts-load-wf-2018-1-11-8	[analytics]
10:48	<joal>	Restarting banner-streaming job after hadoop nodes reboot	[analytics]
10:01	<elukey>	reboot analytics1059-61 for kernel updates	[analytics]
09:34	<elukey>	reboot analytics1055->1058 for kernel updates	[analytics]
09:04	<elukey>	reboot analytics1051->1054 for kernel updates	[analytics]
2018-01-10 §
16:56	<elukey>	reboot analytics1048->50 for kernel updates	[analytics]
16:23	<ottomata>	restarting kafka jumbo brokers to apply java.security certpath restrictions	[analytics]
11:51	<elukey>	re-run webrequest-load-wf-upload-2018-1-10-10 (failed due to reboots)	[analytics]
11:27	<elukey>	re-run webrequest-load-wf-text-2018-1-10-10 (failed due to reboots)	[analytics]
11:26	<elukey>	reboot analytics1044->47 for kernel updates	[analytics]
11:03	<elukey>	reboot analytics1040->43 for kernel updates	[analytics]
2018-01-09 §
16:53	<joal>	Rerun pageview-druid-hourly-wf-2018-1-9-13	[analytics]
15:33	<elukey>	stop mysql on dbstore1002 as prep step for shutdown (stop all slaves, mysql stop)	[analytics]
15:10	<elukey>	reboot analytics1028 (hadoop worker and hdfs journal node) for kernel updates	[analytics]