analytics SAL

801-850 of 1959 results (17ms)

2018-03-06 §
09:41	<elukey>	stop eventlogging's mysql consumers for db1107 (el master) kernel updates	[analytics]
2018-03-05 §
18:22	<elukey>	restart webrequest-load-wf-upload-2018-3-5-16 via Hue (failed due to reboots)	[analytics]
18:21	<elukey>	restart webrequest-load-wf-text-2018-3-5-16 via Hue (failed due to reboots)	[analytics]
15:00	<mforns>	rerun mediacounts-load-wf-2018-3-5-9	[analytics]
10:57	<joal>	Relaunch Mediawiki-history job manually from spark2 to see if new versions helps	[analytics]
10:57	<joal>	Killing failing Mediawiki-History job for 2018-03	[analytics]
2018-03-02 §
15:33	<mforns>	rerun webrequest-load-wf-text-2018-3-2-12	[analytics]
2018-03-01 §
14:59	<elukey>	shutdown deployment-eventlog02 in favor of deployment-eventlog05 in deployment-prep (Ubuntu -> Debian EL migration)	[analytics]
09:45	<elukey>	rerun webrequest-load-wf-text-2018-3-1-6 manually, failed due to analytics1030's reboot	[analytics]
2018-02-28 §
22:09	<milimetric>	re-deployed refinery for a small docs fix in the sqoop script	[analytics]
17:55	<milimetric>	Refinery synced to HDFS, deploy completed	[analytics]
17:40	<milimetric>	deploying Refinery	[analytics]
08:38	<joal>	rerun cassandra-hourly-wf-local_group_default_T_pageviews_per_project_v2-2018-2-27-15	[analytics]
2018-02-27 §
19:12	<ottomata>	updating spark2-* CLIs to spark 2.2.1: T185581	[analytics]
2018-02-21 §
20:48	<ottomata>	now running 2 camus webrequest jobs, one consuming from jumbo (no data yet), the other from analytics. these should be fine to run in parallel.	[analytics]
07:21	<elukey>	reboot db1108 (analytics-slave.eqiad.wmnet) for mariadb+kernel updates	[analytics]
2018-02-19 §
17:14	<elukey>	deployed eventlogging - https://gerrit.wikimedia.org/r/#/c/405687/	[analytics]
07:35	<elukey>	re-run wikidata-specialentitydata_metrics-wf-2018-2-17 via Hue	[analytics]
2018-02-16 §
15:41	<elukey>	add analytics1057 back in the Hadoop worker pool after disk swap	[analytics]
10:55	<elukey>	increased topic partitions for netflow to 3	[analytics]
2018-02-15 §
13:54	<milimetric>	deployment of refinery and refinery-source done	[analytics]
12:52	<joal>	Killing webrequest-load bundle (next restart should be at hour 12:00)	[analytics]
08:18	<elukey>	removed jmxtrans and java 7 from analytics1003 and re-launched refinery-drop-mediawiki-snapshots	[analytics]
07:51	<elukey>	removed default-java packages from analytics1003 and re-launched refinery-drop-mediawiki-snapshots	[analytics]
2018-02-14 §
13:44	<elukey>	rollback java 8 upgrade for archiva - issues with Analytics builds	[analytics]
13:35	<elukey>	installed openjdk-8 on meitnerium, manually upgraded java-update-alternatives to java8, restarted archiva	[analytics]
13:14	<elukey>	removed java 7 packages from analytics100[12]	[analytics]
12:43	<elukey>	jmxtrans removed from all the Hadoop workers	[analytics]
12:43	<elukey>	openjdk-7-* packages removed from all the Hadoop workers	[analytics]
2018-02-13 §
11:42	<elukey>	force kill of yarn nodemanager + other containers on analytics1057 (node failed, unit masked, processes still around)	[analytics]
2018-02-12 §
23:16	<elukey>	re-run webrequest-load-wf-upload-2018-2-12-21 via Hue (node managers failure)	[analytics]
23:13	<elukey>	manual restart of Yarn Node Managers on analytics1058/31	[analytics]
23:09	<elukey>	cleaned up tmp files on all analytics hadoop worker nodes, job filling up tmp	[analytics]
17:18	<elukey>	home dirs on stat1004 moved to /srv/home (/home symlinks to it)	[analytics]
17:15	<ottomata>	restarting eventlogging-processors to blacklist Print schema in eventlogging-valid-mixed (MySQL)	[analytics]
14:46	<ottomata>	deploying eventlogging for T186833 with EventCapsule in code and IP NO_DB_PROPERTIES	[analytics]
2018-02-09 §
12:19	<joal>	Rerun wikidata-articleplaceholder_metrics-wf-2018-2-8	[analytics]
2018-02-08 §
16:23	<elukey>	stop archiva on meitnerium to swap /var/lib/archiva from the root partition to a new separate one	[analytics]
2018-02-07 §
13:55	<joal>	Manually restarted druid indexation after weird failure of mediawiki-history-reduced-wf-2018-01	[analytics]
13:49	<elukey>	restart overlord/middlemanager on druid1005	[analytics]
2018-02-06 §
19:40	<joal>	Manually restarted druid indexation after weird failure of mediawiki-history-reduced-wf-2018-01	[analytics]
15:36	<elukey>	drain + shutdown of analytics1038 to replace faulty BBU	[analytics]
09:58	<elukey>	applied https://gerrit.wikimedia.org/r/c/405687/ manually on deployment-eventlog02 for testing	[analytics]
2018-02-05 §
15:51	<elukey>	live hacked deployment-eventlog02's /srv/deployment/eventlogging/analytics/eventlogging/handlers.py to add poll(0) to the confluent kafka producer - T185291	[analytics]
11:03	<elukey>	restart eventlogging/forwarder legacy-zmq on eventlog1001 due to slow memory leak over time (cached memory down to zero)	[analytics]
2018-02-02 §
17:09	<joal>	Webrequest upload 2018-02-02 hours 9 and 11 dataloss warning have been checked - They are false positive	[analytics]
09:56	<joal>	unique_devices-per_project_family-monthly-wf-2018-1 after failure	[analytics]
2018-02-01 §
17:00	<ottomata>	killing stuck JsonRefine eventlogging analytics job application_1515441536446_52892, not sure why this is stuck.	[analytics]
14:06	<joal>	Dataloss alerts for upload 2018-02-01 hours 1, 2, 3 and 5 were false positives	[analytics]
12:17	<joal>	Restart cassandra monthly bundle after January deploy	[analytics]