analytics SAL

851-900 of 4571 results (16ms)

2021-04-14 §
14:05	<elukey>	run build/env/bin/hue migrate on an-tool1009 after the hue upgade	[analytics]
13:10	<elukey>	rollback hue-next to 4.8 - issues not present in staging	[analytics]
13:00	<elukey>	upgrade Hue to 4.9 on an-tool1009 - hue-next.wikimedia.org	[analytics]
10:02	<elukey>	roll restart yarn nodemanagers on hadoop prod (attempt to see if they entered in a weird state, graceful restart)	[analytics]
09:54	<elukey>	kill long running mediawiki-job refine erroring out application_1615988861843_166906	[analytics]
09:46	<elukey>	kill application_1615988861843_163186 for the same reason	[analytics]
09:43	<elukey>	kill application_1615988861843_164387 to see if any improvement to socket consumption is made	[analytics]
09:14	<elukey>	run "sudo kill `pgrep -f sqoop`" on an-launcher1002 to clean up old test processes still running	[analytics]
2021-04-13 §
16:17	<razzi>	rebalance kafka partitions for webrequest_text partitions 19, 20	[analytics]
13:18	<ottomata>	Refine now uses refinery-job 0.1.4; RefineFailuresChecker has been removed and its function rolled into RefineMonitor -	[analytics]
10:23	<hnowlan>	deploying aqs with updated cassandra libraries to aqs1004 while depooled	[analytics]
06:17	<elukey>	kill application application_1615988861843_158645 to free space on analytics1070	[analytics]
06:10	<elukey>	kill application_1615988861843_158592 on analytics1061 to allow space to recover (truncate of course in D state)	[analytics]
06:05	<elukey>	truncate logs for application_1615988861843_158592 on analytics1061 - one partition full	[analytics]
2021-04-12 §
14:21	<ottomata>	stop using http proxies for produce_canary_events_job - T274951	[analytics]
2021-04-08 §
16:33	<elukey>	reboot an-worker1100 again to check if all the disks come up correctly	[analytics]
15:43	<razzi>	rebalance kafka partitions for webrequest_text partitions 17, 18	[analytics]
15:35	<elukey>	reboot an-worker1100 to see if it helps with the strange BBU behavior in T279475	[analytics]
14:07	<elukey>	drop /var/spool/rsyslog from stat1008 - corrupted files due to root partition filled up caused a SEGV for rsyslog	[analytics]
11:14	<hnowlan>	created aqs user and loaded full schemas into analytics wmcs cassandra	[analytics]
08:35	<elukey>	apt-get clean on stat1008 to free some space	[analytics]
07:44	<elukey>	restart hadoop hdfs masters on an-master100[1,2] to apply the new log4j settings fro the audit log	[analytics]
06:44	<elukey>	re-deployed refinery to hadoop-test after fixing permissions on an-test-coord1001	[analytics]
2021-04-07 §
23:03	<ottomata>	installing anaconda-wmf-2020.02~wmf5 on remaining nodes - T279480	[analytics]
22:51	<ottomata>	installing anaconda-wmf-2020.02~wmf5 on stat boxes - T279480	[analytics]
22:47	<mforns>	finished refinery deployment up to 1dbbd3dfa996d2e970eb1cbc0a63d53040d4e3a3	[analytics]
22:39	<mforns>	deployment of refinery via scap to hadoop-test failed with Permission denied: '/srv/deployment/analytics/refinery-cache/.config' (deployemt to production went fine)	[analytics]
21:44	<mforns>	starting refinery deploy up to 1dbbd3dfa996d2e970eb1cbc0a63d53040d4e3a3	[analytics]
21:26	<mforns>	deployed refinery-source v0.1.4	[analytics]
21:25	<razzi>	sudo apt-get install --reinstall sudo apt-get install --reinstall anaconda-wmf on stat1008	[analytics]
20:15	<razzi>	rebalance kafka partitions for webrequest_text partitions 15, 16	[analytics]
19:53	<ottomata>	upgrade anaconda-wmf everywhere to 2020.02~wmf4 with fixes for T279480	[analytics]
14:03	<hnowlan>	setting profile::aqs::git_deploy: true in aqs-test1001 hiera config	[analytics]
2021-04-06 §
22:34	<razzi>	rebalance kafka partitions for webrequest_text_13,14	[analytics]
09:37	<elukey>	reimage an-coord1002 to Debian Buster	[analytics]
2021-04-05 §
16:07	<razzi>	remove old hive logs on an-coord1001: sudo rm /var/log/hive/hive-.log.2021-02-	[analytics]
14:54	<razzi>	remove empty /var/log/sqoop on an-launcher1002 (logs go in /var/log/refinery); sudo rmdir /var/log/sqoop	[analytics]
14:51	<razzi>	rebalance kafka partitions for webrequest_text partitions 11, 12	[analytics]
2021-04-02 §
16:28	<razzi>	rebalance kafka partitions for webrequest_text partitions 9,10	[analytics]
16:19	<elukey>	all the Hadoop test cluster on Debian Buster	[analytics]
07:28	<elukey>	manual fix for an-worker1080's interface in netbox (xe-4/0/11), moved by mistake to public-1b	[analytics]
2021-04-01 §
20:27	<razzi>	restore superset_production from backup superset_production_1617306805.sql	[analytics]
20:14	<razzi>	manually run bash /srv/deployment/analytics/superset/deploy/create_virtualenv.sh as analytics_deploy on an-tool1010, since somehow it didn't run with scap	[analytics]
20:01	<razzi>	sudo chown -R analytics_deploy:analytics_deploy /srv/deployment/analytics/superset/venv since it's owned by root and needs to be removed upon deployment	[analytics]
19:54	<razzi>	dump superset production to an-coord1001.eqiad.wmnet:/home/razzi/superset_production_1617306805.sql just in case	[analytics]
16:50	<razzi>	rebalance kafka partitions for webrequest_text partitions 7 and 8	[analytics]
2021-03-31 §
14:18	<hnowlan>	starting copy of large tables from aqs1007 to aqs1011	[analytics]
2021-03-30 §
20:25	<joal>	Kill-Restart data_quality_stats-hourly-bundle after deploy	[analytics]
20:19	<joal>	Deploying refinery onto HDFS	[analytics]
19:57	<joal>	Deploying refinery using scap	[analytics]