analytics SAL

1351-1400 of 5871 results (27ms)

2022-06-02 §
07:26	<joal>	Deploy refinery using scap	[analytics]
2022-06-01 §
21:04	<milimetric>	trying to rerun sqoop from a screen on an-launcher	[analytics]
20:09	<SandraEbele>	Successfully deployed refinery using scap, then deployed onto hdfs.	[analytics]
18:51	<SandraEbele>	About to deploy analytics/refinery (regular weekly train)	[analytics]
08:39	<elukey>	powercycle an-worker1094 - OEM event registered in `racadm getsel`, host frozen	[analytics]
2022-05-31 §
18:48	<ottomata>	sudo -u hdfs hdfs dfsadmin -safemode leave on an-master1001	[analytics]
18:12	<ottomata>	sudo service hadoop-hdfs-namenode start on an-master1002	[analytics]
18:10	<ottomata>	sudo -u hdfs hdfs dfsadmin -safemode enter	[analytics]
17:47	<btullis>	starting namenode services on am-master1001	[analytics]
17:44	<btullis>	restarting the datanodes on all five of the affected hadoop workers.	[analytics]
17:43	<btullis>	restarting journalnode service on each of the five hadoop workers with journals.	[analytics]
17:41	<btullis>	resizing each journalnode with resize2fs	[analytics]
17:38	<btullis>	sudo lvresize -L+20G analytics1069-vg/journalnode	[analytics]
17:38	<btullis>	increasing each of the hadoop journalnodes by 20 GB	[analytics]
17:33	<ottomata>	stop journalnodes and datanodes on 5 hadoop journalnode hosts	[analytics]
17:30	<btullis>	stopped the hdfs-namenode service on an-master100[1-2]	[analytics]
15:36	<milimetric>	dropped razzi databases and deleted HDFS directories (in trash)	[analytics]
06:26	<elukey>	`elukey@an-master1001:~$ sudo systemctl reset-failed hadoop-clean-fairscheduler-event-logs.service`	[analytics]
2022-05-30 §
20:19	<SandraEbele>	Restarted oozie job pageview-druid-daily-coord	[analytics]
11:28	<joal>	deploy airflow spark3 aqs_hourly	[analytics]
2022-05-25 §
21:09	<joal>	Resume aqs_hourly job in airflow test	[analytics]
20:33	<joal>	Pausing aqs_hourly job in airflow test intil we fix the spark3 issue	[analytics]
06:20	<elukey>	`elukey@an-tool1011:~$ sudo systemctl reset-failed ifup@ens13.service` - T273026	[analytics]
2022-05-24 §
19:54	<SandraEbele>	Deployed refinery using scap, then deployed onto hdfs successfully.	[analytics]
18:34	<SandraEbele>	Deploying refinery, regular weekly deployment	[analytics]
13:18	<joal>	Release refinery-source v0.2.0 to archiva	[analytics]
10:21	<btullis>	restarted hadoop-yarn-nodemanager on an-worker1139	[analytics]
2022-05-23 §
18:27	<mforns>	killed mobile_apps-session_metrics-coord (Airflow job is taking over)	[analytics]
2022-05-21 §
15:52	<joal>	Kill yarn app application_1651744501826_83884 in order to prevent the HDFS alerts	[analytics]
2022-05-19 §
16:59	<ottomata>	deploying airflow-dags analytics with new artifact names, first clearing artifacts cache dir - T307115	[analytics]
2022-05-18 §
10:57	<btullis>	upgrading datahub to version 0.8.34	[analytics]
2022-05-17 §
21:32	<razzi>	sudo systemctl reset-failed ifup@ens13.service on an-tool1007	[analytics]
08:54	<btullis>	booted an-tool1007 from network to begin buster upgrade	[analytics]
2022-05-12 §
14:49	<razzi>	undo the 2 previous confctl changes to repool dbproxy1019 to wikireplicas-b only	[analytics]
14:35	<razzi>	razzi@cumin1001:~$ sudo confctl select service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet set/pooled=yes # for T298940	[analytics]
2022-05-11 §
18:20	<razzi>	disregard the above log; wrote out the command but then saw there was a warning for cr2-eqiad	[analytics]
18:15	<razzi>	razzi@lvs1019:~$ systemctl stop pybal.service to apply change https://gerrit.wikimedia.org/r/c/operations/puppet/+/779915	[analytics]
18:06	<razzi>	razzi@lvs1020:~$ systemctl stop pybal.service to apply change https://gerrit.wikimedia.org/r/c/operations/puppet/+/779915	[analytics]
13:29	<mforns>	restarted oozie jobs after deployment: mediarequest_top_files, pageview_top_articles, unique_devices_per_domain_monthly, unique_devices_per_project_family_monthly	[analytics]
2022-05-10 §
20:32	<mforns>	finished refinery deploy (regular weekly train)	[analytics]
19:34	<mforns>	starting refinery deploy (regular weekly train)	[analytics]
2022-05-09 §
15:06	<SandraEbele>	killed ‘apis-coord' oozie job and started corresponding airflow job ‘apis_metrics_to_graphite’	[analytics]
2022-05-06 §
09:11	<joal>	kill cassandra-monthly-wf-local_group_default_T_mediarequest_top_files-2022-4 again	[analytics]
08:44	<joal>	Rerun cassandra-monthly-wf-local_group_default_T_mediarequest_top_files-2022-4 with SRE watching network	[analytics]
08:29	<joal>	kill cassandra-monthly-wf-local_group_default_T_mediarequest_top_files-2022-4 as it was probably saturating network	[analytics]
2022-05-05 §
18:53	<btullis>	restarting airflow-scheduler@platform_eng.service on an-airflow1003	[analytics]
18:53	<btullis>	restarted airflow-scheduler@research.service on an-airflow1002	[analytics]
18:49	<btullis>	restarting airflow-scheduler@analytics service on an-launcher1002	[analytics]
12:26	<aqu>	Regular analytics weekly train [analytics/refinery@cc4b2bd]	[analytics]
09:53	<btullis>	roll-restarting hadoop masters to pick up new heap size	[analytics]