analytics SAL

201-250 of 4713 results (28ms)

2022-05-31 §
17:47	<btullis>	starting namenode services on am-master1001	[analytics]
17:44	<btullis>	restarting the datanodes on all five of the affected hadoop workers.	[analytics]
17:43	<btullis>	restarting journalnode service on each of the five hadoop workers with journals.	[analytics]
17:41	<btullis>	resizing each journalnode with resize2fs	[analytics]
17:38	<btullis>	sudo lvresize -L+20G analytics1069-vg/journalnode	[analytics]
17:38	<btullis>	increasing each of the hadoop journalnodes by 20 GB	[analytics]
17:33	<ottomata>	stop journalnodes and datanodes on 5 hadoop journalnode hosts	[analytics]
17:30	<btullis>	stopped the hdfs-namenode service on an-master100[1-2]	[analytics]
15:36	<milimetric>	dropped razzi databases and deleted HDFS directories (in trash)	[analytics]
06:26	<elukey>	`elukey@an-master1001:~$ sudo systemctl reset-failed hadoop-clean-fairscheduler-event-logs.service`	[analytics]
2022-05-30 §
20:19	<SandraEbele>	Restarted oozie job pageview-druid-daily-coord	[analytics]
11:28	<joal>	deploy airflow spark3 aqs_hourly	[analytics]
2022-05-25 §
21:09	<joal>	Resume aqs_hourly job in airflow test	[analytics]
20:33	<joal>	Pausing aqs_hourly job in airflow test intil we fix the spark3 issue	[analytics]
06:20	<elukey>	`elukey@an-tool1011:~$ sudo systemctl reset-failed ifup@ens13.service` - T273026	[analytics]
2022-05-24 §
19:54	<SandraEbele>	Deployed refinery using scap, then deployed onto hdfs successfully.	[analytics]
18:34	<SandraEbele>	Deploying refinery, regular weekly deployment	[analytics]
13:18	<joal>	Release refinery-source v0.2.0 to archiva	[analytics]
10:21	<btullis>	restarted hadoop-yarn-nodemanager on an-worker1139	[analytics]
2022-05-23 §
18:27	<mforns>	killed mobile_apps-session_metrics-coord (Airflow job is taking over)	[analytics]
2022-05-21 §
15:52	<joal>	Kill yarn app application_1651744501826_83884 in order to prevent the HDFS alerts	[analytics]
2022-05-19 §
16:59	<ottomata>	deploying airflow-dags analytics with new artifact names, first clearing artifacts cache dir - T307115	[analytics]
2022-05-18 §
10:57	<btullis>	upgrading datahub to version 0.8.34	[analytics]
2022-05-17 §
21:32	<razzi>	sudo systemctl reset-failed ifup@ens13.service on an-tool1007	[analytics]
08:54	<btullis>	booted an-tool1007 from network to begin buster upgrade	[analytics]
2022-05-12 §
14:49	<razzi>	undo the 2 previous confctl changes to repool dbproxy1019 to wikireplicas-b only	[analytics]
14:35	<razzi>	razzi@cumin1001:~$ sudo confctl select service=wikireplicas-a,name=dbproxy1019.eqiad.wmnet set/pooled=yes # for T298940	[analytics]
2022-05-11 §
18:20	<razzi>	disregard the above log; wrote out the command but then saw there was a warning for cr2-eqiad	[analytics]
18:15	<razzi>	razzi@lvs1019:~$ systemctl stop pybal.service to apply change https://gerrit.wikimedia.org/r/c/operations/puppet/+/779915	[analytics]
18:06	<razzi>	razzi@lvs1020:~$ systemctl stop pybal.service to apply change https://gerrit.wikimedia.org/r/c/operations/puppet/+/779915	[analytics]
13:29	<mforns>	restarted oozie jobs after deployment: mediarequest_top_files, pageview_top_articles, unique_devices_per_domain_monthly, unique_devices_per_project_family_monthly	[analytics]
2022-05-10 §
20:32	<mforns>	finished refinery deploy (regular weekly train)	[analytics]
19:34	<mforns>	starting refinery deploy (regular weekly train)	[analytics]
2022-05-09 §
15:06	<SandraEbele>	killed ‘apis-coord' oozie job and started corresponding airflow job ‘apis_metrics_to_graphite’	[analytics]
2022-05-06 §
09:11	<joal>	kill cassandra-monthly-wf-local_group_default_T_mediarequest_top_files-2022-4 again	[analytics]
08:44	<joal>	Rerun cassandra-monthly-wf-local_group_default_T_mediarequest_top_files-2022-4 with SRE watching network	[analytics]
08:29	<joal>	kill cassandra-monthly-wf-local_group_default_T_mediarequest_top_files-2022-4 as it was probably saturating network	[analytics]
2022-05-05 §
18:53	<btullis>	restarting airflow-scheduler@platform_eng.service on an-airflow1003	[analytics]
18:53	<btullis>	restarted airflow-scheduler@research.service on an-airflow1002	[analytics]
18:49	<btullis>	restarting airflow-scheduler@analytics service on an-launcher1002	[analytics]
12:26	<aqu>	Regular analytics weekly train [analytics/refinery@cc4b2bd]	[analytics]
09:53	<btullis>	roll-restarting hadoop masters to pick up new heap size	[analytics]
09:16	<btullis>	re-enabling gobblin jobs now	[analytics]
09:15	<btullis>	restarting failed eventlogging_to_druid_ services on an-launcher1002	[analytics]
09:00	<btullis>	restarting an-coord1001	[analytics]
08:53	<btullis>	stopping oozie on an-coord1001	[analytics]
2022-05-04 §
08:47	<btullis>	rebooting an-coord1002 to pick up new kernel	[analytics]
2022-05-03 §
18:24	<razzi>	remove /etc/apache2/sites-available/50-superset-wikimedia-org.conf from an-tool1005 (superset staging) since it was removed from puppet but has no ensure: absent	[analytics]
2022-04-27 §
19:37	<ottomata>	restarting airflow services on all airflow instances after installing updated airflow debian package	[analytics]
2022-04-26 §
19:02	<aqu>	About to deploy analytics/refinery: Weekly deployment train + Artifacts to 0.1.27	[analytics]