analytics SAL

701-750 of 6107 results (35ms)

2023-08-09 §
11:08	<btullis>	starting hadoop-hdfs-namenode.service on an-master1002	[analytics]
11:02	<btullis>	failing over namenode services to an-master1002 so that I can reboot an-master1001	[analytics]
09:49	<btullis>	restarted systemd-timedate service on an-worker1086	[analytics]
2023-08-07 §
17:09	<btullis>	deploying new mediawiki_history snapshot to AQS	[analytics]
2023-08-02 §
20:42	<xcollazo>	deployed latest for Airflow analytics instance.	[analytics]
19:30	<xcollazo>	deploying refinery to try and fix https://lists.wikimedia.org/hyperkitty/list/data-engineering-alerts@lists.wikimedia.org/thread/QKXYMYKMWXGRNYZ77CENA5F2EGA66QQ2/	[analytics]
12:42	<xcollazo>	Redeploy of analytics_product Airflow instance to see it it clears a Spark issue	[analytics]
2023-08-01 §
11:37	<btullis>	ran apt clean on an-tool1009 to free up disk space	[analytics]
06:24	<elukey>	roll restart kafka jumbo brokers to apply new threads settings	[analytics]
2023-07-31 §
19:03	<xcollazo>	Deployed https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/471 for analytics Airflow instance	[analytics]
12:25	<btullis>	upgrading airflow on an-launcher1002 to 2.6.3	[analytics]
2023-07-28 §
19:38	<xcollazo>	Deployed T342926 and https://gitlab.wikimedia.org/repos/data-engineering/airflow-dags/-/merge_requests/469 to analytics Airflow instance	[analytics]
14:34	<milimetric>	deployed a fix for a sqoop typo	[analytics]
2023-07-27 §
18:48	<milimetric>	done deploying some simple stuff to refinery (static files and script comment updates)	[analytics]
2023-07-25 §
09:42	<stevemunene>	powercycle wdqs1013.eqiad.wmnet	[analytics]
2023-07-19 §
16:35	<joal>	Deploy airflow fixfor cassandra loading jobs	[analytics]
13:44	<btullis>	restarting hive-server2 and hive-metastore services on an-coord1001 (currently standby)	[analytics]
12:38	<joal>	deploy Airflow analytics dags - Fullrevampof cassandraloading jobs	[analytics]
11:22	<jennifer_ebe>	deploying refinery to hdfs	[analytics]
10:57	<jennifer_ebe>	deploying refinery using scap	[analytics]
10:54	<btullis>	migrating hive services to an-coord1002 via DNS for T329716 (to permit restart of hive services on an-coord1001).	[analytics]
10:15	<btullis>	restarting oozie service on an-coord1001 for T329716	[analytics]
10:14	<btullis>	restarting presto-service on an-coord1001 for T329716	[analytics]
10:06	<btullis>	restarting java services on an-test-coord1001 for JVM update	[analytics]
09:13	<btullis>	correction: to an-test-client1002	[analytics]
09:13	<btullis>	deploying airflow-dags for analytics_test to an-test-client1001	[analytics]
2023-07-18 §
13:20	<stevemunene>	deploy airflow-dags to an-test-client1002 T341700	[analytics]
2023-07-17 §
13:34	<elukey>	`kill `pgrep -u appledora`` and `kill `pgrep -u akhatun`` on stat1008 to unblock puppet (offboarded users deletion)	[analytics]
13:32	<btullis>	proceeding to reimage analytics1072 (journalnode, in addition to datanode)	[analytics]
09:31	<btullis>	restarted airflow services on an-test-client1002 in order to pick up new versions	[analytics]
09:19	<btullis>	upgrading airflow on an-test-client1002 to version 2.6.3	[analytics]
2023-07-13 §
20:38	<xcollazo>	deployed Airflow DAGs for analytics instance to pickup T335860	[analytics]
2023-07-12 §
16:26	<btullis>	`sudo cumin A:wikireplicas-all 'maintain-views --replace-all --all-databases --table revision'` for T339037	[analytics]
14:11	<btullis>	roll-restarting zookeeper on druid-public for new JVM version	[analytics]
2023-07-11 §
11:00	<btullis>	Proceeding to upgrade datahub in production	[analytics]
08:59	<btullis>	rebooting kafkamon1003	[analytics]
08:54	<btullis>	`systemctl start burrow-jumbo-eqiad.service` on kafkamon1003 for T341551	[analytics]
2023-07-10 §
14:04	<btullis>	powered on an-worker1145	[analytics]
14:02	<btullis>	powered off an-worker1145 for T341481	[analytics]
10:55	<btullis>	`sudo -u hdfs /usr/bin/hdfs haadmin -failover an-master1002-eqiad-wmnet an-master1001-eqiad-wmnet` on an-master1001	[analytics]
2023-07-07 §
09:56	<btullis>	`sudo systemctl start hadoop-hdfs-namenode.service ` on an-master1001	[analytics]
09:28	<stevemunene>	running sre.hadoop.roll-restart-masters restart the maters to completely remove any reference of analytics[1058-1069] T317861	[analytics]
09:15	<stevemunene>	run puppet on hadoop masters to pick up changes from recently decommissioned hosts	[analytics]
08:12	<elukey>	wipe kafka-test cluster (data + zookeper config) to start clean after the issue happened yesterday	[analytics]
2023-07-06 §
14:51	<elukey>	upgraded zookeeper-test1002 to bookworm, but its metadata got re-initialized as well (my bad for this)	[analytics]
14:30	<stevemunene>	decommission analytics1069.eqiad.wmnet T341209	[analytics]
14:19	<stevemunene>	decommission analytics1068.eqiad.wmnet T341208	[analytics]
14:06	<stevemunene>	decommission analytics1067.eqiad.wmnet T341207	[analytics]
13:13	<stevemunene>	decommission analytics1066.eqiad.wmnet T341206	[analytics]
13:02	<stevemunene>	decommission analytics1065.eqiad.wmnet T341205	[analytics]