901-950 of 4864 results (26ms)
2021-07-12 §
18:36 <joal> Delete /year=2021/month=07/day=12/hour=14 of gobblin imported events [analytics]
18:17 <ottomata> stopped puppet and refines and imports for event data on an-launcher1002 in prep for gobblin finalization for event_default job [analytics]
12:31 <joal> Rerun failed webrequest hour after having checked that loss was entirely false-positive [analytics]
2021-07-09 §
03:21 <joal> Rerun webrequest descendent jobs for 2021-07-08T10:00 problem [analytics]
2021-07-08 §
17:22 <joal> Deploy refinery to HDFS [analytics]
16:57 <joal> Kill-restart webrequest oozie job after gobblin time-format change [analytics]
16:44 <joal> Deploying refinery to an-launcher and hadoop-test [analytics]
16:05 <joal> Manually add /wmf/data/raw/webrequest/webrequest_text/year=2021/month=7/day=8/hour=9/_IMPORTED [analytics]
2021-07-07 §
17:03 <joal> Deploy refinery to HDFS [analytics]
16:52 <joal> Deploy refinery to an-launcher1002 [analytics]
16:05 <joal> Deploy refinery to test-cluster [analytics]
13:30 <joal> kill-restart webrequest using gobblin data [analytics]
13:12 <ottomata> deploying refinery to an-launcher1002 for webrequest gobblin migratoin [analytics]
13:09 <joal> Move data for webrequest camus-gobblin migration [analytics]
13:03 <ottomata> disabled camus-webrequest and gobblin-webrequest timer on an-launcher1002 in prep for migration [analytics]
2021-07-06 §
17:33 <joal> Deploy refinery onto HDFS [analytics]
16:41 <joal> Deploy refinery for gobblin [analytics]
16:03 <joal> Kill webrequest_test oozie job [analytics]
15:55 <joal> Drop and recreate wmf_raw.webrequest table [analytics]
15:52 <joal> Moved camus and gobblin data for webrequest on analytics-test-hadoop [analytics]
15:48 <ottomata> deploying refinery to test cluster for webrequest_test gobblin job [analytics]
14:16 <ottomata> restarted aqs for july mw histroy snapshot deploy [analytics]
13:29 <joal> Run first manual empty job for webrequest_test on analytics-test-hadoop [analytics]
13:29 <joal> Clean gobblin state_store and data before starting webrequest_test on analytics-test-hadoop [analytics]
2021-07-03 §
19:57 <joal> rerun learning-features-actor-hourly-wf-2021-7-2-11 [analytics]
2021-07-02 §
13:47 <joal> Reset failed timer refinery-sqoop-mediawiki-private.service [analytics]
12:21 <joal> Replacing failed data with successful data generated when testing https://gerrit.wikimedia.org/r/702877 - wmf_raw.mediawiki_private_cu_changes [analytics]
00:04 <razzi> razzi@an-coord1002:~$ sudo mount -a [analytics]
00:04 <razzi> razzi@an-coord1002:~$ sudo umount /mnt/hdfs [analytics]
00:03 <razzi> razzi@an-coord1002:~$ sudo systemctl restart hive-metastore.service [analytics]
00:02 <razzi> razzi@an-coord1002:~$ sudo systemctl restart hive-server2.service [analytics]
2021-07-01 §
18:56 <razzi> razzi@authdns1001:~$ sudo authdns-update [analytics]
18:19 <razzi> razzi@an-coord1001:~$ sudo mount -a [analytics]
18:18 <razzi> razzi@an-coord1001:~$ sudo umount /mnt/hdfs [analytics]
18:17 <razzi> razzi@an-coord1001:~$ sudo systemctl restart presto-server.service [analytics]
18:16 <razzi> razzi@an-coord1001:~$ sudo systemctl restart hive-metastore.service [analytics]
18:16 <razzi> sudo systemctl restart hive-server2.service [analytics]
18:15 <razzi> sudo systemctl restart oozie on an-coord1001 for https://phabricator.wikimedia.org/T283067 [analytics]
16:38 <razzi> sudo authdns-update on ns0.wikimedia.org to apply https://gerrit.wikimedia.org/r/c/operations/dns/+/702689 [analytics]
2021-06-30 §
18:19 <razzi> unmount and remount /mnt/hdfs on an-test-client1001 for java security update [analytics]
2021-06-29 §
22:55 <razzi> sudo systemctl restart hive-server2 on an-test-coord1001.eqiad.wmnet for T283067 [analytics]
22:53 <razzi> sudo systemctl restart hive-metastore on an-test-coord1001.eqiad.wmnet for T283067 [analytics]
22:52 <razzi> sudo systemctl restart presto-server on an-test-coord1001.eqiad.wmnet for T283067 [analytics]
22:51 <razzi> sudo systemctl restart oozie on an-test-coord1001.eqiad.wmnet for T283067 [analytics]
13:31 <ottomata> deploying refinery for weekly train [analytics]
2021-06-28 §
17:00 <elukey> apt-get reinstall llvm-gpu on stat100[5-8] - T285495 [analytics]
2021-06-25 §
08:01 <elukey> reboot an-worker1101 to unblock stuck GPU [analytics]
07:57 <elukey> execute "sudo /opt/rocm/bin/rocm-smi --gpureset -d 1" on an-worker1101 as attempt to unblock the GPU [analytics]
2021-06-24 §
06:38 <elukey> drop hieradata/role/common/analytics_cluster/superset.yaml from puppet private repo (unused config, all the values dumplicated in the new hiera config) [analytics]
06:34 <elukey> rename superset hiera role configs in puppet private repo (to match the role change done recently) + superset restart [analytics]