2021-07-14 §
17:39 <razzi> sudo cookbook sre.druid.roll-restart-workers public for https://phabricator.wikimedia.org/T283067 [analytics]
00:34 <razzi> razzi@an-test-druid1001:~$ sudo systemctl restart zookeeper [analytics]
00:33 <razzi> razzi@an-test-druid1001:~$ sudo systemctl restart druid-coordinator [analytics]
00:33 <razzi> razzi@an-test-druid1001:~$ sudo systemctl restart druid-broker [analytics]
00:28 <razzi> razzi@an-test-druid1001:~$ sudo systemctl restart druid-middlemanager [analytics]
00:24 <razzi> razzi@an-test-druid1001:~$ sudo systemctl restart druid-overlord [analytics]
00:24 <razzi> razzi@an-test-druid1001:~$ sudo systemctl restart druid-historical [analytics]
2021-07-13 §
19:29 <joal> move /wmf/data/raw/eventlogging --> /wmf/data/raw/eventlogging_camus and drop /wmf/data/raw/eventlogging_legacy/*/year=2021/month=07/day=13/hour=14 [analytics]
19:02 <razzi> razzi@cumin1001:~$ sudo cookbook sre.hadoop.roll-restart-workers analytics [analytics]
13:03 <joal> remove /wmf/gobblin/locks/event_default.lock to unlock gobblin event job [analytics]
2021-07-12 §
18:37 <joal> Move /wmf/data/raw/event to /wmf/data/raw/event_camus and /wmf/data/raw/event_gobblin to /wmf/data/raw/event [analytics]
18:36 <joal> Delete /year=2021/month=07/day=12/hour=14 of gobblin imported events [analytics]
18:17 <ottomata> stopped puppet and refines and imports for event data on an-launcher1002 in prep for gobblin finalization for event_default job [analytics]
12:31 <joal> Rerun failed webrequest hour after having checked that loss was entirely false-positive [analytics]
2021-07-09 §
03:21 <joal> Rerun webrequest descendent jobs for 2021-07-08T10:00 problem [analytics]
2021-07-08 §
17:22 <joal> Deploy refinery to HDFS [analytics]
16:57 <joal> Kill-restart webrequest oozie job after gobblin time-format change [analytics]
16:44 <joal> Deploying refinery to an-launcher and hadoop-test [analytics]
16:05 <joal> Manually add /wmf/data/raw/webrequest/webrequest_text/year=2021/month=7/day=8/hour=9/_IMPORTED [analytics]
2021-07-07 §
17:03 <joal> Deploy refinery to HDFS [analytics]
16:52 <joal> Deploy refinery to an-launcher1002 [analytics]
16:05 <joal> Deploy refinery to test-cluster [analytics]
13:30 <joal> kill-restart webrequest using gobblin data [analytics]
13:12 <ottomata> deploying refinery to an-launcher1002 for webrequest gobblin migratoin [analytics]
13:09 <joal> Move data for webrequest camus-gobblin migration [analytics]
13:03 <ottomata> disabled camus-webrequest and gobblin-webrequest timer on an-launcher1002 in prep for migration [analytics]
2021-07-06 §
17:33 <joal> Deploy refinery onto HDFS [analytics]
16:41 <joal> Deploy refinery for gobblin [analytics]
16:03 <joal> Kill webrequest_test oozie job [analytics]
15:55 <joal> Drop and recreate wmf_raw.webrequest table [analytics]
15:52 <joal> Moved camus and gobblin data for webrequest on analytics-test-hadoop [analytics]
15:48 <ottomata> deploying refinery to test cluster for webrequest_test gobblin job [analytics]
14:16 <ottomata> restarted aqs for july mw histroy snapshot deploy [analytics]
13:29 <joal> Run first manual empty job for webrequest_test on analytics-test-hadoop [analytics]
13:29 <joal> Clean gobblin state_store and data before starting webrequest_test on analytics-test-hadoop [analytics]
2021-07-03 §
19:57 <joal> rerun learning-features-actor-hourly-wf-2021-7-2-11 [analytics]
2021-07-02 §
13:47 <joal> Reset failed timer refinery-sqoop-mediawiki-private.service [analytics]
12:21 <joal> Replacing failed data with successful data generated when testing https://gerrit.wikimedia.org/r/702877 - wmf_raw.mediawiki_private_cu_changes [analytics]
00:04 <razzi> razzi@an-coord1002:~$ sudo mount -a [analytics]
00:04 <razzi> razzi@an-coord1002:~$ sudo umount /mnt/hdfs [analytics]
00:03 <razzi> razzi@an-coord1002:~$ sudo systemctl restart hive-metastore.service [analytics]
00:02 <razzi> razzi@an-coord1002:~$ sudo systemctl restart hive-server2.service [analytics]
2021-07-01 §
18:56 <razzi> razzi@authdns1001:~$ sudo authdns-update [analytics]
18:19 <razzi> razzi@an-coord1001:~$ sudo mount -a [analytics]
18:18 <razzi> razzi@an-coord1001:~$ sudo umount /mnt/hdfs [analytics]
18:17 <razzi> razzi@an-coord1001:~$ sudo systemctl restart presto-server.service [analytics]
18:16 <razzi> razzi@an-coord1001:~$ sudo systemctl restart hive-metastore.service [analytics]
18:16 <razzi> sudo systemctl restart hive-server2.service [analytics]
18:15 <razzi> sudo systemctl restart oozie on an-coord1001 for https://phabricator.wikimedia.org/T283067 [analytics]
16:38 <razzi> sudo authdns-update on ns0.wikimedia.org to apply https://gerrit.wikimedia.org/r/c/operations/dns/+/702689 [analytics]