801-850 of 4789 results (19ms)
2021-07-20 §
15:57 <razzi> sudo -u hdfs kerberos-run-command hdfs hdfs dfsadmin -saveNamespace [analytics]
15:52 <razzi> sudo -u hdfs kerberos-run-command hdfs hdfs dfsadmin -safemode enter [analytics]
15:37 <razzi> kill yarn applications: for jobId in $(yarn application -list | awk 'NR > 2 { print $1 }'); do yarn application -kill $jobId; done [analytics]
15:08 <razzi> sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues [analytics]
14:52 <razzi> sudo systemctl stop 'gobblin-*.timer' [analytics]
14:51 <razzi> sudo systemctl stop analytics-reportupdater-logs-rsync.timer [analytics]
14:47 <razzi> Disable jobs on an-launcher1002 (see https://phabricator.wikimedia.org/T278423#7190372) [analytics]
14:46 <razzi> razzi@an-launcher1002:~$ sudo puppet agent --disable 'razzi: upgrade hadoop masters to debian buster' [analytics]
08:32 <mforns> restarted webrequest bundle (messed up a coord when trying to rerun some failed hours) [analytics]
2021-07-17 §
08:54 <elukey> run 'sudo find -type f -name '*.log*' -mtime +30 -delete' on an-coord1001:/var/log/hive to free space (root partition almost filled up) - T279304 [analytics]
2021-07-15 §
16:44 <ottomata> deploying refinery and refinery-source 0.1.15 for refine job fixes - T271232 [analytics]
13:39 <joal> Kill refine_event application_1623774792907_154469 to let manual run finish [analytics]
13:35 <joal> Kill currently running refine job (application_1623774792907_154014) [analytics]
11:20 <joal> Kill stuck refine application [analytics]
2021-07-14 §
17:39 <razzi> sudo cookbook sre.druid.roll-restart-workers public for https://phabricator.wikimedia.org/T283067 [analytics]
00:34 <razzi> razzi@an-test-druid1001:~$ sudo systemctl restart zookeeper [analytics]
00:33 <razzi> razzi@an-test-druid1001:~$ sudo systemctl restart druid-coordinator [analytics]
00:33 <razzi> razzi@an-test-druid1001:~$ sudo systemctl restart druid-broker [analytics]
00:28 <razzi> razzi@an-test-druid1001:~$ sudo systemctl restart druid-middlemanager [analytics]
00:24 <razzi> razzi@an-test-druid1001:~$ sudo systemctl restart druid-overlord [analytics]
00:24 <razzi> razzi@an-test-druid1001:~$ sudo systemctl restart druid-historical [analytics]
2021-07-13 §
19:29 <joal> move /wmf/data/raw/eventlogging --> /wmf/data/raw/eventlogging_camus and drop /wmf/data/raw/eventlogging_legacy/*/year=2021/month=07/day=13/hour=14 [analytics]
19:02 <razzi> razzi@cumin1001:~$ sudo cookbook sre.hadoop.roll-restart-workers analytics [analytics]
13:03 <joal> remove /wmf/gobblin/locks/event_default.lock to unlock gobblin event job [analytics]
2021-07-12 §
18:37 <joal> Move /wmf/data/raw/event to /wmf/data/raw/event_camus and /wmf/data/raw/event_gobblin to /wmf/data/raw/event [analytics]
18:36 <joal> Delete /year=2021/month=07/day=12/hour=14 of gobblin imported events [analytics]
18:17 <ottomata> stopped puppet and refines and imports for event data on an-launcher1002 in prep for gobblin finalization for event_default job [analytics]
12:31 <joal> Rerun failed webrequest hour after having checked that loss was entirely false-positive [analytics]
2021-07-09 §
03:21 <joal> Rerun webrequest descendent jobs for 2021-07-08T10:00 problem [analytics]
2021-07-08 §
17:22 <joal> Deploy refinery to HDFS [analytics]
16:57 <joal> Kill-restart webrequest oozie job after gobblin time-format change [analytics]
16:44 <joal> Deploying refinery to an-launcher and hadoop-test [analytics]
16:05 <joal> Manually add /wmf/data/raw/webrequest/webrequest_text/year=2021/month=7/day=8/hour=9/_IMPORTED [analytics]
2021-07-07 §
17:03 <joal> Deploy refinery to HDFS [analytics]
16:52 <joal> Deploy refinery to an-launcher1002 [analytics]
16:05 <joal> Deploy refinery to test-cluster [analytics]
13:30 <joal> kill-restart webrequest using gobblin data [analytics]
13:12 <ottomata> deploying refinery to an-launcher1002 for webrequest gobblin migratoin [analytics]
13:09 <joal> Move data for webrequest camus-gobblin migration [analytics]
13:03 <ottomata> disabled camus-webrequest and gobblin-webrequest timer on an-launcher1002 in prep for migration [analytics]
2021-07-06 §
17:33 <joal> Deploy refinery onto HDFS [analytics]
16:41 <joal> Deploy refinery for gobblin [analytics]
16:03 <joal> Kill webrequest_test oozie job [analytics]
15:55 <joal> Drop and recreate wmf_raw.webrequest table [analytics]
15:52 <joal> Moved camus and gobblin data for webrequest on analytics-test-hadoop [analytics]
15:48 <ottomata> deploying refinery to test cluster for webrequest_test gobblin job [analytics]
14:16 <ottomata> restarted aqs for july mw histroy snapshot deploy [analytics]
13:29 <joal> Run first manual empty job for webrequest_test on analytics-test-hadoop [analytics]
13:29 <joal> Clean gobblin state_store and data before starting webrequest_test on analytics-test-hadoop [analytics]
2021-07-03 §
19:57 <joal> rerun learning-features-actor-hourly-wf-2021-7-2-11 [analytics]