201-250 of 2888 results (15ms)
2020-05-19 §
06:29 <elukey> roll restart zookeeper on druid100[4-6] for openjdk upgrades [analytics]
06:18 <elukey> roll restart zookeeper on druid100[1-3] for openjdk upgrades [analytics]
2020-05-18 §
14:02 <elukey> roll restart of hadoop daemons on the prod cluster for openjdk upgrades [analytics]
13:30 <elukey> roll restart hadoop daemons on the test cluster for openjdk upgrades [analytics]
10:33 <elukey> add an-druid100[1,2] to the Druid Analytics cluster [analytics]
2020-05-15 §
13:23 <elukey> roll restart of the Druid analytics cluster to pick up new openjdk + /srv completed [analytics]
13:15 <elukey> turnilo back to druid1001 [analytics]
13:03 <elukey> move turnilo config to druid1002 to ease druid maintenance [analytics]
12:31 <elukey> move superset config to druid1002 (was druid1003) to ease maintenance [analytics]
09:08 <elukey> restart druid brokers on Analytics Public [analytics]
2020-05-14 §
18:41 <ottomata> fixed TLS authentication for Kafka mirror maker on jumbo - T250250 [analytics]
12:49 <joal> Release 2020-04 mediawiki_history_reduced to public druid for AQS (elukey did it :-P) [analytics]
09:53 <elukey> upgrade matomo to 3.13.3 [analytics]
09:50 <elukey> set matomo in maintenance mode as prep step for upgrade [analytics]
2020-05-13 §
21:36 <elukey> powercycle analytics1055 [analytics]
13:46 <elukey> upgrade spark2 on all stat100x hosts - T250161 [analytics]
06:47 <elukey> upgrade spark2 on stat1004 - canary host - T250161 [analytics]
2020-05-11 §
10:17 <elukey> re-run webrequest-load-wf-text-2020-5-11-9 [analytics]
06:06 <elukey> restart wikimedia-discovery-golden on stat1007 - apparenlty killed by no memory left to allocate on the system [analytics]
05:14 <elukey> force re-run of monitor_refine_event_failure_flags after fixing a refine failed hour [analytics]
2020-05-10 §
07:44 <joal> Rerun webrequest-load-wf-upload-2020-5-10-1 [analytics]
2020-05-08 §
21:06 <ottomata> running prefered replica election for kafka-jumbo to get preferred leaders back after reboot of broker earlier today - T252203 [analytics]
15:36 <ottomata> starting kafka broker on kafka-jumbo1006, same issue on other brokers when they are leaders of offending partitions - T252203 [analytics]
15:27 <ottomata> stopping kafka broker on kafka-jumbo1006 to investigate camus import failures - T252203 [analytics]
15:16 <ottomata> restarted turnilo after applying nuria and mforns changes [analytics]
2020-05-07 §
17:39 <ottomata> deploying fix to refinery bin/camus CamusPartitionChecker when using dynamic stream configs [analytics]
16:49 <joal> Restart and babysit mediawiki-history-denormalize-wf-2020-04 [analytics]
16:37 <elukey> roll restart of all the nodemanagers on the hadoop cluster to pick up new jvm settings [analytics]
13:53 <elukey> move stat1007 to role::statistics::explorer (adding jupyterhub) [analytics]
11:00 <joal> Moving application_1583418280867_334532 to the nice queue [analytics]
10:58 <joal> Rerun wikidata-articleplaceholder_metrics-wf-2020-5-6 [analytics]
07:45 <elukey> re-run mediawiki-history-denormalize [analytics]
07:43 <elukey> kill application_1583418280867_333560 after a chat with David, the job is consuming ~2TB of RAM [analytics]
07:32 <elukey> re-run mediawiki history load [analytics]
07:18 <elukey> execute yarn application -movetoqueue application_1583418280867_332862 -queue root.nice [analytics]
07:06 <elukey> restart mediawiki-history-load via hue [analytics]
06:41 <elukey> restart oozie on an-coord1001 [analytics]
05:46 <elukey> re-run mediarequest-hourly-wf-2020-5-6-19 [analytics]
05:35 <elukey> re-run two failed hours for webrequest load text (07/05T05) and upload (06/05T23) [analytics]
05:33 <elukey> restart hadoop yarn nodemanager on analytics1071 [analytics]
2020-05-06 §
12:49 <elukey> restart oozie on an-coord1001 to pick up the new shlib retention changes [analytics]
12:28 <mforns> re-run pageview-druid-hourly-coord for 2020-05-06T06:00:00 after oozie shared lib update [analytics]
11:30 <elukey> use /run/user as kerberos credential cache for stat1005 [analytics]
09:25 <elukey> re-run projectview coordinator for 2020-5-6-5 after oozie shared lib update [analytics]
09:24 <elukey> re-run virtualpageview coordinator for 2020-5-6-5 after oozie shared lib update [analytics]
09:13 <elukey> re-run apis coordinator for 2020-5-6-7 after oozie shared lib update [analytics]
09:11 <elukey> re-run learning features actor coordinator for 2020-5-6-7 after oozie shared lib update [analytics]
09:10 <elukey> re-run aqs-hourly coordinator for 2020-5-6-7 after oozie shared lib update [analytics]
09:09 <elukey> re-run mediacounts coordinator for 2020-5-6-7 after oozie shared lib update [analytics]
09:08 <elukey> re-run mediarequest coordinator for 2020-5-6-7 after oozie shared lib update [analytics]