601-650 of 3551 results (19ms)
          
  
    | 2020-09-15
      
      § | 
    
  | 12:30 | <elukey> | stop timers on an-launcher1002 to drain the cluster and restart an-coord1001's daemons (hive/oozie/presto) | [analytics] | 
            
  | 06:48 | <elukey> | run systemctl reset-failed monitor_refine_eventlogging_legacy_failure_flags.service on an-launcher1002 | [analytics] | 
            
  
    | 2020-09-14
      
      § | 
    
  | 14:36 | <milimetric> | deployed eventstreams with new KafkaSSE version on staging, eqiad, codfw | [analytics] | 
            
  
    | 2020-09-11
      
      § | 
    
  | 15:41 | <milimetric> | restarted data quality stats bundles | [analytics] | 
            
  | 01:32 | <milimetric> | deployed small fix for hql of editors_bycountry load job | [analytics] | 
            
  | 00:46 | <milimetric> | deployed refinery source 0.0.136, refinery, and synced to HDFS | [analytics] | 
            
  
    | 2020-09-09
      
      § | 
    
  | 10:11 | <klausman> | Rebooting stat1005 for clearing GPU status and testing new DKMS driver (T260442) | [analytics] | 
            
  | 07:25 | <elukey> | restart varnishkafka-webrequest on cp5010 and cp5012, delivery reports errors happening since yesterday's network outage | [analytics] | 
            
  
    | 2020-09-04
      
      § | 
    
  | 18:11 | <milimetric> | aqs deploy went well!  Geoeditors endpoint is live internally, data load job was successful, will submit pull request for public endpoint. | [analytics] | 
            
  | 06:54 | <joal> | Manually restart mediawiki-history-drop-snapshot after hive-partitions/hdfs-folders mismatch fix | [analytics] | 
            
  | 06:08 | <elukey> | reset-failed mediawiki-history-drop-snapshot on an-launcher1002 to clear icinga errors | [analytics] | 
            
  | 01:52 | <milimetric> | aborted aqs deploy due to cassandra error | [analytics] | 
            
  
    | 2020-09-03
      
      § | 
    
  | 19:15 | <milimetric> | finished deploying refinery and refinery-source, restarting jobs now | [analytics] | 
            
  | 13:59 | <milimetric> | edit-hourly-druid-wf-2020-08 fails consistently | [analytics] | 
            
  | 13:56 | <joal> | Kill-restart mediawiki-history-reduced oozie job into production queue | [analytics] | 
            
  | 13:56 | <joal> | rerun edit-hourly-druid-wf-2020-08 after failed attempt | [analytics] | 
            
  
    | 2020-09-02
      
      § | 
    
  | 18:24 | <milimetric> | restarting mediawiki history denormalize coordinator in production queue, due to failed 2020-08 run | [analytics] | 
            
  | 08:37 | <elukey> | run kafka preferred-replica-election on jumbo after jumbo1003's reimage to buster | [analytics] | 
            
  
    | 2020-08-31
      
      § | 
    
  | 13:43 | <elukey> | run kafka preferred-replica-election on Jumbo after jumbo1001's reimage | [analytics] | 
            
  | 07:13 | <elukey> | run kafka preferred-replica-election on Jumbo after jumbo1005's reimage | [analytics] | 
            
  
    | 2020-08-28
      
      § | 
    
  | 14:25 | <mforns> | deployed pageview whitelist with new wiki: ja.wikivoyage | [analytics] | 
            
  | 14:18 | <elukey> | run kafka preferred-replica-election on jumbo after the reimage of jumbo1006 | [analytics] | 
            
  | 07:21 | <joal> | Manually add ja.wikivoyage to pageview allowlist to prevent alerts | [analytics] | 
            
  
    | 2020-08-27
      
      § | 
    
  | 19:05 | <mforns> | finished refinery deploy (ref v0.0.134) | [analytics] | 
            
  | 18:41 | <mforns> | starting refinery deploy (ref v0.0.134) | [analytics] | 
            
  | 18:30 | <mforns> | deployed refinery-source v0.0.134 | [analytics] | 
            
  | 13:29 | <elukey> | restart jvm daemons on analytics1042, aqs1004, kafka-jumbo1001 to pick up new openjdk upgrades (canaries) | [analytics] | 
            
  
    | 2020-08-25
      
      § | 
    
  | 15:47 | <elukey> | restart mariadb@analytics_meta on db1108 to apply a replication filter (exclude superset_staging database from replication) | [analytics] | 
            
  | 06:35 | <elukey> | restart mediawiki-history-drop-snapshot on an-launcher1002 to check that it works | [analytics] | 
            
  
    | 2020-08-24
      
      § | 
    
  | 06:50 | <joal> | Dropping wikitext-history snapshots 2020-04 and 2020-05 keeping two (2020-06 and 2020-07) to free space in hdfs | [analytics] | 
            
  
    | 2020-08-23
      
      § | 
    
  | 19:34 | <nuria> | deleted 1.2 TB from hdfs://analytics-hadoop/user/analytics/.Trash/200811000000 | [analytics] | 
            
  | 19:31 | <nuria> | deleted 1.2 TB from hdfs://analytics-hadoop/user/nuria/.Trash/* | [analytics] | 
            
  | 19:26 | <nuria> | deleted 300G from hdfs://analytics-hadoop/user/analytics/.Trash/200814000000 | [analytics] | 
            
  | 19:25 | <nuria> | deleted 1.2 TB from hdfs://analytics-hadoop/user/analytics/.Trash/200808000000 | [analytics] | 
            
  
    | 2020-08-20
      
      § | 
    
  | 16:49 | <joal> | Kill restart webrequest-load bundle to move it to production queue | [analytics] | 
            
  
    | 2020-08-14
      
      § | 
    
  | 09:13 | <fdans> | restarting refine to apply T257860 | [analytics] | 
            
  
    | 2020-08-13
      
      § | 
    
  | 16:13 | <fdans> | restarting webrequest bundle | [analytics] | 
            
  | 14:44 | <fdans> | deploying refinery | [analytics] | 
            
  | 14:13 | <fdans> | updating refinery source symlinks | [analytics] | 
            
  
    | 2020-08-11
      
      § | 
    
  | 17:36 | <ottomata> | refine with refinery-source 0.0.132 and merge_with_hive_schema_before_read=true - T255818 | [analytics] | 
            
  | 14:52 | <ottomata> | scap deploy refinery to an-launcher1002 to get camus wrapper script changes | [analytics] | 
            
  
    | 2020-08-06
      
      § | 
    
  | 14:47 | <fdans> | deploying refinery | [analytics] | 
            
  | 08:07 | <elukey> | roll restart druid-brokers (on both clusters) to pick up new changes for monitorings | [analytics] | 
            
  
    | 2020-08-05
      
      § | 
    
  | 13:04 | <elukey> | restart yarn resource managers on an-master100[12] to pick up new Yarn settings - https://gerrit.wikimedia.org/r/c/operations/puppet/+/618529 | [analytics] | 
            
  | 13:03 | <elukey> | set yarn_scheduler_minimum_allocation_mb = 1 (was zero) to Hadoop to workaround a Flink 1.1 issue (namely it doesn't work if the value is <= 0) | [analytics] | 
            
  | 09:32 | <elukey> | set ticket max renewable lifetime to 7d on all kerberos clients (was zero, the default) | [analytics] | 
            
  
    | 2020-08-04
      
      § | 
    
  | 08:30 | <elukey> | resume druid-related oozie coordinator jobs via Hue (after druid upgrade) | [analytics] | 
            
  | 08:28 | <elukey> | started netflow kafka supervisor on Druid Analytics (after upgrade) | [analytics] | 
            
  | 08:19 | <elukey> | restore systemd timers for druid jobs on an-launcher1002 (after druid upgrade) | [analytics] | 
            
  | 07:33 | <elukey> | stop systemd timers related to druid on an-launcher1002 | [analytics] |