351-400 of 4576 results (20ms)
          
  
    | 2021-12-14
      
      § | 
    
  | 14:25 | <btullis> | btullis@aqs1011:$ sudo systemctl start cassandra-b.service | [analytics] | 
            
  | 12:44 | <joal> | Rerun failed cassandra-hourly-wf-local_group_default_T_pageviews_per_project_v2-2021-12-14-10 | [analytics] | 
            
  | 12:42 | <joal> | Kill late spark cassandra loading job | [analytics] | 
            
  
    | 2021-12-11
      
      § | 
    
  | 10:06 | <elukey> | kill process 2560 on stat1005 to allow puppet to clean up the related user (offboarded) | [analytics] | 
            
  | 10:04 | <elukey> | kill process 2831 on stat1008 to allow puppet to clean up the related user (offboarded) | [analytics] | 
            
  
    | 2021-12-09
      
      § | 
    
  | 11:08 | <btullis> | roll restarting druid historical daemons on analytics cluster T297148 | [analytics] | 
            
  | 10:46 | <btullis> | roll restarting druid brokers on analytics cluster | [analytics] | 
            
  
    | 2021-12-07
      
      § | 
    
  | 20:09 | <ottomata> | deploy wikistats2 with doc updates | [analytics] | 
            
  
    | 2021-12-03
      
      § | 
    
  | 17:36 | <razzi> | restart aqs-next to pick up new mediawiki snapshot: `razzi@cumin1001:~$ sudo cumin A:aqs-next 'systemctl restart aqs'` | [analytics] | 
            
  | 17:36 | <razzi> | restart aqs to pick up new mediawiki snapshot: `razzi@cumin1001:~$ sudo cookbook sre.aqs.roll-restart aqs` | [analytics] | 
            
  | 07:33 | <elukey> | move kafka-test to fixed uid/gid | [analytics] | 
            
  
    | 2021-12-02
      
      § | 
    
  | 20:05 | <ottomata> | restarting pageview-druid-daily-coord (killing 0062888-210701181527401-oozie-oozi-C) - I can't seem to rerun a particular hour, so just starting again from that hour. | [analytics] | 
            
  | 17:57 | <elukey> | drop "EventLogging MySQL" datasource from Superset (not valid anymore) | [analytics] | 
            
  | 17:26 | <joal> | Kill paragon job to prevent more nodemangers to OOM | [analytics] | 
            
  
    | 2021-12-01
      
      § | 
    
  | 20:40 | <razzi> | deploy refinery for T296089 patch https://gerrit.wikimedia.org/r/c/analytics/refinery/+/742672 | [analytics] | 
            
  
    | 2021-11-27
      
      § | 
    
  | 09:56 | <elukey> | powercycle analytics1071, soft lockup stacktraces in the tty | [analytics] | 
            
  
    | 2021-11-24
      
      § | 
    
  | 17:30 | <mforns> | Deployed refinery using scap, then deployed onto hdfs | [analytics] | 
            
  | 12:31 | <btullis> | btullis@an-launcher1002:~$ sudo systemctl reset-failed monitor_refine_event_sanitized_analytics_delayed.service | [analytics] | 
            
  | 07:09 | <elukey> | drop /tmp/blockmgr-20fe4b2b-31fb-4a85-b5b1-bebe254120f8 on stat1006 to free space on the root partition | [analytics] | 
            
  
    | 2021-11-23
      
      § | 
    
  | 11:56 | <btullis> | roll-restarting the cassandra services on the aqs cluster. (Not the aqs_next cluster) | [analytics] | 
            
  | 11:49 | <btullis> | btullis@an-coord1001:~$ sudo systemctl restart presto-server.service | [analytics] | 
            
  | 11:49 | <btullis> | btullis@an-coord1001:~$ sudo systemctl restart oozie.service | [analytics] | 
            
  
    | 2021-11-22
      
      § | 
    
  | 12:18 | <btullis> | failed back the hive services to an-coord1001 via CNAME change | [analytics] | 
            
  | 11:36 | <btullis> | btullis@an-coord1001:~$ sudo systemctl restart hive-server2 hive-metastore | [analytics] | 
            
  | 10:44 | <btullis> | deploying DNS change to switch hive to the standby server. | [analytics] | 
            
  | 10:18 | <btullis> | btullis@an-coord1002:~$ sudo systemctl restart hive-server2 hive-metastore | [analytics] | 
            
  
    | 2021-11-18
      
      § | 
    
  | 17:26 | <elukey> | varnishkafka-webrequest on cp3050 is running with /etc/ssl/localcerts/wmf_trusted_root_CAs.pem | [analytics] | 
            
  | 10:03 | <elukey> | restart prometheus-druid-exporter on Druid Analytics to clear unnecessary metrics | [analytics] | 
            
  | 07:32 | <elukey> | restart prometheus-druid-exporter on Druid Public to see metrics difference | [analytics] | 
            
  
    | 2021-11-17
      
      § | 
    
  | 16:01 | <btullis> | roll-restarting kafka-test brokers | [analytics] | 
            
  | 12:12 | <btullis> | roll-restarting the presto analytics workers | [analytics] | 
            
  | 11:44 | <btullis> | btullis@archiva1002:~$ sudo systemctl restart archiva.service | [analytics] | 
            
  | 07:29 | <elukey> | `apt-get clean` on an-tool1005 to free space in the root partition | [analytics] | 
            
  | 07:28 | <elukey> | `sudo pkill -U jmixter` on stat100[5,8] to allow puppet to run and remove the offboarded user | [analytics] | 
            
  
    | 2021-11-16
      
      § | 
    
  | 19:40 | <joal> | Deploying refinery to HDFS | [analytics] | 
            
  | 19:15 | <joal> | Deploying refinery with scap | [analytics] | 
            
  | 18:23 | <joal> | Releasing refinery-source v0.1.21 | [analytics] | 
            
  | 11:32 | <btullis> | btullis@cumin1001:~$ sudo cookbook sre.druid.roll-restart-workers public | [analytics] | 
            
  | 10:20 | <btullis> | roll-restarting hadoop masters | [analytics] | 
            
  
    | 2021-11-15
      
      § | 
    
  | 16:37 | <joal> | Rerun failed mediawiki-wikitext-history-wf-2021-10 | [analytics] | 
            
  
    | 2021-11-11
      
      § | 
    
  | 06:56 | <elukey> | `systemctl start prometheus-mysqld-exporter@analytics_meta` on db1108 | [analytics] | 
            
  
    | 2021-11-10
      
      § | 
    
  | 18:20 | <btullis> | btullis@an-launcher1002:~$ sudo systemctl reset-failed monitor_refine_event_sanitized_analytics_delayed.service | [analytics] | 
            
  | 10:19 | <btullis> | btullis@an-launcher1002:~$ sudo systemctl reset-failed monitor_refine_event_sanitized_analytics_delayed | [analytics] | 
            
  
    | 2021-11-09
      
      § | 
    
  | 16:52 | <razzi> | restart presto server on an-coord1001 to apply change for T292087 | [analytics] | 
            
  | 16:30 | <razzi> | set superset presto version to 0.246 in ui | [analytics] | 
            
  | 16:30 | <razzi> | set superset presto timeout to 170s: {"connect_args":{"session_props":{"query_max_run_time":"170s"}}} for T294771 | [analytics] | 
            
  | 12:23 | <btullis> | btullis@an-launcher1002:~$ sudo systemctl reset-failed monitor_refine_event_sanitized_analytics_delayed | [analytics] | 
            
  | 07:23 | <elukey> | `apt-get clean` on stat1006 to free some space (root partition full) | [analytics] | 
            
  
    | 2021-11-08
      
      § | 
    
  | 19:51 | <ottomata> | an-coord1002: drop user 'admin'@'localhost'; start slave; to fix broken replication  - T284150 | [analytics] | 
            
  | 19:44 | <razzi> | create admin user on an-coord1001 for T284150 | [analytics] |