| 2020-09-04
      
      § | 
    
  | 20:30 | <cmjohnson@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 20:05 | <cmjohnson@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 20:04 | <cmjohnson@cumin1001> | END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) | [production] | 
            
  | 20:03 | <cmjohnson@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 20:01 | <cmjohnson@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 20:01 | <cmjohnson@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 20:00 | <cmjohnson@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 19:59 | <cmjohnson@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 19:59 | <cmjohnson@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 19:57 | <cmjohnson@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 19:57 | <cmjohnson@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 19:22 | <mutante> | Icinga - ACKing with sticky - alerts on test and dev hosts | [production] | 
            
  | 18:10 | <milimetric@deploy1001> | Finished deploy [analytics/aqs/deploy@95d6432]: AQS: new editors by country endpoint, low risk so trying on a Friday with SRE blessing (duration: 07m 35s) | [production] | 
            
  | 18:02 | <milimetric@deploy1001> | Started deploy [analytics/aqs/deploy@95d6432]: AQS: new editors by country endpoint, low risk so trying on a Friday with SRE blessing | [production] | 
            
  | 10:31 | <elukey@cumin1001> | END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) | [production] | 
            
  | 10:29 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Depool db1087 for MCR schema change', diff saved to https://phabricator.wikimedia.org/P12492 and previous config saved to /var/cache/conftool/dbconfig/20200904-102955-marostegui.json | [production] | 
            
  | 10:28 | <marostegui> | Deploy MCR schema change on db1087 (sanitarium master), this will generate lag (probably a few days) on s8 labsdb hosts  T238966 | [production] | 
            
  | 09:48 | <marostegui> | Restart prometheus-mysqld-exporter on db2125 | [production] | 
            
  | 09:11 | <elukey@cumin1001> | START - Cookbook sre.hadoop.roll-restart-workers | [production] | 
            
  | 08:58 | <elukey@cumin1001> | END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) | [production] | 
            
  | 08:31 | <elukey@cumin1001> | START - Cookbook sre.hadoop.roll-restart-workers | [production] | 
            
  | 08:29 | <elukey> | roll restart of the hadoop workers (test and analytics cluster) for openjdk upgrades | [production] | 
            
  | 08:08 | <moritzm> | installing 4.19.132 kernel on buster systems (only installing the deb, reboots separately) | [production] | 
            
  | 07:30 | <moritzm> | installing 4.9.228 kernel on stretch systems (only installing the deb, reboots separately) | [production] | 
            
  | 05:13 | <marostegui> | Deploy MCR schema change on s4 eqiad master T238966 | [production] | 
            
  | 01:51 | <milimetric@deploy1001> | Finished deploy [analytics/aqs/deploy@95d6432]: AQS: Deploying new geoeditors endpoints (duration: 63m 18s) | [production] | 
            
  | 01:35 | <pt1979@cumin2001> | END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | [production] | 
            
  | 01:30 | <pt1979@cumin2001> | START - Cookbook sre.dns.netbox | [production] | 
            
  | 01:23 | <ryankemper> | (Following the restart of blazegraph, service has been restored to `wdqs2003`. See https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&var-cluster_name=wdqs&from=1599182219699&to=1599182547699) | [production] | 
            
  | 01:16 | <ryankemper> | Glancing at https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&var-cluster_name=wdqs&from=1599170628749&to=1599182011243, looks like `wdqs2003`'s blazegaph isn't happy based off the null data entries. Restarting blazegraph: `ryankemper@wdqs2003:~$ sudo systemctl restart wdqs-blazegraph` | [production] | 
            
  | 00:48 | <milimetric@deploy1001> | Started deploy [analytics/aqs/deploy@95d6432]: AQS: Deploying new geoeditors endpoints | [production] | 
            
  
    | 2020-09-03
      
      § | 
    
  | 23:31 | <urbanecm@deploy1001> | Synchronized wmf-config/InitialiseSettings.php: 93947391e97be11a9cd7eb4713b274b05d5b371a: Start logging log-ins on select wikis (T253802) (duration: 00m 56s) | [production] | 
            
  | 21:18 | <cmjohnson@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 21:15 | <cmjohnson@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 19:55 | <milimetric@deploy1001> | deploy aborted: AQS: Deploying new geoeditors endpoints (duration: 00m 13s) | [production] | 
            
  | 19:54 | <milimetric@deploy1001> | Started deploy [analytics/aqs/deploy@95d6432]: AQS: Deploying new geoeditors endpoints | [production] | 
            
  | 19:07 | <milimetric@deploy1001> | Finished deploy [analytics/refinery@e4d5149] (thin): Regular analytics weekly train THIN [analytics/refinery@e4d5149] (duration: 00m 08s) | [production] | 
            
  | 19:07 | <milimetric@deploy1001> | Started deploy [analytics/refinery@e4d5149] (thin): Regular analytics weekly train THIN [analytics/refinery@e4d5149] | [production] | 
            
  | 19:06 | <milimetric@deploy1001> | Finished deploy [analytics/refinery@e4d5149]: Regular analytics weekly train [analytics/refinery@e4d5149] (duration: 09m 06s) | [production] | 
            
  | 18:57 | <milimetric@deploy1001> | Started deploy [analytics/refinery@e4d5149]: Regular analytics weekly train [analytics/refinery@e4d5149] | [production] | 
            
  | 17:50 | <cmjohnson@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 17:48 | <cmjohnson@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 17:47 | <cmjohnson@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 17:46 | <cmjohnson@cumin1001> | END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) | [production] | 
            
  | 17:46 | <cmjohnson@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 17:45 | <cmjohnson@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 17:44 | <cmjohnson@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 17:43 | <cmjohnson@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) | [production] | 
            
  | 17:43 | <cmjohnson@cumin1001> | START - Cookbook sre.hosts.downtime | [production] | 
            
  | 17:41 | <cmjohnson@cumin1001> | START - Cookbook sre.hosts.downtime | [production] |