| 2021-04-28
      
      § | 
    
  | 04:14 | <ryankemper> | T280382 `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2001.codfw.wmnet --dest wdqs2007.codfw.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance blazegraph` on `ryankemper@cumin1001` tmux session `reimage` | [production] | 
            
  | 04:14 | <ryankemper@cumin1001> | START - Cookbook sre.wdqs.data-transfer | [production] | 
            
  | 04:13 | <ryankemper@cumin1001> | END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) | [production] | 
            
  | 04:08 | <ryankemper> | T280382 `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2001.codfw.wmnet --dest wdqs2007.codfw.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin1001` tmux session `reimage` | [production] | 
            
  | 04:08 | <marostegui> | Start replication changes, connect everything to db1163 T278214 | [production] | 
            
  | 04:08 | <ryankemper@cumin1001> | START - Cookbook sre.wdqs.data-transfer | [production] | 
            
  | 04:07 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Set db1163 with weight 0 before the switchover T278214', diff saved to https://phabricator.wikimedia.org/P15598 and previous config saved to /var/cache/conftool/dbconfig/20210428-040718-marostegui.json | [production] | 
            
  | 03:53 | <ryankemper@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2007.codfw.wmnet with reason: REIMAGE | [production] | 
            
  | 03:51 | <ryankemper@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2007.codfw.wmnet with reason: REIMAGE | [production] | 
            
  | 03:49 | <ryankemper@puppetmaster1001> | conftool action : set/pooled=no; selector: name=wdqs2007.codfw.wmnet | [production] | 
            
  | 03:48 | <ryankemper@puppetmaster1001> | conftool action : set/pooled=no; selector: name=wdqs1013.eqiad.wmnet | [production] | 
            
  | 03:33 | <ryankemper> | `sudo systemctl restart wdqs-blazegraph` on `wdqs1012` to clear the `WDQS SPARQL` warning | [production] | 
            
  | 03:32 | <ryankemper> | T280382 `sudo -i wmf-auto-reimage-host -p T280382 wdqs2007.codfw.wmnet` on `ryankemper@cumin1001` tmux session `reimage` | [production] | 
            
  | 03:32 | <ryankemper> | T280382 `sudo -i wmf-auto-reimage-host -p T280382 wdqs1013.eqiad.wmnet` on `ryankemper@cumin1001` tmux session `reimage` | [production] | 
            
  | 02:33 | <robh@cumin1001> | END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | [production] | 
            
  | 02:28 | <robh@cumin1001> | START - Cookbook sre.dns.netbox | [production] | 
            
  | 01:06 | <robh@cumin1001> | END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | [production] | 
            
  | 01:00 | <robh@cumin1001> | START - Cookbook sre.dns.netbox | [production] | 
            
  | 00:03 | <robh@cumin1001> | END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on snapshot1015.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 00:01 | <robh@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1014.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  
    | 2021-04-27
      
      § | 
    
  | 23:58 | <robh@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1015.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 23:57 | <robh@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1013.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 23:57 | <robh@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1014.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 23:55 | <robh@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1012.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 23:54 | <robh@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1013.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 23:53 | <robh@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on snapshot1011.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 23:52 | <robh@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1012.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 23:51 | <robh@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on snapshot1011.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 21:07 | <legoktm@cumin1001> | END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb[2005-2006].codfw.wmnet | [production] | 
            
  | 20:55 | <legoktm@cumin1001> | START - Cookbook sre.hosts.decommission for hosts rdb[2005-2006].codfw.wmnet | [production] | 
            
  | 20:54 | <legoktm@cumin1001> | END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts rdb[2003-2004].codfw.wmnet | [production] | 
            
  | 20:42 | <legoktm@cumin1001> | START - Cookbook sre.hosts.decommission for hosts rdb[2003-2004].codfw.wmnet | [production] | 
            
  | 20:32 | <bblack> | re-pooling codfw public traffic - T279457 | [production] | 
            
  | 20:11 | <jhuneidi@deploy1002> | Synchronized php-1.37.0-wmf.3/includes/rcfeed/IRCColourfulRCFeedFormatter.php: Backport rcfeed: Remove reference assignment (T281226) to 1.37.0-wmf.3 (duration: 01m 12s) | [production] | 
            
  | 20:08 | <herron@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2005.codfw.wmnet with reason: REIMAGE | [production] | 
            
  | 20:06 | <herron@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2005.codfw.wmnet with reason: REIMAGE | [production] | 
            
  | 19:44 | <dzahn@cumin1001> | END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host people1003.eqiad.wmnet | [production] | 
            
  | 19:37 | <herron@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main2004.codfw.wmnet with reason: REIMAGE | [production] | 
            
  | 19:35 | <papaul> | powerdown ms-backup2001  for maintenance | [production] | 
            
  | 19:35 | <herron@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main2004.codfw.wmnet with reason: REIMAGE | [production] | 
            
  | 19:07 | <papaul> | powerdown logstash2035  for maintenance | [production] | 
            
  | 19:03 | <dzahn@cumin1001> | START - Cookbook sre.ganeti.makevm for new host people1003.eqiad.wmnet | [production] | 
            
  | 19:00 | <dzahn@cumin1001> | END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts people1003.eqiad.wmnet | [production] | 
            
  | 18:50 | <mutante> | people1003 - destroying VM and recreating again from scratch to test if issue of no console and no access is repeatable | [production] | 
            
  | 18:50 | <dzahn@cumin1001> | START - Cookbook sre.hosts.decommission for hosts people1003.eqiad.wmnet | [production] | 
            
  | 18:37 | <herron@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-main1005.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 18:35 | <herron@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-main1005.eqiad.wmnet with reason: REIMAGE | [production] | 
            
  | 18:33 | <mutante> | people1003 - rebooting, trying to get new VM to work | [production] | 
            
  | 18:33 | <Urbanecm> | Morning B&C window done | [production] | 
            
  | 18:32 | <urbanecm@deploy1002> | Synchronized wmf-config/InitialiseSettings.php: 91a85f2: ac770bf: Enable language in header for office and testwiki users (T280526) (duration: 01m 19s) | [production] |