| 2021-01-08
      
      ยง | 
    
  | 15:02 | <andrew@deploy1001> | Started deploy [horizon/deploy@f6c50db]: minor django package upgrades | [production] | 
            
  | 14:51 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1082 (re)pooling @ 66%: After schema change', diff saved to https://phabricator.wikimedia.org/P13696 and previous config saved to /var/cache/conftool/dbconfig/20210108-145113-root.json | [production] | 
            
  | 14:36 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1082 (re)pooling @ 33%: After schema change', diff saved to https://phabricator.wikimedia.org/P13695 and previous config saved to /var/cache/conftool/dbconfig/20210108-143610-root.json | [production] | 
            
  | 13:42 | <klausman@cumin2001> | END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ml-serve2004.codfw.wmnet with reason: REIMAGE | [production] | 
            
  | 13:41 | <klausman@cumin2001> | END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ml-serve2002.codfw.wmnet with reason: REIMAGE | [production] | 
            
  | 13:39 | <klausman@cumin2001> | END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ml-serve2003.codfw.wmnet with reason: REIMAGE | [production] | 
            
  | 13:37 | <klausman@cumin2001> | START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2004.codfw.wmnet with reason: REIMAGE | [production] | 
            
  | 13:37 | <klausman@cumin2001> | START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2002.codfw.wmnet with reason: REIMAGE | [production] | 
            
  | 13:37 | <klausman@cumin2001> | START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2003.codfw.wmnet with reason: REIMAGE | [production] | 
            
  | 12:52 | <klausman@cumin2001> | END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ml-serve2001.codfw.wmnet with reason: REIMAGE | [production] | 
            
  | 12:49 | <klausman@cumin2001> | START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve2001.codfw.wmnet with reason: REIMAGE | [production] | 
            
  | 12:04 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1141 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P13694 and previous config saved to /var/cache/conftool/dbconfig/20210108-120415-root.json | [production] | 
            
  | 11:49 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1141 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P13693 and previous config saved to /var/cache/conftool/dbconfig/20210108-114912-root.json | [production] | 
            
  | 11:34 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1141 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P13692 and previous config saved to /var/cache/conftool/dbconfig/20210108-113408-root.json | [production] | 
            
  | 11:19 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1141 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P13691 and previous config saved to /var/cache/conftool/dbconfig/20210108-111905-root.json | [production] | 
            
  | 11:17 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Depool db1141', diff saved to https://phabricator.wikimedia.org/P13690 and previous config saved to /var/cache/conftool/dbconfig/20210108-111733-marostegui.json | [production] | 
            
  | 11:13 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1138 (re)pooling @ 100%: After schema change', diff saved to https://phabricator.wikimedia.org/P13689 and previous config saved to /var/cache/conftool/dbconfig/20210108-111345-root.json | [production] | 
            
  | 10:58 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1138 (re)pooling @ 75%: After schema change', diff saved to https://phabricator.wikimedia.org/P13688 and previous config saved to /var/cache/conftool/dbconfig/20210108-105842-root.json | [production] | 
            
  | 10:43 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1138 (re)pooling @ 50%: After schema change', diff saved to https://phabricator.wikimedia.org/P13676 and previous config saved to /var/cache/conftool/dbconfig/20210108-104338-root.json | [production] | 
            
  | 10:38 | <urbanecm@deploy1001> | Synchronized private/PrivateSettings.php: Update T250887 mitigations (duration: 01m 10s) | [production] | 
            
  | 10:28 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1138 (re)pooling @ 25%: After schema change', diff saved to https://phabricator.wikimedia.org/P13675 and previous config saved to /var/cache/conftool/dbconfig/20210108-102835-root.json | [production] | 
            
  | 10:26 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Depool db1138', diff saved to https://phabricator.wikimedia.org/P13674 and previous config saved to /var/cache/conftool/dbconfig/20210108-102606-marostegui.json | [production] | 
            
  | 10:01 | <elukey> | restart varnishkafka-webrequest on cp5001 - timeouts to kafka-jumbo1001, librdkafka seems not recovering very well | [production] | 
            
  | 10:00 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1085 (re)pooling @ 100%: After cloning db1155:3316', diff saved to https://phabricator.wikimedia.org/P13673 and previous config saved to /var/cache/conftool/dbconfig/20210108-100040-root.json | [production] | 
            
  | 09:45 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1085 (re)pooling @ 75%: After cloning db1155:3316', diff saved to https://phabricator.wikimedia.org/P13672 and previous config saved to /var/cache/conftool/dbconfig/20210108-094535-root.json | [production] | 
            
  | 09:30 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1085 (re)pooling @ 50%: After cloning db1155:3316', diff saved to https://phabricator.wikimedia.org/P13671 and previous config saved to /var/cache/conftool/dbconfig/20210108-093032-root.json | [production] | 
            
  | 09:30 | <marostegui> | Restart mysql on db1115 (tendril/dbtree) | [production] | 
            
  | 09:15 | <marostegui@cumin1001> | dbctl commit (dc=all): 'db1085 (re)pooling @ 25%: After cloning db1155:3316', diff saved to https://phabricator.wikimedia.org/P13670 and previous config saved to /var/cache/conftool/dbconfig/20210108-091528-root.json | [production] | 
            
  | 09:08 | <moritzm> | installing libxstream-java security updates on Buster | [production] | 
            
  | 09:01 | <godog> | swift codfw-prod: more weight to ms-be20[58-61] - T269337 | [production] | 
            
  | 08:12 | <marostegui> | Deploy schema change on s4 codfw master - T270187 | [production] | 
            
  | 07:57 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Depool db1082', diff saved to https://phabricator.wikimedia.org/P13669 and previous config saved to /var/cache/conftool/dbconfig/20210108-075714-marostegui.json | [production] | 
            
  | 07:23 | <marostegui> | Deploy schema change on s5 codfw master - T270187 | [production] | 
            
  | 06:33 | <marostegui@cumin1001> | dbctl commit (dc=all): 'Depool db1085 to clone db1155:3316 T268742 ', diff saved to https://phabricator.wikimedia.org/P13666 and previous config saved to /var/cache/conftool/dbconfig/20210108-063301-marostegui.json | [production] | 
            
  | 06:18 | <marostegui> | Deploy schema change on s2 codfw master - T270187 | [production] | 
            
  | 04:59 | <mutante> | mw1266 - restart-php7.2-fpm | [production] | 
            
  | 03:04 | <ryankemper> | [wdqs deploy] Deploy complete, service is healthy. This is done. | [production] | 
            
  | 02:35 | <ryankemper> | [wdqs deploy] Restarting `wdqs-categories` across load-balanced instances, one host at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'` | [production] | 
            
  | 02:35 | <ryankemper> | [wdqs deploy] Restarted `wdqs-categories` across test instances: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'` | [production] | 
            
  | 02:34 | <ryankemper> | [wdqs deploy] Restarted `wdqs-updater` across all instances: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'` | [production] | 
            
  | 02:27 | <ryankemper@deploy1001> | Finished deploy [wdqs/wdqs@b15fc5c]: 0.3.58 (duration: 18m 04s) | [production] | 
            
  | 02:15 | <ryankemper> | [wdqs deploy] Nevermind - the UI failure I mentioned above is transient. Restarting my ssh tunnel seemed to make the problem go away. Proceeding with deploy | [production] | 
            
  | 02:12 | <ryankemper> | [wdqs deploy] While queries run fine, it looks like there might be a UI glitch in this version. Digging in to see if it's transient, but I'll likely be aborting this deploy | [production] | 
            
  | 02:09 | <ryankemper@deploy1001> | Started deploy [wdqs/wdqs@b15fc5c]: 0.3.58 | [production] | 
            
  | 02:09 | <ryankemper> | [wdqs deploy] Tests passing on canary before beginning wdqs deploy, proceeding | [production] | 
            
  | 01:29 | <dzahn@cumin1001> | conftool action : set/pooled=yes; selector: name=mw1267.eqiad.wmnet | [production] | 
            
  | 01:28 | <mutante> | mw1276, mw1277 - first API appervers on buster, now serving traffic, free to depool if any issues | [production] | 
            
  | 01:28 | <dzahn@cumin1001> | conftool action : set/pooled=yes; selector: name=mw1277.eqiad.wmnet | [production] | 
            
  | 01:28 | <dzahn@cumin1001> | conftool action : set/pooled=yes; selector: name=mw1276.eqiad.wmnet | [production] | 
            
  | 01:24 | <mutante> | mw1266 - another buster appserver now serving traffic | [production] |