| 2023-08-28
      
      § | 
    
  | 05:57 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'db1178 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P51446 and previous config saved to /var/cache/conftool/dbconfig/20230828-055751-ladsgroup.json | [production] | 
            
  | 05:57 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P51445 and previous config saved to /var/cache/conftool/dbconfig/20230828-055727-ladsgroup.json | [production] | 
            
  | 05:56 | <ladsgroup@deploy1002> | Finished scap: Backport for [[gerrit:952202|Stop writing to old extlinks columns in s4 (T342683)]] (duration: 15m 36s) | [production] | 
            
  | 05:55 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P51444 and previous config saved to /var/cache/conftool/dbconfig/20230828-055539-ladsgroup.json | [production] | 
            
  | 05:50 | <ladsgroup@deploy1002> | ladsgroup: Continuing with sync | [production] | 
            
  | 05:49 | <ladsgroup@deploy1002> | ladsgroup: Backport for [[gerrit:952202|Stop writing to old extlinks columns in s4 (T342683)]] synced to the testservers mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, and mw-debug kubernetes deployment (accessible via k8s-experimental XWD option) | [production] | 
            
  | 05:46 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P51443 and previous config saved to /var/cache/conftool/dbconfig/20230828-054615-ladsgroup.json | [production] | 
            
  | 05:42 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'db1178 (re)pooling @ 75%: Maint over', diff saved to https://phabricator.wikimedia.org/P51442 and previous config saved to /var/cache/conftool/dbconfig/20230828-054247-ladsgroup.json | [production] | 
            
  | 05:42 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db2106 (T344589)', diff saved to https://phabricator.wikimedia.org/P51441 and previous config saved to /var/cache/conftool/dbconfig/20230828-054221-ladsgroup.json | [production] | 
            
  | 05:41 | <marostegui> | failover m5-master to dbproxy1021 | [production] | 
            
  | 05:41 | <ladsgroup@deploy1002> | Started scap: Backport for [[gerrit:952202|Stop writing to old extlinks columns in s4 (T342683)]] | [production] | 
            
  | 05:40 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1141 (T344589)', diff saved to https://phabricator.wikimedia.org/P51440 and previous config saved to /var/cache/conftool/dbconfig/20230828-054033-ladsgroup.json | [production] | 
            
  | 05:34 | <elukey> | powercycle restbase1027 - stopped publishing metrics days ago, no root tty available in mgmt console | [production] | 
            
  | 05:31 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repooling after maintenance db1192 (T344589)', diff saved to https://phabricator.wikimedia.org/P51439 and previous config saved to /var/cache/conftool/dbconfig/20230828-053108-ladsgroup.json | [production] | 
            
  | 05:30 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Depooling db2106 (T344589)', diff saved to https://phabricator.wikimedia.org/P51438 and previous config saved to /var/cache/conftool/dbconfig/20230828-053045-ladsgroup.json | [production] | 
            
  | 05:30 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance | [production] | 
            
  | 05:30 | <elukey> | depool restbase1027 - a lot of ping down events registered, a check up is needed | [production] | 
            
  | 05:30 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance | [production] | 
            
  | 05:27 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'db1178 (re)pooling @ 25%: Maint over', diff saved to https://phabricator.wikimedia.org/P51437 and previous config saved to /var/cache/conftool/dbconfig/20230828-052742-ladsgroup.json | [production] | 
            
  | 05:27 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance | [production] | 
            
  | 05:27 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance | [production] | 
            
  | 05:26 | <ladsgroup@cumin1001> | END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance | [production] | 
            
  | 05:26 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Depooling db1141 (T344589)', diff saved to https://phabricator.wikimedia.org/P51436 and previous config saved to /var/cache/conftool/dbconfig/20230828-052610-ladsgroup.json | [production] | 
            
  | 05:26 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance | [production] | 
            
  | 05:26 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 05:25 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 05:22 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2102.codfw.wmnet with reason: Maintenance | [production] | 
            
  | 05:22 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2102.codfw.wmnet with reason: Maintenance | [production] | 
            
  | 05:13 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Depooling db1192 (T344589)', diff saved to https://phabricator.wikimedia.org/P51435 and previous config saved to /var/cache/conftool/dbconfig/20230828-051349-ladsgroup.json | [production] | 
            
  | 05:13 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 05:13 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1192.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 05:12 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'db1178 (re)pooling @ 10%: Maint over', diff saved to https://phabricator.wikimedia.org/P51434 and previous config saved to /var/cache/conftool/dbconfig/20230828-051237-ladsgroup.json | [production] | 
            
  
    | 2023-08-25
      
      § | 
    
  | 21:03 | <inflatador> | bking@cumin1001 shutting off wdqs1005 in preparation for decommission T344198 | [production] | 
            
  | 21:02 | <bking@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on wdqs1005.eqiad.wmnet with reason: to be decommissioned soon | [production] | 
            
  | 21:02 | <bking@cumin1001> | START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on wdqs1005.eqiad.wmnet with reason: to be decommissioned soon | [production] | 
            
  | 19:48 | <jhancock@cumin2002> | END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host moss-be2003.codfw.wmnet with OS bullseye | [production] | 
            
  | 19:39 | <bking@cumin1001> | END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) | [production] | 
            
  | 18:55 | <jhancock@cumin2002> | START - Cookbook sre.hosts.reimage for host moss-be2003.codfw.wmnet with OS bullseye | [production] | 
            
  | 18:48 | <jforrester@deploy1002> | helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply | [production] | 
            
  | 18:47 | <jforrester@deploy1002> | helmfile [eqiad] START helmfile.d/services/wikifunctions: apply | [production] | 
            
  | 18:47 | <jforrester@deploy1002> | helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply | [production] | 
            
  | 18:46 | <jforrester@deploy1002> | helmfile [codfw] START helmfile.d/services/wikifunctions: apply | [production] | 
            
  | 18:45 | <jhancock@cumin2002> | END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host moss-be2003.mgmt.codfw.wmnet with reboot policy FORCED | [production] | 
            
  | 18:45 | <jhancock@cumin2002> | START - Cookbook sre.hosts.provision for host moss-be2003.mgmt.codfw.wmnet with reboot policy FORCED | [production] | 
            
  | 18:45 | <jforrester@deploy1002> | helmfile [staging] DONE helmfile.d/services/wikifunctions: apply | [production] | 
            
  | 18:45 | <jforrester@deploy1002> | helmfile [staging] START helmfile.d/services/wikifunctions: apply | [production] | 
            
  | 18:26 | <jforrester@deploy1002> | helmfile [staging] DONE helmfile.d/services/wikifunctions: apply | [production] | 
            
  | 18:25 | <jforrester@deploy1002> | helmfile [staging] START helmfile.d/services/wikifunctions: apply | [production] |