| 2021-11-27
      
      § | 
    
  | 19:55 | <andrew@deploy1002> | Finished deploy [horizon/deploy@6115b3b]: network UI updates for T296548 (duration: 04m 14s) | [production] | 
            
  | 19:51 | <andrew@deploy1002> | Started deploy [horizon/deploy@6115b3b]: network UI updates for T296548 | [production] | 
            
  | 19:47 | <andrew@deploy1002> | Finished deploy [horizon/deploy@6115b3b]: network UI tests in codfw1dev (duration: 02m 01s) | [production] | 
            
  | 19:45 | <andrew@deploy1002> | Started deploy [horizon/deploy@6115b3b]: network UI tests in codfw1dev | [production] | 
            
  | 12:22 | <elukey> | drop /var/tmp/core files from ores100[2,4] root partition full | [production] | 
            
  | 12:10 | <elukey> | drop /var/tmp/core files from ores1009, root partition full | [production] | 
            
  | 11:55 | <elukey> | disable coredumps for ORES celery units (will cause a roll restart of all celeries) - T296563 | [production] | 
            
  | 11:46 | <elukey> | drop ores coredumps from ores1008 | [production] | 
            
  | 09:56 | <elukey> | powercycle analytics1071, soft lockup stacktraces in the tty | [production] | 
            
  | 09:51 | <elukey> | move ores coredump files from /var/cache/tmp to /srv/coredumps on ores100[6,7,8] and ores2003 to free space on the root partition | [production] | 
            
  
    | 2021-11-26
      
      § | 
    
  | 16:11 | <arnoldokoth> | drain kubestage1002 node in prep for decommissioning | [production] | 
            
  | 16:05 | <arnoldokoth> | drain kubestage1001 node in prep for decommissioning | [production] | 
            
  | 15:46 | <elukey> | move /var/tmp/core/* to /srv/coredumps on ores1008 to free root space | [production] | 
            
  | 14:30 | <jelto@deploy1002> | helmfile [eqiad] Ran 'sync' command on namespace 'miscweb' for release 'main' . | [production] | 
            
  | 14:25 | <jelto@deploy1002> | helmfile [codfw] Ran 'sync' command on namespace 'miscweb' for release 'main' . | [production] | 
            
  | 14:21 | <jelto@deploy1002> | helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' . | [production] | 
            
  | 13:48 | <jelto@deploy1002> | helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' . | [production] | 
            
  | 13:46 | <jelto@deploy1002> | helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' . | [production] | 
            
  | 13:25 | <akosiaris@deploy1002> | helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. | [production] | 
            
  | 13:25 | <akosiaris@deploy1002> | helmfile [staging-codfw] START helmfile.d/admin 'apply'. | [production] | 
            
  | 12:21 | <vgutierrez> | restarting HAProxy on O:cache::upload_haproxy - T290005 | [production] | 
            
  | 11:41 | <akosiaris> | T296303 cleanup weird state of calico-codfw cluster | [production] | 
            
  | 11:41 | <akosiaris@deploy1002> | helmfile [staging-codfw] DONE helmfile.d/admin 'sync'. | [production] | 
            
  | 11:41 | <akosiaris@deploy1002> | helmfile [staging-codfw] START helmfile.d/admin 'sync'. | [production] | 
            
  | 11:39 | <akosiaris@deploy1002> | helmfile [staging-codfw] START helmfile.d/admin 'sync'. | [production] | 
            
  | 11:25 | <vgutierrez> | restarting HAProxy on O:cache::(text|upload)_haproxy - T290005 | [production] | 
            
  | 10:23 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repool after fixing users T296274', diff saved to https://phabricator.wikimedia.org/P17880 and previous config saved to /var/cache/conftool/dbconfig/20211126-102340-ladsgroup.json | [production] | 
            
  | 10:17 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Depooling db1111 (T296274)', diff saved to https://phabricator.wikimedia.org/P17879 and previous config saved to /var/cache/conftool/dbconfig/20211126-101714-ladsgroup.json | [production] | 
            
  | 10:17 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1111.eqiad.wmnet with reason: Maintenance T296274 | [production] | 
            
  | 10:17 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 4:00:00 on db1111.eqiad.wmnet with reason: Maintenance T296274 | [production] | 
            
  | 10:14 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Repool after fixing users T296274', diff saved to https://phabricator.wikimedia.org/P17878 and previous config saved to /var/cache/conftool/dbconfig/20211126-101423-ladsgroup.json | [production] | 
            
  | 10:05 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'Depooling db1177 (T296274)', diff saved to https://phabricator.wikimedia.org/P17877 and previous config saved to /var/cache/conftool/dbconfig/20211126-100547-ladsgroup.json | [production] | 
            
  | 10:05 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1177.eqiad.wmnet with reason: Maintenance T296274 | [production] | 
            
  | 10:05 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 4:00:00 on db1177.eqiad.wmnet with reason: Maintenance T296274 | [production] | 
            
  | 10:04 | <ladsgroup@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance T296143 | [production] | 
            
  | 10:04 | <ladsgroup@cumin1001> | START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance T296143 | [production] | 
            
  | 08:28 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'After maintenance db1160 (T296143)', diff saved to https://phabricator.wikimedia.org/P17876 and previous config saved to /var/cache/conftool/dbconfig/20211126-082834-ladsgroup.json | [production] | 
            
  | 08:13 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'After maintenance db1160 (T296143)', diff saved to https://phabricator.wikimedia.org/P17875 and previous config saved to /var/cache/conftool/dbconfig/20211126-081329-ladsgroup.json | [production] | 
            
  | 07:58 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'After maintenance db1160 (T296143)', diff saved to https://phabricator.wikimedia.org/P17874 and previous config saved to /var/cache/conftool/dbconfig/20211126-075824-ladsgroup.json | [production] | 
            
  | 07:43 | <ladsgroup@cumin1001> | dbctl commit (dc=all): 'After maintenance db1160 (T296143)', diff saved to https://phabricator.wikimedia.org/P17873 and previous config saved to /var/cache/conftool/dbconfig/20211126-074320-ladsgroup.json | [production] | 
            
  | 06:28 | <Amir1> | killing extensions/MachineVision/maintenance/fetchSuggestions.php in mwmaint | [production] | 
            
  | 06:19 | <Amir1> | killing lingering process from mwmaint to depooled db (db1160) that was depooled nine hours ago | [production] |