| 
      
        2023-06-30
      
      §
     | 
  
    
  | 20:59 | 
  <bking@cumin1001> | 
  END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) | 
  [production] | 
            
  | 20:59 | 
  <bking@cumin1001> | 
  START - Cookbook sre.wdqs.data-transfer | 
  [production] | 
            
  | 20:57 | 
  <bking@cumin1001> | 
  END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) | 
  [production] | 
            
  | 20:29 | 
  <jforrester@deploy1002> | 
  helmfile [staging] DONE helmfile.d/services/wikifunctions: apply | 
  [production] | 
            
  | 20:29 | 
  <jforrester@deploy1002> | 
  helmfile [staging] START helmfile.d/services/wikifunctions: apply | 
  [production] | 
            
  | 19:55 | 
  <mutante> | 
  please hold code changes and deploys if using gitlab - upgrade in progress | 
  [production] | 
            
  | 19:53 | 
  <dzahn@cumin1001> | 
  START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: security release | 
  [production] | 
            
  | 19:26 | 
  <bking@cumin1001> | 
  START - Cookbook sre.wdqs.data-transfer | 
  [production] | 
            
  | 19:25 | 
  <bking@cumin1001> | 
  END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) | 
  [production] | 
            
  | 19:25 | 
  <dzahn@cumin1001> | 
  END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: security release | 
  [production] | 
            
  | 19:25 | 
  <bking@cumin1001> | 
  START - Cookbook sre.wdqs.data-transfer | 
  [production] | 
            
  | 18:25 | 
  <brennen@deploy1002> | 
  Finished scap: Backport for [[gerrit:934476|Fix bug in opening dialog (T340816)]] (duration: 08m 37s) | 
  [production] | 
            
  | 18:20 | 
  <mutante> | 
  upgrading gitlab on gitlab-replica.wikimedia.org | 
  [production] | 
            
  | 18:19 | 
  <dzahn@cumin1001> | 
  START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: security release | 
  [production] | 
            
  | 18:18 | 
  <brennen@deploy1002> | 
  brennen and jforrester: Backport for [[gerrit:934476|Fix bug in opening dialog (T340816)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet | 
  [production] | 
            
  | 18:16 | 
  <brennen@deploy1002> | 
  Started scap: Backport for [[gerrit:934476|Fix bug in opening dialog (T340816)]] | 
  [production] | 
            
  | 18:06 | 
  <dzahn@cumin1001> | 
  END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: security release | 
  [production] | 
            
  | 16:59 | 
  <dzahn@cumin1001> | 
  START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: security release | 
  [production] | 
            
  | 16:27 | 
  <jayme@deploy1002> | 
  helmfile [eqiad] DONE helmfile.d/services/mathoid: apply | 
  [production] | 
            
  | 16:27 | 
  <jayme@deploy1002> | 
  helmfile [eqiad] START helmfile.d/services/mathoid: apply | 
  [production] | 
            
  | 16:26 | 
  <jayme@deploy1002> | 
  helmfile [codfw] DONE helmfile.d/services/mathoid: apply | 
  [production] | 
            
  | 16:25 | 
  <jayme@deploy1002> | 
  helmfile [codfw] START helmfile.d/services/mathoid: apply | 
  [production] | 
            
  | 16:25 | 
  <jayme@deploy1002> | 
  helmfile [staging] DONE helmfile.d/services/mathoid: apply | 
  [production] | 
            
  | 16:25 | 
  <jayme@deploy1002> | 
  helmfile [staging] START helmfile.d/services/mathoid: apply | 
  [production] | 
            
  | 16:09 | 
  <aikochou@deploy1002> | 
  helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . | 
  [production] | 
            
  | 15:50 | 
  <jhancock@cumin2002> | 
  END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1149.eqiad.wmnet with OS bullseye | 
  [production] | 
            
  | 15:35 | 
  <elukey@deploy1002> | 
  helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' . | 
  [production] | 
            
  | 15:35 | 
  <elukey@deploy1002> | 
  helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' . | 
  [production] | 
            
  | 15:21 | 
  <isaranto@deploy1002> | 
  helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . | 
  [production] | 
            
  | 14:43 | 
  <jiji@cumin1001> | 
  conftool action : γετ; selector: service=kube-apiserver | 
  [production] | 
            
  | 14:42 | 
  <sbassett> | 
  Deployed updated mitigation for T337593 | 
  [production] | 
            
  | 14:30 | 
  <jhancock@cumin2002> | 
  START - Cookbook sre.hosts.reimage for host an-worker1149.eqiad.wmnet with OS bullseye | 
  [production] | 
            
  | 14:14 | 
  <isaranto@deploy1002> | 
  helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . | 
  [production] | 
            
  | 13:23 | 
  <jayme@deploy1002> | 
  helmfile [staging] DONE helmfile.d/services/mathoid: apply | 
  [production] | 
            
  | 13:23 | 
  <jayme@deploy1002> | 
  helmfile [staging] START helmfile.d/services/mathoid: apply | 
  [production] | 
            
  | 12:39 | 
  <kharlan@deploy1002> | 
  helmfile [eqiad] START helmfile.d/services/ipoid: apply | 
  [production] | 
            
  | 12:30 | 
  <bking@cumin1001> | 
  END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs2021.codfw.wmnet with OS bullseye | 
  [production] | 
            
  | 12:22 | 
  <jbond@cumin1001> | 
  END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['sretest1003'] | 
  [production] | 
            
  | 12:20 | 
  <jiji@cumin1001> | 
  END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kubestagemaster2002.codfw.wmnet with OS bullseye | 
  [production] | 
            
  | 12:17 | 
  <jbond@cumin1001> | 
  START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003'] | 
  [production] | 
            
  | 12:17 | 
  <jbond@cumin1001> | 
  END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['sretest1003'] | 
  [production] | 
            
  | 12:16 | 
  <jbond@cumin1001> | 
  START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest1003'] | 
  [production] | 
            
  | 12:10 | 
  <kharlan@deploy1002> | 
  helmfile [staging] DONE helmfile.d/services/ipoid: apply | 
  [production] | 
            
  | 12:09 | 
  <kharlan@deploy1002> | 
  helmfile [staging] START helmfile.d/services/ipoid: apply | 
  [production] | 
            
  | 12:03 | 
  <kharlan@deploy1002> | 
  helmfile [staging] START helmfile.d/services/ipoid: apply | 
  [production] | 
            
  | 11:59 | 
  <jiji@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster1002.eqiad.wmnet with OS bullseye | 
  [production] | 
            
  | 11:54 | 
  <jiji@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster2002.codfw.wmnet with reason: host reimage | 
  [production] | 
            
  | 11:51 | 
  <jiji@cumin1001> | 
  START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster2002.codfw.wmnet with reason: host reimage | 
  [production] | 
            
  | 11:39 | 
  <jiji@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster1002.eqiad.wmnet with reason: host reimage | 
  [production] | 
            
  | 11:38 | 
  <jbond@cumin1001> | 
  END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['sretest1003'] | 
  [production] |