| 2023-10-02
      
      ยง | 
    
  | 17:24 | <sukhe> | sudo cumin "A:dns-rec" "disable-puppet 'merging CR 962648'" | [production] | 
            
  | 17:18 | <elukey@deploy2002> | helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. | [production] | 
            
  | 17:18 | <elukey@deploy2002> | helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. | [production] | 
            
  | 17:17 | <elukey@deploy2002> | helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. | [production] | 
            
  | 17:17 | <elukey@deploy2002> | helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. | [production] | 
            
  | 17:17 | <elukey@deploy2002> | helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. | [production] | 
            
  | 17:17 | <elukey@deploy2002> | helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. | [production] | 
            
  | 17:12 | <eevans@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1022.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 17:09 | <eevans@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1022.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 17:00 | <fabfur> | upgrade purged package to version 0.21+deb12u1 cp4052 (bookworm) (T347837) | [production] | 
            
  | 16:56 | <eevans@cumin1001> | START - Cookbook sre.hosts.reimage for host restbase1022.eqiad.wmnet with OS bullseye | [production] | 
            
  | 16:55 | <eevans@cumin1001> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1031.eqiad.wmnet with OS bullseye | [production] | 
            
  | 16:39 | <ryankemper@cumin1001> | END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T347624, testing new cookbook changes) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet, repooling both afterwards w/ encryption | [production] | 
            
  | 16:30 | <ryankemper@cumin1001> | START - Cookbook sre.wdqs.data-transfer (T347624, testing new cookbook changes) xfer categories from wdqs2024.codfw.wmnet -> wdqs2025.codfw.wmnet, repooling both afterwards w/ encryption | [production] | 
            
  | 16:29 | <eevans@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1031.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 16:26 | <eevans@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1031.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 16:13 | <eevans@cumin1001> | START - Cookbook sre.hosts.reimage for host restbase1031.eqiad.wmnet with OS bullseye | [production] | 
            
  | 16:08 | <eevans@cumin1001> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1028.eqiad.wmnet with OS bullseye | [production] | 
            
  | 16:06 | <fabfur> | importing into bookworm-wikimedia package purged_0.21+deb12u1_amd64 (T347837) | [production] | 
            
  | 15:44 | <jgiannelos@deploy2002> | helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply | [production] | 
            
  | 15:43 | <jgiannelos@deploy2002> | helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply | [production] | 
            
  | 15:43 | <eevans@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1028.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 15:40 | <eevans@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1028.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 15:29 | <sukhe> | enable puppet on A:dns-rec and force agent run | [production] | 
            
  | 15:28 | <joal@deploy2002> | helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply | [production] | 
            
  | 15:28 | <joal@deploy2002> | helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply | [production] | 
            
  | 15:27 | <eevans@cumin1001> | START - Cookbook sre.hosts.reimage for host restbase1028.eqiad.wmnet with OS bullseye | [production] | 
            
  | 15:27 | <eevans@cumin1001> | END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for restbase1028.eqiad.wmnet | [production] | 
            
  | 15:27 | <eevans@cumin1001> | START - Cookbook sre.hosts.remove-downtime for restbase1028.eqiad.wmnet | [production] | 
            
  | 15:24 | <eevans@cumin1001> | END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for restbase1021.eqiad.wmnet | [production] | 
            
  | 15:24 | <eevans@cumin1001> | START - Cookbook sre.hosts.remove-downtime for restbase1021.eqiad.wmnet | [production] | 
            
  | 15:23 | <jelto@cumin1001> | END (PASS) - Cookbook sre.gitlab.failover (exit_code=0) Failover of gitlab from gitlab1003.wikimedia.org to gitlab2002.wikimedia.org | [production] | 
            
  | 15:20 | <jelto@cumin1001> | END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) https://gitlab-replica.wikimedia.org/ https://gitlab-replica-old.wikimedia.org/ on all recursors | [production] | 
            
  | 15:20 | <jelto@cumin1001> | START - Cookbook sre.dns.wipe-cache https://gitlab-replica.wikimedia.org/ https://gitlab-replica-old.wikimedia.org/ on all recursors | [production] | 
            
  | 15:02 | <eevans@cumin1001> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host restbase1021.eqiad.wmnet with OS bullseye | [production] | 
            
  | 15:00 | <jhancock@cumin2002> | END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1229.eqiad.wmnet with OS bullseye | [production] | 
            
  | 14:55 | <elukey> | restart kubelet on ml-serve1001 (high latencies registered) | [production] | 
            
  | 14:51 | <fabfur> | upgrade purged package to version 0.21+deb11u1 on all cp hosts (T347837) | [production] | 
            
  | 14:48 | <jhancock@cumin2002> | END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti-test2004.mgmt.codfw.wmnet with reboot policy FORCED | [production] | 
            
  | 14:48 | <jhancock@cumin2002> | START - Cookbook sre.hosts.provision for host ganeti-test2004.mgmt.codfw.wmnet with reboot policy FORCED | [production] | 
            
  | 14:47 | <jhancock@cumin2002> | END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | [production] | 
            
  | 14:47 | <jhancock@cumin2002> | END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding new host ganeti-test2004 - jhancock@cumin2002" | [production] | 
            
  | 14:46 | <jhancock@cumin2002> | START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding new host ganeti-test2004 - jhancock@cumin2002" | [production] | 
            
  | 14:44 | <jhancock@cumin2002> | START - Cookbook sre.dns.netbox | [production] | 
            
  | 14:40 | <stevemunene@cumin1001> | END (FAIL) - Cookbook sre.druid.roll-restart-workers (exit_code=99) for Druid public cluster: Roll restart of Druid jvm daemons. | [production] | 
            
  | 14:37 | <eevans@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on restbase1021.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 14:34 | <eevans@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on restbase1021.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 14:23 | <fabfur> | importing into bullseye-wikimedia package purged_0.21+deb11u1_amd64 (T347837) | [production] | 
            
  | 14:20 | <eevans@cumin1001> | START - Cookbook sre.hosts.reimage for host restbase1021.eqiad.wmnet with OS bullseye | [production] | 
            
  | 14:19 | <isaranto@deploy2002> | helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . | [production] |