| 
      
        2023-06-23
      
      §
     | 
  
    
  | 14:12 | 
  <vgutierrez@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on parse1002.eqiad.wmnet with reason: HW issues | 
  [production] | 
            
  | 14:12 | 
  <vgutierrez@cumin1001> | 
  START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on parse1002.eqiad.wmnet with reason: HW issues | 
  [production] | 
            
  | 13:35 | 
  <Emperor> | 
  update private wiki container ACLs in eqiad-swift | 
  [production] | 
            
  | 13:30 | 
  <Emperor> | 
  update private wiki container ACLs in codfw-swift | 
  [production] | 
            
  | 13:29 | 
  <godog> | 
  add 200G to prometheus/k8s in eqiad | 
  [production] | 
            
  | 12:40 | 
  <elukey> | 
  move varnishkafka drmrs instances to pki | 
  [production] | 
            
  | 12:10 | 
  <Emperor> | 
  updating ACLs on wikipedia-office containers T340189 T338765 | 
  [production] | 
            
  | 11:24 | 
  <btullis@deploy1002> | 
  helmfile [staging] DONE helmfile.d/services/datahub: sync on main | 
  [production] | 
            
  | 11:13 | 
  <btullis@deploy1002> | 
  helmfile [staging] START helmfile.d/services/datahub: apply on main | 
  [production] | 
            
  | 11:12 | 
  <btullis@deploy1002> | 
  helmfile [staging] DONE helmfile.d/services/datahub: sync on main | 
  [production] | 
            
  | 11:02 | 
  <btullis@deploy1002> | 
  helmfile [staging] START helmfile.d/services/datahub: apply on main | 
  [production] | 
            
  | 10:27 | 
  <btullis@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1110.eqiad.wmnet | 
  [production] | 
            
  | 10:20 | 
  <btullis@cumin1001> | 
  START - Cookbook sre.hosts.reboot-single for host an-worker1110.eqiad.wmnet | 
  [production] | 
            
  | 10:12 | 
  <moritzm> | 
  installing vim security updates | 
  [production] | 
            
  | 09:26 | 
  <moritzm> | 
  uploaded openjdk-8 8u372-ga-1~deb10u1 to component/jdk8 (forward port of Java 8 for Buster) | 
  [production] | 
            
  | 09:20 | 
  <btullis@cumin1001> | 
  END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1110.eqiad.wmnet | 
  [production] | 
            
  | 08:48 | 
  <elukey@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on ml-cache1001.eqiad.wmnet with reason: Working on pki | 
  [production] | 
            
  | 08:48 | 
  <elukey@cumin1001> | 
  START - Cookbook sre.hosts.downtime for 0:30:00 on ml-cache1001.eqiad.wmnet with reason: Working on pki | 
  [production] | 
            
  | 08:37 | 
  <btullis@cumin1001> | 
  START - Cookbook sre.hosts.reboot-single for host an-worker1110.eqiad.wmnet | 
  [production] | 
            
  | 05:52 | 
  <ayounsi@cumin1001> | 
  END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 14860 | 
  [production] | 
            
  | 05:49 | 
  <ayounsi@cumin1001> | 
  START - Cookbook sre.network.peering with action 'configure' for AS: 14860 | 
  [production] | 
            
  | 04:57 | 
  <marostegui@cumin1001> | 
  dbctl commit (dc=all): 'Depool db1118', diff saved to https://phabricator.wikimedia.org/P49472 and previous config saved to /var/cache/conftool/dbconfig/20230623-045758-root.json | 
  [production] | 
            
  | 01:19 | 
  <bking@cumin1001> | 
  END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) | 
  [production] | 
            
  | 01:15 | 
  <bking@cumin1001> | 
  START - Cookbook sre.wdqs.data-transfer | 
  [production] | 
            
  
    | 
      
        2023-06-22
      
      §
     | 
  
    
  | 21:00 | 
  <bking@cumin1001> | 
  END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) | 
  [production] | 
            
  | 19:41 | 
  <dzahn@cumin1001> | 
  END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host phab-test1001.eqiad.wmnet | 
  [production] | 
            
  | 19:41 | 
  <dzahn@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host phab-test1001.eqiad.wmnet with OS buster | 
  [production] | 
            
  | 19:29 | 
  <dzahn@cumin1001> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on phab-test1001.eqiad.wmnet with reason: host reimage | 
  [production] | 
            
  | 19:26 | 
  <dzahn@cumin1001> | 
  START - Cookbook sre.hosts.downtime for 2:00:00 on phab-test1001.eqiad.wmnet with reason: host reimage | 
  [production] | 
            
  | 19:25 | 
  <bking@cumin1001> | 
  START - Cookbook sre.wdqs.data-transfer | 
  [production] | 
            
  | 19:14 | 
  <dzahn@cumin1001> | 
  START - Cookbook sre.hosts.reimage for host phab-test1001.eqiad.wmnet with OS buster | 
  [production] | 
            
  | 19:13 | 
  <dzahn@cumin1001> | 
  END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM phab-test1001.eqiad.wmnet - dzahn@cumin1001" | 
  [production] | 
            
  | 19:12 | 
  <dzahn@cumin1001> | 
  START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM phab-test1001.eqiad.wmnet - dzahn@cumin1001" | 
  [production] | 
            
  | 19:11 | 
  <dzahn@cumin1001> | 
  END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) phab-test1001.eqiad.wmnet on all recursors | 
  [production] | 
            
  | 19:11 | 
  <dzahn@cumin1001> | 
  START - Cookbook sre.dns.wipe-cache phab-test1001.eqiad.wmnet on all recursors | 
  [production] | 
            
  | 19:11 | 
  <dzahn@cumin1001> | 
  END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | 
  [production] | 
            
  | 19:11 | 
  <dzahn@cumin1001> | 
  END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM phab-test1001.eqiad.wmnet - dzahn@cumin1001" | 
  [production] | 
            
  | 19:11 | 
  <dzahn@cumin1001> | 
  START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM phab-test1001.eqiad.wmnet - dzahn@cumin1001" | 
  [production] | 
            
  | 19:09 | 
  <dzahn@cumin1001> | 
  START - Cookbook sre.dns.netbox | 
  [production] | 
            
  | 17:32 | 
  <brett@cumin2002> | 
  END (ERROR) - Cookbook sre.cdn.roll-upgrade-ats (exit_code=97) Rolling upgrade/restart of Apache Traffic Server on P{cp1082*} and (A:cp-eqiad or A:cp-text_eqiad or A:cp-upload_eqiad or A:cp-codfw or A:cp-text_codfw or A:cp-upload_codfw or A:cp-esams or A:cp-text_esams or A:cp-upload_esams or A:cp-ulsfo or A:cp-text_ulsfo or A:cp-upload_ulsfo or A:cp-eqsin or A:cp-text_eqsin or A:cp-upload_eqsin or A:c | 
  [production] | 
            
  | 17:32 | 
  <brett@cumin2002> | 
  START - Cookbook sre.cdn.roll-upgrade-ats Rolling upgrade/restart of Apache Traffic Server on P{cp1082*} and (A:cp-eqiad or A:cp-text_eqiad or A:cp-upload_eqiad or A:cp-codfw or A:cp-text_codfw or A:cp-upload_codfw or A:cp-esams or A:cp-text_esams or A:cp-upload_esams or A:cp-ulsfo or A:cp-text_ulsfo or A:cp-upload_ulsfo or A:cp-eqsin or A:cp-text_eqsin or A:cp-upload_eqsin or A:cp-drmrs or A:cp-text_ | 
  [production] | 
            
  | 17:04 | 
  <btullis@deploy1002> | 
  helmfile [staging] DONE helmfile.d/services/datahub: sync on main | 
  [production] | 
            
  | 17:03 | 
  <brett@cumin2002> | 
  END (PASS) - Cookbook sre.dns.roll-restart-wikimedia-dns (exit_code=0) rolling restart_daemons on P{doh6001*} and A:wikidough | 
  [production] | 
            
  | 17:03 | 
  <brett@cumin2002> | 
  START - Cookbook sre.dns.roll-restart-wikimedia-dns rolling restart_daemons on P{doh6001*} and A:wikidough | 
  [production] | 
            
  | 16:54 | 
  <btullis@deploy1002> | 
  helmfile [staging] START helmfile.d/services/datahub: apply on main | 
  [production] | 
            
  | 16:27 | 
  <eevans@cumin2002> | 
  END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for sessionstore2001.codfw.wmnet: Renew puppet certificate - eevans@cumin2002 | 
  [production] | 
            
  | 16:26 | 
  <eevans@cumin2002> | 
  START - Cookbook sre.puppet.renew-cert for sessionstore2001.codfw.wmnet: Renew puppet certificate - eevans@cumin2002 | 
  [production] | 
            
  | 16:24 | 
  <eevans@cumin2002> | 
  END (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for sessionstore2001.codfw.wmnet: Renew puppet certificate - eevans@cumin2002 | 
  [production] | 
            
  | 16:24 | 
  <eevans@cumin2002> | 
  START - Cookbook sre.puppet.renew-cert for sessionstore2001.codfw.wmnet: Renew puppet certificate - eevans@cumin2002 | 
  [production] | 
            
  | 16:23 | 
  <eevans@cumin2002> | 
  END (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for sessionstore2001.codfw.wmnet: Renew puppet certificate - eevans@cumin2002 | 
  [production] |