| 
      
        2023-05-16
      
      ยง
     | 
  
    
  | 16:06 | 
  <mutante> | 
  gitlab-runner2003 - installed rsync client for debugging an issue with rsync from inside containers, comparing to from outside container | 
  [production] | 
            
  | 15:49 | 
  <sukhe> | 
  run authdns-update for CR 920314 | 
  [production] | 
            
  | 15:41 | 
  <joal@deploy1002> | 
  Finished deploy [airflow-dags/analytics@7fa2dcd]: Regular analytics weekly train [airflow-dags@7fa2dcd] (duration: 00m 10s) | 
  [production] | 
            
  | 15:41 | 
  <joal@deploy1002> | 
  Started deploy [airflow-dags/analytics@7fa2dcd]: Regular analytics weekly train [airflow-dags@7fa2dcd] | 
  [production] | 
            
  | 15:36 | 
  <hashar> | 
  Some CI jobs started failing after an upgrade of some Jenkins plugins. I have upgraded a couple more and it seems to work now T336775 | 
  [production] | 
            
  | 15:33 | 
  <sukhe> | 
  set routing-options static route 208.80.153.231/32 next-hop [ 208.80.153.10 208.80.153.48 208.80.153.74 ]: T326688 | 
  [production] | 
            
  | 15:33 | 
  <sukhe> | 
  set routing-options static route 208.80.153.231/32 next-hop [ 208.80.153.10 208.80.153.48 208.80.153.74 ] | 
  [production] | 
            
  | 15:32 | 
  <hnowlan@deploy1002> | 
  helmfile [staging] DONE helmfile.d/services/rest-gateway: apply | 
  [production] | 
            
  | 15:32 | 
  <hnowlan@deploy1002> | 
  helmfile [staging] START helmfile.d/services/rest-gateway: apply | 
  [production] | 
            
  | 15:27 | 
  <hashar> | 
  Restarting CI Jenkins | 
  [production] | 
            
  | 15:26 | 
  <Emperor> | 
  rebalance codfw swift rings T335280 | 
  [production] | 
            
  | 15:18 | 
  <hashar> | 
  CI Jenkins jobs are stall following the plugins upgrade :/ | 
  [production] | 
            
  | 15:07 | 
  <hnowlan@deploy1002> | 
  helmfile [eqiad] DONE helmfile.d/services/thumbor: apply | 
  [production] | 
            
  | 15:04 | 
  <hnowlan@deploy1002> | 
  helmfile [eqiad] START helmfile.d/services/thumbor: apply | 
  [production] | 
            
  | 15:03 | 
  <hnowlan@deploy1002> | 
  helmfile [eqiad] DONE helmfile.d/services/thumbor: apply | 
  [production] | 
            
  | 14:59 | 
  <jhancock@cumin2002> | 
  END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudswift1001.eqiad.wmnet with OS bullseye | 
  [production] | 
            
  | 14:55 | 
  <bking@deploy1002> | 
  helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply | 
  [production] | 
            
  | 14:49 | 
  <moritzm> | 
  installing libxml2 security updates on buster | 
  [production] | 
            
  | 14:48 | 
  <sukhe> | 
  [done] "cr*-codfw*" commit "Gerrit: 919876 add new DNS host dns2005": T326688 | 
  [production] | 
            
  | 14:47 | 
  <bking@deploy1002> | 
  helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply | 
  [production] | 
            
  | 14:46 | 
  <hnowlan@deploy1002> | 
  helmfile [eqiad] START helmfile.d/services/thumbor: apply | 
  [production] | 
            
  | 14:43 | 
  <hashar> | 
  Restarting CI Jenkins | 
  [production] | 
            
  | 14:42 | 
  <hnowlan@deploy1002> | 
  helmfile [eqiad] DONE helmfile.d/services/thumbor: apply | 
  [production] | 
            
  | 14:42 | 
  <sukhe> | 
  "cr*-codfw*" commit "Gerrit: 919876 add new DNS host dns2005": T326688 | 
  [production] | 
            
  | 14:36 | 
  <bking@deploy1002> | 
  helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply | 
  [production] | 
            
  | 14:32 | 
  <hnowlan@deploy1002> | 
  helmfile [eqiad] START helmfile.d/services/thumbor: apply | 
  [production] | 
            
  | 14:32 | 
  <hnowlan@deploy1002> | 
  helmfile [codfw] DONE helmfile.d/services/thumbor: apply | 
  [production] | 
            
  | 14:32 | 
  <hnowlan@deploy1002> | 
  helmfile [codfw] START helmfile.d/services/thumbor: apply | 
  [production] | 
            
  | 14:31 | 
  <hnowlan@deploy1002> | 
  helmfile [eqiad] DONE helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 14:31 | 
  <hnowlan@deploy1002> | 
  helmfile [eqiad] START helmfile.d/admin 'apply'. | 
  [production] | 
            
  | 14:30 | 
  <hnowlan@deploy1002> | 
  helmfile [codfw] DONE helmfile.d/services/thumbor: sync | 
  [production] | 
            
  | 14:30 | 
  <hnowlan@deploy1002> | 
  helmfile [codfw] START helmfile.d/services/thumbor: sync | 
  [production] | 
            
  | 14:27 | 
  <hnowlan@deploy1002> | 
  helmfile [codfw] DONE helmfile.d/services/thumbor: apply | 
  [production] | 
            
  | 14:27 | 
  <hnowlan@deploy1002> | 
  helmfile [codfw] START helmfile.d/services/thumbor: apply | 
  [production] | 
            
  | 14:26 | 
  <hnowlan@deploy1002> | 
  helmfile [staging] DONE helmfile.d/services/thumbor: apply | 
  [production] | 
            
  | 14:26 | 
  <bking@deploy1002> | 
  helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply | 
  [production] | 
            
  | 14:26 | 
  <hnowlan@deploy1002> | 
  helmfile [staging] START helmfile.d/services/thumbor: apply | 
  [production] | 
            
  | 14:26 | 
  <sukhe@cumin2002> | 
  END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns2005.wikimedia.org with OS bullseye | 
  [production] | 
            
  | 14:18 | 
  <jnuche@deploy1002> | 
  Finished deploy [releng/jenkins-deploy@0c82f2d] (releasing): (no justification provided) (duration: 00m 45s) | 
  [production] | 
            
  | 14:17 | 
  <jnuche@deploy1002> | 
  Started deploy [releng/jenkins-deploy@0c82f2d] (releasing): (no justification provided) | 
  [production] | 
            
  | 14:10 | 
  <akosiaris@cumin1001> | 
  END (FAIL) - Cookbook sre.discovery.datacenter (exit_code=93) pool all active/active services in codfw: codfw row D switches upgrade done - T335042 | 
  [production] | 
            
  | 14:10 | 
  <sukhe@cumin2002> | 
  END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dns2005.wikimedia.org with reason: host reimage | 
  [production] | 
            
  | 14:06 | 
  <sukhe@cumin2002> | 
  START - Cookbook sre.hosts.downtime for 2:00:00 on dns2005.wikimedia.org with reason: host reimage | 
  [production] | 
            
  | 13:54 | 
  <akosiaris@cumin1001> | 
  START - Cookbook sre.discovery.datacenter pool all active/active services in codfw: codfw row D switches upgrade done - T335042 | 
  [production] | 
            
  | 13:53 | 
  <sukhe@cumin2002> | 
  START - Cookbook sre.hosts.reimage for host dns2005.wikimedia.org with OS bullseye | 
  [production] | 
            
  | 13:49 | 
  <oblivian@cumin1001> | 
  END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on A:lvs-eqiad | 
  [production] | 
            
  | 13:46 | 
  <jhancock@cumin2002> | 
  START - Cookbook sre.hosts.reimage for host cloudswift1001.eqiad.wmnet with OS bullseye | 
  [production] | 
            
  | 13:46 | 
  <Emperor> | 
  repool ms-fe2012 T335042 | 
  [production] | 
            
  | 13:45 | 
  <oblivian@cumin1001> | 
  START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on A:lvs-eqiad | 
  [production] | 
            
  | 13:39 | 
  <btullis@puppetmaster1001> | 
  conftool action : set/pooled=yes; selector: cluster=eventschemas,dc=codfw,name=schema2004.codfw.wmnet | 
  [production] |