| 2022-08-09
      
      ยง | 
    
  | 16:53 | <bking@cumin1001> | START - Cookbook sre.elasticsearch.force-shard-allocation | [production] | 
            
  | 16:26 | <bking@deploy1002> | helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply | [production] | 
            
  | 16:26 | <bking@deploy1002> | helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply | [production] | 
            
  | 16:01 | <bking@cumin1001> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1069.eqiad.wmnet with OS bullseye | [production] | 
            
  | 15:45 | <bking@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1069.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 15:42 | <bking@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1069.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 15:32 | <mwdebug-deploy@deploy1002> | helmfile [codfw] DONE helmfile.d/services/mwdebug: apply | [production] | 
            
  | 15:31 | <mwdebug-deploy@deploy1002> | helmfile [codfw] START helmfile.d/services/mwdebug: apply | [production] | 
            
  | 15:31 | <mwdebug-deploy@deploy1002> | helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply | [production] | 
            
  | 15:30 | <bking@cumin1001> | START - Cookbook sre.hosts.reimage for host elastic1069.eqiad.wmnet with OS bullseye | [production] | 
            
  | 15:28 | <mwdebug-deploy@deploy1002> | helmfile [eqiad] START helmfile.d/services/mwdebug: apply | [production] | 
            
  | 15:27 | <bking@cumin1001> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic1058.eqiad.wmnet with OS bullseye | [production] | 
            
  | 15:08 | <bking@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic1058.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 15:05 | <bking@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on elastic1058.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 14:59 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . | [production] | 
            
  | 14:59 | <elukey@deploy1002> | helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . | [production] | 
            
  | 14:57 | <denisse|m> | finished running 'homer "status:active" commit "netmon: Add the netmon1003 host as a syslog destination"' in the cumin1001 host. Homer reported no errors. | [production] | 
            
  | 14:54 | <elukey@deploy1002> | helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . | [production] | 
            
  | 14:50 | <bking@cumin1001> | START - Cookbook sre.hosts.reimage for host elastic1058.eqiad.wmnet with OS bullseye | [production] | 
            
  | 14:28 | <bking@cumin1001> | conftool action : set/pooled=false; selector: dnsdisc=wdqs,name=codfw | [production] | 
            
  | 13:57 | <kevinbazira@deploy1002> | helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . | [production] | 
            
  | 13:57 | <kevinbazira@deploy1002> | helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . | [production] | 
            
  | 13:54 | <denisse|m> | Add the new netmon1003 host as a syslog destination in homer templates/common/system.conf https://gerrit.wikimedia.org/r/c/operations/homer/public/+/819124 | [production] | 
            
  | 13:52 | <denisse|m> | Successfully ran '# run-puppet-merge' in the netmon1002 and netmon1003 hosts. | [production] | 
            
  | 13:51 | <denisse|m> | Running '# run-puppet-agent' in the netmon1003 host | [production] | 
            
  | 13:50 | <denisse|m> | Running '# run-puppet-agent' in the netmon1002 host | [production] | 
            
  | 13:47 | <ryankemper@cumin1001> | END (PASS) - Cookbook sre.elasticsearch.force-shard-allocation (exit_code=0) | [production] | 
            
  | 13:46 | <ryankemper@cumin1001> | START - Cookbook sre.elasticsearch.force-shard-allocation | [production] | 
            
  | 13:45 | <denisse|m> | puppet-merge on puppetmaster2004.codfw.wmnet for patch 819179 succeeded | [production] | 
            
  | 13:43 | <denisse|m> | Set netmon1003 as netmon_server and netmon1002 as a netmon_servers_failover in the Puppet repository https://gerrit.wikimedia.org/r/c/operations/puppet/+/819179 | [production] | 
            
  | 13:42 | <denisse|m> | authdns updated successfully | [production] | 
            
  | 13:42 | <denisse|m> | Had to revert https://gerrit.wikimedia.org/r/c/operations/dns/+/819177 because I rebased my changes incorrectly, sent the new patch in https://gerrit.wikimedia.org/r/c/operations/dns/+/821746 | [production] | 
            
  | 13:32 | <denisse|m> | running '# authdns-update' in  ns0.wikimedia.org | [production] | 
            
  | 13:29 | <denisse|m> | Flip DNS for LibreNMS and Smokeping from netmon1002 to netmon1003 https://gerrit.wikimedia.org/r/c/operations/dns/+/819177 | [production] | 
            
  | 13:23 | <jynus> | stop replication on db1117:m1 T309074 | [production] | 
            
  | 13:21 | <denisse|m> | netmon1002 to netmon1003 failover | [production] | 
            
  | 13:17 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' . | [production] | 
            
  | 13:16 | <elukey@deploy1002> | helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' . | [production] | 
            
  | 10:58 | <elukey@deploy1002> | helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' . | [production] | 
            
  | 09:53 | <vgutierrez> | rolling restart of pybal in eqsin - T310070 | [production] | 
            
  | 09:25 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . | [production] | 
            
  | 09:24 | <elukey@deploy1002> | helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . | [production] | 
            
  | 09:24 | <elukey@deploy1002> | helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . | [production] | 
            
  | 09:12 | <vgutierrez> | rolling restart of pybal in codfw - T310070 | [production] | 
            
  | 08:47 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . | [production] | 
            
  | 08:30 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . | [production] | 
            
  | 08:28 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . | [production] | 
            
  | 08:27 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. | [production] | 
            
  | 08:27 | <elukey@deploy1002> | helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. | [production] | 
            
  | 08:27 | <elukey@deploy1002> | helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. | [production] |