| 2023-01-25
      
      ยง | 
    
  | 15:43 | <sukhe@cumin2002> | END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2031.codfw.wmnet with OS bullseye | [production] | 
            
  | 15:38 | <papaul> | on going maintenance on fasw-c-eqiad | [production] | 
            
  | 15:33 | <btullis@cumin1001> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1001.eqiad.wmnet | [production] | 
            
  | 15:33 | <sukhe@cumin2002> | START - Cookbook sre.hosts.reimage for host cp2031.codfw.wmnet with OS bullseye | [production] | 
            
  | 15:33 | <sukhe@cumin2002> | END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2031.codfw.wmnet with OS bullseye | [production] | 
            
  | 15:29 | <btullis@cumin1001> | START - Cookbook sre.hosts.reboot-single for host an-conf1001.eqiad.wmnet | [production] | 
            
  | 15:23 | <btullis@cumin1001> | END (FAIL) - Cookbook sre.hadoop.reboot-workers (exit_code=99) for Hadoop analytics cluster | [production] | 
            
  | 15:21 | <sukhe@cumin2002> | START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS bullseye | [production] | 
            
  | 15:19 | <btullis@cumin1001> | END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1003.eqiad.wmnet | [production] | 
            
  | 15:17 | <sukhe@puppetmaster1001> | conftool action : set/pooled=yes; selector: name=cp4045.ulsfo.wmnet,service=ats-be | [production] | 
            
  | 15:17 | <sukhe@puppetmaster1001> | conftool action : set/pooled=yes; selector: name=cp4045.ulsfo.wmnet,service=cdn | [production] | 
            
  | 15:14 | <sukhe@cumin2002> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4045.ulsfo.wmnet with OS bullseye | [production] | 
            
  | 15:13 | <btullis@cumin1001> | START - Cookbook sre.hosts.reboot-single for host an-conf1003.eqiad.wmnet | [production] | 
            
  | 15:13 | <btullis@cumin1001> | END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99) | [production] | 
            
  | 15:13 | <btullis@cumin1001> | START - Cookbook sre.hosts.reboot-cluster | [production] | 
            
  | 15:12 | <urbanecm@deploy1002> | Finished scap: triggering i18n refresh for T327824 (duration: 07m 57s) | [production] | 
            
  | 15:07 | <sukhe@cumin2002> | START - Cookbook sre.hosts.reimage for host cp2031.codfw.wmnet with OS bullseye | [production] | 
            
  | 15:04 | <urbanecm@deploy1002> | Started scap: triggering i18n refresh for T327824 | [production] | 
            
  | 15:04 | <urbanecm@deploy1002> | Finished scap: Backport for [[gerrit:882615|Enable the Wikibase REST API on Wikidata (T324999)]] (duration: 08m 43s) | [production] | 
            
  | 15:02 | <sukhe@puppetmaster1001> | conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet,service=ats-be | [production] | 
            
  | 15:02 | <sukhe@puppetmaster1001> | conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet,service=cdn | [production] | 
            
  | 15:01 | <urbanecm> | Overrunning B&C window | [production] | 
            
  | 14:57 | <urbanecm@deploy1002> | urbanecm and migr: Backport for [[gerrit:882615|Enable the Wikibase REST API on Wikidata (T324999)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet | [production] | 
            
  | 14:57 | <sukhe@cumin2002> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4037.ulsfo.wmnet with OS bullseye | [production] | 
            
  | 14:55 | <urbanecm@deploy1002> | Started scap: Backport for [[gerrit:882615|Enable the Wikibase REST API on Wikidata (T324999)]] | [production] | 
            
  | 14:53 | <btullis@cumin1001> | START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster | [production] | 
            
  | 14:53 | <urbanecm@deploy1002> | Finished scap: Backport for [[gerrit:883224|REST: Use error log level for unexpected errors (T327490)]], [[gerrit:883547|User impact: amend incorrect parameter for the single day streak text (T327824)]] (duration: 32m 21s) | [production] | 
            
  | 14:53 | <sukhe@cumin2002> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4045.ulsfo.wmnet with reason: host reimage | [production] | 
            
  | 14:50 | <sukhe@cumin2002> | START - Cookbook sre.hosts.downtime for 2:00:00 on cp4045.ulsfo.wmnet with reason: host reimage | [production] | 
            
  | 14:45 | <jmm@cumin2002> | END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host install6002.wikimedia.org | [production] | 
            
  | 14:39 | <urbanecm@deploy1002> | jakob and sgimeno and urbanecm: Backport for [[gerrit:883224|REST: Use error log level for unexpected errors (T327490)]], [[gerrit:883547|User impact: amend incorrect parameter for the single day streak text (T327824)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet | [production] | 
            
  | 14:32 | <sukhe@cumin2002> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4037.ulsfo.wmnet with reason: host reimage | [production] | 
            
  | 14:30 | <jmm@cumin2002> | END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install6002.wikimedia.org on all recursors | [production] | 
            
  | 14:30 | <jmm@cumin2002> | START - Cookbook sre.dns.wipe-cache install6002.wikimedia.org on all recursors | [production] | 
            
  | 14:30 | <jmm@cumin2002> | END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | [production] | 
            
  | 14:30 | <jmm@cumin2002> | END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install6002.wikimedia.org - jmm@cumin2002" | [production] | 
            
  | 14:30 | <sukhe@cumin2002> | START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS bullseye | [production] | 
            
  | 14:29 | <isaranto@deploy1002> | helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' . | [production] | 
            
  | 14:29 | <jmm@cumin2002> | START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install6002.wikimedia.org - jmm@cumin2002" | [production] | 
            
  | 14:29 | <isaranto@deploy1002> | helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . | [production] | 
            
  | 14:29 | <isaranto@deploy1002> | helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' . | [production] | 
            
  | 14:29 | <isaranto@deploy1002> | helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . | [production] | 
            
  | 14:29 | <isaranto@deploy1002> | helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' . | [production] | 
            
  | 14:29 | <isaranto@deploy1002> | helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . | [production] | 
            
  | 14:29 | <isaranto@deploy1002> | helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' . | [production] | 
            
  | 14:28 | <sukhe@cumin2002> | START - Cookbook sre.hosts.downtime for 2:00:00 on cp4037.ulsfo.wmnet with reason: host reimage | [production] | 
            
  | 14:25 | <jmm@cumin2002> | START - Cookbook sre.dns.netbox | [production] | 
            
  | 14:25 | <jmm@cumin2002> | START - Cookbook sre.ganeti.makevm for new host install6002.wikimedia.org | [production] | 
            
  | 14:23 | <jmm@cumin2002> | END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host install5002.wikimedia.org | [production] | 
            
  | 14:21 | <urbanecm@deploy1002> | Started scap: Backport for [[gerrit:883224|REST: Use error log level for unexpected errors (T327490)]], [[gerrit:883547|User impact: amend incorrect parameter for the single day streak text (T327824)]] | [production] |