| 2023-10-26
      
      ยง | 
    
  | 13:21 | <jmm@cumin2002> | START - Cookbook sre.maps.roll-restart-reboot rolling restart_daemons on A:maps-replica-codfw | [production] | 
            
  | 13:20 | <lucaswerkmeister-wmde@deploy2002> | Finished scap: Backport for [[gerrit:968713|Enable block feature for AbuseFilter on srwiki (T349727)]] (duration: 10m 23s) | [production] | 
            
  | 13:20 | <bking@cumin1001> | START - Cookbook sre.wdqs.data-transfer | [production] | 
            
  | 13:20 | <bking@cumin1001> | END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) | [production] | 
            
  | 13:15 | <lucaswerkmeister-wmde@deploy2002> | zoranzoki21 and lucaswerkmeister-wmde: Continuing with sync | [production] | 
            
  | 13:15 | <moritzm> | installing poppler security updates | [production] | 
            
  | 13:11 | <lucaswerkmeister-wmde@deploy2002> | zoranzoki21 and lucaswerkmeister-wmde: Backport for [[gerrit:968713|Enable block feature for AbuseFilter on srwiki (T349727)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) | [production] | 
            
  | 13:10 | <lucaswerkmeister-wmde@deploy2002> | Started scap: Backport for [[gerrit:968713|Enable block feature for AbuseFilter on srwiki (T349727)]] | [production] | 
            
  | 13:04 | <bking@cumin1001> | START - Cookbook sre.wdqs.data-transfer | [production] | 
            
  | 12:27 | <stevemunene@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-airflow1007.eqiad.wmnet with reason: Downtime as we setup the new WMDE Airflow instance | [production] | 
            
  | 12:26 | <stevemunene@cumin1001> | START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-airflow1007.eqiad.wmnet with reason: Downtime as we setup the new WMDE Airflow instance | [production] | 
            
  | 11:04 | <kevinbazira@deploy2002> | helmfile [ml-serve-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . | [production] | 
            
  | 11:03 | <kevinbazira@deploy2002> | helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . | [production] | 
            
  | 10:58 | <kevinbazira@deploy2002> | helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . | [production] | 
            
  | 10:51 | <isaranto@deploy2002> | helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . | [production] | 
            
  | 10:51 | <isaranto@deploy2002> | helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . | [production] | 
            
  | 10:51 | <isaranto@deploy2002> | helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . | [production] | 
            
  | 10:40 | <elukey@deploy2002> | helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . | [production] | 
            
  | 10:30 | <mvolz@deploy2002> | helmfile [eqiad] DONE helmfile.d/services/citoid: apply | [production] | 
            
  | 10:29 | <mvolz@deploy2002> | helmfile [eqiad] START helmfile.d/services/citoid: apply | [production] | 
            
  | 10:25 | <mvolz@deploy2002> | helmfile [codfw] DONE helmfile.d/services/citoid: apply | [production] | 
            
  | 10:25 | <mvolz@deploy2002> | helmfile [codfw] START helmfile.d/services/citoid: apply | [production] | 
            
  | 10:20 | <mvolz@deploy2002> | helmfile [staging] DONE helmfile.d/services/citoid: apply | [production] | 
            
  | 10:20 | <mvolz@deploy2002> | helmfile [staging] START helmfile.d/services/citoid: apply | [production] | 
            
  | 10:10 | <mvolz@deploy2002> | helmfile [staging] DONE helmfile.d/services/citoid: apply | [production] | 
            
  | 10:10 | <mvolz@deploy2002> | helmfile [staging] START helmfile.d/services/citoid: apply | [production] | 
            
  | 09:29 | <dcausse> | erratum (replace wdqs1009 with wdqs2009 in the above msg): depooling and restarting blazegraph on wdqs2009 (stuck since 2023-10-12) | [production] | 
            
  | 09:28 | <dcausse> | depooling and restarting blazegraph on wdqs1009 (stuck since 2023-10-12) | [production] | 
            
  | 09:23 | <brouberol@cumin1001> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1009.eqiad.wmnet with OS bullseye | [production] | 
            
  | 09:14 | <ayounsi@cumin1001> | END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox | [production] | 
            
  | 09:14 | <ayounsi@cumin1001> | START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox | [production] | 
            
  | 09:06 | <brouberol@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1009.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 09:03 | <brouberol@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1009.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 08:50 | <brouberol@cumin1001> | START - Cookbook sre.hosts.reimage for host kafka-jumbo1009.eqiad.wmnet with OS bullseye | [production] | 
            
  | 08:49 | <urbanecm> | mwmaint2002: `foreachwikiindblist /srv/mediawiki/dblists/growthexperiments.dblist extensions/GrowthExperiments/maintenance/refreshUserImpactData.php --registeredWithin=1year --editedWithin=2week --hasEditsAtLeast=3 --ignoreIfUpdatedWithin=1second --verbose --use-job-queue` (testing T344428; after enabling backend on all Wikipedias) | [production] | 
            
  | 08:48 | <urbanecm@deploy2002> | Finished scap: Backport for [[gerrit:949034|Growth: Enable new Impact backend everywhere (T344143)]] (duration: 09m 29s) | [production] | 
            
  | 08:43 | <urbanecm@deploy2002> | urbanecm: Continuing with sync | [production] | 
            
  | 08:40 | <urbanecm@deploy2002> | urbanecm: Backport for [[gerrit:949034|Growth: Enable new Impact backend everywhere (T344143)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) | [production] | 
            
  | 08:40 | <kevinbazira@deploy2002> | helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . | [production] | 
            
  | 08:40 | <brouberol@cumin1001> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1008.eqiad.wmnet with OS bullseye | [production] | 
            
  | 08:39 | <urbanecm@deploy2002> | Started scap: Backport for [[gerrit:949034|Growth: Enable new Impact backend everywhere (T344143)]] | [production] | 
            
  | 08:32 | <kevinbazira@deploy2002> | helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' . | [production] | 
            
  | 08:32 | <urbanecm@deploy2002> | helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply | [production] | 
            
  | 08:31 | <urbanecm@deploy2002> | helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply | [production] | 
            
  | 08:29 | <urbanecm@deploy2002> | helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: apply | [production] | 
            
  | 08:28 | <urbanecm@deploy2002> | helmfile [eqiad] START helmfile.d/services/changeprop-jobqueue: apply | [production] | 
            
  | 08:28 | <urbanecm@deploy2002> | helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply | [production] | 
            
  | 08:27 | <urbanecm@deploy2002> | helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply | [production] | 
            
  | 08:24 | <brouberol@cumin1001> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-jumbo1008.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 08:21 | <brouberol@cumin1001> | START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-jumbo1008.eqiad.wmnet with reason: host reimage | [production] |