| 2024-05-13
      
      ยง | 
    
  | 18:03 | <swfrench@deploy1002> | helmfile [staging] START helmfile.d/services/blubberoid: apply | [production] | 
            
  | 17:40 | <cmooney@cumin1002> | END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | [production] | 
            
  | 17:40 | <cmooney@cumin1002> | END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for new linknets on codfw spines - cmooney@cumin1002" | [production] | 
            
  | 17:39 | <cmooney@cumin1002> | START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add records for new linknets on codfw spines - cmooney@cumin1002" | [production] | 
            
  | 17:38 | <ebernhardson@deploy1002> | helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply | [production] | 
            
  | 17:38 | <ebernhardson@deploy1002> | helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply | [production] | 
            
  | 17:37 | <cmooney@cumin1002> | START - Cookbook sre.dns.netbox | [production] | 
            
  | 17:27 | <ryankemper> | T363973 [Kafka] Restarting `jumbo-eqiad` brokers, followed by mirror maker | [production] | 
            
  | 17:27 | <ryankemper@cumin2002> | START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad | [production] | 
            
  | 17:12 | <dancy> | Ran `docker buildx prune --keep-storage 20000000000` on integration-agent-docker-1054 to free up ~19GB in /var/lib/docker. | [releng] | 
            
  | 17:05 | <dancy> | Updating jenkins jobs for https://gerrit.wikimedia.org/r/c/integration/config/+/1029621 | [releng] | 
            
  | 17:05 | <jayme@cumin1002> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage | [production] | 
            
  | 17:02 | <jayme@cumin1002> | START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage | [production] | 
            
  | 16:50 | <jayme@cumin1002> | START - Cookbook sre.hosts.reimage for host kubestagemaster2005.codfw.wmnet with OS bullseye | [production] | 
            
  | 16:49 | <jayme@cumin1002> | END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2005.codfw.wmnet to plain | [production] | 
            
  | 16:47 | <jayme@cumin1002> | START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2005.codfw.wmnet to plain | [production] | 
            
  | 16:47 | <jayme@cumin1002> | END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of kubestagemaster2004.codfw.wmnet to plain | [production] | 
            
  | 16:46 | <jayme@cumin1002> | START - Cookbook sre.ganeti.changedisk for changing disk type of kubestagemaster2004.codfw.wmnet to plain | [production] | 
            
  | 16:46 | <ejegg> | fundraising civicrm upgraded from c0d2fa95 to 4f55a7cf | [production] | 
            
  | 16:46 | <jayme@cumin1002> | END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=97) for new host kubestagemaster2005.codfw.wmnet | [production] | 
            
  | 16:46 | <jayme@cumin1002> | END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kubestagemaster2005.codfw.wmnet with OS bullseye | [production] | 
            
  | 16:34 | <brouberol@cumin2002> | END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: JVM restart - brouberol@cumin2002 - T363975 | [production] | 
            
  | 16:16 | <jdrewniak@deploy1002> | Synchronized portals: Wikimedia Portals Update: [[gerrit:1031006| Bumping portals to master (T128546)]] (duration: 13m 47s) | [production] | 
            
  | 16:13 | <ejegg> | restarted fundraising scheduled jobs | [production] | 
            
  | 16:11 | <ejegg> | fundraising civicrm rolled back from 3fef5849 to c0d2fa95 | [production] | 
            
  | 16:02 | <jdrewniak@deploy1002> | Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:1031006| Bumping portals to master (T128546)]] (duration: 14m 23s) | [production] | 
            
  | 15:59 | <ladsgroup@cumin1002> | dbctl commit (dc=all): 'Depooling db1170 (T352010)', diff saved to https://phabricator.wikimedia.org/P62370 and previous config saved to /var/cache/conftool/dbconfig/20240513-155911-ladsgroup.json | [production] | 
            
  | 15:59 | <ladsgroup@cumin1002> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 15:58 | <ladsgroup@cumin1002> | START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1170.eqiad.wmnet with reason: Maintenance | [production] | 
            
  | 15:58 | <ladsgroup@cumin1002> | dbctl commit (dc=all): 'Repooling after maintenance db1158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62369 and previous config saved to /var/cache/conftool/dbconfig/20240513-155849-ladsgroup.json | [production] | 
            
  | 15:55 | <ejegg> | fundraising civicrm upgraded from c0d2fa95 to 3fef5849 | [production] | 
            
  | 15:54 | <ejegg> | disabled fundraising scheduled jobs for CiviCRM deploy | [production] | 
            
  | 15:49 | <herron@cumin1002> | END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-logging-codfw | [production] | 
            
  | 15:43 | <ladsgroup@cumin1002> | dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P62368 and previous config saved to /var/cache/conftool/dbconfig/20240513-154341-ladsgroup.json | [production] | 
            
  | 15:28 | <ladsgroup@cumin1002> | dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P62367 and previous config saved to /var/cache/conftool/dbconfig/20240513-152833-ladsgroup.json | [production] | 
            
  | 15:27 | <elukey@deploy1002> | helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . | [production] | 
            
  | 15:25 | <herron@cumin1002> | START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-logging-codfw | [production] | 
            
  | 15:19 | <marostegui@cumin1002> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance | [production] | 
            
  | 15:19 | <marostegui@cumin1002> | START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance | [production] | 
            
  | 15:18 | <hashar> | deployment-prep: deleted security rule for 208.80.154.17 ssh and port 2 (sic) and allow 208.80.154.132 / contint1002 port 22 instead # T334517 | [releng] | 
            
  | 15:18 | <brouberol@cumin2002> | START - Cookbook sre.elasticsearch.rolling-operation Operation.RESTART (3 nodes at a time) for ElasticSearch cluster search_codfw: JVM restart - brouberol@cumin2002 - T363975 | [production] | 
            
  | 15:13 | <ladsgroup@cumin1002> | dbctl commit (dc=all): 'Repooling after maintenance db1158 (T352010)', diff saved to https://phabricator.wikimedia.org/P62366 and previous config saved to /var/cache/conftool/dbconfig/20240513-151325-ladsgroup.json | [production] | 
            
  | 14:55 | <Lucas_WMDE> | UTC afternoon backport+config window don | [production] | 
            
  | 14:50 | <logmsgbot> | lucaswerkmeister-wmde@deploy1002 Finished scap: Backport for [[gerrit:1025391|Include mw-jobrunner port in host header check]] (duration: 16m 04s) | [production] | 
            
  | 14:49 | <jayme@cumin1002> | END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host kubestagemaster2004.codfw.wmnet | [production] | 
            
  | 14:49 | <jayme@cumin1002> | END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubestagemaster2004.codfw.wmnet with OS bullseye | [production] | 
            
  | 14:46 | <James_F> | Zuul: [mediawiki/extensions/DiscordRCFeed] Add Flow as a phan dep too | [releng] | 
            
  | 14:42 | <jayme@cumin1002> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage | [production] | 
            
  | 14:39 | <jayme@cumin1002> | START - Cookbook sre.hosts.downtime for 2:00:00 on kubestagemaster2005.codfw.wmnet with reason: host reimage | [production] | 
            
  | 14:37 | <logmsgbot> | lucaswerkmeister-wmde@deploy1002 lucaswerkmeister-wmde and hnowlan: Continuing with sync | [production] |