| 2025-08-05
      
      § | 
    
  | 01:13 | <vriley@cumin1002> | START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002" | [production] | 
            
  | 01:11 | <mwpresync@deploy1003> | Finished scap build-images: Publishing wmf/next image (duration: 10m 57s) | [production] | 
            
  | 01:03 | <jhancock@cumin1003> | START - Cookbook sre.hosts.reimage for host dbprov2007.codfw.wmnet with OS bookworm | [production] | 
            
  | 01:00 | <mwpresync@deploy1003> | Started scap build-images: Publishing wmf/next image | [production] | 
            
  | 01:00 | <jhancock@cumin1003> | END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dbprov2007'] | [production] | 
            
  | 00:59 | <jhancock@cumin1003> | START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov2007'] | [production] | 
            
  | 00:53 | <vriley@cumin1002> | END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1043.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 00:51 | <jhancock@cumin1003> | END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbprov2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | [production] | 
            
  | 00:47 | <vriley@cumin1002> | START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1043.eqiad.wmnet with reason: host reimage | [production] | 
            
  | 00:38 | <jhancock@cumin1003> | START - Cookbook sre.hosts.provision for host dbprov2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | [production] | 
            
  | 00:29 | <vriley@cumin1002> | END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | [production] | 
            
  | 00:28 | <vriley@cumin1002> | START - Cookbook sre.hosts.reimage for host cloudcephosd1043.eqiad.wmnet with OS bookworm | [production] | 
            
  | 00:25 | <jhancock@cumin1003> | END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbprov2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | [production] | 
            
  | 00:22 | <vriley@cumin1002> | END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1043.eqiad.wmnet with OS bookworm | [production] | 
            
  | 00:16 | <vriley@cumin1002> | START - Cookbook sre.hosts.provision for host cloudcephosd1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | [production] | 
            
  | 00:10 | <jhancock@cumin1003> | START - Cookbook sre.hosts.provision for host dbprov2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED | [production] | 
            
  | 00:10 | <jhancock@cumin1003> | END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dbprov2007 | [production] | 
            
  | 00:10 | <jhancock@cumin1003> | START - Cookbook sre.network.configure-switch-interfaces for host dbprov2007 | [production] | 
            
  | 00:09 | <jhancock@cumin1003> | END (PASS) - Cookbook sre.dns.netbox (exit_code=0) | [production] | 
            
  | 00:09 | <jhancock@cumin1003> | END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dbprov2007 to codfw - jhancock@cumin1003" | [production] | 
            
  | 00:09 | <jhancock@cumin1003> | START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dbprov2007 to codfw - jhancock@cumin1003" | [production] | 
            
  | 00:08 | <vriley@cumin1002> | START - Cookbook sre.hosts.reimage for host cloudcephosd1043.eqiad.wmnet with OS bookworm | [production] | 
            
  | 00:08 | <vriley@cumin1002> | END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1043.eqiad.wmnet with OS bookworm | [production] | 
            
  | 00:06 | <jhancock@cumin1003> | START - Cookbook sre.dns.netbox | [production] | 
            
  
    | 2025-08-04
      
      § | 
    
  | 23:42 | <vriley@cumin1002> | START - Cookbook sre.hosts.reimage for host cloudcephosd1043.eqiad.wmnet with OS bookworm | [production] | 
            
  | 21:46 | <ladsgroup@cumin1002> | dbctl commit (dc=all): 'Repooling after maintenance db2222 (T400854)', diff saved to https://phabricator.wikimedia.org/P80782 and previous config saved to /var/cache/conftool/dbconfig/20250804-214644-ladsgroup.json | [production] | 
            
  | 21:39 | <kemayo@deploy1003> | Finished scap sync-world: Backport for [[gerrit:1175575|Change search teardown focus to not use an over-broad route (T401090)]] (duration: 08m 08s) | [production] | 
            
  | 21:33 | <kemayo@deploy1003> | kemayo: Continuing with sync | [production] | 
            
  | 21:32 | <kemayo@deploy1003> | kemayo: Backport for [[gerrit:1175575|Change search teardown focus to not use an over-broad route (T401090)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. | [production] | 
            
  | 21:31 | <ladsgroup@cumin1002> | dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P80781 and previous config saved to /var/cache/conftool/dbconfig/20250804-213136-ladsgroup.json | [production] | 
            
  | 21:31 | <kemayo@deploy1003> | Started scap sync-world: Backport for [[gerrit:1175575|Change search teardown focus to not use an over-broad route (T401090)]] | [production] | 
            
  | 21:16 | <ebernhardson@deploy1003> | Finished scap sync-world: Backport for [[gerrit:1175562|Revert "cirrus: Start AB test of completion suggester fuzziness" (T397732)]], [[gerrit:1175566|Clean up CirrusSearch settings on ex-wikipedia special wikis (T400062)]] (duration: 08m 06s) | [production] | 
            
  | 21:16 | <ladsgroup@cumin1002> | dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P80780 and previous config saved to /var/cache/conftool/dbconfig/20250804-211628-ladsgroup.json | [production] | 
            
  | 21:14 | <vriley@cumin1002> | END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1043.eqiad.wmnet with OS bullseye | [production] | 
            
  | 21:11 | <ebernhardson@deploy1003> | ebernhardson: Continuing with sync | [production] | 
            
  | 21:10 | <ebernhardson@deploy1003> | ebernhardson: Backport for [[gerrit:1175562|Revert "cirrus: Start AB test of completion suggester fuzziness" (T397732)]], [[gerrit:1175566|Clean up CirrusSearch settings on ex-wikipedia special wikis (T400062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. | [production] | 
            
  | 21:08 | <ebernhardson@deploy1003> | Started scap sync-world: Backport for [[gerrit:1175562|Revert "cirrus: Start AB test of completion suggester fuzziness" (T397732)]], [[gerrit:1175566|Clean up CirrusSearch settings on ex-wikipedia special wikis (T400062)]] | [production] | 
            
  | 21:03 | <cjming@deploy1003> | Finished scap sync-world: Backport for [[gerrit:1175511|Clear edit count when unattaching local users for rename (T313900)]], [[gerrit:1175512|fixStuckGlobalRename: Fix using actor_id from the wrong wiki (T398177)]], [[gerrit:1175574|SessionManager: Add $sessionWriteReason to shutdown and when saves are triggered from the destructor (T400249)]] (duration: 07m 36s) | [production] | 
            
  | 21:01 | <ladsgroup@cumin1002> | dbctl commit (dc=all): 'Repooling after maintenance db2222 (T400854)', diff saved to https://phabricator.wikimedia.org/P80779 and previous config saved to /var/cache/conftool/dbconfig/20250804-210119-ladsgroup.json | [production] | 
            
  | 20:58 | <ladsgroup@cumin1002> | dbctl commit (dc=all): 'Depooling db2222 (T400854)', diff saved to https://phabricator.wikimedia.org/P80778 and previous config saved to /var/cache/conftool/dbconfig/20250804-205837-ladsgroup.json | [production] | 
            
  | 20:58 | <ladsgroup@cumin1002> | DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2222.codfw.wmnet with reason: Maintenance | [production] | 
            
  | 20:58 | <ladsgroup@cumin1002> | dbctl commit (dc=all): 'Repooling after maintenance db2221 (T400854)', diff saved to https://phabricator.wikimedia.org/P80777 and previous config saved to /var/cache/conftool/dbconfig/20250804-205813-ladsgroup.json | [production] | 
            
  | 20:57 | <cjming@deploy1003> | matmarex, cjming: Continuing with sync | [production] | 
            
  | 20:57 | <cjming@deploy1003> | matmarex, cjming: Backport for [[gerrit:1175511|Clear edit count when unattaching local users for rename (T313900)]], [[gerrit:1175512|fixStuckGlobalRename: Fix using actor_id from the wrong wiki (T398177)]], [[gerrit:1175574|SessionManager: Add $sessionWriteReason to shutdown and when saves are triggered from the destructor (T400249)]] synced to the testservers (see https://wikitech.wikimedia.org/w | [production] | 
            
  | 20:55 | <cjming@deploy1003> | Started scap sync-world: Backport for [[gerrit:1175511|Clear edit count when unattaching local users for rename (T313900)]], [[gerrit:1175512|fixStuckGlobalRename: Fix using actor_id from the wrong wiki (T398177)]], [[gerrit:1175574|SessionManager: Add $sessionWriteReason to shutdown and when saves are triggered from the destructor (T400249)]] | [production] | 
            
  | 20:45 | <ottomata> | eventgate-analytics in eqiad cannot be deployed due to stuck helm STATUS: pending-upgrade.  This needs to be deployed to rollback to a version that doesn't cause logspam.  cc cwhite, rzl - T376026 | [production] | 
            
  | 20:43 | <ladsgroup@cumin1002> | dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P80776 and previous config saved to /var/cache/conftool/dbconfig/20250804-204305-ladsgroup.json | [production] | 
            
  | 20:39 | <Daimona> | mwscript-k8s --comment="T397270" -f --file /srv/mediawiki/php-1.45.0-wmf.12/extensions/CampaignEvents/maintenance/countryExceptionMappings.csv -- CampaignEvents:UpdateCountriesColumn --wiki metawiki --exceptions countryExceptionMappings.csv --commit | [production] | 
            
  | 20:37 | <Daimona> | mwscript-k8s --comment="T397270" -f --file /srv/mediawiki/php-1.45.0-wmf.12/extensions/CampaignEvents/maintenance/countryExceptionMappings.csv -- CampaignEvents:UpdateCountriesColumn --wiki officewiki --exceptions countryExceptionMappings.csv --commit | [production] | 
            
  | 20:36 | <otto@deploy1003> | helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply | [production] |