201-250 of 10000 results (27ms)
2026-06-04 ยง
08:00 <marostegui@cumin1003> START - Cookbook sre.hosts.downtime for 2:00:00 on db1224.eqiad.wmnet with reason: host reimage [production]
07:53 <marostegui> Install mariadb 10.11.17 on db2249 T427345 [production]
07:43 <marostegui@cumin1003> START - Cookbook sre.hosts.reimage for host db1224.eqiad.wmnet with OS trixie [production]
07:42 <marostegui@cumin1003> END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1224: Upgrading db1224.eqiad.wmnet [production]
07:41 <marostegui@cumin1003> START - Cookbook sre.mysql.depool depool db1224: Upgrading db1224.eqiad.wmnet [production]
07:41 <marostegui@cumin1003> START - Cookbook sre.mysql.major-upgrade [production]
07:39 <cwilliams@cumin1003> END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) [production]
07:39 <cwilliams@cumin1003> END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1255: Migration of db1255.eqiad.wmnet completed [production]
07:34 <kharlan@deploy1003> Finished scap sync-world: Backport for [[gerrit:1297536|hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200|hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173|hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]] (duration: 08m 56s) [production]
07:29 <kharlan@deploy1003> kharlan, harroyo-wmf: Continuing with deployment [production]
07:27 <kharlan@deploy1003> kharlan, harroyo-wmf: Backport for [[gerrit:1297536|hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200|hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173|hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwd [production]
07:25 <kharlan@deploy1003> Started scap sync-world: Backport for [[gerrit:1297536|hCaptcha risk scores: VE plugin to collect risk scores for block notices (T426943)]], [[gerrit:1297200|hCaptcha: Render a fresh mobile widget for each captcha attempt (T425929)]], [[gerrit:1297173|hCaptcha: Enable risk-score collection for users blocked by IP blocks (T424629)]] [production]
07:24 <marostegui@cumin1003> END (PASS) - Cookbook sre.mysql.major-upgrade (exit_code=0) [production]
07:24 <marostegui@cumin1003> END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2191: Migration of db2191.codfw.wmnet completed [production]
07:12 <kharlan@deploy1003> Finished scap sync-world: Backport for [[gerrit:1297550|Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]] (duration: 06m 45s) [production]
07:08 <kharlan@deploy1003> kharlan: Continuing with deployment [production]
07:08 <kharlan@deploy1003> kharlan: Backport for [[gerrit:1297550|Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [production]
07:06 <kharlan@deploy1003> Started scap sync-world: Backport for [[gerrit:1297550|Revert "EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion"]] [production]
07:04 <otto@deploy1003> Finished scap sync-world: Backport for [[gerrit:1297260|EventStreamConfig - webrequest.dumps.dev0 - enable canary events for hive ingestion (T425087)]] (duration: 399m 30s) [production]
07:03 <otto@deploy1003> otto: Rolling back deployment [production]
06:53 <cwilliams@cumin1003> START - Cookbook sre.mysql.pool pool db1255: Migration of db1255.eqiad.wmnet completed [production]
06:51 <cwilliams@cumin1003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1255.eqiad.wmnet with OS trixie [production]
06:38 <marostegui@cumin1003> START - Cookbook sre.mysql.pool pool db2191: Migration of db2191.codfw.wmnet completed [production]
06:35 <cwilliams@cumin1003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage [production]
06:32 <marostegui@cumin1003> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2191.codfw.wmnet with OS trixie [production]
06:31 <cwilliams@cumin1003> START - Cookbook sre.hosts.downtime for 2:00:00 on db1255.eqiad.wmnet with reason: host reimage [production]
06:16 <cwilliams@cumin1003> START - Cookbook sre.hosts.reimage for host db1255.eqiad.wmnet with OS trixie [production]
06:15 <marostegui@cumin1003> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2191.codfw.wmnet with reason: host reimage [production]
06:13 <cwilliams@cumin1003> END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1255: Upgrading db1255.eqiad.wmnet [production]
06:12 <cwilliams@cumin1003> START - Cookbook sre.mysql.depool depool db1255: Upgrading db1255.eqiad.wmnet [production]
06:12 <cwilliams@cumin1003> START - Cookbook sre.mysql.major-upgrade [production]
06:11 <marostegui@cumin1003> START - Cookbook sre.hosts.downtime for 2:00:00 on db2191.codfw.wmnet with reason: host reimage [production]
06:04 <cwilliams@cumin1003> dbctl commit (dc=all): 'Depool db1255 T427895', diff saved to https://phabricator.wikimedia.org/P93836 and previous config saved to /var/cache/conftool/dbconfig/20260604-060428-cwilliams.json [production]
06:03 <cwilliams@dns1004> END - running authdns-update [production]
06:02 <cwilliams@dns1004> START - running authdns-update [production]
05:54 <cwilliams@cumin1003> dbctl commit (dc=all): 'Promote db1258 to x3 primary and set section read-write T427895', diff saved to https://phabricator.wikimedia.org/P93835 and previous config saved to /var/cache/conftool/dbconfig/20260604-055429-cwilliams.json [production]
05:53 <cwilliams@cumin1003> dbctl commit (dc=all): 'Set x3 eqiad as read-only for maintenance - T427895', diff saved to https://phabricator.wikimedia.org/P93834 and previous config saved to /var/cache/conftool/dbconfig/20260604-055346-cwilliams.json [production]
05:53 <cezmunsta> Starting x3 eqiad failover from db1255 to db1258 - T427895 [production]
05:52 <marostegui@cumin1003> START - Cookbook sre.hosts.reimage for host db2191.codfw.wmnet with OS trixie [production]
05:50 <marostegui@cumin1003> END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2191: Upgrading db2191.codfw.wmnet [production]
05:50 <marostegui@cumin1003> START - Cookbook sre.mysql.depool depool db2191: Upgrading db2191.codfw.wmnet [production]
05:50 <cwilliams@cumin1003> dbctl commit (dc=all): 'Set db1258 with weight 0 T427895', diff saved to https://phabricator.wikimedia.org/P93833 and previous config saved to /var/cache/conftool/dbconfig/20260604-055021-cwilliams.json [production]
05:50 <marostegui@cumin1003> START - Cookbook sre.mysql.major-upgrade [production]
05:50 <cwilliams@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 18 hosts with reason: Primary switchover x3 T427895 [production]
05:48 <kevinbazira@deploy1003> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' . [production]
05:46 <marostegui@cumin1003> dbctl commit (dc=all): 'Depool db2191 T428120', diff saved to https://phabricator.wikimedia.org/P93832 and previous config saved to /var/cache/conftool/dbconfig/20260604-054614-marostegui.json [production]
05:45 <marostegui@cumin1003> dbctl commit (dc=all): 'Promote db2215 to x1 primary T428120', diff saved to https://phabricator.wikimedia.org/P93831 and previous config saved to /var/cache/conftool/dbconfig/20260604-054528-marostegui.json [production]
05:44 <marostegui> Starting x1 codfw failover from db2191 to db2215 - T428120 [production]
05:27 <marostegui@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 16 hosts with reason: Primary switchover x1 T428120 [production]
05:27 <marostegui@cumin1003> dbctl commit (dc=all): 'Set db2215 with weight 0 T428120', diff saved to https://phabricator.wikimedia.org/P93830 and previous config saved to /var/cache/conftool/dbconfig/20260604-052722-marostegui.json [production]