|
2026-04-29
ยง
|
| 09:20 |
<marostegui@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1229.eqiad.wmnet with reason: Reimage to Trixie |
[production] |
| 09:19 |
<marostegui@cumin1003> |
START - Cookbook sre.mysql.depool depool db2175: Reimage to Trixie |
[production] |
| 09:19 |
<marostegui@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2175.codfw.wmnet with reason: Reimage to Trixie |
[production] |
| 09:15 |
<fceratto@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P91862 and previous config saved to /var/cache/conftool/dbconfig/20260429-091542-fceratto.json |
[production] |
| 09:13 |
<marostegui@cumin1003> |
END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2189: after reimage to trixie |
[production] |
| 09:10 |
<marostegui@cumin1003> |
END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1233: after reimage to trixie |
[production] |
| 09:05 |
<fceratto@cumin1003> |
dbctl commit (dc=all): 'Repooling after maintenance db1174 (T419961)', diff saved to https://phabricator.wikimedia.org/P91857 and previous config saved to /var/cache/conftool/dbconfig/20260429-090534-fceratto.json |
[production] |
| 09:01 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reimage for host ganeti5005.eqsin.wmnet with OS bookworm |
[production] |
| 08:56 |
<fceratto@cumin1003> |
dbctl commit (dc=all): 'Depooling db1174 (T419961)', diff saved to https://phabricator.wikimedia.org/P91854 and previous config saved to /var/cache/conftool/dbconfig/20260429-085654-fceratto.json |
[production] |
| 08:56 |
<fceratto@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1174.eqiad.wmnet with reason: Maintenance |
[production] |
| 08:54 |
<marostegui@cumin1003> |
START - Cookbook sre.mysql.pool pool db2194: after reimage to trixie |
[production] |
| 08:51 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' . |
[production] |
| 08:51 |
<fceratto@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1171.eqiad.wmnet with reason: Maintenance |
[production] |
| 08:48 |
<marostegui@cumin1003> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2194.codfw.wmnet with OS trixie |
[production] |
| 08:45 |
<marostegui@cumin1003> |
START - Cookbook sre.mysql.pool pool db1175: after reimage to trixie |
[production] |
| 08:42 |
<ryankemper@cumin2002> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudelastic1007.eqiad.wmnet with OS trixie |
[production] |
| 08:40 |
<marostegui@cumin1003> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1175.eqiad.wmnet with OS trixie |
[production] |
| 08:38 |
<urbanecm@deploy1003> |
mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki Wikimedia_Apps/Team/Android/TriviaGame 'Wikimedia Apps/Team/Android/"Which came first?" Game' 'Martin Urbanec (WMF)' '--reason=per [[:phab:T423845]]' # T423845 |
[production] |
| 08:38 |
<urbanecm@deploy1003> |
mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki Wikimedia_Apps/Team/Android/TriviaGame 'Wikimedia Apps/Team/Android/"Which came first?" Game' 'Martin Urbanec (WMF)' '--reason=per [[:phab:T423845]]' # T423845 |
[production] |
| 08:37 |
<urbanecm@deploy1003> |
mwscript-k8s job started: extensions/Translate/scripts/moveTranslatableBundle.php --wiki=mediawikiwiki Wikimedia_Apps/Team/Android/TriviaGame 'Wikimedia Apps/Team/Android/Which' came 'first? Game' 'Martin Urbanec (WMF)' '--reason=per [[:phab:T423845]]' # T423845 |
[production] |
| 08:29 |
<elukey@deploy1003> |
helmfile [staging] DONE helmfile.d/services/wikifunctions: sync |
[production] |
| 08:29 |
<elukey@deploy1003> |
helmfile [staging] START helmfile.d/services/wikifunctions: sync |
[production] |
| 08:28 |
<marostegui@cumin1003> |
START - Cookbook sre.mysql.pool pool db2189: after reimage to trixie |
[production] |
| 08:24 |
<marostegui@cumin1003> |
START - Cookbook sre.mysql.pool pool db1233: after reimage to trixie |
[production] |
| 08:24 |
<marostegui@cumin1003> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2194.codfw.wmnet with reason: host reimage |
[production] |
| 08:24 |
<marostegui@cumin1003> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2189.codfw.wmnet with OS trixie |
[production] |
| 08:21 |
<marostegui@cumin1003> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1233.eqiad.wmnet with OS trixie |
[production] |
| 08:21 |
<marostegui@cumin1003> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db2194.codfw.wmnet with reason: host reimage |
[production] |
| 08:18 |
<marostegui@cumin1003> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1175.eqiad.wmnet with reason: host reimage |
[production] |
| 08:18 |
<Emperor> |
re-enable puppet in apus/codfw for TLS key rollover T424674 (no change, incident took over) |
[production] |
| 08:16 |
<Emperor> |
disable puppet in apus/codfw for TLS key rollover T424674 |
[production] |
| 08:14 |
<marostegui@cumin1003> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1175.eqiad.wmnet with reason: host reimage |
[production] |
| 08:09 |
<dpogorzelski@deploy1003> |
helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' . |
[production] |
| 08:08 |
<a-pizzata@deploy1003> |
Finished deploy [analytics/refinery@d6a17a0] (thin): Regular analytics weekly train THIN [analytics/refinery@d6a17a0a] (duration: 01m 54s) |
[production] |
| 08:06 |
<a-pizzata@deploy1003> |
Started deploy [analytics/refinery@d6a17a0] (thin): Regular analytics weekly train THIN [analytics/refinery@d6a17a0a] |
[production] |
| 08:02 |
<marostegui@cumin1003> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2189.codfw.wmnet with reason: host reimage |
[production] |
| 07:59 |
<a-pizzata@deploy1003> |
Finished deploy [analytics/refinery@d6a17a0]: Regular analytics weekly train [analytics/refinery@d6a17a0a] (duration: 04m 12s) |
[production] |
| 07:59 |
<marostegui@cumin1003> |
START - Cookbook sre.hosts.reimage for host db2194.codfw.wmnet with OS trixie |
[production] |
| 07:59 |
<marostegui@cumin1003> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1233.eqiad.wmnet with reason: host reimage |
[production] |
| 07:58 |
<marostegui@cumin1003> |
START - Cookbook sre.hosts.reimage for host db1175.eqiad.wmnet with OS trixie |
[production] |
| 07:57 |
<marostegui@cumin1003> |
END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db2194: Reimage to Trixie |
[production] |
| 07:57 |
<marostegui@cumin1003> |
START - Cookbook sre.mysql.depool depool db2194: Reimage to Trixie |
[production] |
| 07:57 |
<marostegui@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2194.codfw.wmnet with reason: Reimage to Trixie |
[production] |
| 07:56 |
<marostegui@cumin1003> |
END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db2227: after reimage to trixie |
[production] |
| 07:56 |
<marostegui@cumin1003> |
END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1175: Reimage to Trixie |
[production] |
| 07:56 |
<marostegui@cumin1003> |
START - Cookbook sre.mysql.depool depool db1175: Reimage to Trixie |
[production] |
| 07:55 |
<marostegui@cumin1003> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1175.eqiad.wmnet with reason: Reimage to Trixie |
[production] |
| 07:55 |
<a-pizzata@deploy1003> |
Started deploy [analytics/refinery@d6a17a0]: Regular analytics weekly train [analytics/refinery@d6a17a0a] |
[production] |
| 07:55 |
<a-pizzata@deploy1003> |
Finished deploy [analytics/refinery@d6a17a0] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@d6a17a0a] (duration: 01m 57s) |
[production] |
| 07:53 |
<marostegui@cumin1003> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db2189.codfw.wmnet with reason: host reimage |
[production] |