2024-08-14
ยง
|
15:43 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'es1029 (re)pooling @ 25%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67315 and previous config saved to /var/cache/conftool/dbconfig/20240814-154338-arnaudb.json |
[production] |
15:40 |
<dani@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/miscweb: apply |
[production] |
15:39 |
<dani@deploy1003> |
helmfile [codfw] START helmfile.d/services/miscweb: apply |
[production] |
15:39 |
<dani@deploy1003> |
helmfile [eqiad] DONE helmfile.d/services/miscweb: apply |
[production] |
15:39 |
<dani@deploy1003> |
helmfile [eqiad] START helmfile.d/services/miscweb: apply |
[production] |
15:39 |
<dani@deploy1003> |
helmfile [staging] DONE helmfile.d/services/miscweb: apply |
[production] |
15:39 |
<dani@deploy1003> |
helmfile [staging] START helmfile.d/services/miscweb: apply |
[production] |
15:34 |
<jayme@cumin1002> |
START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bullseye |
[production] |
15:28 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'es1029 (re)pooling @ 16%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67314 and previous config saved to /var/cache/conftool/dbconfig/20240814-152833-arnaudb.json |
[production] |
15:13 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'es1029 (re)pooling @ 8%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67312 and previous config saved to /var/cache/conftool/dbconfig/20240814-151328-arnaudb.json |
[production] |
14:59 |
<klausman@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2010.codfw.wmnet |
[production] |
14:58 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'es1029 (re)pooling @ 4%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67307 and previous config saved to /var/cache/conftool/dbconfig/20240814-145819-arnaudb.json |
[production] |
14:53 |
<klausman@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host ml-serve2010.codfw.wmnet |
[production] |
14:49 |
<jayme@cumin1002> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main2010.codfw.wmnet with OS bookworm |
[production] |
14:43 |
<jayme@cumin1002> |
START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bookworm |
[production] |
14:43 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'es1029 (re)pooling @ 2%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67305 and previous config saved to /var/cache/conftool/dbconfig/20240814-144314-arnaudb.json |
[production] |
14:32 |
<elukey@deploy1003> |
helmfile [eqiad] DONE helmfile.d/services/thumbor: sync |
[production] |
14:28 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'es1029 (re)pooling @ 1%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67304 and previous config saved to /var/cache/conftool/dbconfig/20240814-142808-arnaudb.json |
[production] |
14:27 |
<elukey@deploy1003> |
helmfile [eqiad] START helmfile.d/services/thumbor: sync |
[production] |
14:22 |
<elukey@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/thumbor: sync |
[production] |
14:21 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'es1 es1029 depooling for hdd hotswap', diff saved to https://phabricator.wikimedia.org/P67299 and previous config saved to /var/cache/conftool/dbconfig/20240814-142147-arnaudb.json |
[production] |
14:21 |
<ebernhardson@deploy1003> |
Synchronized private/PrivateSettings.php: Update NetworkSession users list for T341332 (duration: 12m 33s) |
[production] |
14:17 |
<elukey@deploy1003> |
helmfile [codfw] START helmfile.d/services/thumbor: sync |
[production] |
13:55 |
<elukey@deploy1003> |
helmfile [staging] DONE helmfile.d/services/thumbor: sync |
[production] |
13:55 |
<elukey@deploy1003> |
helmfile [staging] START helmfile.d/services/thumbor: sync |
[production] |
13:52 |
<hnowlan@deploy1003> |
helmfile [codfw] DONE helmfile.d/services/thumbor: sync |
[production] |
13:50 |
<hnowlan@deploy1003> |
helmfile [codfw] START helmfile.d/services/thumbor: sync |
[production] |
13:33 |
<kartik@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1062696|Use the updated recommendation API from liftwing (T371465)]] (duration: 07m 51s) |
[production] |
13:32 |
<jayme@cumin1002> |
END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-main2010.codfw.wmnet'] |
[production] |
13:29 |
<kartik@deploy1003> |
kartik: Continuing with sync |
[production] |
13:28 |
<kartik@deploy1003> |
kartik: Backport for [[gerrit:1062696|Use the updated recommendation API from liftwing (T371465)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
13:25 |
<kartik@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1062696|Use the updated recommendation API from liftwing (T371465)]] |
[production] |
13:25 |
<kartik@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1062697|Use the updated recommendation API from liftwing (T371465)]] (duration: 08m 37s) |
[production] |
13:22 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db2189 (re)pooling @ 100%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67296 and previous config saved to /var/cache/conftool/dbconfig/20240814-132256-arnaudb.json |
[production] |
13:22 |
<jayme@cumin1002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main2010.codfw.wmnet'] |
[production] |
13:20 |
<kartik@deploy1003> |
kartik: Continuing with sync |
[production] |
13:18 |
<kartik@deploy1003> |
kartik: Backport for [[gerrit:1062697|Use the updated recommendation API from liftwing (T371465)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
13:18 |
<ebernhardson@deploy1003> |
helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
13:18 |
<ebernhardson@deploy1003> |
helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
13:16 |
<kartik@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1062697|Use the updated recommendation API from liftwing (T371465)]] |
[production] |
13:14 |
<ebernhardson@deploy1003> |
helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
13:14 |
<ebernhardson@deploy1003> |
helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
13:11 |
<ebernhardson@deploy1003> |
helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
13:11 |
<ebernhardson@deploy1003> |
helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
13:07 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db2189 (re)pooling @ 75%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67295 and previous config saved to /var/cache/conftool/dbconfig/20240814-130750-arnaudb.json |
[production] |
12:52 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db2189 (re)pooling @ 50%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67293 and previous config saved to /var/cache/conftool/dbconfig/20240814-125245-arnaudb.json |
[production] |
12:49 |
<arnaudb@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on 9 hosts with reason: replication table exclusion deployment |
[production] |
12:49 |
<arnaudb@cumin1002> |
START - Cookbook sre.hosts.downtime for 0:20:00 on 9 hosts with reason: replication table exclusion deployment |
[production] |
12:37 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db2189 (re)pooling @ 25%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67292 and previous config saved to /var/cache/conftool/dbconfig/20240814-123739-arnaudb.json |
[production] |
12:22 |
<arnaudb@cumin1002> |
dbctl commit (dc=all): 'db2189 (re)pooling @ 16%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67291 and previous config saved to /var/cache/conftool/dbconfig/20240814-122234-arnaudb.json |
[production] |