451-500 of 10000 results (103ms)
2024-08-14 ยง
14:49 <jayme@cumin1002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main2010.codfw.wmnet with OS bookworm [production]
14:43 <jayme@cumin1002> START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bookworm [production]
14:43 <arnaudb@cumin1002> dbctl commit (dc=all): 'es1029 (re)pooling @ 2%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67305 and previous config saved to /var/cache/conftool/dbconfig/20240814-144314-arnaudb.json [production]
14:32 <elukey@deploy1003> helmfile [eqiad] DONE helmfile.d/services/thumbor: sync [production]
14:28 <arnaudb@cumin1002> dbctl commit (dc=all): 'es1029 (re)pooling @ 1%: broken disk replaced, slow repooling', diff saved to https://phabricator.wikimedia.org/P67304 and previous config saved to /var/cache/conftool/dbconfig/20240814-142808-arnaudb.json [production]
14:27 <elukey@deploy1003> helmfile [eqiad] START helmfile.d/services/thumbor: sync [production]
14:22 <elukey@deploy1003> helmfile [codfw] DONE helmfile.d/services/thumbor: sync [production]
14:21 <arnaudb@cumin1002> dbctl commit (dc=all): 'es1 es1029 depooling for hdd hotswap', diff saved to https://phabricator.wikimedia.org/P67299 and previous config saved to /var/cache/conftool/dbconfig/20240814-142147-arnaudb.json [production]
14:21 <ebernhardson@deploy1003> Synchronized private/PrivateSettings.php: Update NetworkSession users list for T341332 (duration: 12m 33s) [production]
14:17 <elukey@deploy1003> helmfile [codfw] START helmfile.d/services/thumbor: sync [production]
13:55 <elukey@deploy1003> helmfile [staging] DONE helmfile.d/services/thumbor: sync [production]
13:55 <elukey@deploy1003> helmfile [staging] START helmfile.d/services/thumbor: sync [production]
13:52 <hnowlan@deploy1003> helmfile [codfw] DONE helmfile.d/services/thumbor: sync [production]
13:50 <hnowlan@deploy1003> helmfile [codfw] START helmfile.d/services/thumbor: sync [production]
13:33 <kartik@deploy1003> Finished scap sync-world: Backport for [[gerrit:1062696|Use the updated recommendation API from liftwing (T371465)]] (duration: 07m 51s) [production]
13:32 <jayme@cumin1002> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-main2010.codfw.wmnet'] [production]
13:29 <kartik@deploy1003> kartik: Continuing with sync [production]
13:28 <kartik@deploy1003> kartik: Backport for [[gerrit:1062696|Use the updated recommendation API from liftwing (T371465)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
13:25 <kartik@deploy1003> Started scap sync-world: Backport for [[gerrit:1062696|Use the updated recommendation API from liftwing (T371465)]] [production]
13:25 <kartik@deploy1003> Finished scap sync-world: Backport for [[gerrit:1062697|Use the updated recommendation API from liftwing (T371465)]] (duration: 08m 37s) [production]
13:22 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2189 (re)pooling @ 100%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67296 and previous config saved to /var/cache/conftool/dbconfig/20240814-132256-arnaudb.json [production]
13:22 <jayme@cumin1002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-main2010.codfw.wmnet'] [production]
13:20 <kartik@deploy1003> kartik: Continuing with sync [production]
13:18 <kartik@deploy1003> kartik: Backport for [[gerrit:1062697|Use the updated recommendation API from liftwing (T371465)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
13:18 <ebernhardson@deploy1003> helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
13:18 <ebernhardson@deploy1003> helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply [production]
13:16 <kartik@deploy1003> Started scap sync-world: Backport for [[gerrit:1062697|Use the updated recommendation API from liftwing (T371465)]] [production]
13:14 <ebernhardson@deploy1003> helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
13:14 <ebernhardson@deploy1003> helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply [production]
13:11 <ebernhardson@deploy1003> helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
13:11 <ebernhardson@deploy1003> helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply [production]
13:07 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2189 (re)pooling @ 75%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67295 and previous config saved to /var/cache/conftool/dbconfig/20240814-130750-arnaudb.json [production]
12:52 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2189 (re)pooling @ 50%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67293 and previous config saved to /var/cache/conftool/dbconfig/20240814-125245-arnaudb.json [production]
12:49 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on 9 hosts with reason: replication table exclusion deployment [production]
12:49 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 0:20:00 on 9 hosts with reason: replication table exclusion deployment [production]
12:37 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2189 (re)pooling @ 25%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67292 and previous config saved to /var/cache/conftool/dbconfig/20240814-123739-arnaudb.json [production]
12:22 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2189 (re)pooling @ 16%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67291 and previous config saved to /var/cache/conftool/dbconfig/20240814-122234-arnaudb.json [production]
12:07 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2189 (re)pooling @ 8%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67290 and previous config saved to /var/cache/conftool/dbconfig/20240814-120729-arnaudb.json [production]
11:52 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2189 (re)pooling @ 4%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67289 and previous config saved to /var/cache/conftool/dbconfig/20240814-115223-arnaudb.json [production]
11:37 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2189 (re)pooling @ 2%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67288 and previous config saved to /var/cache/conftool/dbconfig/20240814-113718-arnaudb.json [production]
11:23 <mvolz@deploy1003> helmfile [eqiad] DONE helmfile.d/services/citoid: apply [production]
11:23 <mvolz@deploy1003> helmfile [eqiad] START helmfile.d/services/citoid: apply [production]
11:22 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2189 (re)pooling @ 1%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67287 and previous config saved to /var/cache/conftool/dbconfig/20240814-112212-arnaudb.json [production]
11:20 <mvolz@deploy1003> helmfile [codfw] DONE helmfile.d/services/citoid: apply [production]
11:19 <mvolz@deploy1003> helmfile [codfw] START helmfile.d/services/citoid: apply [production]
11:19 <mvolz@deploy1003> helmfile [staging] DONE helmfile.d/services/citoid: apply [production]
11:18 <mvolz@deploy1003> helmfile [staging] START helmfile.d/services/citoid: apply [production]
09:56 <fnegri@cumin1002> conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s1 [production]
09:26 <klausman@deploy1003> helmfile [codfw] DONE helmfile.d/services/api-gateway: apply [production]
09:26 <klausman@deploy1003> helmfile [codfw] START helmfile.d/services/api-gateway: apply [production]