1101-1150 of 10000 results (108ms)
2024-08-14 §
12:07 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2189 (re)pooling @ 8%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67290 and previous config saved to /var/cache/conftool/dbconfig/20240814-120729-arnaudb.json [production]
11:52 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2189 (re)pooling @ 4%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67289 and previous config saved to /var/cache/conftool/dbconfig/20240814-115223-arnaudb.json [production]
11:37 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2189 (re)pooling @ 2%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67288 and previous config saved to /var/cache/conftool/dbconfig/20240814-113718-arnaudb.json [production]
11:23 <mvolz@deploy1003> helmfile [eqiad] DONE helmfile.d/services/citoid: apply [production]
11:23 <mvolz@deploy1003> helmfile [eqiad] START helmfile.d/services/citoid: apply [production]
11:22 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2189 (re)pooling @ 1%: corrupted index fixed', diff saved to https://phabricator.wikimedia.org/P67287 and previous config saved to /var/cache/conftool/dbconfig/20240814-112212-arnaudb.json [production]
11:20 <mvolz@deploy1003> helmfile [codfw] DONE helmfile.d/services/citoid: apply [production]
11:19 <mvolz@deploy1003> helmfile [codfw] START helmfile.d/services/citoid: apply [production]
11:19 <mvolz@deploy1003> helmfile [staging] DONE helmfile.d/services/citoid: apply [production]
11:18 <mvolz@deploy1003> helmfile [staging] START helmfile.d/services/citoid: apply [production]
09:56 <fnegri@cumin1002> conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s1 [production]
09:26 <klausman@deploy1003> helmfile [codfw] DONE helmfile.d/services/api-gateway: apply [production]
09:26 <klausman@deploy1003> helmfile [codfw] START helmfile.d/services/api-gateway: apply [production]
09:23 <klausman@deploy1003> helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply [production]
09:23 <klausman@deploy1003> helmfile [eqiad] START helmfile.d/services/api-gateway: apply [production]
09:17 <klausman@deploy1003> helmfile [staging] DONE helmfile.d/services/api-gateway: apply [production]
09:16 <klausman@deploy1003> helmfile [staging] START helmfile.d/services/api-gateway: apply [production]
09:11 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2189.codfw.wmnet with reason: replication still catching up [production]
09:11 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2189.codfw.wmnet with reason: replication still catching up [production]
08:53 <jayme@cumin1002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host kafka-main2010.codfw.wmnet with OS bullseye [production]
08:46 <jayme@cumin1002> START - Cookbook sre.hosts.reimage for host kafka-main2010.codfw.wmnet with OS bullseye [production]
07:45 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db2189.codfw.wmnet with reason: index corruption [production]
07:45 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 4:00:00 on db2189.codfw.wmnet with reason: index corruption [production]
00:54 <eileen> config revision changed from d6f17100 to f569b590 [production]
00:41 <eileen> civicrm upgraded from dd54b9ae to eecbba5d [production]
00:11 <eileen> civicrm upgraded from 686c7c5f to dd54b9ae [production]
00:04 <eileen> config revision changed from e8cc0ed6 to d6f17100 [production]
2024-08-13 §
23:08 <ejegg> payments-wiki upgraded from 2d48f432 to 3eb3be67 [production]
21:56 <inflatador> bking@cumin2002 reboot wdqs101[3-5],1018,1020 from DRAC due to unresponsiveness T372442 [production]
21:16 <ebernhardson@deploy1003> helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
21:16 <ebernhardson@deploy1003> helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply [production]
21:15 <ebernhardson@deploy1003> helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
21:15 <ebernhardson@deploy1003> helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply [production]
21:09 <ebernhardson@deploy1003> helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
21:09 <ebernhardson@deploy1003> helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply [production]
21:07 <ebernhardson@deploy1003> helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
21:07 <ebernhardson@deploy1003> helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply [production]
20:51 <ryankemper@cumin2002> END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T370754, transfer fresh wdqs-main journal to codfw host) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs2021.codfw.wmnet w/ force delete existing files, repooling neither afterwards [production]
20:22 <brett> Update ncmonitor to 1.2.0 via apt1002 [production]
19:57 <ryankemper@cumin2002> START - Cookbook sre.wdqs.data-transfer (T370754, transfer fresh wdqs-main journal to codfw host) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs2021.codfw.wmnet w/ force delete existing files, repooling neither afterwards [production]
19:44 <ebernhardson@deploy1003> helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
19:43 <ebernhardson@deploy1003> helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply [production]
19:32 <bking@cumin2002> END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (2 nodes at a time) for ElasticSearch cluster search_eqiad: security update - bking@cumin2002 - T371874 [production]
19:29 <ryankemper@cumin2002> END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T370754, transfer fresh wdqs-main journal to codfw host) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs2021.codfw.wmnet, repooling neither afterwards [production]
19:27 <ryankemper@cumin2002> START - Cookbook sre.wdqs.data-transfer (T370754, transfer fresh wdqs-main journal to codfw host) xfer wikidata_main from wdqs1021.eqiad.wmnet -> wdqs2021.codfw.wmnet, repooling neither afterwards [production]
19:25 <ebernhardson@deploy1003> helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
19:25 <ebernhardson@deploy1003> helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply [production]
19:25 <ebernhardson@deploy1003> helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
19:25 <ebernhardson@deploy1003> helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply [production]
19:24 <ebernhardson@deploy1003> helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply [production]