4451-4500 of 10000 results (68ms)
2022-08-02 ยง
04:15 <ryankemper> [Elastic] Blew away red index like so: `ryankemper@cumin1001:~$ curl -XDELETE https://search.svc.codfw.wmnet:9243/be_x_oldwiki_titlesuggest_1659407912`. Cluster is back to `green` status. [production]
04:07 <ryankemper> [Elastic] Per `curl -s https://search.svc.codfw.wmnet:9243/_cat/aliases | grep -i be_x` I see `be_x_oldwiki_titlesuggest ` alias points to `be_x_oldwiki_titlesuggest_1658396688`. I think this means the red index is an old index from an in-progress reindex operation. I likely just need to delete `be_x_oldwiki_titlesuggest_1659407912` but doing some quick digging first [production]
04:04 <ryankemper> [Elastic] Red cluster status in main codfw elasticsearch cluster (`https://search.svc.codfw.wmnet:9243`); culprit appears to be index `be_x_oldwiki_titlesuggest_1659407912`. Confusingly it has 2 replicas set so it's not clear to me how we got into this state starting from green (in the past we've gone into red status from indices that erroneously had 0 replicas in production) [production]
03:47 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
03:46 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
03:46 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
03:45 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
03:40 <krinkle@deploy1002> Synchronized multiversion/: I0802db272695 (duration: 03m 10s) [production]
03:40 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
03:39 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
03:39 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
03:38 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
03:34 <krinkle@deploy1002> Synchronized wmf-config/: I9b89c0ff5c2 (duration: 03m 32s) [production]
03:33 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
03:32 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
03:32 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
03:31 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
03:27 <krinkle@deploy1002> Synchronized multiversion/: I6e97d39a3, Ib843ebced31 (duration: 03m 30s) [production]
03:26 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
03:25 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
03:25 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
03:24 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
03:22 <krinkle@mwmaint1002> pull aborted: (duration: 00m 11s) [production]
03:21 <krinkle@deploy1002> Synchronized wmf-config/CommonSettings.php: I39a2b86065 (duration: 03m 19s) [production]
03:20 <ryankemper@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host elastic2059.codfw.wmnet with OS bullseye [production]
03:15 <krinkle@deploy1002> Synchronized multiversion/: Ieaea60a991e5611 (duration: 03m 03s) [production]
03:14 <krinkle@mwmaint2002> pull aborted: (duration: 01m 36s) [production]
03:14 <krinkle@mwmaint1002> pull aborted: (duration: 01m 31s) [production]
03:13 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
03:12 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
03:12 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
03:11 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
02:58 <ryankemper@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2059.codfw.wmnet with reason: host reimage [production]
02:54 <ryankemper> [WDQS] `ryankemper@wdqs1012:~$ sudo systemctl restart wdqs-blazegraph.service` to clear `Query Service HTTP Port` && `WDQS SPARQL` alerts [production]
02:53 <ryankemper@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on elastic2059.codfw.wmnet with reason: host reimage [production]
02:36 <ryankemper@cumin1001> START - Cookbook sre.hosts.reimage for host elastic2059.codfw.wmnet with OS bullseye [production]
02:31 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
02:30 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
02:30 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
02:29 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
02:09 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
02:08 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
02:08 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
02:07 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
00:41 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
00:40 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
00:40 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
00:39 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
00:35 <krinkle@deploy1002> Synchronized wmf-config/CommonSettings.php: Ieaea60a991e5 (duration: 03m 10s) [production]
00:29 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]