2024-02-03
§
|
13:30 |
<eevans@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase2017.codfw.wmnet with reason: Decommissioning — T352469 |
[production] |
13:30 |
<eevans@cumin1002> |
START - Cookbook sre.hosts.downtime for 30 days, 0:00:00 on restbase2017.codfw.wmnet with reason: Decommissioning — T352469 |
[production] |
08:19 |
<ryankemper> |
[cloudelastic] Replica shards have re-initialized; cluster is back to green. Will probably see a wall of `ElasticSearch unassigned shard check - 9400` resolve messages soon, fingers crossed |
[production] |
08:15 |
<ryankemper> |
[cloduelastic] Re-enabled replica allocation on `cloudelastic-omega-eqiad` => `curl -H 'Content-Type: application/json' -XPUT https://cloudelastic.wikimedia.org:9443/_cluster/settings -d '{"transient":{"cluster.routing.allocation":{"enable": "all"}}}'` |
[production] |
08:10 |
<ryankemper> |
[cloudelastic] Seeing `replica allocations are forbidden due to cluster setting [cluster.routing.allocation.enable=primaries`; that likely explains the many unassigned shards of cloudelastic.wikimedia.org:9400 ... feels like a previous cookbook run didn't back out successfully leaving replica allocation disabled |
[production] |
08:09 |
<ryankemper> |
[cloudelastic] current state: `{"cluster_name":"cloudelastic-omega-eqiad","status":"yellow","number_of_nodes":10,"number_of_data_nodes":10,"active_primary_shards":798,"active_shards":1438,"relocating_shards":0,"initializing_shards":0,"unassigned_shards":160,"delayed_unassigned_shards":0,"active_shards_percent_as_number":89.98748435544431}` |
[production] |
01:13 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance |
[production] |
01:13 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance |
[production] |
01:13 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1249 (T355609)', diff saved to https://phabricator.wikimedia.org/P56168 and previous config saved to /var/cache/conftool/dbconfig/20240203-011337-marostegui.json |
[production] |
00:58 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P56167 and previous config saved to /var/cache/conftool/dbconfig/20240203-005830-marostegui.json |
[production] |
00:43 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P56166 and previous config saved to /var/cache/conftool/dbconfig/20240203-004324-marostegui.json |
[production] |
00:28 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1249 (T355609)', diff saved to https://phabricator.wikimedia.org/P56165 and previous config saved to /var/cache/conftool/dbconfig/20240203-002817-marostegui.json |
[production] |
00:03 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db1249 (T355609)', diff saved to https://phabricator.wikimedia.org/P56164 and previous config saved to /var/cache/conftool/dbconfig/20240203-000314-marostegui.json |
[production] |
00:03 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1249.eqiad.wmnet with reason: Maintenance |
[production] |
00:03 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1249.eqiad.wmnet with reason: Maintenance |
[production] |
00:02 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1248 (T355609)', diff saved to https://phabricator.wikimedia.org/P56163 and previous config saved to /var/cache/conftool/dbconfig/20240203-000252-marostegui.json |
[production] |
2024-02-02
§
|
23:47 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P56162 and previous config saved to /var/cache/conftool/dbconfig/20240202-234745-marostegui.json |
[production] |
23:32 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P56161 and previous config saved to /var/cache/conftool/dbconfig/20240202-233239-marostegui.json |
[production] |
23:17 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1248 (T355609)', diff saved to https://phabricator.wikimedia.org/P56160 and previous config saved to /var/cache/conftool/dbconfig/20240202-231732-marostegui.json |
[production] |
22:44 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db1248 (T355609)', diff saved to https://phabricator.wikimedia.org/P56159 and previous config saved to /var/cache/conftool/dbconfig/20240202-224357-marostegui.json |
[production] |
22:44 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1248.eqiad.wmnet with reason: Maintenance |
[production] |
22:43 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1248.eqiad.wmnet with reason: Maintenance |
[production] |
22:43 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1247 (T355609)', diff saved to https://phabricator.wikimedia.org/P56158 and previous config saved to /var/cache/conftool/dbconfig/20240202-224334-marostegui.json |
[production] |
22:28 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P56157 and previous config saved to /var/cache/conftool/dbconfig/20240202-222828-marostegui.json |
[production] |
22:13 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P56156 and previous config saved to /var/cache/conftool/dbconfig/20240202-221321-marostegui.json |
[production] |
21:58 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1247 (T355609)', diff saved to https://phabricator.wikimedia.org/P56155 and previous config saved to /var/cache/conftool/dbconfig/20240202-215815-marostegui.json |
[production] |
21:35 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db1247 (T355609)', diff saved to https://phabricator.wikimedia.org/P56154 and previous config saved to /var/cache/conftool/dbconfig/20240202-213504-marostegui.json |
[production] |
21:34 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1247.eqiad.wmnet with reason: Maintenance |
[production] |
21:34 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1247.eqiad.wmnet with reason: Maintenance |
[production] |
21:16 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1246.eqiad.wmnet with reason: Maintenance |
[production] |
21:15 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1246.eqiad.wmnet with reason: Maintenance |
[production] |
20:57 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1244.eqiad.wmnet with reason: Maintenance |
[production] |
20:57 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1244.eqiad.wmnet with reason: Maintenance |
[production] |
20:57 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1243 (T355609)', diff saved to https://phabricator.wikimedia.org/P56153 and previous config saved to /var/cache/conftool/dbconfig/20240202-205722-marostegui.json |
[production] |
20:42 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P56152 and previous config saved to /var/cache/conftool/dbconfig/20240202-204215-marostegui.json |
[production] |
20:27 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P56151 and previous config saved to /var/cache/conftool/dbconfig/20240202-202709-marostegui.json |
[production] |
20:12 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1243 (T355609)', diff saved to https://phabricator.wikimedia.org/P56150 and previous config saved to /var/cache/conftool/dbconfig/20240202-201202-marostegui.json |
[production] |
19:44 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db1243 (T355609)', diff saved to https://phabricator.wikimedia.org/P56149 and previous config saved to /var/cache/conftool/dbconfig/20240202-194359-marostegui.json |
[production] |
19:44 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1243.eqiad.wmnet with reason: Maintenance |
[production] |
19:43 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1243.eqiad.wmnet with reason: Maintenance |
[production] |
19:43 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1242 (T355609)', diff saved to https://phabricator.wikimedia.org/P56148 and previous config saved to /var/cache/conftool/dbconfig/20240202-194338-marostegui.json |
[production] |
19:28 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P56147 and previous config saved to /var/cache/conftool/dbconfig/20240202-192831-marostegui.json |
[production] |
19:13 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P56146 and previous config saved to /var/cache/conftool/dbconfig/20240202-191325-marostegui.json |
[production] |
18:58 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1242 (T355609)', diff saved to https://phabricator.wikimedia.org/P56145 and previous config saved to /var/cache/conftool/dbconfig/20240202-185818-marostegui.json |
[production] |
18:35 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db1242 (T355609)', diff saved to https://phabricator.wikimedia.org/P56144 and previous config saved to /var/cache/conftool/dbconfig/20240202-183510-marostegui.json |
[production] |
18:35 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1242.eqiad.wmnet with reason: Maintenance |
[production] |
18:35 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1242.eqiad.wmnet with reason: Maintenance |
[production] |
18:34 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1241 (T355609)', diff saved to https://phabricator.wikimedia.org/P56143 and previous config saved to /var/cache/conftool/dbconfig/20240202-183448-marostegui.json |
[production] |