2021-11-22
ยง
|
20:01 |
<ryankemper@cumin1001> |
END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) restart with plugin upgrade (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw plugin upgrade + restart - ryankemper@cumin1001 - T295705 |
[production] |
19:49 |
<ryankemper@cumin1001> |
START - Cookbook sre.elasticsearch.rolling-operation restart with plugin upgrade (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw plugin upgrade + restart - ryankemper@cumin1001 - T295705 |
[production] |
19:49 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
19:46 |
<urbanecm> |
Evening B&C window completed |
[production] |
19:45 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
19:44 |
<urbanecm@deploy1002> |
Synchronized php-1.38.0-wmf.9/extensions/ProofreadPage/: 10b8440069ac71434274462c545c6b2b2c9182d9: Use the WikiEditor ready hook instead of using() the lib (T296033) (duration: 00m 56s) |
[production] |
19:34 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
19:30 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
19:24 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: b6b05e30b3c9b4007fd31ab0698507d7a48d1caf: kswiki: set wgTranslateNumerals to false (T296055) (duration: 00m 55s) |
[production] |
19:20 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
19:18 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: 4aa8d5bf465bfc3fee2ec547718af0c779f88ef4: Enable SandboxLink on lawiki (T296073) (duration: 00m 56s) |
[production] |
19:16 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: 1c082bec4c74c156b26af4349488835902c5bacd: Enable mapframe on the Indonesian Wikipedia (T295571) (duration: 00m 56s) |
[production] |
19:15 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
19:11 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
19:05 |
<pt1979@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
19:01 |
<vgutierrez> |
pool cp4032 (text) using HAProxy as TLS terminator - T290005 |
[production] |
18:20 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
18:14 |
<pt1979@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
18:04 |
<ryankemper@cumin1001> |
END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) restart with plugin upgrade (3 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic plugin upgrade + restart - ryankemper@cumin1001 |
[production] |
17:50 |
<ryankemper@cumin1001> |
END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) restart with plugin upgrade (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw plugin upgrade + restart - ryankemper@cumin1001 - T295705 |
[production] |
17:48 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
17:48 |
<XioNoX> |
repool codfw |
[production] |
17:46 |
<vgutierrez@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4032.ulsfo.wmnet with OS buster |
[production] |
17:46 |
<ejegg> |
updated fundraising python tools from d90f4c91 -> d1d7b100 |
[production] |
17:43 |
<pt1979@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
17:32 |
<ebernhardson> |
restart both elasticsearch instances on elastic2044, reporting `connection refused` (after a brief period of `no route to host`) to masters even though the connection works outside elastic |
[production] |
17:01 |
<ryankemper> |
T295705 Beginning rolling restart w/ plugin upgrade of `cloudelastic`: `ryankemper@cumin1001:~$ sudo cookbook sre.elasticsearch.rolling-operation cloudelastic "cloudelastic plugin upgrade + restart" --upgrade --nodes-per-run 3 --start-datetime 2021-11-22T16:59:38 --task-id T295705` on tmux `rolling_restarts_cloudelastic` |
[production] |
17:00 |
<ryankemper@cumin1001> |
START - Cookbook sre.elasticsearch.rolling-operation restart with plugin upgrade (3 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic plugin upgrade + restart - ryankemper@cumin1001 |
[production] |
16:58 |
<ryankemper> |
[Elastic] T295705 Rolling restart w/ plugin upgrade of `relforge` is complete |
[production] |
16:55 |
<ryankemper> |
[Elastic] T295705 Restarting second and final relforge host: `ryankemper@relforge1003:~$ sudo systemctl restart elasticsearch_6@relforge-eqiad.service elasticsearch_6@relforge-eqiad-small-alpha.service logstash.service` |
[production] |
16:55 |
<vgutierrez@cumin1001> |
START - Cookbook sre.hosts.reimage for host cp4032.ulsfo.wmnet with OS buster |
[production] |
16:52 |
<ryankemper> |
[Elastic] T295705 Restarting first relforge host: `ryankemper@relforge1004:~$ sudo systemctl restart elasticsearch_6@relforge-eqiad.service elasticsearch_6@relforge-eqiad-small-alpha.service logstash.service` |
[production] |
16:51 |
<jayme> |
fleet wide updated wmf-certificates to 0~20211122-1 |
[production] |
16:50 |
<vgutierrez> |
depol cp4032 to be reimaged as cache::text_haproxy - T290005 |
[production] |
16:49 |
<ryankemper> |
[Elastic] T295705 Downtimed relforge* for 2 hours in order to performing a manual rolling restart of the two hosts `relforge1003` and `relforge1004` |
[production] |
16:44 |
<ryankemper> |
T295705 Upgrading `relforge` elasticsearch packages: `ryankemper@cumin1001:~$ sudo cumin -b 2 'relforge*' 'DEBIAN_FRONTEND=noninteractive sudo apt-get -y -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" install elasticsearch-oss wmf-elasticsearch-search-plugins'` |
[production] |
16:39 |
<ryankemper@cumin1001> |
START - Cookbook sre.elasticsearch.rolling-operation restart with plugin upgrade (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw plugin upgrade + restart - ryankemper@cumin1001 - T295705 |
[production] |
16:15 |
<urbanecm> |
Password reset for Miraki@arbcom_dewiki per private request |
[production] |
16:15 |
<moritzm> |
installing postgresql-13 security updates on bullseye |
[production] |
15:56 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
15:55 |
<XioNoX> |
Telia DDoS auto-mitigation enabled on all circuits - T288926 |
[production] |
15:51 |
<pt1979@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
15:28 |
<Amir1> |
revoking DROP for wikiadmin from db1100 (T249683) |
[production] |
15:27 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host prometheus2006.codfw.wmnet with OS bullseye |
[production] |
15:17 |
<moritzm> |
set kvm:machine_version=pc-i440fx-2.8 for Ganeti cluster in codfw T294119 |
[production] |
15:16 |
<jayme> |
imported wmf-certificates 0~20211122-1 to stretch-wikimedia,buster-wikimedia,bullseye-wikimedia |
[production] |
15:13 |
<_joe_> |
restarting pybal low-traffic in codfw, eqiad |
[production] |
15:07 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
15:03 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
14:58 |
<jelto@cumin1001> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host gitlab-runner1001.wikimedia.org |
[production] |