2020-07-22
ยง
|
10:24 |
<jbond42> |
upload prometheus-swagger-exporter_0.3-1+deb10u1 to apt1001 buster repo |
[production] |
10:24 |
<jayme@deploy1001> |
helmfile [CODFW] Ran 'sync' command on namespace 'cxserver' for release 'production' . |
[production] |
10:22 |
<jayme@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'cxserver' for release 'staging' . |
[production] |
10:19 |
<akosiaris@deploy2001> |
helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' . |
[production] |
10:19 |
<akosiaris@deploy2001> |
helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' . |
[production] |
10:12 |
<akosiaris@cumin1001> |
conftool action : set/pooled=yes; selector: dc=codfw,service=mobileapps,name=scb.* |
[production] |
10:08 |
<akosiaris@cumin1001> |
conftool action : set/pooled=no; selector: dc=codfw,service=mobileapps,name=scb.* |
[production] |
10:04 |
<jayme@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'citoid' for release 'production' . |
[production] |
10:01 |
<jayme@deploy1001> |
helmfile [CODFW] Ran 'sync' command on namespace 'citoid' for release 'production' . |
[production] |
09:58 |
<marostegui> |
Deploy MCR schema change on s4 codfw master (lag will appear on codfw) - T238966 |
[production] |
09:55 |
<akosiaris> |
bump memory in codfw mobileapps another 20% T218733 |
[production] |
09:55 |
<akosiaris@deploy2001> |
helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' . |
[production] |
09:55 |
<akosiaris@deploy2001> |
helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' . |
[production] |
09:52 |
<godog> |
centrallog1001 lvextend /srv by 130G |
[production] |
09:51 |
<jayme@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'citoid' for release 'staging' . |
[production] |
09:46 |
<akosiaris> |
codfw mobileapps kubernetes traffic back to 96% T218733 again. scb pooled again. |
[production] |
09:46 |
<akosiaris@cumin1001> |
conftool action : set/pooled=yes; selector: dc=codfw,service=mobileapps,name=scb.* |
[production] |
09:43 |
<jayme@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'blubberoid' for release 'production' . |
[production] |
09:43 |
<akosiaris@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' . |
[production] |
09:43 |
<akosiaris@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'production' . |
[production] |
09:40 |
<jayme@deploy1001> |
helmfile [CODFW] Ran 'sync' command on namespace 'blubberoid' for release 'production' . |
[production] |
09:40 |
<akosiaris> |
increase codfw mobileapps kubernetes traffic to 100% T218733 |
[production] |
09:40 |
<akosiaris@cumin1001> |
conftool action : set/pooled=no; selector: dc=codfw,service=mobileapps,name=scb.* |
[production] |
09:34 |
<jayme@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'blubberoid' for release 'staging' . |
[production] |
09:27 |
<akosiaris@deploy2001> |
helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' . |
[production] |
09:27 |
<akosiaris@deploy2001> |
helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'nontls' . |
[production] |
09:25 |
<akosiaris> |
bump memory limits for mobileapps by 25% T218733 |
[production] |
09:25 |
<akosiaris@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . |
[production] |
09:10 |
<jayme> |
updated docker-report to 0.0.7-1 on deneb |
[production] |
09:09 |
<jayme> |
import docker-report 0.0.7-1 to buster-wikimedia |
[production] |
09:06 |
<gehel> |
restarting blazegraph on all wdqs nodes - new vocabulary |
[production] |
08:48 |
<dcausse> |
restarting blazegraph on wdqs1010 (testing new vocab) |
[production] |
08:46 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Fully repool db1126', diff saved to https://phabricator.wikimedia.org/P12017 and previous config saved to /var/cache/conftool/dbconfig/20200722-084613-marostegui.json |
[production] |
08:41 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Increase es1020 to 100% pooled in es4, reduce es1021 to weight 0 T257284', diff saved to https://phabricator.wikimedia.org/P12016 and previous config saved to /var/cache/conftool/dbconfig/20200722-084159-kormat.json |
[production] |
08:39 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P12015 and previous config saved to /var/cache/conftool/dbconfig/20200722-083926-marostegui.json |
[production] |
08:35 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Fully repool db1084 and db1107', diff saved to https://phabricator.wikimedia.org/P12014 and previous config saved to /var/cache/conftool/dbconfig/20200722-083535-marostegui.json |
[production] |
08:31 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P12013 and previous config saved to /var/cache/conftool/dbconfig/20200722-083140-marostegui.json |
[production] |
08:30 |
<kart_> |
Updated cxserver to 2020-07-20-200559-production (T257674) |
[production] |
08:28 |
<kartik@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'cxserver' for release 'production' . |
[production] |
08:25 |
<kartik@deploy1001> |
helmfile [CODFW] Ran 'sync' command on namespace 'cxserver' for release 'production' . |
[production] |
08:25 |
<volans@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
08:23 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1084 and db1107', diff saved to https://phabricator.wikimedia.org/P12012 and previous config saved to /var/cache/conftool/dbconfig/20200722-082309-marostegui.json |
[production] |
08:22 |
<kartik@deploy1001> |
helmfile [STAGING] Ran 'sync' command on namespace 'cxserver' for release 'staging' . |
[production] |
08:20 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P12010 and previous config saved to /var/cache/conftool/dbconfig/20200722-082023-marostegui.json |
[production] |
08:19 |
<volans@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
08:16 |
<akosiaris> |
increase codfw mobileapps kubernetes traffic to 96% T218733. Take #2. Let's see if I can reproduce the weird increases in p99 latencies and figure out their cause |
[production] |
08:15 |
<akosiaris@cumin1001> |
conftool action : set/weight=1; selector: dc=codfw,service=mobileapps,name=scb.* |
[production] |
08:14 |
<kormat@cumin1001> |
dbctl commit (dc=all): 'Increase es1020 to 75% pooled in es4, reduce es1021 to weight 25 T257284', diff saved to https://phabricator.wikimedia.org/P12009 and previous config saved to /var/cache/conftool/dbconfig/20200722-081457-kormat.json |
[production] |
08:13 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1084 and db1107', diff saved to https://phabricator.wikimedia.org/P12008 and previous config saved to /var/cache/conftool/dbconfig/20200722-081330-marostegui.json |
[production] |
08:12 |
<moritzm> |
Turnilo switched to CAS |
[production] |