2020-03-05
ยง
|
17:32 |
<elukey> |
run homer on cumin1001 to apply https://gerrit.wikimedia.org/r/576873 on cr1/cr2-eqiad |
[production] |
17:27 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'canary' . |
[production] |
17:27 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'production' . |
[production] |
17:24 |
<otto@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'eventstreams' for release 'canary' . |
[production] |
17:24 |
<otto@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'eventstreams' for release 'production' . |
[production] |
17:19 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
17:15 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
17:14 |
<otto@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'eventstreams' for release 'canary' . |
[production] |
17:14 |
<otto@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'eventstreams' for release 'production' . |
[production] |
17:11 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
17:09 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
16:58 |
<otto@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'eventstreams' for release 'canary' . |
[production] |
16:58 |
<otto@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'eventstreams' for release 'production' . |
[production] |
16:55 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Fully repool db1078 after reimage to buster T246604', diff saved to https://phabricator.wikimedia.org/P10631 and previous config saved to /var/cache/conftool/dbconfig/20200305-165555-marostegui.json |
[production] |
16:55 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'canary' . |
[production] |
16:54 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'production' . |
[production] |
16:50 |
<krinkle@deploy1001> |
Synchronized dblists/: I22a3c2a82f7be4a (duration: 00m 57s) |
[production] |
16:43 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1078 after reimage to buster T246604', diff saved to https://phabricator.wikimedia.org/P10630 and previous config saved to /var/cache/conftool/dbconfig/20200305-164319-marostegui.json |
[production] |
16:22 |
<marostegui> |
Restart tendril/dbtree database |
[production] |
16:18 |
<_joe_> |
repooling mw1394 |
[production] |
16:12 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1078 after reimage to buster T246604', diff saved to https://phabricator.wikimedia.org/P10629 and previous config saved to /var/cache/conftool/dbconfig/20200305-161222-marostegui.json |
[production] |
16:01 |
<elukey> |
depool mw1394 |
[production] |
16:01 |
<Krinkle> |
mw1394 (api_appserver) is fatalling search-related api requests due to "Elastic down?" |
[production] |
15:28 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' . |
[production] |
15:28 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' . |
[production] |
15:26 |
<otto@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' . |
[production] |
15:25 |
<otto@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' . |
[production] |
15:24 |
<otto@deploy1001> |
helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' . |
[production] |
15:24 |
<otto@deploy1001> |
helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' . |
[production] |
15:18 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1078 after reimage to buster T246604', diff saved to https://phabricator.wikimedia.org/P10627 and previous config saved to /var/cache/conftool/dbconfig/20200305-151858-marostegui.json |
[production] |
15:18 |
<_joe_> |
fixing the envoy installation on mw1394-1404, running scap pull |
[production] |
15:15 |
<XioNoX> |
add SNMP community to Juniper devices |
[production] |
15:01 |
<otto@deploy1001> |
helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' . |
[production] |
15:01 |
<otto@deploy1001> |
helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' . |
[production] |
14:55 |
<otto@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' . |
[production] |
14:55 |
<otto@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' . |
[production] |
14:52 |
<moritzm> |
copied hpssacli to thirdparty/hwraid for buster-wikimedia (current Gen 10 releases are named ssaducli now, but retain the old package (which only uses libc anyway) for backwards compat with gen9 on Buster) |
[production] |
14:45 |
<moritzm> |
copied hpssaducli to thirdparty/hwraid for buster-wikimedia (current releases are named ssaducli now, but retain the old package (which only uses libc anyway) for backwards compat |
[production] |
14:45 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' . |
[production] |
14:45 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' . |
[production] |
14:25 |
<XioNoX> |
push BGP to Cloud on cr2-codfw - T245606 |
[production] |
14:13 |
<Urbanecm> |
Password reset for SUL User:Yezi Brook (T246988) |
[production] |
14:09 |
<XioNoX> |
push BGP to Cloud on cr1-codfw - T245606 |
[production] |
14:05 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
14:05 |
<liw@deploy1001> |
rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.22 |
[production] |
14:03 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
14:03 |
<XioNoX> |
set all eqiad/codfw PDUs, cord W thresholds to 3440 - T245655 |
[production] |
13:54 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
13:51 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
13:50 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |