2020-03-05
ยง
|
16:01 |
<elukey> |
depool mw1394 |
[production] |
16:01 |
<Krinkle> |
mw1394 (api_appserver) is fatalling search-related api requests due to "Elastic down?" |
[production] |
15:28 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' . |
[production] |
15:28 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' . |
[production] |
15:26 |
<otto@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' . |
[production] |
15:25 |
<otto@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' . |
[production] |
15:24 |
<otto@deploy1001> |
helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' . |
[production] |
15:24 |
<otto@deploy1001> |
helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' . |
[production] |
15:18 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1078 after reimage to buster T246604', diff saved to https://phabricator.wikimedia.org/P10627 and previous config saved to /var/cache/conftool/dbconfig/20200305-151858-marostegui.json |
[production] |
15:18 |
<_joe_> |
fixing the envoy installation on mw1394-1404, running scap pull |
[production] |
15:15 |
<XioNoX> |
add SNMP community to Juniper devices |
[production] |
15:01 |
<otto@deploy1001> |
helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' . |
[production] |
15:01 |
<otto@deploy1001> |
helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' . |
[production] |
14:55 |
<otto@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' . |
[production] |
14:55 |
<otto@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' . |
[production] |
14:52 |
<moritzm> |
copied hpssacli to thirdparty/hwraid for buster-wikimedia (current Gen 10 releases are named ssaducli now, but retain the old package (which only uses libc anyway) for backwards compat with gen9 on Buster) |
[production] |
14:45 |
<moritzm> |
copied hpssaducli to thirdparty/hwraid for buster-wikimedia (current releases are named ssaducli now, but retain the old package (which only uses libc anyway) for backwards compat |
[production] |
14:45 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'canary' . |
[production] |
14:45 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics-external' for release 'production' . |
[production] |
14:25 |
<XioNoX> |
push BGP to Cloud on cr2-codfw - T245606 |
[production] |
14:13 |
<Urbanecm> |
Password reset for SUL User:Yezi Brook (T246988) |
[production] |
14:09 |
<XioNoX> |
push BGP to Cloud on cr1-codfw - T245606 |
[production] |
14:05 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
14:05 |
<liw@deploy1001> |
rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.22 |
[production] |
14:03 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
14:03 |
<XioNoX> |
set all eqiad/codfw PDUs, cord W thresholds to 3440 - T245655 |
[production] |
13:54 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
13:51 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
13:50 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
13:49 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
13:48 |
<marostegui> |
Stop MySQL on db1078 for reimage - T246604 |
[production] |
13:47 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1078 for reimage to buster - T246604', diff saved to https://phabricator.wikimedia.org/P10623 and previous config saved to /var/cache/conftool/dbconfig/20200305-134701-marostegui.json |
[production] |
13:26 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
13:24 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
12:56 |
<addshore> |
stop that cache warming .... |
[production] |
12:52 |
<addshore> |
START warm cache for db1111 & db1126 for Q30-32 million (100k batch selects, 30s sleep) T219123 (pass 1) |
[production] |
12:06 |
<Amir1> |
the property terms removal is finished. 312K rows deleted (T225054) |
[production] |
11:53 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db2109 after reimage to buster - T246604', diff saved to https://phabricator.wikimedia.org/P10622 and previous config saved to /var/cache/conftool/dbconfig/20200305-115322-marostegui.json |
[production] |
11:45 |
<Amir1> |
deleting property terms from wb_terms in wikidatawiki (T225054) |
[production] |
11:43 |
<ladsgroup@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: [[gerrit:577218|Stop writing to the old term store for properties (T219301 T225054)]], take II (duration: 01m 04s) |
[production] |
11:42 |
<ladsgroup@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: [[gerrit:577218|Stop writing to the old term store for properties (T219301 T225054)]] (duration: 01m 04s) |
[production] |
11:29 |
<ladsgroup@deploy1001> |
Synchronized php-1.35.0-wmf.22/extensions/Wikibase: [[gerrit:576963|Schedule 1 CleanTermsIfUnusedJob per ID to clean (T244115 T246898)]] (duration: 01m 08s) |
[production] |
11:25 |
<ladsgroup@deploy1001> |
Synchronized php-1.35.0-wmf.22/extensions/Cognate: [[gerrit:576876|Exit undelete hook early if revision not found (T245869)]] (duration: 01m 04s) |
[production] |
11:20 |
<addshore@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Write to new term store up to Q87 million, was 86 (T219123) cache bust (duration: 01m 03s) |
[production] |
11:19 |
<addshore@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Write to new term store up to Q87 million, was 86 (T219123) (duration: 01m 04s) |
[production] |
11:10 |
<vgutierrez> |
Disable parent proxies on ats-tls in ulsfo - T244464 |
[production] |
11:06 |
<addshore@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Reading up to Q30M for the new term store everywhere (was Q25M) + warm db1126 & db1111 caches (T219123) cache bust (duration: 01m 04s) |
[production] |
11:04 |
<addshore@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Reading up to Q30M for the new term store everywhere (was Q25M) + warm db1126 & db1111 caches (T219123) (duration: 01m 05s) |
[production] |
11:04 |
<jbond42> |
small update to PCC https://gerrit.wikimedia.org/r/c/operations/software/puppet-compiler/+/576663 |
[production] |
10:50 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |