2020-04-28
§
|
09:12 |
<elukey@cumin1001> |
START - Cookbook sre.presto.roll-restart-workers |
[production] |
09:12 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.presto.roll-restart-workers (exit_code=99) |
[production] |
09:12 |
<elukey@cumin1001> |
START - Cookbook sre.presto.roll-restart-workers |
[production] |
08:55 |
<XioNoX> |
re-set lost licenses on asw2-a/b-eqiad |
[production] |
08:40 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Fully repool db1105:3311 and 3312 after reimage', diff saved to https://phabricator.wikimedia.org/P11060 and previous config saved to /var/cache/conftool/dbconfig/20200428-084041-marostegui.json |
[production] |
08:36 |
<dcausse> |
deleting wikidatawiki_content_1587076410 from cloudelastic |
[production] |
08:30 |
<_joe_> |
restarting php-fpm on mw1407 and mw1409 again, then running traffic on them for 1 hour. |
[production] |
08:24 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repoo db1105:3311 and 3312 after reimage', diff saved to https://phabricator.wikimedia.org/P11059 and previous config saved to /var/cache/conftool/dbconfig/20200428-082420-marostegui.json |
[production] |
08:21 |
<dcausse> |
restarting blazegraph on wdqs1007 (T242453) |
[production] |
08:20 |
<jynus@cumin2001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
08:17 |
<jynus@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
08:13 |
<kormat> |
reimaging db2124 to buster T250666 |
[production] |
08:13 |
<mutante> |
rsyncing transparency-report-private files from bromine to miscweb1002/2002. git-cloning was removed about a year ago but site still exists. need to figure out if it should be deleted (T188362 T247650) |
[production] |
08:09 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repoo db1105:3311 and 3312 after reimage', diff saved to https://phabricator.wikimedia.org/P11058 and previous config saved to /var/cache/conftool/dbconfig/20200428-080920-marostegui.json |
[production] |
08:06 |
<moritzm> |
installing qemu security updates |
[production] |
07:52 |
<_joe_> |
running benchmarks on mw1407 (LCStoreStaticArray) and mw1409 (LCStoreCDB) for T99740: restart php-fpm, pool for 5 minutes to warmup caches, then depool both servers. |
[production] |
07:49 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
07:44 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
07:26 |
<marostegui> |
Reimage db1105 |
[production] |
07:24 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1105:3311 and 3312 for reimage', diff saved to https://phabricator.wikimedia.org/P11057 and previous config saved to /var/cache/conftool/dbconfig/20200428-072416-marostegui.json |
[production] |
06:35 |
<marostegui> |
Deploy schema change on s3 master with replication for the wikis at T250071#6051598 - T250071 |
[production] |
06:06 |
<marostegui> |
Deploy schema change on s4 codfw, this will generate lag on codfw - T250055 |
[production] |
05:57 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db1112', diff saved to https://phabricator.wikimedia.org/P11056 and previous config saved to /var/cache/conftool/dbconfig/20200428-055719-marostegui.json |
[production] |
05:52 |
<marostegui> |
Reclone labsdb1011 from labsdb1012 - T249188 |
[production] |
05:42 |
<marostegui> |
Restart labsdb1011 with innodb_purge_threads set to 10 - T249188 |
[production] |
05:35 |
<marostegui> |
Deploy schema change on db1112 |
[production] |
05:34 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1112 for schema change', diff saved to https://phabricator.wikimedia.org/P11054 and previous config saved to /var/cache/conftool/dbconfig/20200428-053453-marostegui.json |
[production] |
04:59 |
<vgutierrez> |
depool and powercycle cp5012 |
[production] |
04:37 |
<kart_> |
Updated cxserver to 2020-04-27-061703-production (T249852) |
[production] |
04:34 |
<kartik@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'cxserver' for release 'production' . |
[production] |
04:22 |
<kartik@deploy1001> |
helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' . |
[production] |
04:18 |
<kartik@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' . |
[production] |
2020-04-27
§
|
23:25 |
<catrope@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Update logos for tiwiki and tiwiktionary (T150618, T249451) (duration: 00m 57s) |
[production] |
23:20 |
<catrope@deploy1001> |
Synchronized static/images/project-logos/: Update logos for tiwiki and tiwiktionary (T150618, T249451) (duration: 00m 58s) |
[production] |
23:18 |
<catrope@deploy1001> |
Synchronized dblists/visualeditor-nondefault.dblist: Enable VisualEditor by default on srwiki (T250878) (duration: 00m 57s) |
[production] |
23:16 |
<catrope@deploy1001> |
Synchronized wmf-config/config/srwiki.yaml: Enable VisualEditor by default on srwiki (T250878) (duration: 00m 58s) |
[production] |
20:58 |
<bearND> |
mobileapps deploy on canary failed due to timeouts, rolled back. |
[production] |
20:56 |
<bsitzmann@deploy1001> |
Finished deploy [mobileapps/deploy@99c350c]: Update mobileapps to 09cb7c2e (duration: 00m 52s) |
[production] |
20:55 |
<hknust> |
holger@mwmaint1002 END (enwiki=success, frwiki=fail) uppercaseTitlesForUnicodeTransition.php as part of T219279 |
[production] |
20:55 |
<bsitzmann@deploy1001> |
Started deploy [mobileapps/deploy@99c350c]: Update mobileapps to 09cb7c2e |
[production] |
20:43 |
<bearND> |
mobileapps deployed failed due to timeouts, rolled back. |
[production] |
20:42 |
<bsitzmann@deploy1001> |
Finished deploy [mobileapps/deploy@99c350c]: Update mobileapps to 09cb7c2e (duration: 06m 24s) |
[production] |
20:35 |
<bsitzmann@deploy1001> |
Started deploy [mobileapps/deploy@99c350c]: Update mobileapps to 09cb7c2e |
[production] |
20:28 |
<hknust> |
holger@mwmaint1002 Restarting uppercaseTitlesForUnicodeTransition.php as part of T219279 for 2 wikis |
[production] |
19:26 |
<cmjohnson@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
19:25 |
<cmjohnson@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
19:23 |
<ppchelko@deploy1001> |
Finished deploy [changeprop/deploy@ecca66b]: Switch off rules moved to k8s T248677 (duration: 01m 22s) |
[production] |
19:22 |
<cmjohnson@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
19:21 |
<ppchelko@deploy1001> |
Started deploy [changeprop/deploy@ecca66b]: Switch off rules moved to k8s T248677 |
[production] |
19:21 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |