2016-04-21
§
|
14:08 |
<subbu> |
syncing parsoid code |
[production] |
14:07 |
<volans> |
[switchover #5/#5] Switch parsercache RO/RW |
[production] |
14:07 |
<_joe_> |
[switchover #5/1] switching redis replication manually |
[production] |
14:07 |
<ori@tin> |
Synchronized wmf-config/CommonSettings.php: [switchover #4/#2] I0e85c3d20: Switch wmfMasterDatacenter to eqiad (duration: 00m 26s) |
[production] |
14:06 |
<_joe_> |
[switchover #4/1] puppet merged |
[production] |
14:04 |
<Krinkle> |
krinkle@tin: bin/apache-fast-test wiki-urls-warmup1000.txt eqiad |
[production] |
14:03 |
<volans> |
[switchover #3/#1] Set active site's databases (masters) in read-only mode except parsercache ones. |
[production] |
14:03 |
<_joe_> |
[swichover 3/2 wipe memcacheds] |
[production] |
14:02 |
<paravoid> |
wikis now in planned read-only mode, cf. http://blog.wikimedia.org/2016/04/18/wikimedia-server-switch/ |
[production] |
14:01 |
<ori@tin> |
Synchronized wmf-config/db-codfw.php: [switchover #2/#1] Id8b2e7a05: Set codfw databases to read-only mode (duration: 00m 24s) |
[production] |
14:00 |
<jynus> |
disabled all db lag alerts |
[production] |
13:55 |
<volans> |
[switchover #1/#6] Switch pt-heartbeat from active site (codfw) to new site (eqiad) masters |
[production] |
13:54 |
<ori> |
[switchover #1/2] stopping jobrunners in codfw |
[production] |
13:54 |
<_joe_> |
[switchover #1/3] stopping crons on wasat |
[production] |
13:52 |
<volans> |
[switchover #1/#5] Set final $master status for databases in advance |
[production] |
13:50 |
<volans> |
[switchover #1/#4] Disable puppet on all eqiad and codfw databases masters |
[production] |
13:50 |
<paravoid> |
commencing codfw->eqiad datacenter switchover |
[production] |
13:39 |
<ori@tin> |
Synchronized wmf-config/InitialiseSettings.php: I2171f6b1: Enable MessageCacheError log channel (duration: 00m 25s) |
[production] |
13:37 |
<bblack> |
[traffic codfw switch revert #3] - DNS TTL done, bulk of end-user traffic rebalanced, graphs starting to level off at new normals, as done as it gets from our end |
[production] |
13:31 |
<bblack> |
[traffic codfw switch revert #4] - done & confirmed |
[production] |
13:28 |
<bblack> |
[traffic codfw switch revert #4] - merge -> start salted puppet |
[production] |
13:27 |
<bblack> |
[traffic codfw switch revert #2] - done & confirmed |
[production] |
13:25 |
<bblack> |
[traffic codfw switch revert #3] - merge -> authdns-update |
[production] |
13:24 |
<bblack> |
[traffic codfw switch revert #2] - merge -> start salted puppet |
[production] |
13:23 |
<bblack> |
[traffic codfw switch revert #1] - done & confirmed |
[production] |
13:23 |
<bblack> |
[traffic codfw switch revert #1] - merge -> start salted puppet (@13:20, late log) |
[production] |
13:21 |
<ori@tin> |
Synchronized php-1.27.0-wmf.21/includes: Ie9799f5ea: Make MessageCache handle lock timeouts better (duration: 01m 18s) |
[production] |
13:12 |
<jynus@tin> |
Synchronized wmf-config/db-eqiad.php: Temporarely increase es1* master weight to add connection capacity (duration: 00m 37s) |
[production] |
09:57 |
<elukey> |
removed apache2 logrotate config manually from argon as temp patch to remove cronspam from root@ (T132896) |
[production] |
08:36 |
<jynus> |
restarting db1031 to apply new mysql config |
[production] |
02:31 |
<l10nupdate@tin> |
ResourceLoader cache refresh completed at Thu Apr 21 02:31:04 UTC 2016 (duration 8m 37s) |
[production] |
02:22 |
<mwdeploy@tin> |
sync-l10n completed (1.27.0-wmf.21) (duration: 09m 48s) |
[production] |
01:49 |
<mutante> |
git pull on strontium, ops/puppet |
[production] |
01:48 |
<mutante> |
belated log: restarted slapd on seaborgium |
[production] |
01:29 |
<ori> |
installed python-progressbar on terbium for warmup script, will be puppetized later |
[production] |
2016-04-20
§
|
22:18 |
<mutante> |
creating ganeti VM install1001 on eqiad cluster |
[production] |
19:03 |
<AaronSchulz> |
Cleared out 'enqueue' job queues to see if corruption comes back |
[production] |
18:17 |
<jynus@tin> |
Synchronized wmf-config/db-eqiad.php: Promote db1031 as the new x1 eqiad local master (duration: 00m 28s) |
[production] |
18:16 |
<ori@tin> |
Synchronized php-1.27.0-wmf.21/extensions/Translate/messagegroups/WikiPageMessageGroup.php: I331bd93b: Avoid more master queries on page views (duration: 00m 31s) |
[production] |
18:16 |
<ori@tin> |
Synchronized php-1.27.0-wmf.21/includes/jobqueue/JobQueueGroup.php: Ie9799f5ea: Catch errors in pushLazyJobs() and log them (duration: 00m 36s) |
[production] |
17:59 |
<jynus> |
changing database topology to set db1031 as the master of x1 on eqiad |
[production] |
17:58 |
<volans> |
Upgrading db1065 and fixing overheathing problems T132515 |
[production] |
17:30 |
<volans> |
Upgrading db1070 and fixing overheathing problems T132515 |
[production] |
17:19 |
<aaron@tin> |
Synchronized php-1.27.0-wmf.21/includes/jobqueue/JobQueueRedis.php: 86d185a4bbf52d (duration: 00m 39s) |
[production] |
17:15 |
<volans> |
Upgrading db1071 and fixing overheathing problems T132515 |
[production] |
17:03 |
<akosiaris> |
aptitude purge php5-xhprof on uranium |
[production] |
16:54 |
<elukey> |
replaced "#" with ";" manually in uranium's /etc/php5/cli/conf.d/20-xhprof.ini and /etc/php5/apache2/php.ini to avoid cronspam (didn't find puppet/package trails) |
[production] |
15:43 |
<ebernhardson> |
delete apifeatureusage-2016.01.20 from codfw elasticsearch cluster. Index should never have existed in this cluster (and is beyond retention). |
[production] |
15:42 |
<ebernhardson> |
delete apifeatureusage-2016-01-(02,09,10) from eqiad elasticsearch cluster. We only keep 30 days of apifeatureusage logs |
[production] |
15:37 |
<jynus@tin> |
Synchronized wmf-config/db-codfw.php: Tweak DB weights for better latency, avoiding peaks on QPS (duration: 00m 32s) |
[production] |