2018-09-11
§
|
21:03 |
<mutante> |
restarted apache on mwdebug1002, running puppet |
[production] |
19:38 |
<ema> |
switch all services to codfw only |
[production] |
19:31 |
<ema> |
switch restbase to active/active |
[production] |
19:20 |
<ema> |
depool eqiad from edge traffic |
[production] |
19:06 |
<ema> |
route esams via codfw |
[production] |
17:10 |
<XioNoX> |
delete BGP sessions with old AS10089 router on cr1-eqsin |
[production] |
16:53 |
<godog> |
repair sdd on ms-be1043 - T199198 |
[production] |
16:27 |
<mutante> |
added gtirloni to acl*sre-team on Phabricator (T203489) |
[production] |
16:17 |
<godog> |
correction, sdk1 on ms-be1041 - T199198 |
[production] |
16:16 |
<godog> |
repair sdd1 on ms-be1043 - T199198 |
[production] |
15:06 |
<godog> |
serve switch originals and thumbs from codfw only |
[production] |
15:00 |
<godog> |
begin switching swift to codfw |
[production] |
14:40 |
<END> |
(PASS) - Cookbook sre.switchdc.services.02-restore-ttl (exit_code=0) (akosiaris@sarin) |
[production] |
14:40 |
<START> |
- Cookbook sre.switchdc.services.02-restore-ttl (akosiaris@sarin) |
[production] |
14:38 |
<END> |
(PASS) - Cookbook sre.switchdc.services.01-switch-dc (exit_code=0) (akosiaris@sarin) |
[production] |
14:38 |
<Switching> |
services parsoid, restbase, restbase-async, mobileapps, apertium, citoid, cxserver, eventstreams, graphoid, mathoid, proton, pdfrender, recommendation-api, zotero, eventbus, ores, wdqs, wdqs-internal: eqiad => codfw (akosiaris@sarin) |
[production] |
14:38 |
<START> |
- Cookbook sre.switchdc.services.01-switch-dc (akosiaris@sarin) |
[production] |
14:38 |
<END> |
(PASS) - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (exit_code=0) (akosiaris@sarin) |
[production] |
14:32 |
<START> |
- Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (akosiaris@sarin) |
[production] |
14:31 |
<END> |
(FAIL) - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (exit_code=99) (akosiaris@sarin) |
[production] |
14:31 |
<START> |
- Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (akosiaris@sarin) |
[production] |
14:31 |
<END> |
(FAIL) - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (exit_code=99) (akosiaris@sarin) |
[production] |
14:31 |
<START> |
- Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (akosiaris@sarin) |
[production] |
13:21 |
<END> |
(PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0) (volans@sarin) |
[production] |
13:21 |
<START> |
- Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (volans@sarin) |
[production] |
13:14 |
<END> |
(PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) (volans@sarin) |
[production] |
13:14 |
<START> |
- Cookbook sre.switchdc.mediawiki.01-stop-maintenance (volans@sarin) |
[production] |
13:12 |
<END> |
(PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) (volans@sarin) |
[production] |
13:12 |
<START> |
- Cookbook sre.switchdc.mediawiki.00-reduce-ttl (volans@sarin) |
[production] |
13:08 |
<volans> |
performing some additional switchdc live test |
[production] |
13:02 |
<volans> |
upgraded spicerack to version 0.0.8 on sarin/neodymium - T199079 |
[production] |
12:28 |
<gehel> |
restarting tilerator on maps1* (eqiad) - heap memory exceeded |
[production] |
12:09 |
<moritzm> |
installing jq security updates on trusty |
[production] |
12:01 |
<dereckson@deploy1001> |
Synchronized wmf-config/throttle.php: Update Informatika SZŠ Chomutov throttle rule (T203909) (duration: 00m 50s) |
[production] |
12:00 |
<dereckson@deploy1001> |
sync-file aborted: Update Informatika SZŠ Chomutov throttle rule (duration: 00m 04s) |
[production] |
11:49 |
<volans> |
uploaded spicerack_0.0.8-1{,+deb9u1} to apt.wikimedia.org {jessie,stretch}-wikimedia - T199079 |
[production] |
11:37 |
<moritzm> |
restarting hhvm on mw1261-mw1265 to pick up curl security updates |
[production] |
11:25 |
<zfilipin@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:455804|Set category collation to uca-et-u-kn on Estonian-language wikis (T202977)]] (duration: 00m 50s) |
[production] |
10:37 |
<marostegui> |
Disable GTID on all codfw masters (sX, x1, esX) (not in db2040 as it is not enabled there) T189107 |
[production] |
10:36 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Repool db1096:3315, db1100 (duration: 00m 49s) |
[production] |
10:30 |
<tgr@deploy1001> |
Finished scap: T204018 update i18n on fixcopyrightwiki (duration: 31m 01s) |
[production] |
10:27 |
<marostegui> |
db1096:3315 and db1100 were test pages - NO MORE TEST PAGES ARE EXPECTED FROM NOW ON - T200509 |
[production] |
10:16 |
<marostegui> |
Stop replication on db2075 to test the paging (should not page) |
[production] |
10:14 |
<marostegui> |
Stop replication on db1100 to test the paging |
[production] |
10:03 |
<marostegui> |
Stop replication on db2084:3315 for alert testing |
[production] |
09:59 |
<tgr@deploy1001> |
Started scap: T204018 update i18n on fixcopyrightwiki |
[production] |
09:54 |
<marostegui> |
Stop replication on db1096:3315 for paging testing |
[production] |
09:25 |
<moritzm> |
installing curl security updates |
[production] |
08:39 |
<godog> |
repair xfs on sdh/sdc on ms-be2040 - T199198 |
[production] |
08:27 |
<marostegui> |
Stop replication on db1100 for new alert testing (this should generate a page) T200509 |
[production] |