2017-04-19
ยง
|
15:48 |
<jynus@tin> |
Synchronized wmf-config/db-codfw.php: Failing over x1-master (duration: 00m 41s) |
[production] |
15:46 |
<gehel@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=elastic2020.codfw.wmnet |
[production] |
15:42 |
<jynus@tin> |
Synchronized wmf-config/InitialiseSettings.php: Disable cx_translation- it is causing an outage on x1 (duration: 02m 44s) |
[production] |
15:40 |
<dzahn@puppetmaster2001> |
conftool action : set/pooled=no; selector: name=mw2256.codfw.wmnet |
[production] |
15:32 |
<mutante> |
mw2256 went down and showed " PANIC: double fault, error_code: 0x0" |
[production] |
15:16 |
<jynus@tin> |
Synchronized wmf-config/db-codfw.php: Pool db2055 as an additional API server (duration: 01m 02s) |
[production] |
15:11 |
<_joe_> |
ran cumin 'R:class = role::mediawiki::jobrunner and *.eqiad.wmnet' 'systemctl reset-failed' manually |
[production] |
15:07 |
<godog> |
start swiftrepl on ms-fe1005 for codfw switchover |
[production] |
15:04 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t09_restart_parsoid(eqiad, codfw) Successfully completed |
[production] |
14:53 |
<akosiaris@puppetmaster1001> |
conftool action : set/pooled=yes; selector: name=mw2256.codfw.wmnet,service=apache2 |
[production] |
14:53 |
<akosiaris@puppetmaster1001> |
conftool action : set/pooled=yes; selector: name=mw2256.codfw.wmnet,service=nginx |
[production] |
14:48 |
<akosiaris@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=mw2256.codfw.wmnet,service=nginx |
[production] |
14:48 |
<akosiaris@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=mw2256.codfw.wmnet,service=apache2 |
[production] |
14:46 |
<gehel> |
banning elastic2020 from codfw cluster - T149006 |
[production] |
14:46 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t09_restart_parsoid(eqiad, codfw) Rolling restart parsoid in eqiad and codfw |
[production] |
14:44 |
<oblivian@tin> |
Synchronized wmf-config/ProductionServices.php: Fix redis locks (duration: 02m 24s) |
[production] |
14:41 |
<akosiaris> |
powercycle mw2256 |
[production] |
14:33 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t09_tendril(eqiad, codfw) Successfully completed |
[production] |
14:33 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t09_tendril(eqiad, codfw) Update Tendril configuration for the new masters |
[production] |
14:33 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t09_start_maintenance(eqiad, codfw) Successfully completed |
[production] |
14:31 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t09_start_maintenance(eqiad, codfw) Start MediaWiki maintenance in the new master DC |
[production] |
14:31 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t09_restore_ttl(eqiad, codfw) Successfully completed |
[production] |
14:31 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t09_restore_ttl(eqiad, codfw) Restore the TTL of all the MediaWiki discovery records |
[production] |
14:30 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t08_stop_mediawiki_readonly(eqiad, codfw) Successfully completed |
[production] |
14:30 |
<switchdc> |
(volans@sarin) MediaWiki read-only period ends at: 2017-04-19 14:30:05.678665 |
[production] |
14:30 |
<root@tin> |
Synchronized wmf-config/db-codfw.php: Set MediaWiki in read-write mode in datacenter codfw (duration: 00m 18s) |
[production] |
14:29 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t08_stop_mediawiki_readonly(eqiad, codfw) Set MediaWiki in read-write mode (db_to config already merged and git pulled) |
[production] |
14:28 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t07_coredb_masters_readwrite(eqiad, codfw) Successfully completed |
[production] |
14:28 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t07_coredb_masters_readwrite(eqiad, codfw) set core DB masters in read-write mode |
[production] |
14:25 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t06_redis(eqiad, codfw) Successfully completed |
[production] |
14:25 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t06_redis(eqiad, codfw) Switch the Redis replication |
[production] |
14:25 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t05_switch_traffic(eqiad, codfw) Successfully completed |
[production] |
14:22 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t05_switch_traffic(eqiad, codfw) Switch traffic flow to the appservers in the new datacenter |
[production] |
14:22 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t05_switch_datacenter(eqiad, codfw) Successfully completed |
[production] |
14:22 |
<root@tin> |
Synchronized wmf-config/CommonSettings.php: Switch MediaWiki active datacenter to codfw (duration: 00m 19s) |
[production] |
14:21 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t05_switch_datacenter(eqiad, codfw) Switch MediaWiki configuration to the new datacenter |
[production] |
14:21 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) Successfully completed |
[production] |
14:15 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) wipe and warmup caches |
[production] |
14:15 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t03_coredb_masters_readonly(eqiad, codfw) Successfully completed |
[production] |
14:15 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t03_coredb_masters_readonly(eqiad, codfw) set core DB masters in read-only mode |
[production] |
14:14 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t02_start_mediawiki_readonly(eqiad, codfw) Successfully completed |
[production] |
14:14 |
<root@tin> |
Synchronized wmf-config/db-eqiad.php: Set MediaWiki in read-only mode in datacenter eqiad (duration: 01m 29s) |
[production] |
14:13 |
<switchdc> |
(volans@sarin) MediaWiki read-only period starts at: 2017-04-19 14:12:54.007017 |
[production] |
14:12 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t02_start_mediawiki_readonly(eqiad, codfw) Set MediaWiki in read-only mode (db_from config already merged and git pulled) |
[production] |
14:09 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t01_stop_maintenance(eqiad, codfw) Successfully completed |
[production] |
14:07 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t01_stop_maintenance(eqiad, codfw) Stop MediaWiki maintenance in the old master DC |
[production] |
14:06 |
<godog> |
stop swiftrepl on ms-fe1005 for codfw switchover |
[production] |
14:06 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t00_reduce_ttl(eqiad, codfw) Successfully completed |
[production] |
14:06 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t00_reduce_ttl(eqiad, codfw) Reduce the TTL of all the MediaWiki discovery records |
[production] |
14:06 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Successfully completed |
[production] |