2017-04-19
§
|
14:14 |
<root@tin> |
Synchronized wmf-config/db-eqiad.php: Set MediaWiki in read-only mode in datacenter eqiad (duration: 01m 29s) |
[production] |
14:13 |
<switchdc> |
(volans@sarin) MediaWiki read-only period starts at: 2017-04-19 14:12:54.007017 |
[production] |
14:12 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t02_start_mediawiki_readonly(eqiad, codfw) Set MediaWiki in read-only mode (db_from config already merged and git pulled) |
[production] |
14:09 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t01_stop_maintenance(eqiad, codfw) Successfully completed |
[production] |
14:07 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t01_stop_maintenance(eqiad, codfw) Stop MediaWiki maintenance in the old master DC |
[production] |
14:06 |
<godog> |
stop swiftrepl on ms-fe1005 for codfw switchover |
[production] |
14:06 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t00_reduce_ttl(eqiad, codfw) Successfully completed |
[production] |
14:06 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t00_reduce_ttl(eqiad, codfw) Reduce the TTL of all the MediaWiki discovery records |
[production] |
14:06 |
<switchdc> |
(volans@sarin) END TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Successfully completed |
[production] |
14:05 |
<switchdc> |
(volans@sarin) START TASK - switchdc.stages.t00_disable_puppet(eqiad, codfw) Disabling puppet on selected hosts |
[production] |
14:00 |
<bblack@neodymium> |
conftool action : set/pooled=yes; selector: name=cp2014.codfw.wmnet,service=varnish-be |
[production] |
13:42 |
<bblack@neodymium> |
conftool action : set/pooled=no; selector: name=cp2014.codfw.wmnet,service=varnish-be |
[production] |
13:28 |
<urandom> |
cqlsh -f /etc/cassandra/adduser.cql, recreating user/perms (as-needed) |
[production] |
12:38 |
<urandom> |
T163292: Starting removal of Cassandra instance restbase1018-c.eqiad.wmnet |
[production] |
11:36 |
<oblivian:> |
Setting swift-rw in eqiad DOWN |
[production] |
11:36 |
<oblivian:> |
Setting swift-rw in codfw UP |
[production] |
11:36 |
<ema> |
repool varnish-be on cp3044 |
[production] |
11:23 |
<godog> |
add naos to git-deploy term on common-infrastructure4 - T162900 |
[production] |
11:03 |
<switchdc> |
(oblivian@sarin) END TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) Successfully completed |
[production] |
10:57 |
<switchdc> |
(oblivian@sarin) START TASK - switchdc.stages.t04_cache_wipe(eqiad, codfw) wipe and warmup caches |
[production] |
10:56 |
<_joe_> |
running the warmup stage in codfw for final testing |
[production] |
10:41 |
<ema> |
depool varnish-be on cp3044 because of mailbox lag issues |
[production] |
09:34 |
<moritzm> |
installing dbus security updates |
[production] |
09:11 |
<elukey> |
cleaning up ocg1003's /srv/deployment/ocg/postmortem dir (root partition filled up) |
[production] |
07:26 |
<hoo> |
Updated the sites and site_identifiers tables on all Wikidata clients for T149522. |
[production] |
06:57 |
<switchdc> |
(oblivian@sarin) END TASK - switchdc.stages.t06_redis(codfw, eqiad) Successfully completed |
[production] |
06:56 |
<switchdc> |
(oblivian@sarin) START TASK - switchdc.stages.t06_redis(codfw, eqiad) Switch the Redis replication |
[production] |
06:52 |
<_joe_> |
artificially stopping slave replication on rdb2001 for a final test of the switchover redis stage |
[production] |
03:53 |
<urandom> |
T163292: Starting removal of Cassandra instance restbase1018-b.eqiad.wmnet |
[production] |
03:49 |
<mobrovac@tin> |
Started restart [restbase/deploy@1bfada4]: (no justification provided) |
[production] |
03:40 |
<mobrovac@tin> |
Started restart [restbase/deploy@1bfada4]: Kick RB to pick up restbase1018 instances are gone |
[production] |
03:32 |
<mobrovac@tin> |
Finished deploy [changeprop/deploy@a19ebf8]: Temp: Decrease the transclusion update from 400 to 200 for T163292 (duration: 00m 53s) |
[production] |
03:31 |
<mobrovac@tin> |
Started deploy [changeprop/deploy@a19ebf8]: Temp: Decrease the transclusion update from 400 to 200 for T163292 |
[production] |
01:58 |
<mutante> |
naos: rsyncd is of course legitimately running on a deployment server sepearate from this (unlike in other cases where we used it for syncing during migration), so this was just the one config fragment for /home and not removing the service or anything |
[production] |
01:56 |
<mutante> |
naos: manually deleting rsyncd config remnants (puppet wouldn't know to clean up after itself) |
[production] |
01:47 |
<mutante> |
rsyncing /home from mira to naos (T162900) |
[production] |
01:21 |
<urandom> |
T163292: Starting removal of Cassandra instance restbase1018-a.eqiad.wmnet |
[production] |
2017-04-18
§
|
23:04 |
<dzahn@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=restbase1018.eqiad.wmnet |
[production] |
23:02 |
<mutante> |
ms1001 - deleting old GlobalCert SSL cert for dumps.wm that was about to expire and is replaced by Letsencrypt, |
[production] |
22:30 |
<mutante> |
ocg1003 gzipping ocg.log for disk space |
[production] |
21:12 |
<bblack@neodymium> |
conftool action : set/pooled=yes; selector: name=cp2002.codfw.wmnet,service=varnish-be |
[production] |
20:36 |
<bblack@neodymium> |
conftool action : set/pooled=no; selector: name=cp2002.codfw.wmnet,service=varnish-be |
[production] |
17:26 |
<mobrovac@tin> |
Finished deploy [restbase/deploy@1bfada4]: Blacklist all user pages on commons (duration: 07m 12s) |
[production] |
17:26 |
<ssastry@tin> |
Finished deploy [parsoid/deploy@b067328]: Deploying Parsoid to bump heap limits to 900m (from 600m) (duration: 06m 25s) |
[production] |
17:19 |
<ssastry@tin> |
Started deploy [parsoid/deploy@b067328]: Deploying Parsoid to bump heap limits to 900m (from 600m) |
[production] |
17:19 |
<mobrovac@tin> |
Started deploy [restbase/deploy@1bfada4]: Blacklist all user pages on commons |
[production] |
17:12 |
<XenoRyet> |
updated tools from a8b8d7242799b61dd2a48ef4e804164cd1818bc9 to a1e9342e093a85032255fc1d9904db7df13680b7 |
[production] |
17:09 |
<elukey> |
restart nutcracker in codfw (profile::mediawiki::nutcracker) to make sure that all the daemons are running with the latest config |
[production] |
16:26 |
<bblack> |
completed Traffic-layer portions of codfw switchover ( https://wikitech.wikimedia.org/wiki/Switch_Datacenter#Switchover_2 ) |
[production] |
16:21 |
<bblack> |
starting Traffic-layer portions of codfw switchover ( https://wikitech.wikimedia.org/wiki/Switch_Datacenter#Switchover_2 ) |
[production] |