1951-2000 of 10000 results (58ms)
2017-04-19 §
06:56 <switchdc> (oblivian@sarin) START TASK - switchdc.stages.t06_redis(codfw, eqiad) Switch the Redis replication [production]
06:52 <_joe_> artificially stopping slave replication on rdb2001 for a final test of the switchover redis stage [production]
03:53 <urandom> T163292: Starting removal of Cassandra instance restbase1018-b.eqiad.wmnet [production]
03:49 <mobrovac@tin> Started restart [restbase/deploy@1bfada4]: (no justification provided) [production]
03:40 <mobrovac@tin> Started restart [restbase/deploy@1bfada4]: Kick RB to pick up restbase1018 instances are gone [production]
03:32 <mobrovac@tin> Finished deploy [changeprop/deploy@a19ebf8]: Temp: Decrease the transclusion update from 400 to 200 for T163292 (duration: 00m 53s) [production]
03:31 <mobrovac@tin> Started deploy [changeprop/deploy@a19ebf8]: Temp: Decrease the transclusion update from 400 to 200 for T163292 [production]
01:58 <mutante> naos: rsyncd is of course legitimately running on a deployment server sepearate from this (unlike in other cases where we used it for syncing during migration), so this was just the one config fragment for /home and not removing the service or anything [production]
01:56 <mutante> naos: manually deleting rsyncd config remnants (puppet wouldn't know to clean up after itself) [production]
01:47 <mutante> rsyncing /home from mira to naos (T162900) [production]
01:21 <urandom> T163292: Starting removal of Cassandra instance restbase1018-a.eqiad.wmnet [production]
2017-04-18 §
23:04 <dzahn@puppetmaster1001> conftool action : set/pooled=no; selector: name=restbase1018.eqiad.wmnet [production]
23:02 <mutante> ms1001 - deleting old GlobalCert SSL cert for dumps.wm that was about to expire and is replaced by Letsencrypt, [production]
22:30 <mutante> ocg1003 gzipping ocg.log for disk space [production]
21:12 <bblack@neodymium> conftool action : set/pooled=yes; selector: name=cp2002.codfw.wmnet,service=varnish-be [production]
20:36 <bblack@neodymium> conftool action : set/pooled=no; selector: name=cp2002.codfw.wmnet,service=varnish-be [production]
17:26 <mobrovac@tin> Finished deploy [restbase/deploy@1bfada4]: Blacklist all user pages on commons (duration: 07m 12s) [production]
17:26 <ssastry@tin> Finished deploy [parsoid/deploy@b067328]: Deploying Parsoid to bump heap limits to 900m (from 600m) (duration: 06m 25s) [production]
17:19 <ssastry@tin> Started deploy [parsoid/deploy@b067328]: Deploying Parsoid to bump heap limits to 900m (from 600m) [production]
17:19 <mobrovac@tin> Started deploy [restbase/deploy@1bfada4]: Blacklist all user pages on commons [production]
17:12 <XenoRyet> updated tools from a8b8d7242799b61dd2a48ef4e804164cd1818bc9 to a1e9342e093a85032255fc1d9904db7df13680b7 [production]
17:09 <elukey> restart nutcracker in codfw (profile::mediawiki::nutcracker) to make sure that all the daemons are running with the latest config [production]
16:26 <bblack> completed Traffic-layer portions of codfw switchover ( https://wikitech.wikimedia.org/wiki/Switch_Datacenter#Switchover_2 ) [production]
16:21 <bblack> starting Traffic-layer portions of codfw switchover ( https://wikitech.wikimedia.org/wiki/Switch_Datacenter#Switchover_2 ) [production]
16:15 <jynus> reimporting some rows to dbstore1002 on jawiki and ruwiki T160509 [production]
16:12 <godog> reboot tin to fix cpu mhz issue and check bios settings - T163158 [production]
16:09 <mobrovac@tin> Finished deploy [restbase/deploy@960b468]: Blacklist an enwiki and a commons page (duration: 08m 16s) [production]
16:01 <mobrovac@tin> Started deploy [restbase/deploy@960b468]: Blacklist an enwiki and a commons page [production]
16:00 <mobrovac@tin> Finished deploy [restbase/deploy@960b468]: Dev Cluster: Blacklist an enwiki and a commons page (duration: 01m 42s) [production]
15:58 <mobrovac@tin> Started deploy [restbase/deploy@960b468]: Dev Cluster: Blacklist an enwiki and a commons page [production]
15:20 <elukey> restored default output-buffer config for rdb2005:6479 [production]
15:08 <godog> puppet-run on cache_upload in codfw/eqiad to pick up swift a/p changes [production]
15:02 <godog> puppet-run on cache_upload in codfw/eqiad to pick up switch a/a changes [production]
15:02 <gehel> upgrading elastic2020 to elasticsearch 5.1.2 [production]
14:55 <_joe_> switchover of services, misc things done [production]
14:54 <oblivian:> Setting restbase-async in codfw DOWN [production]
14:54 <oblivian:> Setting restbase-async in eqiad UP [production]
14:43 <_joe_> switching traffic for all a/a services plus maps and restbase to codfw-only [production]
14:38 <_joe_> forcing puppet run on caches for catching up with the a/a setting of maps and restbase [production]
14:33 <oblivian:> Setting restbase in eqiad DOWN [production]
14:33 <_joe_> starting switchover of services eqiad => codfw; external traffic will be switched over, as well as internal traffic to restbase [production]
14:25 <gehel> un-ban elastic2020 to get ready for real-life test during switchover - T149006 [production]
14:22 <elukey> executed config set client-output-buffer-limit "normal 0 0 0 slave 2147483648 2147483648 300 pubsub 33554432 8388608 60" on rdb2005:6749 as attempt to solve slave lagging - T159850 [production]
14:21 <oblivian:> Setting mobileapps in eqiad UP [production]
14:14 <oblivian:> Setting mobileapps in eqiad DOWN [production]
14:11 <elukey> executed CONFIG SET appendfsync everysec (default) to restore defaults on rdb2005:6479- T159850 [production]
14:08 <switchdc> (oblivian@sarin) END TASK - switchdc.stages.t09_restart_parsoid(codfw, eqiad) Successfully completed [production]
14:04 <elukey> executed CONFIG SET appendfsync no on rdb2005:6479 to test if fsync stalls affect replication - T159850 [production]
13:50 <switchdc> (oblivian@sarin) START TASK - switchdc.stages.t09_restart_parsoid(codfw, eqiad) Rolling restart parsoid in eqiad and codfw [production]
13:35 <switchdc> (oblivian@sarin) END TASK - switchdc.stages.t01_stop_maintenance(codfw, eqiad) Failed to execute [production]