2551-2600 of 10000 results (14ms)
2010-08-13 §
13:55 <mark> /dev/sda on sq57 is busted [production]
13:54 <RobH> removed search17 from search_pool_3 [production]
13:50 <mark> Set idleconnection.timeout = 300 (NOT idlecommand.timeout) on all LVS services on lvs3, restarting pybal [production]
13:44 <mark> powercycled sq57, which was stuck in [16538652.048532] BUG: soft lockup - CPU#3 stuck for 61s! [gmond:15746] [production]
13:42 <mark> sq58 was down for a long long time. Brought it back up and synced it [production]
13:37 <RobH> added search7 back into search_pool_3, kept search17 in as well [production]
13:27 <RobH> changed search_pool_3 back from search7 to search17 since it failed [production]
13:25 <robh> synchronized php-1.5/wmf-config/lucene.php 'Re-enabling LucenePrefixSearch - pushed changes on lvs3 to put search back to normal use' [production]
12:45 <mark> API squid cluster is too flaky to my taste. Converting sq33 into an API backend squid as well [production]
12:40 <mark> Shutdown puppet and backend squid on sq32 [production]
11:41 <mark> Corrected changed hostname for api.svc.pmtpa.wmnet in text squid config files [production]
11:37 <mark> Temporarily rejecting requests to sq31 backend to give it some breathing room while it's reading its COSS dirs [production]
11:32 <mark> Reinstalled sq31 with Lucid [production]
10:25 <mark> Shutting down backend squid on sq31 to see the load impact [production]
10:18 <mark> Setup backend request statistics for the API on torrus [production]
09:15 <rainman-sr> bringing up search1-12 and doing some initial index warmups [production]
01:54 <RobH> searchidx1, search1-search12 relocated and online, not in cluster until Robert can fix in the morning. The other half will have to move on a different day, 12 hours in the datacenter is long enough. [production]
01:40 <RobH> finished moving searchidx1 and search1-12, bringin them back up now [production]
2010-08-12 §
23:10 <RobH> shutting down searchidx1, search1-12 for move [production]
22:40 <robh> synchronized php-1.5/wmf-config/lucene.php 'swapped search13 and search18 for migration' [production]
22:37 <robh> synchronized php-1.5/wmf-config/lucene.php 'reverting so search13 and search18 can change roles' [production]
22:22 <robh> synchronized php-1.5/wmf-config/lucene.php 'changes back in place to migrate searchidx1 and search1-10' [production]
22:19 <RobH> puppet updated on all search servers, confirmed all have all three lvs ip addresses [production]
21:55 <mark> Configured puppet to bind all LVS service IPs to all search servers [production]
21:54 <RobH> reverted search_pool changes on lvs [production]
21:54 <robh> synchronized php-1.5/wmf-config/lucene.php 'rolling it back' [production]
21:48 <robh> synchronized php-1.5/wmf-config/lucene.php 'changing settings for migration of searchidx1 and search1-search12' [production]
21:43 <RobH> changing lvs3 search pool settings for server relocations [production]
20:33 <robh> synchronized php-1.5/wmf-config/lucene.php 'commented out wgEnableLucenePrefixSearch for search server relocation' [production]
19:30 <RobH> srv281 reinstall done but not online as puppet has multiple package issues, leaving out of lvs [production]
19:09 <RobH> srv230 is on, but set to false in lvs. do not push back into rotation until after new memory arrives and is installed tomorrow (rt#69) [production]
18:59 <robh> synchronized php-1.5/wmf-config/mc.php 'updating without srv230' [production]
18:53 <RobH> srv230 coming down for memory testing [production]
18:49 <RobH> set srv230 to false in lvs, need to test memory [production]
18:04 <RobH> reinstalling srv281 [production]
17:59 <RobH> nix that, srv125 was ex-es, leaving those for now. [production]
17:58 <RobH> pulling srv103 & srv125 for wipe (pulling stuff with temp warnings first) [production]
17:53 <robh> synchronized php-1.5/wmf-config/mc.php 'removed srv103, replacing it with srv244' [production]
17:47 <RobH> pulling srv95 for wipe [production]
17:38 <RobH> srv110 removed from lvs3 config [production]
17:36 <mark> Removed all apaches up to srv150 from the appserver LVS pool on lvs3 [production]
17:21 <Fred> restarting apache on webservers (220,221,222,224) [production]
16:45 <RobH> wipe running on adler and amane, and they have been removed from puppet and dsh node groups [production]
16:12 <jeluf> synchronized docroot/bits/index.html [production]
15:41 <mark> Setup ports ge-2/0/0 to ge-2/0/20 for search servers on asw-b-sdtpa [production]
15:03 <mark> Shutdown BGP session to AS1257 130.244.6.249 on port 2/5 of br1-knams, preparing for cable move [production]
13:08 <mark> Recovered backend squid on knsq11 [production]
12:53 <mark> Reassembling RAID arrays md0 and md1 on knsq11 [production]
12:40 <mark> Running apt-get upgrade && reboot on amssq31 [production]
11:17 <mark> Shutdown knsq1 and knsq11 for swapping drives [production]