9951-10000 of 10000 results (35ms)
2010-08-14 §
09:39 <mark> START SLAVE on ms1, catching up with ms3 [production]
09:38 <mark> RESET SLAVE on db5 [production]
09:37 <mark> STOP SLAVE on db5 [production]
09:35 <mark> Stopped apparmor on ms1 [production]
08:41 <Andrew> Leaving as-is for now, hoping somebody with appropriate permissions can fix it later. [production]
08:40 <Andrew> STOP SLAVE on db5 gives me ERROR 1045 (00000): Access denied for user: 'wikiadmin@208.80.152.%' (Using password: NO) [production]
08:34 <Andrew> Slave is supposedly still running on db5. Assuming Roan didn't stop it when he switched masters a few days ago. Going to text somebody to confirm that stopping is correct course of action. [production]
08:24 <Andrew> db5 can't be lagged, it's the master ;-). Obviously something wrong with wfWaitForSlaves. [production]
08:19 <Andrew> db5 lagged 217904 seconds [production]
05:09 <Andrew> Ran thread_pending_relationship and thread_reaction schema changes on all LiquidThreads wikis [production]
05:06 <andrew> synchronizing Wikimedia installation... Revision: 70933 [production]
05:04 <Andrew> About to update LiquidThreads production version to the alpha. [production]
2010-08-13 §
22:03 <mark> API logins on commons (only) are reported broken [production]
21:45 <mark> Set correct $cluster variable for reinstalled knsq* squids [production]
21:03 <mark> Increased cache_mem from 1000 to 2500 on sq33, like the other API backend squids [production]
20:58 <mark> Stopping backend squid on sq33 [production]
20:50 <jeluf> synchronized php-1.5/wmf-config/InitialiseSettings.php '24769 - Import source addition for tpi.wikipedia.org' [production]
17:46 <Fred> and srv100 [production]
17:45 <Fred> restarted apache on srv219 and srv222 [production]
15:57 <mark> synchronized php-1.5/wmf-config/mc.php 'Remove some to-be-decommissioned from the down list' [production]
15:56 <mark> synchronized php-1.5/wmf-config/mc.php 'Remove some to-be-decommissioned hosts from the down list' [production]
15:53 <RobH> srv146 removed from puppet and nodelists, slated for wipe, decommissioned. [production]
15:47 <mark> Sent srv146 to death using echo b > /proc/sysrq-trigger. It had a read-only filesystem and is therefore decommissioned. [production]
15:38 <mark> Restarted backend squid on sq33 [production]
15:36 <mark> synchronized php-1.5/wmf-config/mc.php 'Remove some to-be-decommissioned hosts from the down list' [production]
15:25 <mark> Reinstalled sq32 with Lucid [production]
15:01 <mark> Removed sq86 and sq87 from API LVS pool [production]
14:55 <mark> sq80 had been down for a long time. Brought it back up and synced it [production]
14:54 <rainman-sr> all of the search cluster restored to pre-relocation configuration [production]
14:34 <robh> synchronized php-1.5/wmf-config/lucene.php 'reverting search13 to search11' [production]
13:55 <mark> /dev/sda on sq57 is busted [production]
13:54 <RobH> removed search17 from search_pool_3 [production]
13:50 <mark> Set idleconnection.timeout = 300 (NOT idlecommand.timeout) on all LVS services on lvs3, restarting pybal [production]
13:44 <mark> powercycled sq57, which was stuck in [16538652.048532] BUG: soft lockup - CPU#3 stuck for 61s! [gmond:15746] [production]
13:42 <mark> sq58 was down for a long long time. Brought it back up and synced it [production]
13:37 <RobH> added search7 back into search_pool_3, kept search17 in as well [production]
13:27 <RobH> changed search_pool_3 back from search7 to search17 since it failed [production]
13:25 <robh> synchronized php-1.5/wmf-config/lucene.php 'Re-enabling LucenePrefixSearch - pushed changes on lvs3 to put search back to normal use' [production]
12:45 <mark> API squid cluster is too flaky to my taste. Converting sq33 into an API backend squid as well [production]
12:40 <mark> Shutdown puppet and backend squid on sq32 [production]
11:41 <mark> Corrected changed hostname for api.svc.pmtpa.wmnet in text squid config files [production]
11:37 <mark> Temporarily rejecting requests to sq31 backend to give it some breathing room while it's reading its COSS dirs [production]
11:32 <mark> Reinstalled sq31 with Lucid [production]
10:25 <mark> Shutting down backend squid on sq31 to see the load impact [production]
10:18 <mark> Setup backend request statistics for the API on torrus [production]
09:15 <rainman-sr> bringing up search1-12 and doing some initial index warmups [production]
01:54 <RobH> searchidx1, search1-search12 relocated and online, not in cluster until Robert can fix in the morning. The other half will have to move on a different day, 12 hours in the datacenter is long enough. [production]
01:40 <RobH> finished moving searchidx1 and search1-12, bringin them back up now [production]
2010-08-12 §
23:10 <RobH> shutting down searchidx1, search1-12 for move [production]
22:40 <robh> synchronized php-1.5/wmf-config/lucene.php 'swapped search13 and search18 for migration' [production]