production SAL

1151-1200 of 10000 results (19ms)

2012-02-13 §
14:28	<reedy>	synchronized s3.dblist 'Fix double bewikimedia'	[production]
14:28	<reedy>	synchronized pmtpa.dblist 'Fix double bewikimedia'	[production]
14:28	<reedy>	synchronized all.dblist 'Fix double bewikimedia'	[production]
13:44	<reedy>	synchronized php-1.19/extensions/MobileFrontend	[production]
02:32	<Tim>	on kaulen: increased MaxClients to 500 to better deal with the connection flood	[production]
02:23	<Tim>	bugzilla is mostly working now, although it's very slow. The DDoS requests are blocked after connection setup using <Location>	[production]
02:21	<Tim>	on kaulen: restored MaxClients	[production]
02:17	<LocalisationUpdate>	completed (1.18) at Mon Feb 13 02:17:50 UTC 2012	[production]
01:46	<Tim>	temporarily moved bugzilla to port 444 until the connection flood (~1k req/s) subsides	[production]
01:15	<Tim>	started apache with MaxClients=30	[production]
00:59	<Tim>	after kaulen came back up, it was immediately overloaded with jsonrpc.cgi. Stopped apache.	[production]
00:54	<Tim>	kaulen is not responding on ssh, web down, rebooting	[production]
2012-02-12 §
12:09	<mark>	Killed lsearchd processes on search8, restarted	[production]
12:07	<mark>	Rebalanced mw API app servers from load 120 to 150 in pybal list	[production]
10:08	<mark>	Increased MaxClients to 100 on API apaches in Puppet	[production]
09:45	<mark>	Restricted only opensearch API requests to the API squids	[production]
09:43	<mark>	Restricted only opensearch API requests to the API backend apaches, other API requests now hit the main mediawiki cluster	[production]
08:44	<mark>	maximum_forwards change deployed to all squids	[production]
08:42	<mark>	Set maximum_forwards 2 in squid.conf, deployed to the API squids only so far, rest is pending	[production]
07:52	<binasher>	restarted lsearchd on search{3,4,9}	[production]
02:19	<LocalisationUpdate>	completed (1.18) at Sun Feb 12 02:19:17 UTC 2012	[production]
2012-02-11 §
20:31	<apergos>	restarted lightty on dataset2	[production]
17:28	<RobH>	manual test of each affected service complete, db9 fully online.	[production]
17:26	<RobH>	db9 moved, all systems online	[production]
17:08	<RobH>	db9 shutting down to move racks, offline during this includes: blogs, bugzilla, racktables, rt, survey, etherpad, observium	[production]
02:18	<LocalisationUpdate>	completed (1.18) at Sat Feb 11 02:18:36 UTC 2012	[production]
00:17	<reedy>	synchronizing Wikimedia installation... :	[production]
2012-02-10 §
22:17	<LeslieCarr>	fixing the labs apache2 puppet groups	[production]
21:48	<RobH>	memory in cp1017 wasnt properly seated as far as i can tell, if it doesnt mess up again it should be ok.	[production]
21:41	<RobH>	cp1017 being tested for bad memory	[production]
21:36	<RobH>	powercycling msw-a2-eqiad resolves all mgmt issues in rack	[production]
21:34	<RobH>	powercycling msw-a1-eqiad.	[production]
21:29	<RobH>	db1001 rebooting, locked up	[production]
20:53	<RobH>	updating dns for new db hosts	[production]
19:59	<Reedy>	Checking out 1.19wmf1 to /tmp on fenari	[production]
19:12	<RobH>	oxygen setup and installed per rt2343, still needs puppet runs and full deployment per rt 2430	[production]
17:58	<RobH>	updating dns for oxygen internal ip	[production]
17:21	<mutante>	labs logging is broken	[production]
17:14	<RobH>	oxygen offline for hard disk upgrade to replace locke	[production]
16:50	<mutante>	running sync-apache, trying to redirect office.wm to https	[production]
16:07	<mark>	Rebalanced appserver load balancing by giving the new mw* pmtpa app servers weight 150 in the pybal server list	[production]
15:17	<mark>	Turned on KeepAlive on apaches for better miss service times from eqiad	[production]
13:42	<mark>	Configured cp1001 and cp1020 to contact backend servers directly instead of via pmtpa squids	[production]
12:02	<mark>	Decommissioning sq38, sq46 and sq47 in squid configurator	[production]
11:50	<mark>	Making cp1001-1005 API squids	[production]
05:08	<maplebed>	deployed squid config to uploads to send 100% of thumbnail traffic to swift	[production]
02:49	<maplebed>	deploying fix for & bug with swift (files with an & in the name wouldn't load properly)	[production]
02:18	<LocalisationUpdate>	completed (1.18) at Fri Feb 10 02:18:37 UTC 2012	[production]
00:22	<LeslieCarr>	increased nagios max concurrent checks on spence and lowered the interval between processing them	[production]
00:20	<maplebed>	deployed squid config to upload squids rolling thumbnails back to 75% handled by swift to test the & bug	[production]