production SAL

7051-7100 of 7722 results (13ms)

2009-01-16 §
19:50	<brion>	uncommented srv101 from apache nodelist	[production]
19:41	<mark>	Fixed authentication on srv101, and mounted /mnt/upload5	[production]
19:25	<brion>	srv101 is commented out of 'apaches' node group so didn't show up on my earlier sweep	[production]
19:23	<brion>	poking around, srv101 at least is missing upload5 mount still	[production]
2009-01-15 §
21:16	<brion>	seems magically better now	[production]
20:48	<brion>	ok webserver7 started	[production]
20:43	<brion>	per mark's recommendation, retrying webserver7 now that we've reduced hit rate and are past peak...	[production]
20:28	<brion>	bumping styles back to apaches	[production]
20:25	<brion>	restarted w/ some old server config bits commented out	[production]
20:24	<brion>	tom recompiled lighty w/ the solaris bug patch. may or may not be workin' better, but still not throwing a lot of reqs through. checking config...	[production]
19:48	<brion>	trying webserver7 again to see if it's still doing the funk and if we can measure something useful	[production]
19:47	<brion>	we're gonna poke around http://redmine.lighttpd.net/issues/show/673 but we're really not sure what the original problem was to begin with yet	[production]
19:39	<brion>	turning lighty back on, gonna poke it some more	[production]
19:31	<brion>	stopping lighty again. not sure what the hell is going on, but it seems not to respond to most requests	[production]
19:27	<brion>	image scalers are still doing wayyy under what they're supposed to, but they are churning some stuff out. not overloaded that i can see...	[production]
19:20	<brion>	seems to spawn its php-cgi's ok	[production]
19:19	<brion>	trying to stop lighty to poke at fastcgi again	[production]
19:15	<brion>	looks like ms1+lighty is successfully serving images, but failing to hit the scaling backends. possible fastcgi buggage	[production]
19:12	<brion>	started lighty on ms1 a bit ago. not realyl sure if it's configured right	[production]
19:00	<brion>	stopping it again. confirmed load spike still going on	[production]
18:58	<brion>	restarting webserver on ms1, see what happens	[production]
18:56	<brion>	apache load seems to have dropped back to normal	[production]
18:48	<brion>	switching stylepath back to upload (should be cached), seeing if that affects apache load	[production]
18:40	<brion>	switching $wgStylePath to apaches for the moment	[production]
18:39	<brion>	load dropping on ms1; ping time stabilizing also	[production]
18:38	<RobH>	sq14, sq15, sq16 back up and serving requests	[production]
18:38	<brion>	trying stopping/starting webserver on ms1	[production]
18:27	<brion>	nfs upload5 is not happy :(	[production]
18:27	<brion>	some sort of issues w/ media fileserver, we think, perhaps pressure due to some upload squid cache clearing?	[production]
18:23	<RobH>	sq14-aq16 offline, rebooting and cleaning cache	[production]
18:16	<RobH>	sq2, sq4, and sq10 were unresponsive and down. Restarted, cleaned cache, and brought back online.	[production]
04:32	<Tim>	increased squid max post size from 75MB to 110MB so that people can actually upload 100MB files as advertised in the media	[production]
2009-01-14 §
19:21	<mark>	Lower preffed paths from 13030 that were learned at NYIIX	[production]
18:44	<brion>	updated wikitech to current SVN and rebuilt text search index for new server to fix short words	[production]
18:30	<RobH>	removed the sysop and bcrat add/remove from bcrat permissions for eswiki	[production]
18:22	<RobH>	added groups for eswiki again per https://bugzilla.wikimedia.org/show_bug.cgi?id=16975	[production]
16:28	<RobH>	added rollbacker group per https://bugzilla.wikimedia.org/show_bug.cgi?id=16975	[production]
2009-01-13 §
23:32	<Tim>	fixed NRPE on db29	[production]
22:56	<Tim>	cleaned up binlogs on db1 and ixia	[production]
22:54	<brion>	poking WP alias on frwiki [[bugzilla:16887]]	[production]
21:11	<RobH>	setup ganglia on erzurumi	[production]
20:42	<brion>	setting all pdf generators to use the new server	[production]
20:40	<brion>	testing pdf gen on erzurumi on testwiki	[production]
20:35	<RobH>	setup erzurumi for dev testing	[production]
20:35	<RobH>	some random updates on [[server roles]] to clean it up	[production]
19:37	<mark>	Restored normal situation, with 14907 -> 43821 traffic downpreffed to HGTN to avoid peering network congestion	[production]
18:40	<mark>	Retracted outbound announcement to all AMS-IX peers, 16265 and 13030 to force inbound via 1299	[production]
18:25	<mark>	Undid any routing changes as they were not having the desired effect	[production]
18:14	<mark>	Prepended 43821 twice on outgoing announcements to 16265 to make pmtpa-esams path via nycx less attractive	[production]
11:38	<Tim>	reducing innodb_buffer_pool_size on db19, db21, db22, db29	[production]