301-350 of 10000 results (10ms)
2011-04-03 §
14:26 <RoanKattouw> Running sync-common-all to deploy r85256 [production]
13:03 <apergos> shot rsyncs on ms5, setting 777 dir perms on all thumbnail dirs (eg e/ef/blablah.jpg) so scalers can write into them [production]
12:53 <apergos> did same for rest of projects and subdirs (777 on hash dirs) [production]
12:47 <apergos> chmod 777 on commons/thumb/*/* on ms5 so that scalers can create directories in there (mismatch of uid apache vs www-data) [production]
11:12 <mark> Raised per-squid connection limit to ms5 of 200 to 400 connections [production]
11:05 <mark> Raised per-squid connection limit to ms5 of 100 to 200 connections [production]
10:55 <mark> Fixed squid loop, the pmtpa.upload squids were using the esams squids as "CARP parents for distant content" [production]
10:29 <mark> Fixed puppet on sq42/43 [production]
09:44 <mark> Lowered FCGI thumb handlers from 90 to 60 again, to reduce concurrency [production]
08:08 <mark> Started 4 more rsyncs (8 total now) [production]
07:49 <mark> Removed mlocate from ms5, puppetising [production]
07:42 <mark> Started 4 rsyncs from ms4 to ms5 (--ignore-existing) [production]
07:32 <mark> increased thumb handler count from 60 to 90 [production]
07:11 <mark> Doubled the amount of fcgi thumb handlers [production]
07:08 <mark> Turned off logging of 404s to nginx error.log [production]
06:50 <mark> Restarted Apache on the image scalers [production]
06:49 <mark> Reconfigured ms5 to use the 404 thumb handler [production]
06:48 <Ryan_Lane> disabling nfs on ms4 [production]
06:33 <mark> Running puppet on all apaches to fix fstab and mount ms5.pmtpa.wmnet:/export/thumbs [production]
06:32 <mark> Unmounting /mnt/thumbs on all mediawiki-installation servers [production]
06:30 <mark> Remounted NFS /mnt/thumbs on the scalers to ms5 [production]
06:28 <Ryan_Lane> bring nfs back up [production]
06:28 <Ryan_Lane> brought ms4 back up. stopping the web server service and nfs [production]
06:20 <mark> Setup NFS kernel server on ms5 [production]
06:18 <Ryan_Lane> powercycling ms4 [production]
05:29 <Ryan_Lane> rebooting ms4 with -d to get a coredump [production]
05:14 <apergos> reanbling webserver on ms4 for testing [production]
04:45 <apergos> stopping web service on ms4 for the moment [production]
04:29 <apergos> shot webserver again [production]
04:26 <apergos> turned off hourly snaps on ms4, turned back on webserver and nfs [production]
04:09 <apergos> rebooted ms4, shut down webserver and nfsd temporarily for testing [production]
02:58 <apergos> still looking at kernel memory issues, still rebooting, ryan should be here in a few minutes to help out [production]
02:03 <apergos> a solaris advisor... also have zfs arch cache max to 2g which is ridiculously low but wtf right? [production]
02:02 <apergos> set tcp_time_wait_interval to 10000 at suggestion of [production]
01:37 <apergos> lowered zfs arch max to 2g (someone should reset this later)... will take effect on next reboot [production]
00:29 <apergos> rebooting with the new zfs arc cache max value, which will reduce the min value as well... dunno if this will give us enough breathing room or not [production]
00:24 <apergos> set zfs arc cache to ridiculously low value of 4gb, since when it's healthy it's using much less than that (1gb), this will take effect on reboot [production]
00:22 <Reedy> Still experiencing MS4 issues, thumb service is likely to be problematic for most users [production]
2011-04-02 §
23:47 <apergos> rebooting ms4 from serial console, out to lunch and took the renderers down too [production]
18:42 <catrope> synchronized php-1.17/wmf-config/CommonSettings.php 'Per NeilK, change Category:Uploaded_by_UploadWizard to Category:Uploaded_with_UploadWizard' [production]
17:59 <mark> Upgrading varnish to 2.1.5 [production]
17:14 <demon> synchronized php-1.17/includes/filerepo/LocalFile.php 'r85200' [production]
14:19 <mark> Implemented CARP weights for distant CARP parents on squid configurator (used to be all equal before) [production]
11:36 <mark> Created btrfs filesystem on ms6, striped (raid10 style) over 46 devices - very experimental [production]
09:50 <mark> Reinstalling ms6 with Ubuntu 10.04 [production]
09:50 <mark> Fixed torrus again [production]
06:02 <mark> !wikipedia The image thumbnail servers appear stable now [production]
04:59 <mark> Increased nginx worker processes from 1 to 4, set file limit to 30k [production]
04:40 <mark> !wikipedia Image Thumbnail server outage, it's being worked on [production]
04:34 <mark> Power cycling ms4 again [production]