production SAL

5401-5450 of 10000 results (6ms)

2011-04-03 §
06:30	<mark>	Remounted NFS /mnt/thumbs on the scalers to ms5	[production]
06:28	<Ryan_Lane>	bring nfs back up	[production]
06:28	<Ryan_Lane>	brought ms4 back up. stopping the web server service and nfs	[production]
06:20	<mark>	Setup NFS kernel server on ms5	[production]
06:18	<Ryan_Lane>	powercycling ms4	[production]
05:29	<Ryan_Lane>	rebooting ms4 with -d to get a coredump	[production]
05:14	<apergos>	reanbling webserver on ms4 for testing	[production]
04:45	<apergos>	stopping web service on ms4 for the moment	[production]
04:29	<apergos>	shot webserver again	[production]
04:26	<apergos>	turned off hourly snaps on ms4, turned back on webserver and nfs	[production]
04:09	<apergos>	rebooted ms4, shut down webserver and nfsd temporarily for testing	[production]
02:58	<apergos>	still looking at kernel memory issues, still rebooting, ryan should be here in a few minutes to help out	[production]
02:03	<apergos>	a solaris advisor... also have zfs arch cache max to 2g which is ridiculously low but wtf right?	[production]
02:02	<apergos>	set tcp_time_wait_interval to 10000 at suggestion of	[production]
01:37	<apergos>	lowered zfs arch max to 2g (someone should reset this later)... will take effect on next reboot	[production]
00:29	<apergos>	rebooting with the new zfs arc cache max value, which will reduce the min value as well... dunno if this will give us enough breathing room or not	[production]
00:24	<apergos>	set zfs arc cache to ridiculously low value of 4gb, since when it's healthy it's using much less than that (1gb), this will take effect on reboot	[production]
00:22	<Reedy>	Still experiencing MS4 issues, thumb service is likely to be problematic for most users	[production]
2011-04-02 §
23:47	<apergos>	rebooting ms4 from serial console, out to lunch and took the renderers down too	[production]
18:42	<catrope>	synchronized php-1.17/wmf-config/CommonSettings.php 'Per NeilK, change Category:Uploaded_by_UploadWizard to Category:Uploaded_with_UploadWizard'	[production]
17:59	<mark>	Upgrading varnish to 2.1.5	[production]
17:14	<demon>	synchronized php-1.17/includes/filerepo/LocalFile.php 'r85200'	[production]
14:19	<mark>	Implemented CARP weights for distant CARP parents on squid configurator (used to be all equal before)	[production]
11:36	<mark>	Created btrfs filesystem on ms6, striped (raid10 style) over 46 devices - very experimental	[production]
09:50	<mark>	Reinstalling ms6 with Ubuntu 10.04	[production]
09:50	<mark>	Fixed torrus again	[production]
06:02	<mark>	!wikipedia The image thumbnail servers appear stable now	[production]
04:59	<mark>	Increased nginx worker processes from 1 to 4, set file limit to 30k	[production]
04:40	<mark>	!wikipedia Image Thumbnail server outage, it's being worked on	[production]
04:34	<mark>	Power cycling ms4 again	[production]
04:06	<mark>	Power cycled ms4 again	[production]
04:02	<mark>	Removed ms4 from pmtpa.upload config, sending all thumbs to ms5	[production]
03:47	<mark>	Restarted rsyncs ms4->ms5	[production]
03:25	<Ryan_Lane>	powercycling ms4 again	[production]
02:59	<Ryan_Lane>	rebooting ms4	[production]
02:46	<Ryan_Lane>	seems ms4 is totally dead, powercycling it	[production]
01:09	<Ryan_Lane>	installing python-pyinotify on spence for an updated ircecho	[production]
2011-04-01 §
21:35	<Ryan_Lane>	purging some binlogs on db9 to free up space	[production]
21:35	<RobH>	bugzilla now version 4	[production]
21:31	<RobH>	taking down bugzilla for a quick upgrade	[production]
18:48	<Ryan_Lane>	added ctwoo, brion, py, and reedy to the engineering alias	[production]
18:36	<mark>	Deployed ms5.pmtpa.wmnet as a special 'apache' for pmtpa squid uploads... now serving a small portion of commons thumbs	[production]
18:11	<RobH>	bugzilla back online, CRproxy was affected, and repaired	[production]
17:30	<RobH>	bugzilla.wikimedia.org going offline for database backup and upgrade	[production]
17:13	<RobH>	beginning upgrade process for bugzilla, it's availability will be in question during this time	[production]
16:59	<mark>	Turned off Etag in the webserver7 configuration (/opt/webserver7/https-ms4/config/obj.conf) on ms4	[production]
16:50	<notpeter>	rm-ing old binlogs on db9 after confirming that there is no slave lag on db10	[production]
15:53	<mark>	Puppetised nginx and htcp purger setup on ms5	[production]
11:36	<apergos>	restarted lighty on dataset2 (but why did it die?)	[production]
00:06	<tstarling>	synchronized php-1.17/includes/specials/SpecialImport.php 'r85099'	[production]