production SAL

501-550 of 5915 results (5ms)

2010-01-16 §
14:55	<mark>	Started backend squid on sq50	[production]
14:26	<mark>	sq19 has bad drive /dev/sdc	[production]
14:22	<mark>	dist-upgrade & reboot on sq19	[production]
14:14	<mark>	Shutdown sq20, bad disk /dev/sda	[production]
14:13	<mark>	Reenabled sq24 frontend in PyBal	[production]
14:03	<Tim-away>	disk space critical on srv167, cleaned up /tmp	[production]
06:07	<apergos>	"recovery", my *ss... cleaned up /tmp on srv181 to get some space back	[production]
2010-01-15 §
23:53	<tomaszf>	starting test dewiki snapshot on snapshot2	[production]
17:17	<Fred>	modified gmetad config on zwinger and spence to reflect new apache 4cpu aggregator	[production]
17:16	<Fred>	added srv149 as a gmond aggregator in puppet.	[production]
2010-01-14 §
17:28	<domas>	load-tested and fixed db19 to handle full bits workload (~22k/s), now again serving just tampa part.	[production]
01:32	<apergos>	moved bits to .2 again, all nameservers seem to be up and reflect the change (bits unresponsive again)	[production]
2010-01-13 §
18:30	<Andrew>	[andrew@zwinger ~]$ sync-file wmf-config/CommonSettings.php 'Disable GIF scaling again, due to issues reported on village pump, bug 22041'	[production]
17:05	<Rob>	updated InitialiseSettings for bug 21174 and 21077	[production]
16:24	<Rob>	updated InitialiseSettings.php Bug 20508 Please enable Extention:NewUserMessage on en.Wikinews	[production]
11:50	<domas>	started overflow-watchdog ( http://p.defau.lt/?4_Y_Lrl9tVKS6fEYwkswKA ) on db19, sent bits load at it again	[production]
01:23	<atglenn>	using .2 for bits.pmtpa, did authdns-update, let's see what happens (bits was failing to respond)	[production]
2010-01-12 §
19:52	<rainman-sr>	could someone please put search3,9 into search_pool 1 (with search1,4) on lvs3	[production]
19:37	<Fred>	Puppet: set spence as a ganglia aggregator for Misc tree.	[production]
19:23	<Rob>	usability/prototype linode was crashed, had to reboot	[production]
17:39	<Fred>	set wgDefaultSkin back to monobook on wikitech since vector is not operational.	[production]
16:08	<mark>	Arcor clients appear to have problems reaching our sites, traffic to Arcor over AMS-IX has been low since midnight UTC	[production]
15:52	<Rob>	srv222 & srv223 power reestablished, booting.	[production]
15:47	<Rob>	shutting down srv223 & srv222 to change out power cords.	[production]
15:46	<Rob>	db10 moved and back online.	[production]
15:36	<Rob>	db10 moved to sdtpa a2, powering up.	[production]
14:50	<Rob>	taking down db10 to relocate from pmtpa-b1 to sdtpa-a2	[production]
14:50	<Rob>	fixed issues with transcode2 and transcode3, completing base installation.	[production]
05:48	<tstarling>	synchronized php-1.5/includes/HTMLCacheUpdate.php	[production]
05:48	<tstarling>	synchronized php-1.5/includes/BacklinkCache.php	[production]
05:47	<Tim>	deploying r60962	[production]
01:17	<Tim>	on streber: removed a corrupt torrus DB file so it could be rebuilt, torrus should be working now	[production]
00:57	<Tim>	killed frozen torrus cron jobs and ran "torrus compile --tree=Network --force"	[production]
00:51	<Tim>	maybe torrus collector is still broken, trying /etc/init.d/torrus-common force-reload	[production]
00:46	<Tim>	with mpm-prefork managed to debug it fairly easily. Moved away permanently locked DB file render_cache.db, torrus.wikimedia.org is now fixed	[production]
00:39	<Fred>	restarting pdns on ns1	[production]
00:38	<Tim>	switching streber to apache2-mpm-prefork, can't work out why it's not working	[production]
00:22	<Tim>	trying "apache2 -X" on streber	[production]
00:00	<Tim>	restarting apache on streber	[production]
2010-01-11 §
23:38	<domas>	logging the fact that we had cache layer meltdown at some point in time during the day	[production]
22:30	<domas>	leaving bits.pmtpa on db19's varnish, in case of troubles - uncomment bits.pmtpa .2 record in /etc/powerdns/templates/wikimedia.org and run authdns-update	[production]
19:43	<fvassard>	synchronized php-1.5/wmf-config/mc.php 'Swapped memcached from srv125 to srv232'	[production]
19:06	<Rob>	new apaches srv255, srv257 deployed. Updated node groups and synced nagios	[production]
19:03	<Rob>	new apache server srv254 deployed	[production]
18:24	<atglenn>	copy backlog of image data from ms1 to ms7 (running in screen as root on both boxes)	[production]
14:43	<mark>	Rebooting fuchsia, locked up again	[production]
14:24	<mark>	Increased load on knsq16-22 by upping lvs weight from 10 to 15	[production]
2010-01-10 §
23:02	<midom>	synchronized php-1.5/wmf-config/lucene.php 'rainman asked, rainman guilty, hehehe'	[production]
23:01	<midom>	synchronized php-1.5/wmf-config/secure.php	[production]
17:36	<rainman-sr>	search limit raised to 500 again, interwiki search re-enabled for "other" wikis	[production]