production SAL

5651-5700 of 10000 results (13ms)

2012-01-04 §
07:40	<Tim>	fixed puppet by re-running the post-merge hook with key forwarding enabled, and then started puppet on ms6	[production]
07:32	<Tim>	on ms6.esams: fixed proxy IP address and stopped puppet while I figure out how to fix it	[production]
03:25	<Tim>	experimentally raised max_concurrent_checks to 128	[production]
03:17	<Tim>	on spence in nagios.cfg, reduced service_reaper_frequency from 10 to 1, to avoid having a massive process count spike every 10 seconds as checks are started. Locally only as a test.	[production]
02:27	<Ryan_Lane>	I should clarify that I removed 10.2.1.13 from /etc/network/interfaces, it's still properly bound to lo	[production]
02:24	<Tim>	on spence: setting up logrotate for nagios.log and removing nagios-bloated-log.log	[production]
02:22	<Ryan_Lane>	removing manually added 10.2.1.13 address from lvs4	[production]
02:01	<LocalisationUpdate>	completed (1.18) at Wed Jan 4 02:04:57 UTC 2012	[production]
01:43	<Nemo_bis>	Last week slowness: job queue backlog now cleared on !Wikimedia Commons and (almost) English !Wikipedia http://ur1.ca/77q9b	[production]
01:02	<reedy>	synchronized php-1.18/includes/ '[[rev:107978\|r107978]]'	[production]
00:45	<reedy>	synchronized php-1.18/extensions '[[rev:107977\|r107977]], [[rev:107976\|r107976]]'	[production]
00:39	<Tim>	running purgeParserCache.php on hume, deleting objects older than 3 months	[production]
00:38	<reedy>	synchronized php-1.18/includes/specials/ '[[rev:107975\|r107975]]'	[production]
00:29	<tstarling>	synchronizing Wikimedia installation... :	[production]
00:27	<reedy>	synchronized php-1.18/extensions/Nuke/ '[[rev:107974\|r107974]]'	[production]
00:25	<reedy>	synchronized php-1.18/extensions/ '[[rev:107970\|r107970]]'	[production]
2012-01-03 §
23:00	<Tim>	on spence: restarting gmetad	[production]
22:58	<reedy>	synchronizing Wikimedia installation... : Pushing [[rev:107953\|r107953]], [[rev:107955\|r107955]], [[rev:107956\|r107956]], [[rev:107957\|r107957]]	[production]
22:47	<LeslieCarr>	stopping and then starting apache2 on spence to try and lower load	[production]
22:29	<RobH>	added in the lo addres to lvs4, now its working and generating thumbnails	[production]
22:09	<reedy>	synchronizing Wikimedia installation... : Push [[rev:107938\|r107938]] [[rev:107948\|r107948]]	[production]
21:45	<RobH>	ganglia graphs will have missing data for past 30 to 40 minutes	[production]
21:45	<RobH>	spence back online, ganglia and nagios confirmed operational	[production]
21:38	<RobH>	resetting spence and dropping to serial to try to fix it	[production]
21:25	<RobH>	nagios and ganglia down due to spence reboot, system still coming back online	[production]
21:21	<RobH>	spence is unresponsive to ssh and serial console, rebooting	[production]
21:14	<LeslieCarr>	resetting DRAC 5 on spence for management connectivity	[production]
21:05	<binasher>	that fixed it. but how did that happen?	[production]
21:05	<binasher>	ran ip addr add 10.2.1.22/32 label "lo:LVS" dev lo on lvs4	[production]
19:36	<reedy>	synchronized php-1.18/skins/common/images/ '[[rev:107930\|r107930]]'	[production]
17:36	<mutante>	killing more runJobs.php / nextJobDB.php processes on a bunch of servers (/home/catrope/badjobrunners)	[production]
17:26	<RoanKattouw>	Stopping job runners on the following DECOMMISSIONED servers: srv151 srv152 srv153 srv158 srv160 srv164 srv165 srv166 srv167 srv168 srv170 srv176 srv177 srv178 srv181 srv184 srv185	[production]
15:55	<RobH>	torrus back, took forever to recompile	[production]
15:53	<reedy>	synchronized wmf-config/InitialiseSettings.php 'Bug 33485 - Enable WikiLove in si.wikipedia'	[production]
15:52	<Reedy>	Created wikilove tables on siwiki	[production]
15:46	<RobH>	torrus deadlocked, kicking	[production]
14:00	<RoanKattouw>	Restarting job runners on srv242 and mw25, those are the last ones that are stuck	[production]
13:57	<RoanKattouw>	Restarting all job runners that are stuck	[production]
13:48	<RoanKattouw>	Restarting job runner on srv236, seems to be stuck	[production]
02:02	<LocalisationUpdate>	completed (1.18) at Tue Jan 3 02:05:21 UTC 2012	[production]
2012-01-02 §
23:36	<Reedy>	Seems to potentially be an issue with job runners, enwiki backed up to over 90k over the last week or so. Needs investigating	[production]
23:18	<tstarling>	synchronized php-1.18/includes/parser/Parser.php '[[rev:107856\|r107856]]'	[production]
22:58	<tstarling>	synchronizing Wikimedia installation... :	[production]
18:08	<nikerabbit>	synchronized wmf-config/InitialiseSettings.php 'Bug 33368: WebFonts on bpywiki'	[production]
18:05	<nikerabbit>	synchronized php-1.18/languages/messages/ 'i18ndeploy [[rev:107843\|r107843]]'	[production]
18:04	<nikerabbit>	synchronized php-1.18/extensions/WebFonts/WebFonts.i18n.php 'i18ndeploy [[rev:107843\|r107843]]'	[production]
16:58	<mutante>	installed SiteMap extension on Bugzilla - soon bugs should be googleable	[production]
16:33	<mutante>	upgraded Bugzilla from 4.0.2 to 4.0.3 (http://www.bugzilla.org/releases/4.0.3/release-notes.html#v40_point) (RT #2194)	[production]
14:47	<mutante>	cleaned out gammu spool to stop sms bomb - sorry. deamon runs again now though..	[production]
14:36	<mutante>	fixed gammu-smsd on spence per wikitech "Nagios#Fixing_the_USB_dongle" (sending out queued SMS now )	[production]