production SAL

251-300 of 10000 results (11ms)

2012-04-30 §
19:59	<notpeter>	restarting nagios to get rid of some old checks	[production]
19:57	<Jeff_Green>	payments cluster gets kernel updates and reboots	[production]
19:55	<logmsgbot_>	reedy synchronizing Wikimedia installation... : Rebuiild l10n for 1.20wmf2	[production]
19:49	<logmsgbot_>	reedy synchronized wmf-config/ExtensionMessages-1.20wmf2.php 'Syncing file'	[production]
19:49	<logmsgbot_>	reedy synchronized php-1.20wmf2/LocalSettings.php 'Pushing LocalSettings.php'	[production]
19:48	<paravoid>	upgraded & rebooted ssl3001, ssl3002, ssl3003	[production]
19:46	<logmsgbot_>	reedy synchronizing Wikimedia installation... : Pushing out new symlinks etc, moving test2wiki to 1.20wmf2	[production]
19:30	<logmsgbot_>	reedy synchronized php-1.20wmf2 'Syncing 1.20wmf2 live hack revisions'	[production]
19:28	<logmsgbot_>	reedy synchronized php-1.20wmf2 'Syncing 1.20wmf1 live hack revisions'	[production]
19:26	<logmsgbot_>	reedy synchronized php-1.20wmf2 'Syncing 1.20wmf2 for deployment'	[production]
19:18	<Reedy>	Syncing php-1.20wmf2 files from NFS to apaches. Likely to upset NFS (or the uplink for the switch nfs is on) for a little while...	[production]
19:14	<paravoid>	rebooting ssl1004	[production]
19:06	<paravoid>	rebooting ssl1003	[production]
19:00	<paravoid>	rebooting ssl1002	[production]
18:59	<notpeter>	starting innobackupex from db1034 to db57 for new s2 slave	[production]
18:50	<paravoid>	rebooting ssl1001	[production]
18:42	<Jeff_Green>	grosley gets new kernel + reboot	[production]
18:35	<Jeff_Green>	aluminium gets kernel update, yayyyyyyy!	[production]
18:34	<paravoid>	pooled back ssl1; depooling ssl3 and rebooting	[production]
18:29	<binasher>	rebooting mw45 for kernel upgrade	[production]
18:27	<Jeff_Green>	power cycling aluminium which faceplanted	[production]
18:22	<binasher>	rebooting mw45	[production]
18:21	<notpeter>	rebuilding db57 again, this time with more correct raid level!	[production]
18:19	<logmsgbot_>	asher synchronized wmf-config/db.php 'adding db59,60 to s1 with low weights'	[production]
18:16	<paravoid>	depooled & rebooting ssl1	[production]
18:09	<logmsgbot_>	aaron rebuilt wikiversions.cdb and synchronized wikiversions files: Sanity run after script changes.	[production]
18:00	<logmsgbot_>	aaron synchronized multiversion	[production]
17:58	<logmsgbot_>	reedy synchronized php-1.20wmf1/includes/MagicWord.php 'https://gerrit.wikimedia.org/r/6135'	[production]
17:44	<logmsgbot_>	aaron synchronized wikiversions.cdb	[production]
17:43	<AaronSchulz>	updating multiversion code	[production]
08:34	<mutante>	reinstalling srv266	[production]
08:08	<mutante>	upgraded mw1,mw2,mw35	[production]
07:59	<mutante>	reinstalling srv206	[production]
07:50	<mutante>	upgrading mw36	[production]
07:37	<apergos>	powercycling srv266, had this message on mgmt console: Severity: Non Recoverable, SEL:CPU Machine Chk: Processor sensor, transition to non-recoverable was asserted	[production]
07:22	<mutante>	installing upgrades on srv212	[production]
07:19	<apergos>	reinstalled srv284, seems to be up now	[production]
07:17	<mutante>	powercycled mw8	[production]
02:14	<logmsgbot_>	LocalisationUpdate completed (1.20wmf1) at Mon Apr 30 02:13:59 UTC 2012	[production]
2012-04-29 §
20:13	<apergos>	srv206 won't run puppet, see syslog, clearing out the yaml file didn't help, since it's not urgent I'm leaving it for tomorrow	[production]
19:51	<Ryan_Lane>	depooling ssl3004	[production]
19:51	<Ryan_Lane>	removed the ipv6 addresses from maerlant and added them to ssl3001, then restarted nginx	[production]
19:50	<Ryan_Lane>	repooling ssl3001	[production]
19:46	<apergos>	powercycled mw60, same reason as the rest	[production]
19:13	<apergos>	power cycled mw48 and mw52 (hung just like the others)	[production]
18:05	<apergos>	sll3002 and 3003 were rebooted and are the entire ssl esams pool right now	[production]
16:34	<apergos>	powercycling the ssl300x.esams hosts. 212 days of uptime... (and 3001 had gone out to lunch)	[production]
12:34	<mutante>	and finally mw1, so just leaving mw1102 and mw60 for having other issues for a while (->Nagios)	[production]
12:22	<mutante>	check_all_memcached recovered, but still same treatment for mw10 and 11 (8 and 15h ago)	[production]
12:07	<mutante>	powercycling mw30	[production]