production SAL

4351-4400 of 5347 results (15ms)

2009-02-17 §
21:04	<Rob>	pulling srv208 and srv209 for quick reboots, their drac ips are wrong.	[production]
21:04	<Rob>	racked srv217-223 (also racked srv224/225 but no power yet)	[production]
18:30	<brion>	starting a batch run of update-special-pages-small just to ensure it actually works	[production]
18:25	<brion>	fixed hardcoded /usr/local path for PHP and use of obsolete /etc/cluster in update-special-pages and update-special-pages-small; removing misleading log files ([[bugzilla:17534]])	[production]
01:49	<Tim>	deleting all enotif jobs from the job queue, there is still a huge backlog	[production]
2009-02-16 §
16:46	<mark>	Did emergency rollback of squid 2.7.6 to squid 2.6.21 because of incompatible HTTP Host: header	[production]
16:21	<Rob>	stopped upgrades, sq36 completed before stop	[production]
16:17	<Rob>	performing upgrades to sq35-sq38 (not depooling in pybal, letting pybal handle that automatically)	[production]
16:16	<Rob>	performed dist-upgrade on sq31-34	[production]
15:35	<Rob>	depooled sq31-sq34 for upgrade	[production]
08:12	<Tim>	patched in r47309, Article.php tweak	[production]
05:00	<Tim>	made runJobs.php log to UDP instead of via stdout and NFS	[production]
04:53	<Tim>	fixed incorrect host keys in /etc/ssh/ssh_known_hosts for srv38, srv39 and srv77	[production]
04:13	<Tim>	removing all refreshLinks2 jobs from the job queue, duplicate removal is broken so to clear the backlog it's better to just run maintenance/refreshLinks.php	[production]
2009-02-15 §
21:59	<mark>	Experimentally blocked non GET/HEAD HTTP methods on sq3 frontend squid	[production]
16:15	<mark>	Upgraded PyBal on lvs2 - others will follow	[production]
13:11	<domas>	db23 has multiple MCEs for same dimm logged: http://p.defau.lt/?IarKD4gbFhe5RmaV0RB_Xg	[production]
12:38	<domas>	in wikistats, placed older than 10 days files into ./archive/yyyy/mm/ - maybe will make flack crash less :))	[production]
11:56	<mark>	Doing Squid memleak searching on sq1 with valgrind, pooled with weight 1 in LVS	[production]
03:09	<Andrew>	CentralNotice still not working properly, and when we tried to set it to testwiki-only, it never came up. Left it on testwiki only for the time being, until somebody who knows CentralNotice can take a look at it.	[production]
02:21	<Tim>	fixed permissions on the rest of the logs in /home/wikipedia/logs/norotate (fixes centralnotice)	[production]
2009-02-14 §
19:19	<Az1568_>	re-enabled CentralNotice on testwiki to try and find the problem (we've had this before, but fixed it somehow...possibly with a regen? See November 16th log.)	[production]
18:34	<domas>	filed a bug at https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/329489 - could use some Canonical escalation too	[production]
18:26	<domas>	same affected srv47 - this is related to switching locking to fcntl() - this drives apparmor crazy	[production]
17:47	<domas>	srv178 kernel memleaked few gigs. blame: apparmor	[production]
14:34	<domas>	srv215 very much dead, doesn't show vitality signs even after serveractionhardreset	[production]
14:28	<domas>	correction, srv208.mgmt is pointing to uninstalled box	[production]
14:27	<domas>	DRAC serial on all new boxes is ttyS1 which is not in securetty	[production]
14:24	<domas>	srv209.mgmt is actually srv208's SP, and srv208.mgmt is pointing to dead box	[production]
14:15	<domas>	srv209,215 down?	[production]
13:43	<domas>	installing php5-apc-3.0.19-1wm2 (no more futexes) on all ubuntu appservers.	[production]
11:01	<Andrew>	test	[production]
2009-02-13 §
22:10	<mark>	esams squid upgrade complete	[production]
21:05	<RobH>	deployed srv207-srv216 in apaches cluster	[production]
20:34	<RobH>	added new servers to nagois and restarted it	[production]
20:15	<RobH>	setup all node groups, ganglia, apache, so on for srv199-srv206 and added into rotation	[production]
19:38	<mark>	Upgrading esams squids to 2.7.6	[production]
18:36	<mark>	Upgraded squid on sq1 to 2.7.6 and rebooted the box	[production]
18:03	<mark>	Memory leak issues on the upload frontend squids, which started in November	[production]
18:01	<RobH>	sq13 back online, seems there is a memory leak, go mark for finding =]	[production]
17:54	<RobH>	lomaria install done for domas	[production]
17:49	<RobH>	rebooting sq13 due to it failing out in ganglia, OOM error evident.	[production]
17:48	<RobH>	reinstalling lomaria per domas request	[production]
17:37	<RobH>	sq8 was out of memory and locked up, rebooted, cleaned cache, and bringing back online	[production]
17:34	<RobH>	srv38 and srv39 back in rotation	[production]
17:23	<RobH>	srv38 and srv39 reinstalled, installing packages now	[production]
16:57	<RobH>	reinstalling srv38/srv39	[production]
16:57	<RobH>	srv80 reinstalled as ubuntu apache and back in rotation	[production]
16:31	<RobH>	srv79 back in rotation	[production]
16:21	<RobH>	srv79 reinstalled, installing packages and ganglia	[production]