8801-8850 of 10000 results (21ms)
2010-11-05 §
17:42 <RobH> srv206 fixed, pushed back into lvs [production]
17:25 <RobH> working on srv206, disregard any errors it throws [production]
16:40 <RobH> issue with the new api servers is fixed and they are now back in service [production]
16:04 <RobH> some new api servers are not working right, depooled until they are fixed [production]
15:58 <mark> Removed ibis IPs from Squid ACLs; invalid requests issue has been resolved [production]
15:57 <mark> Fixed NFS mounts on apaches that had them missing since the wikimedia-task-appserver upgrade [production]
15:26 <RobH> working on sq57, disregard flapping [production]
15:24 <RobH> new api apackes srv290-srv301 are online, except srv298 which needs drac correction before installation [production]
15:22 <RobH> dropping old entry for tenwiki in apache config and resyncing/restarting apaches to eliminate error message [production]
15:18 <RobH> pushing srv291-srv301 into lvs [production]
15:11 <RobH> doing puppet runs on srv292-srv301 before pushing them into service [production]
14:57 <mark> Hacked out the 'remotemount' lines in /var/lib/dpkg/info/wikimedia-task-appserver.postrm files to prevent apaches from being without NFS mounts during/between puppet runs and package upgrades [production]
14:23 <mark> Deploying new package wikimedia-task-appserver 1.46 across the cluster, which removes configuration files (now handled by Puppet) [production]
11:59 <catrope> synchronized php-1.5/includes/api/ApiLogin.php 'Revert r76078' [production]
11:49 <catrope> synchronized php-1.5/includes/api/ApiLogin.php 'r76078' [production]
05:57 <apergos> failure booting into be3 on ms4, had to back out. so, no progress, we are back to where we were before the reboots. [production]
05:40 <apergos> cleared up luactivate error, shutdown ms4 again, trying to boot into alt boot environment [production]
05:16 <apergos> used shutdown on ms4, be3 showed as "active on reboot" but it booted into be0 (old boot environment) nonetheless. *grumble* [production]
05:06 <apergos> rebooted ms4 into alt boot environment with current patches applied [production]
00:18 <RobH> new api servers are not coping down the data correctly and not reflecting config changes in puppet, so they fail, srv290+ not online yet [production]
2010-11-04 §
23:06 <RobH> running puppet across the new api servers srv290-srv301 then will push them in service later when i figure out why they are not doing what I want ;P [production]
20:13 <RobH> sq51 hatees me [production]
20:11 <RobH> new api servers srv290-301 are installed and showing in ganglia, having issues getting the first couple to pool into lvs before i push the rest into service [production]
20:09 <RobH> fixed sq51 [production]
19:29 <RoanKattouw> Strike that, have backed out changes [production]
19:06 <RoanKattouw> Until Mark's made sure they're good, that is [production]
19:06 <RoanKattouw> Changing some files in wmf-deployment/includes/media . DO NOT RUN SCAP or otherwise deploy these changes! [production]
18:36 <RobH> added dns entries for payments [production]
17:59 <RobH> doing puppet runs and final setup for srv290-srv301 [production]
16:56 <rfaulk> Added numpy Python package to grosley.wikimedia.org with apt_get ... For use in the 2010/11 fundraiser to facilitate stats gathering by providing scientific computing functionality in Python [production]
16:43 <rfaulk> Added MySQLdb Python package to on grosley.wikimedia.org with apt-get ... This package will be used to access fundraising databases to facilitate the gathering and synthesis of relevant statistics for the 2010/11 Wikimedia findraiser [production]
16:23 <mark> Set storage1 (varnish) as upload backend on sq41-50, instead of ms4 [production]
16:14 <RobH> sq59 is being bitchy and wont clean the cache, possible hdd issue? will investigate later [production]
15:42 <RobH> sq35 back in rotation [production]
15:34 <mark> Added storage1 (varnish->ms4) as an HTTP backend to sq45's squid config [production]
15:34 <RobH> commenting out sq35, trying to make it work again in pybal [production]
15:16 <RobH> poking at sq59 [production]
15:06 <RobH> sq35 back online, pushed into lvs, partially up - may need to wait up to 5 for idleconnect timer [production]
14:46 <RobH> pushed dns updates for new payments boxes and correcting owadb1/2 to db31/32 [production]
14:28 <RobH> sq35 set to false in pybal until i determine whats wrong with it [production]
14:09 <mark> Reduced CARP weight of sq41-50 from 10 to 5 [production]
13:37 <RobH> sq35 may flag, disregard [production]
13:30 <RoanKattouw> Removed uploadwizard test wiki on prototype, gonna set it up on the Commons prototype instead [production]
04:17 <atglenn> ganglia 3.1 now running on ms4 and ms5 [production]
01:44 <RobH> srv217 back in cluster [production]
00:36 <RobH> torrus back online [production]
00:29 <RobH> fixing torrus deadlock, no touchy [production]
00:18 <tomaszf> upped open fd's on loudon to 4096 [production]
00:17 <RobH> kicking srv217 for reinstall [production]
2010-11-03 §
21:22 <RobH> updated puppet to properly remove memcached from memcached::false entries and removed the host memcached check for servers no longer running memcached, hup'd nagios to take the change [production]