2010-11-02
§
|
19:44 |
<RobH> |
following procedure on wikitech to fix torrus |
[production] |
16:46 |
<RobH> |
sq42 & sq44 behaving normally now, cleaning cache on sq48 and killing squid for restart as it is flapping and at high load, due to earlier nfs issue |
[production] |
16:38 |
<RobH> |
restarting and cleaning backend squid on sq44 and sq42 which were complaining in lvs |
[production] |
16:35 |
<RobH> |
sq43 was flapping since the nfs mount on ms4 was borked. restarted it |
[production] |
16:07 |
<apergos> |
NFSD_SERVERS=2048 in /etc/default on ms4 |
[production] |
16:06 |
<apergos> |
note that the variables rpcmod:cotsmaxdupreqs has been changed to 2048 in /etc/system, and |
[production] |
15:54 |
<apergos> |
hard reset on ms4, reboot was not getting the job done |
[production] |
15:47 |
<apergos> |
rebootint ms4, nfsd hung and couldn't be restarted or killed. |
[production] |
14:04 |
<RobH> |
restarted pdns on linne due to crash from authdns update |
[production] |
14:02 |
<RobH> |
updated dns with new mgmt entries for payments, owasrvs, and owadbs |
[production] |
03:45 |
<domas> |
added srv193 back to apaches pool on lvs |
[production] |
2010-10-29
§
|
23:21 |
<domas> |
lol repaired myisam tables on db9, call if data has been lost, hehe |
[production] |
22:58 |
<domas> |
resynced srv154, was running with months old configuration/code. |
[production] |
22:58 |
<domas> |
was db22 disabled silently by someone? or not reenabled? :) reenabled now... |
[production] |
22:55 |
<midom> |
synchronized php-1.5/wmf-config/db.php |
[production] |
18:33 |
<apergos> |
restarted torrus on streber, after reports that it was not responding |
[production] |
17:46 |
<apergos> |
domas ran "reset-mysql-slave db18" (from fenari) which clears out *all* old relay logs, and restarts the slaves. |
[production] |
17:34 |
<apergos> |
removed some old relay logs from /a/sqldata on db18 to get space back, it was at 95% |
[production] |
15:22 |
<RoanKattouw> |
Followers on Twitter: view missing entries between Sep 2 and today at http://identi.ca/wikimediatech |
[production] |
15:22 |
<RoanKattouw> |
Re-established identi.ca->Twitter bridge for wikimediatech, broken since September 2 |
[production] |
15:21 |
<RobH> |
repaired the sessions table, rt is now happy |
[production] |
15:09 |
<RobH> |
rt is being odd, looking into it |
[production] |
14:43 |
<phuzion> |
test |
[production] |
2010-10-28
§
|
21:34 |
<RobH> |
powercycled sq69, ran puppet, its back online |
[production] |
21:24 |
<RobH> |
sq69 is borked, powercycling |
[production] |
17:51 |
<Ryan_Lane> |
running checksetup.pl on kaulen for bugzilla |
[production] |
17:50 |
<Ryan_Lane> |
running mysqlcheck --autorepair on bugzilla database on db9 for the bug_fulltext table |
[production] |
15:23 |
<atglenn> |
reenabled logging for fundraising on locke |
[production] |
14:50 |
<atglenn> |
I see a lot of lot of ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO) after reboot of db9... not awake enough to try to look at it; services seem to be running ok |
[production] |
14:46 |
<atglenn> |
powercycled db9, it was unreachable by ssh, ganglia showed load and wait_cpu through the roof |
[production] |
14:46 |
<RoanKattouw> |
db9 back after having been powercycled by Ariel |
[production] |
14:18 |
<RoanKattouw> |
db9 down. Responds to ping but doesn't respond to anything else |
[production] |