2009-10-25
§
|
15:23 |
<domas> |
converting usability initiative tables to InnoDB... |
[production] |
13:23 |
<domas> |
set up snapshot rotation on db10 |
[production] |
12:36 |
<hcatlin> |
mobile1: created init.d/cluster to correct USR1 sig problem, fully updated sys ops on wikitech |
[production] |
12:03 |
<domas> |
Mark, I'm sure you'll like that! ;-p~ |
[production] |
12:02 |
<domas> |
started sq43 without /dev/sdd COSS store (manual conf hack) |
[production] |
11:54 |
<domas> |
removed ns3 from nagios, added ns1 |
[production] |
11:45 |
<domas> |
bounced ns1 too, was affected by selective-answer leak (same number as ns0, btw, 507!) ages ago, just not noticed by nagios. this seem to resolve some slowness I noticed few times. |
[production] |
11:41 |
<domas> |
bounced pdns on ns0, was affected by selective-answer leak |
[production] |
2009-10-23
§
|
23:37 |
<tstarling> |
synchronized wmf-deployment/cache/trusted-xff.cdb |
[production] |
23:31 |
<tstarling> |
synchronized wmf-deployment/cache/trusted-xff.cdb |
[production] |
23:24 |
<Tim> |
updating TrustedXFF (bolt browser) |
[production] |
22:36 |
<domas> |
db28 has multiple fan failures (LOM is finally able to do something :) - still needs datacenter ops |
[production] |
22:20 |
<domas> |
db28 is a toast, needs cold restart by datacenter ops, LOM not able to do anything |
[production] |
22:20 |
<midom> |
synchronized php-1.5/wmf-config/db.php 'db28 dead' |
[production] |
11:17 |
<domas> |
Fixed skip-list of cached query pages, was broken for past two months :) |
[production] |
10:54 |
<midom> |
synchronized php-1.5/thumb.php 'removing livehack' |
[production] |
10:52 |
<domas> |
rotating logs becomes difficult when they become too big, so they continue to grow indefinitely! db20 / nearly full, loooots of /var/log/remote ;-) |
[production] |
10:39 |
<domas> |
who watches the watchers? :) rrdtool process on spence was using 8G of memory. :-)))) |
[production] |
10:24 |
<domas> |
semaphore leaks made some apaches fail, failed apache in rendering farm was not depooled, thus having 404 handler serve plenty of "can't connect to host" broken thumbs. |
[production] |
10:13 |
<domas> |
apparently there're intermittent connection failures from ms4 to scalers |
[production] |
09:56 |
<midom> |
synchronized php-1.5/thumb.php 'error header livehack' |
[production] |
04:04 |
<domas> |
noticed intermittent network failure inside pmtpa |
[production] |
04:01 |
<domas> |
switched jobs table on db22 with an empty one, old one was having just few noop entries and five million invalidated rows... hit interesting (but probably easy to fix) performance problem at mtr_memo_release/mtr_commit code inside MySQL :) |
[production] |
03:17 |
<Fred> |
restarted powerdns on ns2 to kill some zombies with a double tap :p |
[production] |