2009-01-27
§
|
13:44 |
<mark> |
CARP weight redistribution caused large load spike in upload backend request, causing ms1 overload, probably causing issues on apaches via NFS, etc etc... |
[production] |
13:29 |
<mark> |
Lowered CARP weight from 10 to 5 for sq1-10.wikimedia.org, from 15 to 10 for sq11-15 |
[production] |
06:03 |
<bot broken> |
retrolog |
[production] |
02:13 |
<brion> |
updating extensions/AbuseFilter/Views/AbuseFilterViewList.php (mysql 4 compat issue) |
[production] |
02:04 |
<brion> |
installed release versions of mwlib on erzurumi and restarted. these should have updated localizations |
[production] |
01:48 |
<brion> |
turning AbuseFilter on on test.... having some mysql 4.0 compat issues. poking |
[production] |
01:47 |
<brion> |
srv31 seems very sad; slow/borked login? |
[production] |
01:39 |
<brion> |
scapping to update AbuseFilter to current |
[production] |
01:27 |
<brion> |
prepping testing of AbuseFilter on test.wikipedia |
[production] |
00:46 |
<brion> |
enabling Collection also for de.wikisource per frank's req passed on from community |
[production] |
00:36 |
<brion> |
adding NS_HELP to $wgCollectionArticleNamespaces |
[production] |
00:12 |
<brion> |
Collection extension being enabled on dewiki |
[production] |
2009-01-20
§
|
21:45 |
<RobH> |
killed some run away processes on db9 that were killing bugzilla |
[production] |
21:44 |
<brion> |
stock long queries on bz again. got rob poking em |
[production] |
20:31 |
<brion> |
putting $wgEnotifUseJobQ back for now. change postdates some of the spikes i'm seeing, but it'll be easier to not have to consider it |
[production] |
20:19 |
<mark> |
Upgraded kernel to 2.6.24-22 on sq22 |
[production] |
19:57 |
<brion> |
disabling $wgEnotifUseJobQ since the lag is ungodly |
[production] |
17:58 |
<JeLuF> |
db2 overloaded, error messages about unreachable DB server have been supported. Nearly all connections on DB2 are in status "Sleep" |
[production] |
17:21 |
<JeLuF> |
srv154 is reachable again, current load average is 25, no obvious CPU consuming processes visible |
[production] |
17:10 |
<JeLuF> |
srv154 went down. Replaced its memcached by srv144's memcached |
[production] |
03:02 |
<brion> |
syncing InitialiseSettings -- reenabling CentralNotice which we'd taken temporarily out during the upload breakage |
[production] |
01:50 |
<Tim> |
exim4 on lily died while I examined reports of breakage, restarted it |
[production] |