2009-01-29
§
|
17:24 |
<brion> |
erzurumi appears to have been victim to a massive memory leak. seeing if we can reboot it |
[production] |
17:17 |
<brion> |
poking at mw-serve on erzurumi; not responding |
[production] |
16:15 |
<domas> |
livehacked out 'patrol' link on article views %) |
[production] |
04:02 |
<Tim> |
added DNS entry for OTRS test |
[production] |
01:31 |
<Tim> |
fixed srv76 and the wikimedia-task-appserver package |
[production] |
01:31 |
<brion-busy> |
syncing r46513 -- fix for categoryfinder, update to fix for Collection |
[production] |
01:14 |
<brion-busy> |
updating Collection ext -- compat issue with changed category |
[production] |
00:56 |
<brion-busy> |
stopped apache on srv76 for the moment |
[production] |
00:55 |
<brion-busy> |
srv76 doesn't have upload5 mounted |
[production] |
00:41 |
<brion> |
live-hacking out a broken check in getDupeWarning() |
[production] |
00:22 |
<Tim> |
synced nagios config |
[production] |
2009-01-28
§
|
23:40 |
<mark> |
s/knams/esams/ in DNS geobackend files |
[production] |
23:25 |
<mark> |
Deployed fix in /lib/lsb/init-functions on sanger, mchenry, williams and lily which caused (amongst others) Exim reloads (-HUP) to be turned into a kill -TERM (Debian bug #434756) |
[production] |
23:16 |
<mark> |
Set up basic mail system for OTRS on williams. Still incomplete and needs fine tuning and testing, spam checking is not yet implemented amongst other things. |
[production] |
22:30 |
<mark> |
Restarted Exim on sanger, disappeared mysteriously |
[production] |
21:50 |
<mark> |
Raised Dovecot max login process count from 128 to 1024 |
[production] |
21:04 |
<brion> |
merging reupload fixed: r46479, r46483, r46487 |
[production] |
20:49 |
<mark> |
Base OS install finished on williams.wikimedia.org |
[production] |
20:02 |
<brion> |
merging r46472 (FlaggedRevs autopromote fix), r46464-46476 (feed RTL style fix, re-upload disabled field fix) |
[production] |
18:05 |
<RobH> |
setup mail relay for wikimedia.cz for Danny and Co ;] |
[production] |
08:43 |
<domas> |
s3 replication switched from db1-bin.325:437169827 to db11-bin.026\t:79 |
[production] |
08:35 |
<domas> |
s2 rep switched from ixia-bin.150:119337662 to db13-bin.004:79 |
[production] |
06:15 |
<Tim> |
creating backup of db10 on storage2 |
[production] |
04:29 |
<brion> |
svn up'ing and scapping to r46424 consistently |
[production] |
04:22 |
<brion> |
updating FlaggedRevs to r46422 |
[production] |
04:17 |
<brion> |
merging r46419, r46421 -- search display fixlets |
[production] |
03:51 |
<brion> |
attempting scap again; tweaking DataCenter.ui.php since the scap syntax checks are whinging about the abstract static method o_O |
[production] |
03:40 |
<brion> |
scapping to r46413 |
[production] |
01:35 |
<brion> |
svn up'ing to r46413 on test... |
[production] |
2009-01-27
§
|
19:28 |
<brion> |
syncing updates to Collection |
[production] |
19:04 |
<brion> |
scapping update to AbuseFilter for test. updated its schema... |
[production] |
18:44 |
<brion> |
db16 lagged 2188s |
[production] |
18:44 |
<brion> |
restarting slave thread on db16. it got stopped with a lock wait timeout on a page_touched update (wtf?!) |
[production] |
18:43 |
<brion> |
slave stopped on db16 |
[production] |
17:41 |
<mark> |
knsq1 Up and serving requests with squid 2.7.5 |
[production] |
17:25 |
<mark> |
Trying squid 2.7.5 on knsq1 - might be unstable in the mean time |
[production] |
17:22 |
<mark> |
Reduced cache_mem on backend esams text squids from 3000 to 2500 |
[production] |
16:23 |
<RobH> |
srv76 had a failed hdd, replaced, reinstalled, and bringing back into rotation |
[production] |
16:18 |
<RobH> |
srv146 was powered down (heat issue?), powered back up, synced and now in rotation. |
[production] |
16:09 |
<RobH> |
srv139 didnt have apache running, synced and started |
[production] |
16:01 |
<RobH> |
srv129 didnt have apache running, synced and started |
[production] |
15:59 |
<RobH> |
sq11 back online, cleaned |
[production] |
15:40 |
<RobH> |
srv126 back online. possible bad disk, if it crashes again, the disk needs replacement. (it went read only before, which seems to sometimes happen even when the disks are not bad.) |
[production] |
15:25 |
<RobH> |
srv76 wont boot up, reinstalling. |
[production] |
15:12 |
<RobH> |
srv130 coming back online, updated fstab, synced, putting it back in rotation. |
[production] |
15:05 |
<RobH> |
moved ts-array4 to its dedicated ports, now its kate's problem ;] |
[production] |
14:49 |
<Tim> |
restarted recompressTracked.php |
[production] |
14:33 |
<Tim> |
henbane's disk has been full for 8 days due to donate-campaign.log, starting cleanup |
[production] |
14:18 |
<Tim> |
killed recompressTracked.php |
[production] |
13:44 |
<mark> |
CARP weight redistribution caused large load spike in upload backend request, causing ms1 overload, probably causing issues on apaches via NFS, etc etc... |
[production] |