2009-02-15
§
|
21:59 |
<mark> |
Experimentally blocked non GET/HEAD HTTP methods on sq3 frontend squid |
[production] |
16:15 |
<mark> |
Upgraded PyBal on lvs2 - others will follow |
[production] |
13:11 |
<domas> |
db23 has multiple MCEs for same dimm logged: http://p.defau.lt/?IarKD4gbFhe5RmaV0RB_Xg |
[production] |
12:38 |
<domas> |
in wikistats, placed older than 10 days files into ./archive/yyyy/mm/ - maybe will make flack crash less :)) |
[production] |
11:56 |
<mark> |
Doing Squid memleak searching on sq1 with valgrind, pooled with weight 1 in LVS |
[production] |
03:09 |
<Andrew> |
CentralNotice still not working properly, and when we tried to set it to testwiki-only, it never came up. Left it on testwiki only for the time being, until somebody who knows CentralNotice can take a look at it. |
[production] |
02:21 |
<Tim> |
fixed permissions on the rest of the logs in /home/wikipedia/logs/norotate (fixes centralnotice) |
[production] |
2009-02-14
§
|
19:19 |
<Az1568_> |
re-enabled CentralNotice on testwiki to try and find the problem (we've had this before, but fixed it somehow...possibly with a regen? See November 16th log.) |
[production] |
18:34 |
<domas> |
filed a bug at https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/329489 - could use some Canonical escalation too |
[production] |
18:26 |
<domas> |
same affected srv47 - this is related to switching locking to fcntl() - this drives apparmor crazy |
[production] |
17:47 |
<domas> |
srv178 kernel memleaked few gigs. blame: apparmor |
[production] |
14:34 |
<domas> |
srv215 very much dead, doesn't show vitality signs even after serveractionhardreset |
[production] |
14:28 |
<domas> |
correction, srv208.mgmt is pointing to uninstalled box |
[production] |
14:27 |
<domas> |
DRAC serial on all new boxes is ttyS1 which is not in securetty |
[production] |
14:24 |
<domas> |
srv209.mgmt is actually srv208's SP, and srv208.mgmt is pointing to dead box |
[production] |
14:15 |
<domas> |
srv209,215 down? |
[production] |
13:43 |
<domas> |
installing php5-apc-3.0.19-1wm2 (no more futexes) on all ubuntu appservers. |
[production] |
11:01 |
<Andrew> |
test |
[production] |
2009-02-13
§
|
22:10 |
<mark> |
esams squid upgrade complete |
[production] |
21:05 |
<RobH> |
deployed srv207-srv216 in apaches cluster |
[production] |
20:34 |
<RobH> |
added new servers to nagois and restarted it |
[production] |
20:15 |
<RobH> |
setup all node groups, ganglia, apache, so on for srv199-srv206 and added into rotation |
[production] |
19:38 |
<mark> |
Upgrading esams squids to 2.7.6 |
[production] |
18:36 |
<mark> |
Upgraded squid on sq1 to 2.7.6 and rebooted the box |
[production] |
18:03 |
<mark> |
Memory leak issues on the upload frontend squids, which started in November |
[production] |
18:01 |
<RobH> |
sq13 back online, seems there is a memory leak, go mark for finding =] |
[production] |
17:54 |
<RobH> |
lomaria install done for domas |
[production] |
17:49 |
<RobH> |
rebooting sq13 due to it failing out in ganglia, OOM error evident. |
[production] |
17:48 |
<RobH> |
reinstalling lomaria per domas request |
[production] |
17:37 |
<RobH> |
sq8 was out of memory and locked up, rebooted, cleaned cache, and bringing back online |
[production] |
17:34 |
<RobH> |
srv38 and srv39 back in rotation |
[production] |
17:23 |
<RobH> |
srv38 and srv39 reinstalled, installing packages now |
[production] |
16:57 |
<RobH> |
reinstalling srv38/srv39 |
[production] |
16:57 |
<RobH> |
srv80 reinstalled as ubuntu apache and back in rotation |
[production] |
16:31 |
<RobH> |
srv79 back in rotation |
[production] |
16:21 |
<RobH> |
srv79 reinstalled, installing packages and ganglia |
[production] |
16:12 |
<RobH> |
reinstalling srv79 |
[production] |
16:00 |
<RobH> |
ganglia installed on srv77, back in rotation |
[production] |
15:55 |
<RobH> |
srv77 redeployed as ubuntu apache server |
[production] |
15:48 |
<RobH> |
reinstalling srv77 to ubuntu |
[production] |