401-450 of 1312 results (4ms)
2009-02-11 §
18:37 <brion> stopping apache on those bad machines for the moment [production]
18:35 <brion> srv38, 39, 77, 79, and 80 appear to have been prematurely put into apaches pool, running old version of PHP. need to be halted and upgraded [production]
17:26 <domas> restarted apache on srv154 after teh deadlock in apc [production]
16:04 <Tim> disabled checkers.php hack, using mwsuggest.js hack instead [production]
15:52 <Tim> emergency optimisation: disabled search suggest via checkers.php [production]
15:41 <domas> srv159 restarted as proper apache, not -DSCALER [production]
09:02 <domas> moved morebots to ~morebots@wikitech.wikimedia, startup line in rc.local :) [production]
09:00 <domas> tests [production]
07:06 <Tim> running maintenance/fixBug17442.php [production]
06:56 <Tim> restarted job runners [production]
04:31 <Tim> upgraded bugzilla to 3.0.8 with cvs up, and copied in the docs directory from the 3.0.8 tarball [production]
03:31 <Tim> gave myself an account on isidore, cleaned up some crap in /srv/org/wikimedia to /srv/org/wikimedia/backup [production]
02:58 <Tim> apt-get upgrade on isidore [production]
2009-02-10 §
23:47 <mark> Moved upload esams LVS from mint to hawthorn [production]
23:41 <mark> Installed a specially compiled LVS Feisty kernel on hawthorn (running Hardy) & rebooted [production]
22:33 <RobH> updated mwlib on erzurumi per brion [production]
22:25 <RobH> some resets and such on searchidx1 to get ssh working. system is very sluggish. [production]
19:28 <brion> wikitech server crashed; CPU pegged and OOM. rob rebooted it, yay [production]
02:46 <Tim> running maintenance/fixBug17300.php to create missing redirect table entries [production]
01:18 <Tim> reverted PP caching patch [production]
01:14 <Tim> re-enabled search suggestions [production]
2009-02-09 §
23:13 <domas> grunt session finished [production]
23:10 <domas> brought up srv80 from hibernation and made it work. [production]
22:53 <domas> added srv61 too [production]
22:23 <domas> added srv144 and srv147 to duty, added ganglia stuff too [production]
22:01 <domas> started appserver work on srv77,srv79 [production]
21:54 <domas> started srv35,38,49 as appservers, restarted deadlocked srv49 processes [production]
16:14 <mark> Moved upload LVS back from hawthorn to mint - even a optimized 2.6.24 kernel is not fast enough to serve upload LVS [production]
16:03 <Tim> disabled search suggest as an emergency optimsation measure [production]
16:02 <mark> Rebooted hawthorn with an LVS optimized kernel, moved upload LVS back to it [production]
15:53 <mark> Moved upload esams LVS back to mint [production]
15:37 <mark> Moved upload.esams LVS from mint to hawthorn [production]
15:28 <mark> Reinstalled server hawthorn with Hardy 8.04 [production]
13:55 <domas> fixed ganglia group for srv159 (it is scaler, not appserv) [production]
13:51 <domas> brought srv182 up [production]
13:32 <domas> repooled srv104 and srv105, after few months of vacation [production]
13:20 <domas> killed few orphaned tidy processes that were very very busy since Feb1 [production]
13:13 <domas> heeheee, extorted this: [15:11] <rainman-sr> so, srv77,79,80, rose, coronelli and maurus could be converted to apaches [production]
12:36 <Tim> trying apc.localcache=1 on srv176 [production]
04:27 <Tim> patching in r46936 [production]
03:48 <Tim> attempting to reproduce APC lock contention on srv188 [production]
2009-02-08 §
22:43 <brion> may or may not have fixed that -- log file was unwritable. hard to test the command since 'su' bitches about apache not being loginabble on hume :P [production]
22:39 <brion> investigating why centralnotice update is still broken. getting fatal php errors wtf? [production]
20:17 <domas> we were hitting APC lock contention after some CPU peak. Dear Ops Team, please upgrade to APC with localcache support. :))))) [production]
2009-02-07 §
22:49 <domas> db17 came up, but it crashed with different symptoms than other boxes, and it was running 2.6.28.1 kernel. might be previous hardware problems resurfacing [production]
21:23 <domas> db17 down [production]
2009-02-06 §
12:33 <brion> stopped that process since it was taking a while and just saved it as an hourly cronjob. :) log to /opt/mwlib/var/log/cache-cleaning [production]
12:28 <brion> running mw-serve cache cleanup for files older than 24h [production]
2009-02-05 §
18:19 <brion> put ulimit back with -v 1024000 that's better :D [production]
18:18 <brion> removed the ulimit; was unable to reach server with it in place [production]