2009-02-13
§
|
17:49 |
<RobH> |
rebooting sq13 due to it failing out in ganglia, OOM error evident. |
[production] |
17:48 |
<RobH> |
reinstalling lomaria per domas request |
[production] |
17:37 |
<RobH> |
sq8 was out of memory and locked up, rebooted, cleaned cache, and bringing back online |
[production] |
17:34 |
<RobH> |
srv38 and srv39 back in rotation |
[production] |
17:23 |
<RobH> |
srv38 and srv39 reinstalled, installing packages now |
[production] |
16:57 |
<RobH> |
reinstalling srv38/srv39 |
[production] |
16:57 |
<RobH> |
srv80 reinstalled as ubuntu apache and back in rotation |
[production] |
16:31 |
<RobH> |
srv79 back in rotation |
[production] |
16:21 |
<RobH> |
srv79 reinstalled, installing packages and ganglia |
[production] |
16:12 |
<RobH> |
reinstalling srv79 |
[production] |
16:00 |
<RobH> |
ganglia installed on srv77, back in rotation |
[production] |
15:55 |
<RobH> |
srv77 redeployed as ubuntu apache server |
[production] |
15:48 |
<RobH> |
reinstalling srv77 to ubuntu |
[production] |
2009-02-12
§
|
23:59 |
<brion> |
adding 'helppage' to ui-content messages on commons per [[bugzilla:5925]] |
[production] |
23:01 |
<RobH> |
racked and setup drac for srv298-srv216 |
[production] |
21:20 |
<mark> |
Killed blocked apache processes on srv180, and restarted apache |
[production] |
21:19 |
<mark> |
Killed blocked apache processes on srv172, and restarted apache |
[production] |
21:07 |
<brion> |
fixed ownership on log files for updateSpecialPages cronjob, which likely is what broke it |
[production] |
20:28 |
<mark> |
Upgraded experimental squid 2.7.5 on knsq1 to squid 2.7.6 |
[production] |
20:00 |
<brion> |
fixed typo which broke access to revision deletion log for oversighters. tx to aaron for the spot :D |
[production] |
19:45 |
<mark> |
Replaced "2 cpu apaches" group aggregator srv32 by srv35 |
[production] |
18:55 |
<RobH> |
racked, wired, and remote management setup for srv199-srv207 |
[production] |
09:51 |
<domas> |
added srv190-srv198 to apaches dsh group, as they seem to be alive and kicking |
[production] |
09:48 |
<domas> |
changed weights for srv190-srv198 80->100 (to account for 1.85->2.5 ghz cpu step ) |
[production] |
00:29 |
<brion> |
running updateRestrictions on wikis to clean up remaining funky restrictions entries per [[bugzilla:16846]] |
[production] |
00:22 |
<Tim> |
restarted apache on srv172 |
[production] |
2009-02-11
§
|
23:23 |
<mark> |
Pooled srv190-198 |
[production] |
23:23 |
<Tim> |
re-enabling search suggestions |
[production] |
23:19 |
<mark> |
Installed Ganglia on srv190-198 |
[production] |
23:17 |
<mark> |
Installed MediaWiki application server packages on srv190-198 |
[production] |
23:02 |
<mark> |
Added srv190-198 to mediawiki_installation node_group (not any others) |
[production] |
22:55 |
<mark> |
Ran dist-upgrade && reboot on srv190-198 |
[production] |
22:46 |
<mark> |
OS installed on srv190-198 |
[production] |
22:19 |
<RobH> |
racked and setup drac on srv195-srv198 |
[production] |
22:11 |
<RobH> |
racked and setup drac on srv192, srv193, srv194 |
[production] |
22:00 |
<RobH> |
racked and setup drac on srv190, srv191 |
[production] |
21:24 |
<brion> |
putting ixia back in rotation, it's caught up |
[production] |
20:05 |
<brion> |
depooling ixia while it catches up |
[production] |
20:05 |
<brion> |
ixia lagged 8810 secs |
[production] |
20:00 |
<brion> |
ixia replication is broken -- causing contribs lag on itwiki |
[production] |
19:19 |
<RobH> |
setup msw-a5-sdtpa like 30 minutes ago, opps ;] |
[production] |
19:00 |
<mark> |
Added srv190-225 to DNS & DHCP |
[production] |
18:55 |
<mark> |
set up RANCID for asw-a4-sdtpa and asw-a5-sdtpa |
[production] |
18:54 |
<brion> |
disabled srv38,39,77,79,80 in lvs3 pybal config to ensure they don't go back into service accidentally until fixed up |
[production] |
18:37 |
<brion> |
stopping apache on those bad machines for the moment |
[production] |
18:35 |
<brion> |
srv38, 39, 77, 79, and 80 appear to have been prematurely put into apaches pool, running old version of PHP. need to be halted and upgraded |
[production] |
17:26 |
<domas> |
restarted apache on srv154 after teh deadlock in apc |
[production] |
16:04 |
<Tim> |
disabled checkers.php hack, using mwsuggest.js hack instead |
[production] |
15:52 |
<Tim> |
emergency optimisation: disabled search suggest via checkers.php |
[production] |
15:41 |
<domas> |
srv159 restarted as proper apache, not -DSCALER |
[production] |