2010-09-10
§
|
23:51 |
<RobH> |
bringing back online srv154, it should start up, run puppet, and put itself back into service |
[production] |
23:50 |
<RobH> |
hey look, the nagios-wm change worked, it logs to both channels, all is well. |
[production] |
23:46 |
<RobH> |
srv154 may flap, its intentional, i assure you |
[production] |
23:43 |
<RobH> |
nagios-wm reports in multiple channels, woot |
[production] |
23:21 |
<RobH> |
tired of messing with nagios bot, its back online |
[production] |
23:13 |
<RobH> |
trying to bounce the nagios bot to make it report in different channels. |
[production] |
20:23 |
<Ryan_Lane> |
manually modified /etc/gmond.conf on srv218 temporarily to use an actually existing directory for ganglia plugin configuration |
[production] |
20:22 |
<Ryan_Lane> |
manually added memcache ganglia plugin to srv218 temporarily for testing before pushing to all memcache servers |
[production] |
20:21 |
<Ryan_Lane> |
added patched python-memcache package to srv218 for memcache ganglia plugin testing |
[production] |
20:21 |
<Ryan_Lane> |
created test-payments.tesla.usability.wikimedia.org for payment processing testing (no public IP) |
[production] |
19:06 |
<RobH> |
mobile2 back online and in lvs pool, mobile pool optimal (as optimal as a 3 server cluster can be) |
[production] |
19:01 |
<RobH> |
mobile2 unresponsive to ssh, shows partically up in lvs4, wont respond to serial console, rebooting it |
[production] |
18:59 |
<RobH> |
mobile3 back online. |
[production] |
18:59 |
<RobH> |
restarted apache/memcached on mobile3 |
[production] |
16:17 |
<RobH> |
db20 back online, but has one power supply while other is replaced |
[production] |
15:38 |
<RobH> |
db20 all kinds of messed up, ignore nagios flaps. |
[production] |
15:38 |
<RobH> |
db16 running memory tests until Monday, leave it alone |
[production] |
2010-09-09
§
|
21:32 |
<RobH> |
knsq21, knsq22 online, rt#5 resolved |
[production] |
20:31 |
<tfinc> |
synchronized php-1.5/wmf-config/CommonSettings.php 'Adding thursday banners to whitelist for contrib stats' |
[production] |
20:17 |
<RobH> |
knsq21 & knsq22 coming down for reinstallation, set to false in pybal |
[production] |
20:14 |
<RobH> |
knsq17-knsq20 reinstalled and back in service |
[production] |
19:53 |
<jeluf> |
synchronized php-1.5/wmf-config/InitialiseSettings.php '23081 - Remove "makesysop" right from bureaucrats and stewards groups' |
[production] |
19:50 |
<jeluf> |
synchronized php-1.5/wmf-config/InitialiseSettings.php '25011 - Need Namespace name change in tamil wikinews' |
[production] |
19:43 |
<jeluf> |
synchronized php-1.5/wmf-config/InitialiseSettings.php '24865 - [[special' |
[production] |
19:35 |
<jeluf> |
synchronized php-1.5/wmf-config/InitialiseSettings.php '23186 - Enable upload in portuguese Wikipedia' |
[production] |
19:32 |
<jeluf> |
synchronized php-1.5/wmf-config/InitialiseSettings.php '24852 - New namespace for WikiProject on jawiki' |
[production] |
19:16 |
<jeluf> |
synchronized php-1.5/wmf-config/InitialiseSettings.php '25022 - Add NewUserMessage extension on ru.wikiversity' |
[production] |
18:47 |
<RobH> |
test test |
[production] |
18:47 |
<RobH> |
test test |
[production] |
17:44 |
<RobH> |
knsq14, knsq15 both back online and in lvs pool, still working on knsq16, knsq17 |
[production] |
17:00 |
<RobH> |
knsq14-knsq17 reinstalled, getting added and updated by puppet, then they will be put back into service |
[production] |
16:24 |
<RobH> |
knsq12 back in lvs pool, set knsq14-17 to false for reinstallation |
[production] |
16:11 |
<hcatlin> |
deployed XSS fix to m.wiki. thanks to gcouprie and tstarling. |
[production] |
15:52 |
<RobH> |
holding off on that, knsq12 is not up and online, fixing. |
[production] |
15:51 |
<RobH> |
knsq8 in cluster, setting knsq14-knsq17 to false, then taking them down for reinstallation |
[production] |
15:49 |
<RobH> |
knsq8 fixed. setting to true in lvs config |
[production] |
15:45 |
<RobH> |
knsq8 being fixed and pushed into service, then resuming reinstalltions for rt#5 knsq14-knsq22 |
[production] |