2010-09-17
§
|
20:35 |
<RobH> |
locke no longer under warranty, replaced failed 73gb 10k with 146gb 10k drive on site. raid rebuilding, closing rt#114 |
[production] |
19:36 |
<RobH> |
db28 coming down for fanboard replacement (not hot swappable since its the controller) |
[production] |
19:30 |
<RobH> |
all SP back online for search1-search12 reporting error free. |
[production] |
19:25 |
<RobH> |
resetting the SP on search1-search12 to make them forget the spare powersupply they really do not have. |
[production] |
19:16 |
<RobH> |
full power reset cleared drac errors, noted for followup, system back online. |
[production] |
19:07 |
<RobH> |
srv206 has all kinds of errors, working on it, ignore nagios flaps. |
[production] |
18:49 |
<RobH> |
db34 online and ready to be setup into database deployments. |
[production] |
00:15 |
<tfinc> |
synchronized php-1.5/wmf-config/CommonSettings.php 'Fixing placement of VariablePage extension' |
[production] |
2010-09-16
§
|
21:39 |
<tfinc> |
synchronized php-1.5/wmf-config/InitialiseSettings.php 'adding enwiki to variable page extension' |
[production] |
21:38 |
<Ryan_Lane> |
fixed puppet issue with ganglia on memcache servers (a bad puppet file had previously been pushed) |
[production] |
21:35 |
<tfinc> |
ran sync-common-all |
[production] |
21:27 |
<tfinc> |
synchronized php-1.5/extensions/VariablePage/VariablePage.i18n.php 'Adding variable page extension for donate link' |
[production] |
17:36 |
<Ryan_Lane> |
took sq33 out of lvs rotation |
[production] |
16:02 |
<catrope> |
synchronized php-1.5/wmf-config/InitialiseSettings.php 'bug 25142: Redisable anon page creation, there was no consensus' |
[production] |
14:23 |
<RobH> |
sq60,sq73,sq75 coming back into service |
[production] |
14:13 |
<RobH> |
fixed sq33, working on backend on sq60,sq73,sq75, all having cache cleaned |
[production] |
13:54 |
<RobH> |
taking another look at sq33, pulled from service |
[production] |
13:49 |
<RobH> |
working on srv206 |
[production] |
13:46 |
<RobH> |
sq73 & sq75 up |
[production] |
13:35 |
<RobH> |
sq38 & sq60 back in service |
[production] |
13:29 |
<RobH> |
rebooted sq38 & sq60 to bring them back online |
[production] |
12:45 |
<mark> |
Powercycled amssq52 |
[production] |
12:37 |
<mark> |
knsq3 declared dead |
[production] |
11:43 |
<mark> |
Inserted extra switch fabric and spare 4x 10G line card modules into csw1-esams |
[production] |
2010-09-15
§
|
21:16 |
<Ryan_Lane> |
doing initial repo copies from mayflower to formey |
[production] |
21:16 |
<Ryan_Lane> |
added svn users to formey.wikimedia.org |
[production] |
18:43 |
<mark> |
Moving back bits.esams traffic |
[production] |
18:21 |
<mark> |
Moving traffic for bits.esams to bits.pmtpa |
[production] |
18:08 |
<mark> |
Restarted varnish on knsq5 |
[production] |
18:07 |
<mark> |
Changed LVS scheduler from 'wlc' to 'wrr' for bits on amslvs1 |
[production] |
18:02 |
<mark> |
Restarting pybal on amslvs1, with proxyfetch disabled for varnish/bits |
[production] |
17:48 |
<mark> |
Restarted varnish servers on knsq2, 4 and 5 |
[production] |
17:37 |
<mark> |
temporarily disabled the proxyfetch monitor on amslvs1 to stabilize |
[production] |
17:06 |
<mark> |
Restarted pdns instances on dobson |
[production] |
16:59 |
<mark> |
Shutdown powerdns (auth) and pdns-recursor on dobson, preparing for reinstall |
[production] |
16:19 |
<mark> |
Temporarily making ns1 the DNS master, to rebuild dobson |
[production] |
15:52 |
<mark> |
Fixed regular expressions in puppet site.pp |
[production] |
14:48 |
<mark> |
ms2 upgrade to Lucid is complete |
[production] |
14:47 |
<mark> |
Replaced apparmor profile on ms2 by newer one from ms1, restarted replication |
[production] |
13:25 |
<mark> |
Starting ms2 upgrade to Lucid |
[production] |
01:09 |
<Ryan_Lane> |
finished rebuilding ersch |
[production] |
00:58 |
<Ryan_Lane> |
rebuilding ersch |
[production] |
00:57 |
<Ryan_Lane> |
finished rebuilding alsted |
[production] |
00:39 |
<Ryan_Lane> |
rebuilding alsted |
[production] |