|
2009-09-15
§
|
| 21:09 |
<domas> |
where is my attribution ;-D |
[production] |
| 21:08 |
<Rob> |
test.wikipedia.org fixed by mounting nfs |
[production] |
| 20:55 |
<Rob> |
setup new private wiki, added to dns as well as configuration files |
[production] |
| 20:38 |
<robh> |
ran sync-common-all |
[production] |
| 19:58 |
<Rob> |
servers running wipe were burdening the logging host. added drop rules to iptables on db20 to refuse those servers access since ssh doesnt work with wipe destorying things |
[production] |
| 19:24 |
<Rob> |
depooled srv124 to use as test.wikipedia.org, then updated squid config and pushed |
[production] |
| 19:09 |
<midom> |
synchronized php-1.5/wmf-config/db.php 'oh well' |
[production] |
| 17:14 |
<Rob> |
db12 is back online with mysql running |
[production] |
| 16:41 |
<Rob> |
installed python-pyfribidi on pdf1 |
[production] |
| 16:06 |
<aZaFred_> |
snapshot[1..3] wikimedia-task-appserver install completed. Added hosts to dsh nodegroup for mediawiki-installation so common updates get pushed to them |
[production] |
| 16:04 |
<Rob> |
srv245 bad powersupply, swapped with on site spare |
[production] |
| 15:59 |
<Rob> |
rebooting srv245 to fix its drac access |
[production] |
| 15:33 |
<Rob> |
some odd invalid entries in dsh nodes, removed. |
[production] |
| 15:24 |
<Rob> |
running wipe on srv35, srv52, srv54, srv56 |
[production] |
| 15:20 |
<Rob> |
srv66 running wipe |
[production] |
| 15:12 |
<mark> |
Started MySQL on ms2 and restarted replication |
[production] |
| 15:11 |
<mark> |
Redistributed the spare drives on ms2 back into the spare pool (/dev/md1) |
[production] |
| 15:08 |
<Rob> |
shutting down db12 for raid battery swap |
[production] |
| 15:04 |
<mark> |
Swapped drive c3t6d0 in ms2, readded it to /dev/md14 and moved back the spare /dev/sdao into the spare pool (/dev/md1) |
[production] |
| 14:42 |
<mark> |
Shutting down MySQL on ms2 |
[production] |
| 14:33 |
<Rob> |
removed a number of decommissioned servers from nagios |
[production] |
| 14:30 |
<Rob> |
wipe running on srv44, srv45, srv47 |
[production] |
| 14:26 |
<Rob> |
srv31, srv32, srv33 running wipe in screen sessions, do not try to use them ;] |
[production] |
| 14:22 |
<Rob> |
srv30=srv80 to be decommissioned, wiping the drives with them in rack now. mark already depooled from apache and memcached |
[production] |
| 14:19 |
<Rob> |
srv145 is back up and ok |
[production] |
| 14:07 |
<Rob> |
srv145 coming back up, my bad |
[production] |
| 13:53 |
<Rob> |
srv52 hdd died. |
[production] |
| 13:51 |
<domas> |
cron jobs work way better, if one figures how to set permissions right (like, executable? :) |
[production] |
| 00:03 |
<atglenn> |
grrr.. that would be ms4. |
[production] |
| 00:03 |
<atglenn> |
testing mailiferr (workaround for no MAILTO on solaris) on ms5 for hourly snaps |
[production] |
|
2009-09-14
§
|
| 23:26 |
<atglenn> |
rerunning the rsync list of changed files again on ms6 (last run was borked). it's in screen as root |
[production] |
| 23:23 |
<aZaFred_> |
added nagios/ganglia monitoring for snapshot[1..3] |
[production] |
| 22:32 |
<atglenn> |
created a directory /export/upload/wikipedia/common/thumb with no write perms on ms1 so that static html dump ext doesn't try to write in it |
[production] |
| 21:39 |
<aZaFred_> |
upgrading Ubuntu on Sage to 8.04 LTS |
[production] |
| 21:12 |
<tomaszf> |
running timings for en static html snapshot from hume |
[production] |
| 19:50 |
<robh> |
synchronized php-1.5/wmf-config/InitialiseSettings.php |
[production] |
| 19:47 |
<Rob> |
left out the virtual host, resyncing and restarting apaches |
[production] |
| 19:44 |
<robh> |
ran sync-common-all |
[production] |
| 18:21 |
<Rob> |
added volunteer.wikimedia.org to dns and pushed authdns-update |
[production] |
| 18:03 |
<Rob> |
added flagrevs stats update to hume crontab |
[production] |
| 16:24 |
<brion> |
restarted apache on wikitech |
[production] |