2010-02-22
§
|
23:40 |
<Tim> |
restarted replication on ms2 |
[production] |
23:39 |
<Tim> |
ms1 is up with a full copy of ms3 data, now replicating cluster22/rc1 |
[production] |
21:09 |
<mark> |
synchronized php-1.5/wmf-config/InitialiseSettings.php 'Revert' |
[production] |
20:30 |
<root> |
synchronized php-1.5/wmf-config/InitialiseSettings.php 'Bug 22522 - Allow bureaucrats to remove sysop flag on Simple English Wikitionary' |
[production] |
20:09 |
<Rob> |
finished the setup of srv217 (just needed puppet fixed and files updated) |
[production] |
20:02 |
<root> |
ran sync-common-all |
[production] |
20:01 |
<mark> |
synchronized php-1.5/wmf-config/InitialiseSettings.php 'Set vector as default skin on nlwikimedia' |
[production] |
19:44 |
<root> |
ran sync-common-all |
[production] |
18:10 |
<Fred> |
ganglia is going to hiccup a little for the next few minutes while I restore data... "Don't panic" |
[production] |
07:53 |
<Tim> |
rebooted srv206 via drac |
[production] |
07:48 |
<tstarling> |
synchronized php-1.5/wmf-config/mc.php 'removing srv206, is down' |
[production] |
06:19 |
<Tim> |
increased multicast TTL from 1 to 3 in gmond_template.erb, to fix ganglia on ms* |
[production] |
05:54 |
<Tim> |
started ms2 -> ms1 copy using nc+tarpipe |
[production] |
05:30 |
<Tim> |
shutting down mysql on ms2 for copy to ms1 |
[production] |
05:09 |
<tstarling> |
synchronized php-1.5/wmf-config/db.php 'depooling ms2 for copy to ms1' |
[production] |
2010-02-16
§
|
23:31 |
<Fred> |
upgrading gmond to 3.1.2 everywhere. However, due to the newish module structure, there is a potential that ganglia will hickup while puppet does its job... |
[production] |
22:03 |
<logmsgbot_> |
mark synchronized php-1.5/wmf-config/checkers.php 'Exceptions' |
[production] |
20:30 |
<Rob> |
srv127 is online, but not in LVS. Its cert was accepted on sockpuppet, but puppetd --test results in a cert failure on retrieval from sockpuppet. |
[production] |
20:17 |
<Rob> |
rebooting srv127 |
[production] |
20:15 |
<mark> |
Increased membufs to 40 per COSS dir on the pmtpa upload backend squids |
[production] |
19:51 |
<mark> |
Increased membufs per COSS dir from 10 to 20 on the new pmtpa squids |
[production] |
18:10 |
<apergos> |
but documentation can save people precious time when things are on fire |
[production] |
18:02 |
<mark> |
Documentation is not a substitution for thinking |
[production] |
16:44 |
<mark> |
Fixed puppet on most servers |
[production] |
16:23 |
<domas> |
anyone knows why mysqldump on snapshot3 is locking tables? maybe --single-transaction could work better?!!!? |
[production] |
16:22 |
<RoanKattouw> |
Strike my last, I hear it'll fix itself in a day |
[production] |
16:21 |
<RoanKattouw> |
Oops, meant to type srv187-189 |
[production] |
16:21 |
<RoanKattouw> |
Ganglia not picking up data for srv1987-189, 193, 214-218, 250-253, 257 even though the boxes are up and gmond is running; been like this for 3 days |
[production] |
15:21 |
<mark> |
Removing all puppet certs/private keys on all machines |
[production] |