2012-05-07
§
|
21:44 |
<binasher> |
moved default resolution for upload from eqiad to pmtpa |
[production] |
21:29 |
<cmjohnson1> |
shutting down storage3 for troubleshooting |
[production] |
20:37 |
<binasher> |
attempting a live online schema change for zuwikitionary.recentchanges on the prod master |
[production] |
20:22 |
<LeslieCarr> |
(above) restarted nagios-wm on spence |
[production] |
20:20 |
<LeslieCarr> |
restarted irc bot |
[production] |
20:15 |
<binasher> |
rebooting db45 |
[production] |
20:11 |
<binasher> |
rebooting db1019 |
[production] |
18:46 |
<reedy> |
synchronized php-1.20wmf1/extensions/Collection/Collection.session.php 'head' |
[production] |
18:45 |
<reedy> |
synchronized php-1.20wmf2/extensions/Collection/Collection.session.php 'head' |
[production] |
18:25 |
<reedy> |
synchronized php-1.20wmf2/extensions/GlobalBlocking/GlobalBlocking.class.php |
[production] |
18:24 |
<reedy> |
synchronized php-1.20wmf1/extensions/GlobalBlocking/GlobalBlocking.class.php |
[production] |
18:07 |
<reedy> |
rebuilt wikiversions.cdb and synchronized wikiversions files: enwiki to 1.20wmf2 |
[production] |
16:16 |
<cmjohnson1> |
shutting down storage3 to reseat RAID card |
[production] |
15:58 |
<cmjohnson1> |
Going to power cycling storage3 several times to troubleshoot hardware issue |
[production] |
15:15 |
<RobH> |
updating firmware on storgae3 |
[production] |
14:20 |
<Jeff_Green> |
stopped cron jobs on storage3 because of RAID failure |
[production] |
12:49 |
<mutante> |
pushing out virtual host for wikimania2013 wiki. sync / apache-graceful/all |
[production] |
11:18 |
<mutante> |
continuing with upgrades/reboots in amssq* on the side during the day |
[production] |
11:09 |
<mutante> |
squids - sq* done. all latest kernel and 0 pending upgrades. |
[production] |
09:27 |
<mutante> |
rebooting bits varnish sq68-70 one by one.. |
[production] |
08:01 |
<mutante> |
upgrading/rebooting the last couple sq* servers |
[production] |
07:20 |
<binasher> |
power cycled db45 (crashed dewiki slave) |
[production] |
07:05 |
<asher> |
synchronized wmf-config/db.php 'db45 is down' |
[production] |
02:25 |
<Tim> |
on locke: introduced 1/100 sampling for banner impressions, changed filename to bannerImpressions-sampled100.log |
[production] |
02:12 |
<Tim> |
on locke: moved fundraising logs back where they were |
[production] |
02:00 |
<LocalisationUpdate> |
failed: git pull of extensions failed |
[production] |
01:38 |
<Tim> |
on locke: compressing bannerImpressions.log |
[production] |
01:35 |
<Tim> |
on locke: moved bannerImpressions.log to archive and restarted udp2log |
[production] |
01:26 |
<Tim> |
on locke: moved fundraising logs from /a/squid/fundraising/logs to /a/squid so that they will be processed by logrotate |
[production] |
2012-05-06
§
|
07:03 |
<apergos> |
manually rotates udplogs on locke, copying destined_for_storage3 off to hume:/archive/emergencyfromlocke/ (jeff, this note's for you in particular) |
[production] |
06:36 |
<apergos> |
bringing up storage3 with neither /a nor /archive mounted, saw "The disk drive for /archive is not ready yet or not present" etc on boot, waited a long time, finally skipped them |
[production] |
06:12 |
<apergos> |
and powercycling the box instead. grrrr |
[production] |
06:05 |
<apergos> |
rebooting storage3: we have messages like May 6 05:45:12 storage3 kernel: [465081.410025] Filesystem "dm-0": xfs_log_force: error 5 returned. in the log, and the raid is unaccessible, megacli doesn't run either |
[production] |
02:00 |
<LocalisationUpdate> |
failed: git pull of extensions failed |
[production] |