2012-05-07
§
|
14:20 |
<Jeff_Green> |
stopped cron jobs on storage3 because of RAID failure |
[production] |
12:49 |
<mutante> |
pushing out virtual host for wikimania2013 wiki. sync / apache-graceful/all |
[production] |
11:18 |
<mutante> |
continuing with upgrades/reboots in amssq* on the side during the day |
[production] |
11:09 |
<mutante> |
squids - sq* done. all latest kernel and 0 pending upgrades. |
[production] |
09:27 |
<mutante> |
rebooting bits varnish sq68-70 one by one.. |
[production] |
08:01 |
<mutante> |
upgrading/rebooting the last couple sq* servers |
[production] |
07:20 |
<binasher> |
power cycled db45 (crashed dewiki slave) |
[production] |
07:05 |
<asher> |
synchronized wmf-config/db.php 'db45 is down' |
[production] |
02:25 |
<Tim> |
on locke: introduced 1/100 sampling for banner impressions, changed filename to bannerImpressions-sampled100.log |
[production] |
02:12 |
<Tim> |
on locke: moved fundraising logs back where they were |
[production] |
02:00 |
<LocalisationUpdate> |
failed: git pull of extensions failed |
[production] |
01:38 |
<Tim> |
on locke: compressing bannerImpressions.log |
[production] |
01:35 |
<Tim> |
on locke: moved bannerImpressions.log to archive and restarted udp2log |
[production] |
01:26 |
<Tim> |
on locke: moved fundraising logs from /a/squid/fundraising/logs to /a/squid so that they will be processed by logrotate |
[production] |
2012-05-06
§
|
07:03 |
<apergos> |
manually rotates udplogs on locke, copying destined_for_storage3 off to hume:/archive/emergencyfromlocke/ (jeff, this note's for you in particular) |
[production] |
06:36 |
<apergos> |
bringing up storage3 with neither /a nor /archive mounted, saw "The disk drive for /archive is not ready yet or not present" etc on boot, waited a long time, finally skipped them |
[production] |
06:12 |
<apergos> |
and powercycling the box instead. grrrr |
[production] |
06:05 |
<apergos> |
rebooting storage3: we have messages like May 6 05:45:12 storage3 kernel: [465081.410025] Filesystem "dm-0": xfs_log_force: error 5 returned. in the log, and the raid is unaccessible, megacli doesn't run either |
[production] |
02:00 |
<LocalisationUpdate> |
failed: git pull of extensions failed |
[production] |
2012-05-04
§
|
23:46 |
<reedy> |
synchronized php-1.20wmf2/extensions/GlobalBlocking/GlobalBlocking.class.php |
[production] |
23:45 |
<reedy> |
synchronized php-1.20wmf1/extensions/GlobalBlocking/GlobalBlocking.class.php |
[production] |
22:35 |
<aaron> |
synchronized php-1.20wmf2/includes/filerepo/backend/FSFileBackend.php 'deployed a807624' |
[production] |
22:34 |
<LeslieCarr> |
clearing varnish cache and reloading varnish on mobile |
[production] |
21:14 |
<reedy> |
synchronized wmf-config/InitialiseSettings.php |
[production] |
21:13 |
<reedy> |
ran sync-common-all |
[production] |
20:18 |
<catrope> |
synchronized wmf-config/InitialiseSettings.php 'Fix typo (cswikquote vs cswikiquote)' |
[production] |
20:06 |
<asher> |
synchronized wmf-config/db.php 'setting s2 writable' |
[production] |
20:05 |
<binasher> |
performing mysql replication steps for s2 master switch to db52 |
[production] |
20:04 |
<asher> |
synchronized wmf-config/db.php 'setting s2 read-only, db52 (still ro) as master, db13 removed' |
[production] |
19:49 |
<asher> |
synchronized wmf-config/db.php 'setting db52 weight to 0 in prep for making new s2 master' |
[production] |
19:32 |
<binasher> |
powering off db24 |
[production] |
18:08 |
<LeslieCarr> |
reloaded mobile varnish caches and purged them |
[production] |
18:02 |
<Ryan_Lane> |
gerrit upgrade is done |
[production] |
17:55 |
<Ryan_Lane> |
starting gerrit |
[production] |
17:32 |
<Ryan_Lane> |
installing gerrit package on manganese |
[production] |
17:28 |
<Ryan_Lane> |
adding gerrit 2.3 package to the repo |
[production] |
17:25 |
<Ryan_Lane> |
shutting down gerrit so that everything can be backed up |
[production] |
16:45 |
<apergos> |
lighty on dataset2 is running under gdb in screen session as root, if it dies please leave that alone (or look at it if you want to investigate) |
[production] |
16:26 |
<notpeter> |
turning off db30 (former s2 db, still on hardy, will ask asher what to do with it) to test noise in DC |
[production] |
15:50 |
<mutante> |
rebooting sq67 (bits) |
[production] |
15:42 |
<mutante> |
going through sq7x servers (text), full upgrades |
[production] |
15:32 |
<notpeter> |
removing srv281 from rending pool until we figure out what's going on with it |
[production] |