5851-5900 of 7693 results (10ms)
2009-04-23 §
06:27 <Tim> restarted all job runners, ES connection errors weren't killing them [production]
05:43 <Tim> shutting down mysql on all fedora ES servers. Will update documentation and node lists to indicate that this is permanent. [production]
05:37 <Tim> srv217 did not come up from a soft reboot, but power cycle worked. Before reboot, observed apache2 hanging indefinitely on nanosleep(), but couldn't reproduce a timer issue in other processes. An NFS mount was hanging on stat. [production]
05:13 <Tim> rebooting srv217 [production]
04:41 <Tim> srv217 is hanging on various operations, investigating. Trying to shut down its apache. [production]
04:35 <tstarling> synchronized php-1.5/db.php [production]
04:31 <Tim> copy done, started cluster18 mysql instance on ms3 using srv104 snapshot, repooled it [production]
02:07 <tstarling> synchronized php-1.5/InitialiseSettings.php [production]
01:57 <Tim> relaxed wgAccountCreationThrottle on frwiki, presumably the 2006 vandal emergency is over. Disabled it on idwiki for workshop event. [production]
01:45 <Tim> copying srv104's data from ms3 to ms2 [production]
01:11 <Tim> started mysql on srv104 [production]
2009-04-22 §
21:44 <tomaszf> db9 is back up. excessive tmpfs file systems removed [production]
21:39 <tomaszf> taking outage on db9 to remove tmpfs file systems [production]
11:34 <JeLuF> initiated reboot of srv137. dmesg shows no usable information any more. [production]
11:30 <JeLuF> srv137 has read-only filesystem. Stopped Apache. [production]
06:03 <andrew> synchronized php-1.5/includes/specials/SpecialBlockip.php 'Live-merged r49730, typo causing failures in user hiding' [production]
06:02 <Andrew> srv137 still seems read-only, srv137: rsync: mkstemp "/apache/common/php-1.5/includes/specials/.SpecialBlockip.php.1QkrKX" failed: Read-only file system (30) [production]
03:14 <Tim> copying ES data from srv104 to ms3 using nc tarpipe [production]
03:10 <tstarling> synchronized php-1.5/db.php 'depooling srv104 ES' [production]
03:03 <Tim> corruption found on cluster18, the copy source server (srv106) is missing lots of rows. Switched back to srv105/104. [production]
03:02 <tstarling> synchronized php-1.5/db.php [production]
02:50 <tstarling> synchronized php-1.5/includes/Revision.php 'reverted profiling and logging hacks' [production]
02:40 <Tim> depooled ms2 ex-fedora instances and shut them down, it can be a backup for now [production]
02:38 <tstarling> synchronized php-1.5/db.php [production]
02:33 <Tim> deployed the new ms2/ms3 ex-fedora ES configuration [production]
02:32 <tstarling> synchronized php-1.5/db.php [production]
02:01 <Tim> set up ex-fedora mysql instances on both ms2 and ms3, controlled with /etc/init.d/mysql-ex-fedora [production]
01:04 <Tim> changed the main mysql instance on ms3 (rc1) to bind to a single IP address instead of * [production]
2009-04-21 §
19:41 <mark> Added grosley.wikimedia.org to local_domains list on grosley's exim.conf, and added appropriate aliases in /etc/aliases [production]
16:35 <Andrew> Re-ran rebuildTemplates.php, all seems well now [production]
16:30 <robh> synchronized php-1.5/mc-pmtpa.php 'syncing for fred' [production]
16:30 <root> synchronized php-1.5/mc-pmtpa.php 'swapping out srv88 for srv159 and srv90 for srv198' [production]
16:29 <andrew> synchronized php-1.5/mc-pmtpa.php 'Switched srv88 for srv159, srv90 for srv198 to fix down memcache nodes' [production]
16:18 <azafred> restarted memcached on srv96. Now responding. [production]
16:14 <Rob> Fred needs to start logging in as Fred and not as root, bad fred (see it wasnt me this time, bwahahahahahaa) [production]
16:11 <Andrew> Fred fixed up some memcached nodes, but no joy with rebuildTemplates [production]
16:10 <root> synchronized php-1.5/mc-pmtpa.php 'swapping out down servers for active ones' [production]
16:09 <root> synchronized php-1.5/mc-pmtpa.php 'swapping out down servers for active ones' [production]
16:01 <Rob> srv137 read only, depooled in pybal for apache and rebooting. [production]
15:57 <root> synchronized php-1.5/mc-pmtpa.php 'swapping out down servers for active ones' [production]
14:34 <Andrew> rebuildTemplates.php appeared not to help, same problem as before (stopped after a few wikis). Possibly a dodgy memcache node. [production]
14:32 <Andrew> ran rebuildTemplates.php metawiki due to reports of <messagename> appearing in place of the central notice. [production]
05:04 <Andrew> Live-merged r49685, fix for unsuppression of usernames on unblock -- some usernames were left stuck suppressed if they were unblocked when the block suppressed their username [production]
05:03 <andrew> synchronized php-1.5/includes/specials/SpecialBlockip.php [production]
05:03 <andrew> synchronized php-1.5/includes/specials/SpecialIpblocklist.php [production]
01:34 <azafred> Made some improvments on Spam handling. Bayes is in play and can learn from everybody what is spam and what is ham. Documentation to follow. [production]
2009-04-20 §
19:59 <Rob> Powering down srv67, srv85, srv88, srv90 due to temp warnings and bad fans. [production]
19:36 <Rob> updated mc-pmtpa.php to reflect the status of down or spare for the memcached servers. (lots more spares now) [production]
17:35 <azafred> restarted apache on srv217 [production]
17:34 <azafred> srv125 reinstall completed. [production]