5751-5800 of 10000 results (18ms)
2009-12-06 §
01:20 <mark> Disabled xinetd and extdist crontab on zwinger [production]
00:40 <mark> synchronized php-1.5/wmf-config/CommonSettings.php 'Moved svn-invoker (ExtensionDistributor) from zwinger to fenari' [production]
00:27 <mark> sq27 is flooding syslog; placed temporary firewall entry for syslog packets on nfs1 [production]
2009-12-05 §
03:26 <tfinc> synchronized php-1.5/extensions/ContributionReporting/ContributionStatistics_body.php 'picking up bugfix from r59753' [production]
00:46 <tfinc> synchronized php-1.5/wmf-config/CommonSettings.php 'adding CN Notice 22' [production]
00:44 <atglenn> start transfer of incremental via zfs send (600gb?) from ms1 to file on ms4, in prep for nc to ms7 later, running in screen as root on ms1 [production]
00:14 <Fred> synchronized php-1.5/wmf-config/InitialiseSettings.php 'changed logo for usabilitywiki.' [production]
00:11 <Fred> synchronized php-1.5/wmf-config/InitialiseSettings.php 'changed logo for usabilitywiki.' [production]
2009-12-04 §
23:30 <atglenn> started netcat of the bulk of the data from ms5 to ms7. running in screen as root on both hosts. [production]
23:21 <atglenn> started ncat of (small piece of) image date from ms5 to ms7, running in screen as root on both hosts [production]
20:47 <Rob> which doesnt work, damn. [production]
20:47 <Rob> got sick of racktables.wikimedia.org not redirecting correctly, put in a rewrite for non ssl connections to ssl [production]
20:24 <Fred> fixed nrpe on db20 and db7 [production]
20:13 <root> ran sync-common-all [production]
20:12 <Rob> running sync-common-all to update configuration for support of flaggedrevs on plwiktionary [production]
19:20 <Rob> srv144 removed from node groups & pybal, nagios resynced. [production]
19:19 <Rob> srv144 is out of warranty and rebooting randomly, decommissioning. [production]
19:05 <Fred> finished setup of srv245. [production]
19:02 <Rob> srv126 removed from node groups and lvs. nagios restarted to exclude it. [production]
19:01 <Rob> srv126 refuses to even post when benched, out of warranty, slating for immediate decommissioning [production]
19:00 <Rob> srv144 reinstalling with a single hard disk, no more raid1 [production]
18:50 <Rob> swapped primary srv144 drive with old decommissioned spare. reinstalling OS, will reinstall packages and get online later. [production]
18:45 <Rob> sq22 back online, all drives nominal, rebuilding cache and ensuring it is in rotation [production]
18:41 <Rob> rebooted sq22 [production]
18:38 <Rob> rebooted srv144 and srv126 [production]
18:36 <Rob> srv245 package install failed. I do not have time to tinker with it while in the DC, I have other things that require my physical access to the machines. Leaving it alone for now to work on remotely. [production]
18:28 <Rob> srv245 OS installed, setting up wikimedia-task-appserver [production]
18:06 <Rob> srv245 was sitting idle with no OS, depooled from apaches. reinstalling system. [production]
17:57 <Rob> rebooted srv83 per fred [production]
17:35 <Fred> removed srv83 from the nodelist since it was causing ddsh to never finish executing. [production]
17:26 <Fred> fixed broken apache. Seems like there is a machine down that is preventing normal sync-file from finishing... Looking into it. [production]
16:50 <rainman-sr> stopped logging of search queries on searchidx1 until someone sets up proper log archiving to a different machine [production]
16:48 <rainman-sr> searchidx1 had full disk, freed some 100gb of space by deleting logs and stuff laying around [production]
16:14 <Rob> srv245 down and unresponsive, rebooting [production]
16:12 <Rob> sq43's replacement disk is also bad (talk about bad luck), placing rma with dell. system will remain powered down for now. [production]
15:55 <Rob> sq43 isn't seeing a replaced disk, rebooting and troubleshooting [production]
15:33 <domas> 'arcconf setcache 1 logicaldrive 0 roff ' - disabling any read caching on db11-db30 RAIDs [production]
15:13 <Rob> after tinkering with it with domas, it appears rebuild is indeed automatic. db21 rebuilding raid array [production]
15:09 <Rob> db21 bad disk swapped out, rebuild should be automatic [production]
14:57 <Rob> sq14 back up, rebuilding its cache [production]
14:54 <Rob> sq13 primary disk dead, out of warranty [production]
14:53 <Rob> swapping sdc in sq13 and sq14 to bring sq14 back online [production]
14:53 <Rob> sq14 disk sdc dead, out of warranty. [production]
05:18 <Tim> on fenari: running all pending renameUser jobs from enwiki [production]
03:37 <Tim> Around 03:12, accidentally renamed enwiki's job table and so renamed it back a second later. This caused all slaves to stop due to a replication bug. Fixed now. [production]
03:25 <Tim> testing fixJobQueueExplosion.php on commonswiki [production]
02:46 <Tim> srv156 not responding to ssh, trying reboot [production]
01:13 <Tim> restarting job runners [production]
01:13 <tstarling> synchronized php-1.5/includes/HTMLCacheUpdate.php 'patching out all category backlink updates, major bug causing job queue to stall' [production]
00:12 <Tim> granted access to root@fenari on all servers in the mysql node group [production]