751-800 of 5562 results (5ms)
2009-12-04 §
18:38 <Rob> rebooted srv144 and srv126 [production]
18:36 <Rob> srv245 package install failed. I do not have time to tinker with it while in the DC, I have other things that require my physical access to the machines. Leaving it alone for now to work on remotely. [production]
18:28 <Rob> srv245 OS installed, setting up wikimedia-task-appserver [production]
18:06 <Rob> srv245 was sitting idle with no OS, depooled from apaches. reinstalling system. [production]
17:57 <Rob> rebooted srv83 per fred [production]
17:35 <Fred> removed srv83 from the nodelist since it was causing ddsh to never finish executing. [production]
17:26 <Fred> fixed broken apache. Seems like there is a machine down that is preventing normal sync-file from finishing... Looking into it. [production]
16:50 <rainman-sr> stopped logging of search queries on searchidx1 until someone sets up proper log archiving to a different machine [production]
16:48 <rainman-sr> searchidx1 had full disk, freed some 100gb of space by deleting logs and stuff laying around [production]
16:14 <Rob> srv245 down and unresponsive, rebooting [production]
16:12 <Rob> sq43's replacement disk is also bad (talk about bad luck), placing rma with dell. system will remain powered down for now. [production]
15:55 <Rob> sq43 isn't seeing a replaced disk, rebooting and troubleshooting [production]
15:33 <domas> 'arcconf setcache 1 logicaldrive 0 roff ' - disabling any read caching on db11-db30 RAIDs [production]
15:13 <Rob> after tinkering with it with domas, it appears rebuild is indeed automatic. db21 rebuilding raid array [production]
15:09 <Rob> db21 bad disk swapped out, rebuild should be automatic [production]
14:57 <Rob> sq14 back up, rebuilding its cache [production]
14:54 <Rob> sq13 primary disk dead, out of warranty [production]
14:53 <Rob> swapping sdc in sq13 and sq14 to bring sq14 back online [production]
14:53 <Rob> sq14 disk sdc dead, out of warranty. [production]
05:18 <Tim> on fenari: running all pending renameUser jobs from enwiki [production]
03:37 <Tim> Around 03:12, accidentally renamed enwiki's job table and so renamed it back a second later. This caused all slaves to stop due to a replication bug. Fixed now. [production]
03:25 <Tim> testing fixJobQueueExplosion.php on commonswiki [production]
02:46 <Tim> srv156 not responding to ssh, trying reboot [production]
01:13 <Tim> restarting job runners [production]
01:13 <tstarling> synchronized php-1.5/includes/HTMLCacheUpdate.php 'patching out all category backlink updates, major bug causing job queue to stall' [production]
00:12 <Tim> granted access to root@fenari on all servers in the mysql node group [production]
2009-12-03 §
23:46 <catrope> synchronized php-1.5/wmf-config/InitialiseSettings.php 'Allow bcrats to add and remove new arbcom group on nlwiki' [production]
23:40 <RoanKattouw> Synced InitiatiseSettings.php: allow bcrats to add and remove new arbcom group on nlwiki [production]
22:49 <RoanKattouw> Importing 365 images into Commons as User:GeographBot, requested by Multichill [production]
22:39 <RoanKattouw> Synced InitialiseSettings.php for bug 21238: self-removal of flood flag on plwiki [production]
22:33 <RoanKattouw> Synced InitialiseSettings.php for bugs 20775 and 21719. sync-file is stalling on what seems to be an unresponsive server [production]
21:35 <RoanKattouw> Running namespaceDupes on usabilitywiki for bug 21753 [production]
21:35 <RoanKattouw> catrope synchronized php-1.5/wmf-config/InitialiseSettings.php 'bug 21753 Fix Multimedia talk NS on usabilitywiki' [production]
04:20 <tfinc> synchronized php-1.5/extensions/ContributionReporting/ContributionTrackingStatistics_body.php 'fixing conversion rate bugs' [production]
2009-12-02 §
23:28 <midom> synchronized php-1.5/wmf-config/db.php 'reenabling db18 and db25, also, attempting to overwrite stale db.php copies' [production]
23:25 <Fred> massaged mc.php to retrieve working spare, and remove broken memcached nodes. all is now good in the land of memcache [production]
22:13 <mark> Recovered torrus from deadlock [production]
21:00 <Fred> rebooted srv194 (hung) [production]
20:48 <Rob> removed bayle and khaldun from dsh, both are in rack running wipe with network pulled [production]
20:38 <Fred> bart removed from nagios (well that sounds funny) [production]
20:36 <Rob> khaldun is down forever! decomissioned and running wipe in rack with the network pulled [production]
20:35 <Rob> isidore rebooted by accident due to power cable issues [production]
20:21 <Rob> srv136 crashed with temp warnings, going to decommission it, rebooting to wipe and remove network [production]
20:15 <Rob> bart decommissioned, unracked, wipe running on testbench with usbcdrom [production]
19:49 <Rob> decommissioned, unracked srv66, srv51, srv81, srv118 (previously removed from pybal) [production]
19:39 <Rob> decommissioned srv130, unracked [production]
19:20 <Rob> srv122 decommissioned, wiped, unracked [production]
18:19 <Rob> ms7/ms8 racked in sdtpa a2, network wired, dns setup, racktables updated, & LOM online [production]
18:18 <Rob> serial connection to ps1-a4-sdtpa returned to normal [production]
18:05 <Rob> ps1-a4-sdtpa temp losing its serial connection, stealing adapter to setup ms7/8 [production]