4251-4300 of 10000 results (31ms)
2012-04-30 §
18:27 <Jeff_Green> power cycling aluminium which faceplanted [production]
18:22 <binasher> rebooting mw45 [production]
18:21 <notpeter> rebuilding db57 again, this time with more correct raid level! [production]
18:19 <logmsgbot_> asher synchronized wmf-config/db.php 'adding db59,60 to s1 with low weights' [production]
18:16 <paravoid> depooled & rebooting ssl1 [production]
18:09 <logmsgbot_> aaron rebuilt wikiversions.cdb and synchronized wikiversions files: Sanity run after script changes. [production]
18:00 <logmsgbot_> aaron synchronized multiversion [production]
17:58 <logmsgbot_> reedy synchronized php-1.20wmf1/includes/MagicWord.php 'https://gerrit.wikimedia.org/r/6135' [production]
17:44 <logmsgbot_> aaron synchronized wikiversions.cdb [production]
17:43 <AaronSchulz> updating multiversion code [production]
08:34 <mutante> reinstalling srv266 [production]
08:08 <mutante> upgraded mw1,mw2,mw35 [production]
07:59 <mutante> reinstalling srv206 [production]
07:50 <mutante> upgrading mw36 [production]
07:37 <apergos> powercycling srv266, had this message on mgmt console: Severity: Non Recoverable, SEL:CPU Machine Chk: Processor sensor, transition to non-recoverable was asserted [production]
07:22 <mutante> installing upgrades on srv212 [production]
07:19 <apergos> reinstalled srv284, seems to be up now [production]
07:17 <mutante> powercycled mw8 [production]
02:14 <logmsgbot_> LocalisationUpdate completed (1.20wmf1) at Mon Apr 30 02:13:59 UTC 2012 [production]
2012-04-29 §
20:13 <apergos> srv206 won't run puppet, see syslog, clearing out the yaml file didn't help, since it's not urgent I'm leaving it for tomorrow [production]
19:51 <Ryan_Lane> depooling ssl3004 [production]
19:51 <Ryan_Lane> removed the ipv6 addresses from maerlant and added them to ssl3001, then restarted nginx [production]
19:50 <Ryan_Lane> repooling ssl3001 [production]
19:46 <apergos> powercycled mw60, same reason as the rest [production]
19:13 <apergos> power cycled mw48 and mw52 (hung just like the others) [production]
18:05 <apergos> sll3002 and 3003 were rebooted and are the entire ssl esams pool right now [production]
16:34 <apergos> powercycling the ssl300x.esams hosts. 212 days of uptime... (and 3001 had gone out to lunch) [production]
12:34 <mutante> and finally mw1, so just leaving mw1102 and mw60 for having other issues for a while (->Nagios) [production]
12:22 <mutante> check_all_memcached recovered, but still same treatment for mw10 and 11 (8 and 15h ago) [production]
12:07 <mutante> powercycling mw30 [production]
02:56 <paravoid> rebooting ssl2 (has 214 days uptime) [production]
02:47 <paravoid> powercycled ssl3 [production]
02:14 <logmsgbot_> LocalisationUpdate completed (1.20wmf1) at Sun Apr 29 02:13:58 UTC 2012 [production]
2012-04-28 §
22:53 <Reedy> Job queue logs on gdash seem to have stopped on the 26th... [production]
22:29 <logmsgbot_> reedy synchronized php-1.20wmf1/includes/EditPage.php 'https://gerrit.wikimedia.org/r/6088' [production]
21:52 <logmsgbot_> reedy synchronized wmf-config/CommonSettings.php [production]
21:51 <logmsgbot_> reedy synchronized php-1.20wmf1/extensions/cldr/LanguageNames.body.php [production]
21:12 <logmsgbot_> reedy synchronized php-1.20wmf1/extensions/cldr/LanguageNames.body.php [production]
21:10 <logmsgbot_> reedy synchronized php-1.20wmf1/extensions/cldr/LanguageNames.body.php [production]
21:09 <logmsgbot_> reedy synchronized common/php-1.20wmf1/extensions/cldr/LanguageNames.body.php 'more debugging' [production]
20:51 <logmsgbot_> reedy synchronized php-1.20wmf1/extensions/cldr/LanguageNames.body.php 'Add debugging' [production]
20:49 <logmsgbot_> reedy synchronized wmf-config/CommonSettings.php 'Add debuglog group for language code not being a string' [production]
19:04 <logmsgbot_> reedy synchronized php-1.20wmf1/includes/ExternalEdit.php 'https://gerrit.wikimedia.org/r/6077' [production]
19:03 <logmsgbot_> reedy synchronized php-1.20wmf1/includes/api/ApiParse.php 'https://gerrit.wikimedia.org/r/6076' [production]
02:24 <Ryan_Lane> rebooting all mediawiki boxes that have uptimes affected by the bug are being rebooted at 8 minute intervals [production]
02:14 <logmsgbot_> LocalisationUpdate completed (1.20wmf1) at Sat Apr 28 02:14:14 UTC 2012 [production]
01:33 <paravoid> powecycled mw29 [production]
01:21 <paravoid> powercycled mw38 [production]
00:17 <notpeter> db12 is sooooo sloooooow, starting innobackupex from db1017 to db60 for new s1 slave [production]
2012-04-27 §
22:15 <paravoid> upgraded ssl4 to nginx 0.7.65-5wmf1 and added it back to the pool [production]