2012-04-29
§
|
20:13 |
<apergos> |
srv206 won't run puppet, see syslog, clearing out the yaml file didn't help, since it's not urgent I'm leaving it for tomorrow |
[production] |
19:51 |
<Ryan_Lane> |
depooling ssl3004 |
[production] |
19:51 |
<Ryan_Lane> |
removed the ipv6 addresses from maerlant and added them to ssl3001, then restarted nginx |
[production] |
19:50 |
<Ryan_Lane> |
repooling ssl3001 |
[production] |
19:46 |
<apergos> |
powercycled mw60, same reason as the rest |
[production] |
19:13 |
<apergos> |
power cycled mw48 and mw52 (hung just like the others) |
[production] |
18:05 |
<apergos> |
sll3002 and 3003 were rebooted and are the entire ssl esams pool right now |
[production] |
16:34 |
<apergos> |
powercycling the ssl300x.esams hosts. 212 days of uptime... (and 3001 had gone out to lunch) |
[production] |
12:34 |
<mutante> |
and finally mw1, so just leaving mw1102 and mw60 for having other issues for a while (->Nagios) |
[production] |
12:22 |
<mutante> |
check_all_memcached recovered, but still same treatment for mw10 and 11 (8 and 15h ago) |
[production] |
12:07 |
<mutante> |
powercycling mw30 |
[production] |
02:56 |
<paravoid> |
rebooting ssl2 (has 214 days uptime) |
[production] |
02:47 |
<paravoid> |
powercycled ssl3 |
[production] |
02:14 |
<logmsgbot_> |
LocalisationUpdate completed (1.20wmf1) at Sun Apr 29 02:13:58 UTC 2012 |
[production] |
2012-04-28
§
|
22:53 |
<Reedy> |
Job queue logs on gdash seem to have stopped on the 26th... |
[production] |
22:29 |
<logmsgbot_> |
reedy synchronized php-1.20wmf1/includes/EditPage.php 'https://gerrit.wikimedia.org/r/6088' |
[production] |
21:52 |
<logmsgbot_> |
reedy synchronized wmf-config/CommonSettings.php |
[production] |
21:51 |
<logmsgbot_> |
reedy synchronized php-1.20wmf1/extensions/cldr/LanguageNames.body.php |
[production] |
21:12 |
<logmsgbot_> |
reedy synchronized php-1.20wmf1/extensions/cldr/LanguageNames.body.php |
[production] |
21:10 |
<logmsgbot_> |
reedy synchronized php-1.20wmf1/extensions/cldr/LanguageNames.body.php |
[production] |
21:09 |
<logmsgbot_> |
reedy synchronized common/php-1.20wmf1/extensions/cldr/LanguageNames.body.php 'more debugging' |
[production] |
20:51 |
<logmsgbot_> |
reedy synchronized php-1.20wmf1/extensions/cldr/LanguageNames.body.php 'Add debugging' |
[production] |
20:49 |
<logmsgbot_> |
reedy synchronized wmf-config/CommonSettings.php 'Add debuglog group for language code not being a string' |
[production] |
19:04 |
<logmsgbot_> |
reedy synchronized php-1.20wmf1/includes/ExternalEdit.php 'https://gerrit.wikimedia.org/r/6077' |
[production] |
19:03 |
<logmsgbot_> |
reedy synchronized php-1.20wmf1/includes/api/ApiParse.php 'https://gerrit.wikimedia.org/r/6076' |
[production] |
02:24 |
<Ryan_Lane> |
rebooting all mediawiki boxes that have uptimes affected by the bug are being rebooted at 8 minute intervals |
[production] |
02:14 |
<logmsgbot_> |
LocalisationUpdate completed (1.20wmf1) at Sat Apr 28 02:14:14 UTC 2012 |
[production] |
01:33 |
<paravoid> |
powecycled mw29 |
[production] |
01:21 |
<paravoid> |
powercycled mw38 |
[production] |
00:17 |
<notpeter> |
db12 is sooooo sloooooow, starting innobackupex from db1017 to db60 for new s1 slave |
[production] |
2012-04-27
§
|
22:15 |
<paravoid> |
upgraded ssl4 to nginx 0.7.65-5wmf1 and added it back to the pool |
[production] |
21:45 |
<paravoid> |
rebooting ssl4 after upgrading (incl. a kernel update) |
[production] |
20:00 |
<notpeter> |
starting innobackupex from db1040 to db1022 for new eqiad s6 snapshot slave, again |
[production] |
19:59 |
<notpeter> |
starting innobackupex from db12 to db60 for new s1 slave, again |
[production] |
19:58 |
<notpeter> |
starting innobackupex from db1017 to db59 for new s1 slave, again |
[production] |
19:49 |
<paravoid> |
de-pooling ssl4 |
[production] |
19:30 |
<mutante> |
test - added new gerrit interwiki prefix for SAL/wikitech - [[gerrit:6002]] |
[production] |
19:14 |
<logmsgbot_> |
catrope synchronized wmf-config/CommonSettings.php 'Fix rights for afttest and afttest-hide groups' |
[production] |
18:25 |
<logmsgbot_> |
reedy synchronized wmf-config/CommonSettings.php 'Cleanup enotif related settings' |
[production] |
18:24 |
<logmsgbot_> |
reedy synchronized wmf-config/InitialiseSettings.php 'Set wgEnotifWatchlist to true for all wikis. Leaving wgShowUpdatedMarker set to false for all the big wikis' |
[production] |
16:50 |
<logmsgbot_> |
reedy synchronized wmf-config/CommonSettings.php 'Simplify enotif code' |
[production] |
16:45 |
<notpeter> |
starting innobackupex from db1040 to db1022 for new eqiad s6 snapshot slave |
[production] |
16:45 |
<logmsgbot_> |
reedy synchronized wmf-config/InitialiseSettings.php 'wgEnotifWatchlist defaulting to true. Big wikis explicitly set to false' |
[production] |
12:25 |
<mutante> |
fixing integration.mw testswarm and applying fixed erb template by hashar |
[production] |
04:35 |
<Tim> |
added an account for myself on observium |
[production] |
04:22 |
<logmsgbot_> |
tstarling synchronized wmf-config/mc.php 'increased wgMemCachedTimeout from 500ms to 3000ms for [[bugzilla:35900|bug 35900]]' |
[production] |
02:13 |
<logmsgbot_> |
LocalisationUpdate completed (1.20wmf1) at Fri Apr 27 02:13:51 UTC 2012 |
[production] |
00:12 |
<Ryan_Lane> |
upgrading gluster on all instances |
[production] |
00:09 |
<Ryan_Lane> |
upgrading gluster on labstore1-4 |
[production] |