2012-01-06
§
|
18:43 |
<maplebed> |
s4 database rotation complete. outage duration 36 minutes. |
[production] |
18:37 |
<maplebed> |
pushed out new db.php setting s4 to read-write |
[production] |
18:37 |
<ben> |
synchronized wmf-config/db.php |
[production] |
18:35 |
<maplebed> |
db31 made read-write as the new master for s4 |
[production] |
18:31 |
<maplebed> |
old master for s4 log file db22-bin.000106 log pos 631618956 |
[production] |
18:30 |
<maplebed> |
new master for s4: db31, log file db31-bin.000213 log pos is 205612709 |
[production] |
18:24 |
<asher> |
synchronized wmf-config/db.php 'setting s4 to read only, preparing to make db31 master' |
[production] |
18:22 |
<Reedy> |
Commons having db issues, db22 (s4 master) has a disk issue |
[production] |
16:02 |
<apergos> |
restarted lilghty on dataset2 |
[production] |
16:01 |
<Reedy> |
HTTP server (lighttpd?) seems to be down on dataset2 |
[production] |
15:46 |
<RoanKattouw> |
Removing gs_* files in /tmp on srv220 that are >30 min old |
[production] |
15:44 |
<reedy> |
synchronized wmf-config/InitialiseSettings.php 'Bug 33556 - ArticleFeedback settings on Chinese wikipedia' |
[production] |
15:43 |
<RoanKattouw> |
Removed /tmp/mw-cache-1.17 and /tmp/mw-cache-1.17-test on srv220 |
[production] |
15:41 |
<Reedy> |
srv220 / is at 100% usage |
[production] |
15:41 |
<reedy> |
synchronized wmf-config/InitialiseSettings.php 'Bug 33556 - ArticleFeedback settings on Chinese wikipedia' |
[production] |
14:34 |
<mutante> |
saw the log about cp1043/44 being deliberately left broken, but requirement in varnish.pp also broke others, fixed on sq67,68,69 (gerrit change 1802) |
[production] |
02:01 |
<LocalisationUpdate> |
completed (1.18) at Fri Jan 6 02:05:01 UTC 2012 |
[production] |
01:25 |
<binasher> |
puppet is being deliberately left broken on cp1043 and 1044 until tomorrow |
[production] |
01:23 |
<binasher> |
backend varnish instance on cp1042 running 3.0.2 is in production for 1/3 of mobile requests |
[production] |
2012-01-05
§
|
22:15 |
<preilly> |
small fix for iPhone vary support |
[production] |
22:15 |
<preilly> |
synchronized php-1.18/extensions/MobileFrontend/MobileFrontend.php |
[production] |
21:39 |
<Ryan_Lane> |
rebooting virt1 |
[production] |
21:01 |
<reedy> |
synchronized wmf-config/CommonSettings.php 'wmgShortUrlPrefix' |
[production] |
21:01 |
<reedy> |
synchronized wmf-config/InitialiseSettings.php 'wmgShortUrlPrefix' |
[production] |
20:08 |
<Reedy> |
Created ShortUrl tables on testwiki |
[production] |
20:07 |
<reedy> |
synchronizing Wikimedia installation... : Update extensionmessages |
[production] |
20:05 |
<reedy> |
synchronized wmf-config/CommonSettings.php 'wmgUseShortUrl' |
[production] |
20:04 |
<reedy> |
synchronized wmf-config/InitialiseSettings.php 'wmgUseShortUrl' |
[production] |
20:02 |
<reedy> |
synchronized php-1.18/extensions/ShortUrl 'Pushing ShortUrl files out' |
[production] |
19:08 |
<notpeter> |
restarting dhcpd on brewster |
[production] |
18:45 |
<preilly> |
pushing fix for js error on production |
[production] |
18:45 |
<preilly> |
synchronized php-1.18/extensions/MobileFrontend/ApplicationTemplate.php |
[production] |
18:45 |
<preilly> |
synchronized php-1.18/extensions/MobileFrontend/javascripts/application.js |
[production] |
18:00 |
<mutante> |
tarin - added "#includedir /etc/sudoers.d" to sudo config, needs to read /etc/sudoers.d/nrpe for Nagios RAID check |
[production] |
17:49 |
<logmsgbot_> |
hashar: gallium: cleaned /tmp . Our test suites leak a large amount of files :D |
[production] |
17:49 |
<^demon> |
removed chuck norris plugin from jenkins, restarted |
[production] |
16:48 |
<mutante> |
payments4 - 25 running nginx procs cause a warning - but normal and just raise limit? |
[production] |
16:15 |
<mutante> |
people claim it was "completely resolved with "2.6.38-10 backport from PPA." (add-apt-repository ppa:kernel-ppa/ppa ...). wanna try that? (or just reboot ms1002 pls) |
[production] |
15:45 |
<mutante> |
ms1002 - kswapd 100% CPU - but no swap used and free memory left - this looks like https://bugs.launchpad.net/ubuntu/+bug/721896 again |
[production] |
15:39 |
<mutante> |
Nagios check_ntp does stuff like: overall average offset: 0 -> NTP OK: Offset unknown| -> NTP CRITICAL: Offset unknown (even though this bug was supposed to be fixed in a version before the one we use)..sigh |
[production] |
15:14 |
<mutante> |
lvs1004 - puppet didnt run since 12 hours, looked stuck, "already in progress" on every run. rm /var/lib/puppet/state/puppetdlock, restart puppet agent, finished fine in a few seconds. maybe puppet [[bugzilla:2888|bug 2888]],5246 or related |
[production] |
14:57 |
<mutante> |
magnesium - memcached runs on default port 11211, but we run all the others on 11000, this causes Nagios CRIT. Is it supposed to run here? (was also on -l 127.0.0.1 only, but init script starts it on all) |
[production] |
14:55 |
<Jeff_Green> |
searchidx1 /a reached 100%, did the "space issues" maintenance procedure from wikitech search documentation |
[production] |
14:39 |
<mutante> |
same on srv193 |
[production] |
14:35 |
<mutante> |
srv290 - before restart memcached was running with -m 64 and -l 127.0.0.1 for some reason, causing Nagios CRIT, now it looks like others and recovered |
[production] |
14:32 |
<mutante> |
restarting memcached on srv290 |
[production] |
02:01 |
<LocalisationUpdate> |
completed (1.18) at Thu Jan 5 02:05:03 UTC 2012 |
[production] |