2011-12-17
§
|
22:49 |
<RobH> |
Anytime db9 hits 98 or 99% someone needs to remove binlogs to bring it back down to 94 or 95% |
[production] |
22:48 |
<RobH> |
removed older binlogs on db9 again to kick it back to a bit more free space to last the weekend. |
[production] |
17:53 |
<catrope> |
synchronized wmf-config/CommonSettings.php 'Remove SVN dir setting, this is now passed in on the command line' |
[production] |
16:43 |
<RoanKattouw> |
Found out why LocalisationUpdate was failing. Would have been fixed already if puppet had been running on fenari, but it's throwing errors. See [[rev:1617|r1617]] and my comment on [[rev:1558|r1558]] |
[production] |
14:32 |
<apergos> |
thumb cleaner to bed for the night... about 2 days left I think |
[production] |
07:25 |
<apergos> |
thumb cleaner started up for the day |
[production] |
01:57 |
<LocalisationUpdate> |
failed (1.18) at Sat Dec 17 02:00:18 UTC 2011 |
[production] |
2011-12-16
§
|
22:30 |
<RobH> |
reclaimed space on db9, restarted mysql, services seem to be recovering |
[production] |
22:24 |
<maplebed> |
restarting mysql on db9; brief downtime for a number of apps (bugzilla, blog, etc.) expected. |
[production] |
22:03 |
<RobH> |
db9 space reclaimed back to 94% full, related services should start recovering |
[production] |
21:57 |
<RobH> |
db9 disk full, related services are messing up, fixing |
[production] |
21:56 |
<RobH> |
kicking apache for bz related issues on kaulen |
[production] |
19:14 |
<catrope> |
synchronized php-1.18/resources/startup.js 'touch' |
[production] |
19:07 |
<catrope> |
synchronized wmf-config/InitialiseSettings.php 'Set AFTv4 lottery odds to 100% on en_labswikimedia' |
[production] |
18:48 |
<LeslieCarr> |
removed the ssl* yaml logs on stafford to fix the puppet not running error |
[production] |
16:13 |
<apergos> |
thumb cleaner to bed for the night. definitely need an alarm clock for this... good thing it's only got about 4 days of backlog left |
[production] |
15:41 |
<RobH> |
es1002 being actively worked on for hdd controller testing |
[production] |
15:39 |
<RobH> |
lvs1003 disk dead per RT 1549, will torubleshoot on site later today or Monday |
[production] |
15:32 |
<RobH> |
lvs1003 unresponsive to serial console, rebooting |
[production] |
15:18 |
<RobH> |
reinstalling dataset1 |
[production] |
14:45 |
<mutante> |
puppet was broken on all servers including "nrpe" due to package conflict with nagios-plugins-basic i added to base, revert+fix |
[production] |
13:29 |
<RoanKattouw> |
Dropping and recreating AFTv5 tables on en_labswikimedia and enwiki |
[production] |
13:26 |
<catrope> |
synchronized php-1.18/extensions/ArticleFeedbackv5/ 'Updating to trunk state' |
[production] |
13:25 |
<mutante> |
tweaked Nagios earlier today: external command_check_interval & event_broker_options (see comments in gerrit Id3b4a458) |
[production] |
13:01 |
<mark> |
Found lvs5 and lvs6 with offload-gro enabled, even though it's set disabled in /etc/network/interfaces... corrected |
[production] |
09:21 |
<apergos> |
restarted lighthttpd on ds2, it had stopped (and why didn't nagios tell us? ) |
[production] |
08:38 |
<mutante> |
spence - had killed additional notifications.cgi and history.cgi procs, waited 5 minutes, load went down a lot, restarting nagios |
[production] |
08:23 |
<mutante> |
spence - almost unusable, Nagios notifications.cgi and history.cgi use a lot of memory, stopping Nagios, watching swap |
[production] |
08:15 |
<mutante> |
spence slow again, side-note: tried to use "sar" to investigate but "Please check if data collecting is enabled in /etc/default/sysstat" (want to?) |
[production] |
07:54 |
<nikerabbit> |
synchronized php-1.18/extensions/WebFonts/resources/ext.webfonts.js 'JS fix [[rev:106418|r106418]]' |
[production] |
07:09 |
<apergos> |
thumbs cleaner awake for the day |
[production] |
01:57 |
<LocalisationUpdate> |
failed (1.18) at Fri Dec 16 02:00:14 UTC 2011 |
[production] |
2011-12-15
§
|
23:19 |
<LeslieCarr> |
pushing rule to planet.wikimedia.org which should redirect all https to http |
[production] |
23:00 |
<LeslieCarr> |
puppetized planet.wikimedia.org on singer |
[production] |
22:41 |
<LeslieCarr> |
removing https support from planet.wikimedia.org |
[production] |
21:43 |
<awjrichards> |
synchronized php/extensions/LandingCheck/SpecialLandingCheck.php '[[rev:106377|r106377]]' |
[production] |
21:42 |
<awjrichards> |
synchronized php/extensions/LandingCheck/LandingCheck.php '[[rev:106377|r106377]]' |
[production] |
21:34 |
<awjrichards> |
synchronized php/extensions/ContributionTracking/ContributionTracking_body.php '[[rev:106375|r106375]]' |
[production] |
21:34 |
<awjrichards> |
synchronized php/extensions/ContributionTracking/ContributionTracking.processor.php '[[rev:106375|r106375]]' |
[production] |
21:33 |
<awjrichards> |
synchronized php/extensions/ContributionTracking/ContributionTracking.php '[[rev:106375|r106375]]' |
[production] |
20:21 |
<K4-713> |
synchronized payments cluster to [[rev:106360|r106360]] |
[production] |