2010-06-04
§
|
09:50 |
<tstarling> |
synchronizing Wikimedia installation... Revision: 66620 |
[production] |
09:50 |
<Tim> |
pushing out WikimediaMobile (r67331) in preparation for deployment on testwiki |
[production] |
08:44 |
<domas> |
decreased keepalivetimeout and timeout on mobile1 |
[production] |
08:35 |
<Tim> |
on mobile1: reduced max passenger pool size to 200, Domas and I think it's about right, shouldn't exceed allowable memory, should give us close to 100% CPU. |
[production] |
08:26 |
<Tim> |
on mobile1: domas fixed file limit, now 50k |
[production] |
08:10 |
<Tim> |
increasing MaxClients on mobile1 to 1500 |
[production] |
05:01 |
<Fred> |
Added apache2.conf, memcached.conf to puppet receipe for mobile. |
[production] |
03:43 |
<jeluf> |
synchronized php-1.5/wmf-config/InitialiseSettings.php '23784 - Modify add/remove rights for bureaucrats on officewiki' |
[production] |
02:46 |
<Tim> |
mobile1: increased ServerLimit to 1500 and reduced MaxClients to 500 |
[production] |
02:35 |
<Tim> |
on mobile1: increased memcached memory limit from 64M to 5000M |
[production] |
02:15 |
<Tim> |
switched mobile1 over from apache2-mpm-worker to apache2-mpm-prefork (via puppet) |
[production] |
01:03 |
<Tim> |
set ganglia host_dmax to 1 day |
[production] |
2010-06-03
§
|
21:57 |
<Fred> |
mobile1 re-imaged and puppetized. Changed subnet for mobile1. Changed DNS for mobile1. m pointing to newly imaged mobile1 (until transition is completed) |
[production] |
20:55 |
<jeluf> |
synchronized php-1.5/wmf-config/InitialiseSettings.php '23689 - Enable Collection extension on Thai Wikipedia' |
[production] |
20:22 |
<AaronSchulz> |
deployed r67296 FlaggedRevs_alpha |
[production] |
20:21 |
<aaron> |
synchronizing Wikimedia installation... Revision: 66620 |
[production] |
19:39 |
<mark> |
Moved mobile1 switchport from vlan 101 to 100 |
[production] |
19:36 |
<mark> |
Reverted DNS change of mobile1, back to .157 |
[production] |
17:21 |
<Fred> |
mobile1 going to be unreacheable while re-ip'ing |
[production] |
14:05 |
<midom> |
synchronized php-1.5/wmf-config/InitialiseSettings.php 'timezone change for bat-smg' |
[production] |
11:53 |
<mark> |
Made m.wikipedia.org CNAME m.wikipedia.org, m.wikipedia.org A to mobile1/2 in RR |
[production] |
10:57 |
<hcatlin> |
mobile2 has been rebuilt and is featuring the new apache/mobile stack taking 40% of all mobile traffic. pls help monitor on ganglia. |
[production] |
09:04 |
<Tim> |
cleaning COSS on sq45, resynced its configuration, will start squid when done |
[production] |
08:58 |
<Tim> |
kernel reports degraded RAID on sq33, sq34, sq35, sq37, sq38, sq40 |
[production] |
08:39 |
<Tim> |
checked all serial consoles, all nonresponsive, rebooted all |
[production] |
08:23 |
<Tim> |
sq33, sq34, sq35, sq37, sq38, sq40, sq45 have been down for 16-28 days, apparently for no good reason, can't find any log or DT entries. Will try restarts. |
[production] |
07:56 |
<Tim> |
added new squids to nagios |
[production] |
06:36 |
<Tim> |
cleaning cache directories on sq56 to avoid resurrection of expired content |
[production] |
06:35 |
<Tim> |
adding monitoring for rather important service IPs: upload.esams and text.esams |
[production] |
06:22 |
<Tim> |
sq56 not responding to ping or serial console (for 4 days), nothing in racadm getsel, rebooting |
[production] |
06:07 |
<tstarling> |
synchronized php-1.5/wmf-config/InitialiseSettings.php 'disabling ClickTracking due to CR r58099' |
[production] |
05:24 |
<Tim> |
started apache on srv216, was stopped for some reason |
[production] |
03:57 |
<Fred> |
shutting down mailman on list for a few minutes while exim and spamd catch up |
[production] |
01:42 |
<Tim> |
adding forward and reverse DNS for mobile.tesla.usability.wikimedia.org, 208.80.152.245 |
[production] |
2010-06-02
§
|
23:48 |
<Fred> |
re-imaging mobile2 again ;p |
[production] |
21:12 |
<robh> |
ran sync-common-all |
[production] |
20:57 |
<Rob> |
srv134 has temp errors and multiple bad fans, out of warranty, decommissioning and removing from nagios/lvs/dsh |
[production] |
20:52 |
<jeluf> |
synchronized php-1.5/wmf-config/InitialiseSettings.php '23756 - Enable "Rollbacker" group on itwiki' |
[production] |
20:44 |
<Rob> |
how many ops does it take to bring srv281 back online? one + a vounteer to point out he did it wrong. |
[production] |
20:42 |
<robh> |
ran sync-common-all |
[production] |
20:36 |
<robh> |
ran sync-common-all |
[production] |
20:26 |
<jeluf> |
synchronizing Wikimedia installation... Revision: 66620 |
[production] |
20:15 |
<Rob> |
srv281 back online and apache files updated |
[production] |
20:08 |
<Rob> |
lomaria is down, because it was throwing crap around and generally broken, so its shutdown now. |
[production] |
19:59 |
<Rob> |
had to swap power cable on msw-b5-sdtpa |
[production] |
14:02 |
<Rob> |
stay off db28, fixing its firmware. |
[production] |
13:44 |
<andrew> |
synchronized php-1.5/extensions/StrategyWiki/ActiveStrategy/ActiveStrategy_body.php |
[production] |
13:39 |
<andrew> |
synchronized php-1.5/extensions/StrategyWiki/ActiveStrategy/ActiveStrategy_body.php |
[production] |
13:38 |
<Tim> |
ArchRunner on lily was stuck in a busy loop of some kind, had to restart it with kill -9 |
[production] |
13:36 |
<andrew> |
synchronized php-1.5/extensions/StrategyWiki/ActiveStrategy/ActiveStrategy_body.php |
[production] |