2011-04-03
§
|
04:26 |
<apergos> |
turned off hourly snaps on ms4, turned back on webserver and nfs |
[production] |
04:09 |
<apergos> |
rebooted ms4, shut down webserver and nfsd temporarily for testing |
[production] |
02:58 |
<apergos> |
still looking at kernel memory issues, still rebooting, ryan should be here in a few minutes to help out |
[production] |
02:03 |
<apergos> |
a solaris advisor... also have zfs arch cache max to 2g which is ridiculously low but wtf right? |
[production] |
02:02 |
<apergos> |
set tcp_time_wait_interval to 10000 at suggestion of |
[production] |
01:37 |
<apergos> |
lowered zfs arch max to 2g (someone should reset this later)... will take effect on next reboot |
[production] |
00:29 |
<apergos> |
rebooting with the new zfs arc cache max value, which will reduce the min value as well... dunno if this will give us enough breathing room or not |
[production] |
00:24 |
<apergos> |
set zfs arc cache to ridiculously low value of 4gb, since when it's healthy it's using much less than that (1gb), this will take effect on reboot |
[production] |
00:22 |
<Reedy> |
Still experiencing MS4 issues, thumb service is likely to be problematic for most users |
[production] |
2011-04-02
§
|
23:47 |
<apergos> |
rebooting ms4 from serial console, out to lunch and took the renderers down too |
[production] |
18:42 |
<catrope> |
synchronized php-1.17/wmf-config/CommonSettings.php 'Per NeilK, change Category:Uploaded_by_UploadWizard to Category:Uploaded_with_UploadWizard' |
[production] |
17:59 |
<mark> |
Upgrading varnish to 2.1.5 |
[production] |
17:14 |
<demon> |
synchronized php-1.17/includes/filerepo/LocalFile.php 'r85200' |
[production] |
14:19 |
<mark> |
Implemented CARP weights for distant CARP parents on squid configurator (used to be all equal before) |
[production] |
11:36 |
<mark> |
Created btrfs filesystem on ms6, striped (raid10 style) over 46 devices - very experimental |
[production] |
09:50 |
<mark> |
Reinstalling ms6 with Ubuntu 10.04 |
[production] |
09:50 |
<mark> |
Fixed torrus again |
[production] |
06:02 |
<mark> |
!wikipedia The image thumbnail servers appear stable now |
[production] |
04:59 |
<mark> |
Increased nginx worker processes from 1 to 4, set file limit to 30k |
[production] |
04:40 |
<mark> |
!wikipedia Image Thumbnail server outage, it's being worked on |
[production] |
04:34 |
<mark> |
Power cycling ms4 again |
[production] |
04:06 |
<mark> |
Power cycled ms4 again |
[production] |
04:02 |
<mark> |
Removed ms4 from pmtpa.upload config, sending all thumbs to ms5 |
[production] |
03:47 |
<mark> |
Restarted rsyncs ms4->ms5 |
[production] |
03:25 |
<Ryan_Lane> |
powercycling ms4 again |
[production] |
02:59 |
<Ryan_Lane> |
rebooting ms4 |
[production] |
02:46 |
<Ryan_Lane> |
seems ms4 is totally dead, powercycling it |
[production] |
01:09 |
<Ryan_Lane> |
installing python-pyinotify on spence for an updated ircecho |
[production] |
2011-04-01
§
|
21:35 |
<Ryan_Lane> |
purging some binlogs on db9 to free up space |
[production] |
21:35 |
<RobH> |
bugzilla now version 4 |
[production] |
21:31 |
<RobH> |
taking down bugzilla for a quick upgrade |
[production] |
18:48 |
<Ryan_Lane> |
added ctwoo, brion, py, and reedy to the engineering alias |
[production] |
18:36 |
<mark> |
Deployed ms5.pmtpa.wmnet as a special 'apache' for pmtpa squid uploads... now serving a small portion of commons thumbs |
[production] |
18:11 |
<RobH> |
bugzilla back online, CRproxy was affected, and repaired |
[production] |
17:30 |
<RobH> |
bugzilla.wikimedia.org going offline for database backup and upgrade |
[production] |
17:13 |
<RobH> |
beginning upgrade process for bugzilla, it's availability will be in question during this time |
[production] |
16:59 |
<mark> |
Turned off Etag in the webserver7 configuration (/opt/webserver7/https-ms4/config/obj.conf) on ms4 |
[production] |
16:50 |
<notpeter> |
rm-ing old binlogs on db9 after confirming that there is no slave lag on db10 |
[production] |
15:53 |
<mark> |
Puppetised nginx and htcp purger setup on ms5 |
[production] |
11:36 |
<apergos> |
restarted lighty on dataset2 (but why did it die?) |
[production] |
00:06 |
<tstarling> |
synchronized php-1.17/includes/specials/SpecialImport.php 'r85099' |
[production] |
2011-03-31
§
|
20:52 |
<notpeter> |
also, restarting t3h nagioz |
[production] |
20:51 |
<notpeter> |
added self to /etc/nagios/contacts.cfg |
[production] |
20:32 |
<awjrichards> |
synchronized php-1.17/extensions/UserDailyContribs/api/ApiUserDailyContribs.php 'r85088 updates to UserDailyContribs API to retrieve past year edit count' |
[production] |
18:58 |
<catrope> |
synchronized php-1.17/includes/specials/SpecialImport.php 'r85078' |
[production] |
14:58 |
<aaron> |
synchronized php-1.17/wmf-config/flaggedrevs.php |
[production] |
14:58 |
<aaron> |
synchronized php-1.17/wmf-config/InitialiseSettings.php 'same cleanup; ptwikinews (just did sqwiki)' |
[production] |
14:49 |
<aaron> |
synchronized php-1.17/wmf-config/flaggedrevs.php |
[production] |
14:48 |
<aaron> |
synchronized php-1.17/wmf-config/InitialiseSettings.php 'moved add/removgroups settings from flaggedrevs.php to here' |
[production] |
14:44 |
<mark> |
power cycled singer |
[production] |