2011-07-06
§
|
22:36 |
<RobH> |
wmf_ops: you can disregard the humidity alarms for eqiad that are spamming alerts to email. eq confirms no humidity issue on site and I will investigate the actual sensors this friday |
[production] |
22:21 |
<Ryan_Lane> |
repooling srv169, it was missing the wikimedia-lvs-realserver package. fixed in puppet |
[production] |
22:04 |
<Ryan_Lane> |
rebooting srv169 |
[production] |
21:59 |
<Ryan_Lane> |
depooling srv169 |
[production] |
21:59 |
<Ryan_Lane> |
sync'd srv169, repooling to test |
[production] |
21:54 |
<Ryan_Lane> |
depooling srv169 |
[production] |
21:51 |
<Ryan_Lane> |
restarted apache on singer |
[production] |
21:46 |
<laner> |
synchronized php-1.17/wmf-config/db.php 'depooling srv154, repooling srv178' |
[production] |
21:30 |
<pdhanda> |
ran sync-common-all 'Synced to r91606 for ArticleFeedback' |
[production] |
21:27 |
<Ryan_Lane> |
setting proxy setting for secure back to original setting. removed ~ files from sites-enabled |
[production] |
21:17 |
<Ryan_Lane> |
putting retry=3 back, and adding a timeout of 15 seconds to secure |
[production] |
21:16 |
<Ryan_Lane> |
removed retry=3 from ProxyPass directive for secure. 3 seconds really isn't enough for this service... |
[production] |
21:06 |
<RobH> |
running puppet on spence, this is going to take forever. |
[production] |
21:05 |
<Ryan_Lane> |
restarting apache on singer |
[production] |
19:37 |
<mark> |
Added DNS entries for cr1-sdtpa and cr2-pmtpa |
[production] |
19:25 |
<hashar:> |
hexmode raised an user issue with blocking. It is a lock wait timeout happening from time to time on enwiki. 30 occurences in dberror.log for Block::purgeExpired. Could not reproduce it so I am assuming it was temporary issue. |
[production] |
19:15 |
<hashar:> |
srv154 seems unreachable. dberror.log is spammed with "Error connecting to <srv154 IP>" |
[production] |
19:13 |
<RobH> |
added webmaster@ to other top level domain mail routing to forward to the wikimedia.org webmaster for google securebrowsing stuff per RT#1122 |
[production] |
18:08 |
<pdhanda> |
running maintenance/cleanupTitles.php on commonswiki |
[production] |
17:51 |
<pdhanda> |
Running maintenances/namespaceDupesWT.php on commonswiki |
[production] |
17:12 |
<RobH> |
srv169 successfully back in service, tests fine and has all updated files, lvs3 updated to include it in pool |
[production] |
17:11 |
<RobH> |
returning srv169 into service |
[production] |
15:37 |
<mark> |
Removed ms5:/etc/cron.d/mdadm |
[production] |
15:37 |
<mark> |
Stopped MD raid resync on ms5 |
[production] |
15:28 |
<RobH> |
search18 booted back up successfully |
[production] |
15:25 |
<RobH> |
api lag issues known due to search server failure, being worked presently |
[production] |
15:24 |
<RobH> |
search18 sas configuration bios confirms both disks are still in a non-degraded (according to it) mirror |
[production] |
15:23 |
<RobH> |
search18 randomly rebooted after checking disks before the login prompt |
[production] |
15:19 |
<RobH> |
rebooting search18 |
[production] |
15:14 |
<RobH> |
search18 appears to be completely offline, investigating lom logs before rebooting. |
[production] |
15:12 |
<RobH> |
search18 offline, logging into mgmt to check it out |
[production] |
15:01 |
<RobH> |
eqiad humidity levels ticket dispatched for fufillment |
[production] |
14:37 |
<mark> |
Paused rsyncs on ms5 |
[production] |
14:04 |
<mark> |
Powercycled sq36 |
[production] |
13:18 |
<^demon|away> |
fixed permissions on svn c/o on ci.tesla, ran svn cleanup. cruise control still not pleased and yelling about locks |
[production] |
13:16 |
<mark> |
Upgrading firmware of scs-a1-sdtpa |
[production] |
12:51 |
<mark> |
csw5-pmtpa crashed and reloaded |
[production] |
11:53 |
<mark> |
Upgrading firmware of scs-c1-pmtpa |
[production] |
2011-07-05
§
|
23:53 |
<^demon> |
scratch that....ssh just seems to have been rather slow in getting its act together. ci.tesla is just fine now |
[production] |
23:50 |
<^demon> |
well now I've locked myself out of ci.tesla. Seems it doesn't start ssh on reboot...what a silly thing to do |
[production] |
23:47 |
<^demon> |
rebooting ci.tesla since it was horribly hung up on the latest build--was it really stuck for the past 24hrs? |
[production] |
23:35 |
<reedy> |
synchronized php-1.17/resources/mediawiki/mediawiki.js 'r91505' |
[production] |
21:28 |
<pdhanda> |
ran sync-common-all 'Synced to r91494 for WikiLove' |
[production] |
21:10 |
<pdhanda> |
synchronized php-1.17/resources/jquery.ui/themes/vector/jquery.ui.button.css 'Synced to r 91493.' |
[production] |
21:00 |
<pdhanda> |
synchronized php-1.17/resources/jquery.ui/themes/vector/jquery.ui.button.css 'Synced to r 91490.' |
[production] |
19:53 |
<reedy> |
synchronized php-1.17/includes/api/ApiQuery.php 'r91479' |
[production] |
19:07 |
<catrope> |
synchronized php-1.17/extensions/Vector/modules/ext.vector.collapsibleTabs.js 'r91476' |
[production] |
19:07 |
<catrope> |
synchronized php-1.17/extensions/Vector/modules/ext.vector.simpleSearch.js 'r91476' |
[production] |
16:49 |
<RoanKattouw> |
Short CPU spike on the Apaches, approx 16:45-16:50 UTC. Things seem to be recovering now |
[production] |
16:04 |
<mark> |
Put OSPF/OSPFv3 on csw1-sdtpa:e14/1 in active mode again; appears stable so far |
[production] |