|
2011-07-06
§
|
| 23:42 |
<Ryan_Lane> |
powercycling srv154 |
[production] |
| 23:40 |
<Ryan_Lane> |
powercycling srv276, it's dead |
[production] |
| 22:56 |
<mark> |
Setup cr1-sdtpa with initial config; connected to csw5-pmtpa (via L2 csw1-sdtpa); OSPF up |
[production] |
| 22:36 |
<RobH> |
wmf_ops: you can disregard the humidity alarms for eqiad that are spamming alerts to email. eq confirms no humidity issue on site and I will investigate the actual sensors this friday |
[production] |
| 22:21 |
<Ryan_Lane> |
repooling srv169, it was missing the wikimedia-lvs-realserver package. fixed in puppet |
[production] |
| 22:04 |
<Ryan_Lane> |
rebooting srv169 |
[production] |
| 21:59 |
<Ryan_Lane> |
depooling srv169 |
[production] |
| 21:59 |
<Ryan_Lane> |
sync'd srv169, repooling to test |
[production] |
| 21:54 |
<Ryan_Lane> |
depooling srv169 |
[production] |
| 21:51 |
<Ryan_Lane> |
restarted apache on singer |
[production] |
| 21:46 |
<laner> |
synchronized php-1.17/wmf-config/db.php 'depooling srv154, repooling srv178' |
[production] |
| 21:30 |
<pdhanda> |
ran sync-common-all 'Synced to r91606 for ArticleFeedback' |
[production] |
| 21:27 |
<Ryan_Lane> |
setting proxy setting for secure back to original setting. removed ~ files from sites-enabled |
[production] |
| 21:17 |
<Ryan_Lane> |
putting retry=3 back, and adding a timeout of 15 seconds to secure |
[production] |
| 21:16 |
<Ryan_Lane> |
removed retry=3 from ProxyPass directive for secure. 3 seconds really isn't enough for this service... |
[production] |
| 21:06 |
<RobH> |
running puppet on spence, this is going to take forever. |
[production] |
| 21:05 |
<Ryan_Lane> |
restarting apache on singer |
[production] |
| 19:37 |
<mark> |
Added DNS entries for cr1-sdtpa and cr2-pmtpa |
[production] |
| 19:25 |
<hashar:> |
hexmode raised an user issue with blocking. It is a lock wait timeout happening from time to time on enwiki. 30 occurences in dberror.log for Block::purgeExpired. Could not reproduce it so I am assuming it was temporary issue. |
[production] |
| 19:15 |
<hashar:> |
srv154 seems unreachable. dberror.log is spammed with "Error connecting to <srv154 IP>" |
[production] |
| 19:13 |
<RobH> |
added webmaster@ to other top level domain mail routing to forward to the wikimedia.org webmaster for google securebrowsing stuff per RT#1122 |
[production] |
| 18:08 |
<pdhanda> |
running maintenance/cleanupTitles.php on commonswiki |
[production] |
| 17:51 |
<pdhanda> |
Running maintenances/namespaceDupesWT.php on commonswiki |
[production] |
| 17:12 |
<RobH> |
srv169 successfully back in service, tests fine and has all updated files, lvs3 updated to include it in pool |
[production] |
| 17:11 |
<RobH> |
returning srv169 into service |
[production] |
| 15:37 |
<mark> |
Removed ms5:/etc/cron.d/mdadm |
[production] |
| 15:37 |
<mark> |
Stopped MD raid resync on ms5 |
[production] |
| 15:28 |
<RobH> |
search18 booted back up successfully |
[production] |
| 15:25 |
<RobH> |
api lag issues known due to search server failure, being worked presently |
[production] |
| 15:24 |
<RobH> |
search18 sas configuration bios confirms both disks are still in a non-degraded (according to it) mirror |
[production] |
| 15:23 |
<RobH> |
search18 randomly rebooted after checking disks before the login prompt |
[production] |
| 15:19 |
<RobH> |
rebooting search18 |
[production] |
| 15:14 |
<RobH> |
search18 appears to be completely offline, investigating lom logs before rebooting. |
[production] |
| 15:12 |
<RobH> |
search18 offline, logging into mgmt to check it out |
[production] |
| 15:01 |
<RobH> |
eqiad humidity levels ticket dispatched for fufillment |
[production] |
| 14:37 |
<mark> |
Paused rsyncs on ms5 |
[production] |
| 14:04 |
<mark> |
Powercycled sq36 |
[production] |
| 13:18 |
<^demon|away> |
fixed permissions on svn c/o on ci.tesla, ran svn cleanup. cruise control still not pleased and yelling about locks |
[production] |
| 13:16 |
<mark> |
Upgrading firmware of scs-a1-sdtpa |
[production] |
| 12:51 |
<mark> |
csw5-pmtpa crashed and reloaded |
[production] |
| 11:53 |
<mark> |
Upgrading firmware of scs-c1-pmtpa |
[production] |
|
2011-07-05
§
|
| 23:53 |
<^demon> |
scratch that....ssh just seems to have been rather slow in getting its act together. ci.tesla is just fine now |
[production] |
| 23:50 |
<^demon> |
well now I've locked myself out of ci.tesla. Seems it doesn't start ssh on reboot...what a silly thing to do |
[production] |
| 23:47 |
<^demon> |
rebooting ci.tesla since it was horribly hung up on the latest build--was it really stuck for the past 24hrs? |
[production] |
| 23:35 |
<reedy> |
synchronized php-1.17/resources/mediawiki/mediawiki.js 'r91505' |
[production] |
| 21:28 |
<pdhanda> |
ran sync-common-all 'Synced to r91494 for WikiLove' |
[production] |
| 21:10 |
<pdhanda> |
synchronized php-1.17/resources/jquery.ui/themes/vector/jquery.ui.button.css 'Synced to r 91493.' |
[production] |
| 21:00 |
<pdhanda> |
synchronized php-1.17/resources/jquery.ui/themes/vector/jquery.ui.button.css 'Synced to r 91490.' |
[production] |
| 19:53 |
<reedy> |
synchronized php-1.17/includes/api/ApiQuery.php 'r91479' |
[production] |
| 19:07 |
<catrope> |
synchronized php-1.17/extensions/Vector/modules/ext.vector.collapsibleTabs.js 'r91476' |
[production] |