2011-07-06
§
|
23:45 |
<Ryan_Lane> |
powercycling srv281 |
[production] |
23:44 |
<Ryan_Lane> |
powercycling srv266 |
[production] |
23:44 |
<Ryan_Lane> |
powercycling srv217 |
[production] |
23:43 |
<Ryan_Lane> |
powercycling srv206 |
[production] |
23:42 |
<Ryan_Lane> |
powercycling srv154 |
[production] |
23:40 |
<Ryan_Lane> |
powercycling srv276, it's dead |
[production] |
22:56 |
<mark> |
Setup cr1-sdtpa with initial config; connected to csw5-pmtpa (via L2 csw1-sdtpa); OSPF up |
[production] |
22:36 |
<RobH> |
wmf_ops: you can disregard the humidity alarms for eqiad that are spamming alerts to email. eq confirms no humidity issue on site and I will investigate the actual sensors this friday |
[production] |
22:21 |
<Ryan_Lane> |
repooling srv169, it was missing the wikimedia-lvs-realserver package. fixed in puppet |
[production] |
22:04 |
<Ryan_Lane> |
rebooting srv169 |
[production] |
21:59 |
<Ryan_Lane> |
depooling srv169 |
[production] |
21:59 |
<Ryan_Lane> |
sync'd srv169, repooling to test |
[production] |
21:54 |
<Ryan_Lane> |
depooling srv169 |
[production] |
21:51 |
<Ryan_Lane> |
restarted apache on singer |
[production] |
21:46 |
<laner> |
synchronized php-1.17/wmf-config/db.php 'depooling srv154, repooling srv178' |
[production] |
21:30 |
<pdhanda> |
ran sync-common-all 'Synced to r91606 for ArticleFeedback' |
[production] |
21:27 |
<Ryan_Lane> |
setting proxy setting for secure back to original setting. removed ~ files from sites-enabled |
[production] |
21:17 |
<Ryan_Lane> |
putting retry=3 back, and adding a timeout of 15 seconds to secure |
[production] |
21:16 |
<Ryan_Lane> |
removed retry=3 from ProxyPass directive for secure. 3 seconds really isn't enough for this service... |
[production] |
21:06 |
<RobH> |
running puppet on spence, this is going to take forever. |
[production] |
21:05 |
<Ryan_Lane> |
restarting apache on singer |
[production] |
19:37 |
<mark> |
Added DNS entries for cr1-sdtpa and cr2-pmtpa |
[production] |
19:25 |
<hashar:> |
hexmode raised an user issue with blocking. It is a lock wait timeout happening from time to time on enwiki. 30 occurences in dberror.log for Block::purgeExpired. Could not reproduce it so I am assuming it was temporary issue. |
[production] |
19:15 |
<hashar:> |
srv154 seems unreachable. dberror.log is spammed with "Error connecting to <srv154 IP>" |
[production] |
19:13 |
<RobH> |
added webmaster@ to other top level domain mail routing to forward to the wikimedia.org webmaster for google securebrowsing stuff per RT#1122 |
[production] |
18:08 |
<pdhanda> |
running maintenance/cleanupTitles.php on commonswiki |
[production] |
17:51 |
<pdhanda> |
Running maintenances/namespaceDupesWT.php on commonswiki |
[production] |
17:12 |
<RobH> |
srv169 successfully back in service, tests fine and has all updated files, lvs3 updated to include it in pool |
[production] |
17:11 |
<RobH> |
returning srv169 into service |
[production] |
15:37 |
<mark> |
Removed ms5:/etc/cron.d/mdadm |
[production] |
15:37 |
<mark> |
Stopped MD raid resync on ms5 |
[production] |
15:28 |
<RobH> |
search18 booted back up successfully |
[production] |
15:25 |
<RobH> |
api lag issues known due to search server failure, being worked presently |
[production] |
15:24 |
<RobH> |
search18 sas configuration bios confirms both disks are still in a non-degraded (according to it) mirror |
[production] |
15:23 |
<RobH> |
search18 randomly rebooted after checking disks before the login prompt |
[production] |
15:19 |
<RobH> |
rebooting search18 |
[production] |
15:14 |
<RobH> |
search18 appears to be completely offline, investigating lom logs before rebooting. |
[production] |
15:12 |
<RobH> |
search18 offline, logging into mgmt to check it out |
[production] |
15:01 |
<RobH> |
eqiad humidity levels ticket dispatched for fufillment |
[production] |
14:37 |
<mark> |
Paused rsyncs on ms5 |
[production] |
14:04 |
<mark> |
Powercycled sq36 |
[production] |
13:18 |
<^demon|away> |
fixed permissions on svn c/o on ci.tesla, ran svn cleanup. cruise control still not pleased and yelling about locks |
[production] |
13:16 |
<mark> |
Upgrading firmware of scs-a1-sdtpa |
[production] |
12:51 |
<mark> |
csw5-pmtpa crashed and reloaded |
[production] |
11:53 |
<mark> |
Upgrading firmware of scs-c1-pmtpa |
[production] |