2010-07-05
§
|
23:20 |
<Tim> |
moving codereview-proxy to kaulen to replace isidore (which is down) |
[production] |
22:53 |
<Tim> |
on srv124: remounted /home to fix test.wikipedia.org |
[production] |
18:56 |
<jeluf> |
synchronized php-1.5/wmf-config/flaggedrevs.php '24010 - id.wikipedia requesting FlaggedRevs' |
[production] |
17:01 |
<mark> |
Did BGP soft clear outbound on all AMS-IX sessions; no prefixes were being announced as of two weeks ago |
[production] |
16:01 |
<mark> |
Made puppet ensure apache is running on the app servers; running "sync-common" upon start |
[production] |
15:38 |
<mark> |
Fixed puppet on srv145 |
[production] |
11:39 |
<mark> |
Remounted /home on hume |
[production] |
11:30 |
<root> |
synchronizing Wikimedia installation... Revision: 68850 |
[production] |
11:29 |
<mark> |
synchronized php-1.5/wmf-config/CommonSettings.php 'CommonSettings.php out of sync on a few apaches' |
[production] |
09:17 |
<jeluf> |
synchronized php-1.5/wmf-config/InitialiseSettings.php '24264 - Create a namespace aliases on zhwiki' |
[production] |
06:26 |
<ronabop> |
Kudos to the team who rebuilt a multi-hundred node system under extreme pressure. |
[production] |
06:24 |
<Tim> |
fixed broken ircd auth configuration, irc.wikimedia.org now working again |
[production] |
05:13 |
<Tim> |
on browne: reinstalled udprec to fix IRC server |
[production] |
04:16 |
<Tim> |
switched nagios monitoring for search to less flappy TCP connection check instead of HTTP |
[production] |
03:58 |
<domas> |
s3 pos: db17-bin.321:0 |
[production] |
03:58 |
<midom> |
synchronized php-1.5/wmf-config/db.php 's3 rw' |
[production] |
03:54 |
<Tim> |
fixed search monitoring in nagios |
[production] |
03:47 |
<Tim> |
started lsearchd on search1-20 |
[production] |
03:44 |
<midom> |
synchronized php-1.5/wmf-config/db.php 'rw s4 s4' |
[production] |
03:42 |
<Tim> |
fixed search1: just needed /home remounted |
[production] |
03:38 |
<midom> |
synchronized php-1.5/wmf-config/db.php |
[production] |
03:32 |
<domas> |
new repl positions, s2: db30-bin.000015:1227, s4: db16-bin.019:0 |
[production] |
03:17 |
<midom> |
synchronized php-1.5/wmf-config/db.php |
[production] |
03:13 |
<midom> |
synchronized php-1.5/wmf-config/db.php |
[production] |
03:04 |
<tstarling> |
synchronized php-1.5/wmf-config/db.php 's3 fake master r/o' |
[production] |
02:54 |
<Tim> |
mysql status: s2 and s4 have replication broken with "Client requested master to start replication from impossible position". s3: still waiting for innodb recovery on master. Other clusters good. |
[production] |
02:49 |
<tstarling> |
synchronized php-1.5/wmf-config/db.php |
[production] |
02:48 |
<Tim> |
on db8: read_only=1 again and setting wiki to r/o |
[production] |
02:46 |
<Tim> |
on db8: read_only=0, started up r/o (s4) |
[production] |
02:42 |
<Tim> |
putting s2 into read-only mode due to replication issues |
[production] |
02:35 |
<RobH_> |
search server defaults to sitting on grub screen for search13-search20, will fix later, for now they are booting back up. |
[production] |
02:30 |
<Tim> |
fixed m.wikipedia.org on lvs4 |
[production] |
02:26 |
<RobH_> |
search13 back up, working on the others |
[production] |
02:24 |
<mark> |
Moved bits.pmtpa to point to Text squids in DNS |
[production] |
02:08 |
<Tim> |
starting mysqld on a lot of DB servers |
[production] |
02:08 |
<RobH_> |
seems like a power outage, not an AC issue. |
[production] |
02:07 |
<RobH_> |
email back online |
[production] |
01:52 |
<Tim> |
on nfs1: river fixed the filesystem with fsck |
[production] |
01:33 |
<Tim> |
(about 5 minutes ago) started mysqld on db17 |
[production] |
01:26 |
<Tim> |
on lvs4: removed dead squids from text list |
[production] |
01:16 |
<Tim> |
started mysql on db8 |
[production] |
01:15 |
<Tim> |
started mysqld on db5 |
[production] |
01:11 |
<Tim> |
power went off briefly again, lvs4 came back up properly this time, starting pybal on it again |
[production] |
00:59 |
<Tim> |
got LVS set up and working on lvs4 |
[production] |
00:56 |
<Tim> |
s/nfs4/lvs4 |
[production] |
00:55 |
<Tim> |
got nfs4 back online |
[production] |