4351-4400 of 5347 results (9ms)
2009-02-17 §
21:04 <Rob> pulling srv208 and srv209 for quick reboots, their drac ips are wrong. [production]
21:04 <Rob> racked srv217-223 (also racked srv224/225 but no power yet) [production]
18:30 <brion> starting a batch run of update-special-pages-small just to ensure it actually works [production]
18:25 <brion> fixed hardcoded /usr/local path for PHP and use of obsolete /etc/cluster in update-special-pages and update-special-pages-small; removing misleading log files ([[bugzilla:17534]]) [production]
01:49 <Tim> deleting all enotif jobs from the job queue, there is still a huge backlog [production]
2009-02-16 §
16:46 <mark> Did emergency rollback of squid 2.7.6 to squid 2.6.21 because of incompatible HTTP Host: header [production]
16:21 <Rob> stopped upgrades, sq36 completed before stop [production]
16:17 <Rob> performing upgrades to sq35-sq38 (not depooling in pybal, letting pybal handle that automatically) [production]
16:16 <Rob> performed dist-upgrade on sq31-34 [production]
15:35 <Rob> depooled sq31-sq34 for upgrade [production]
08:12 <Tim> patched in r47309, Article.php tweak [production]
05:00 <Tim> made runJobs.php log to UDP instead of via stdout and NFS [production]
04:53 <Tim> fixed incorrect host keys in /etc/ssh/ssh_known_hosts for srv38, srv39 and srv77 [production]
04:13 <Tim> removing all refreshLinks2 jobs from the job queue, duplicate removal is broken so to clear the backlog it's better to just run maintenance/refreshLinks.php [production]
2009-02-15 §
21:59 <mark> Experimentally blocked non GET/HEAD HTTP methods on sq3 frontend squid [production]
16:15 <mark> Upgraded PyBal on lvs2 - others will follow [production]
13:11 <domas> db23 has multiple MCEs for same dimm logged: http://p.defau.lt/?IarKD4gbFhe5RmaV0RB_Xg [production]
12:38 <domas> in wikistats, placed older than 10 days files into ./archive/yyyy/mm/ - maybe will make flack crash less :)) [production]
11:56 <mark> Doing Squid memleak searching on sq1 with valgrind, pooled with weight 1 in LVS [production]
03:09 <Andrew> CentralNotice still not working properly, and when we tried to set it to testwiki-only, it never came up. Left it on testwiki only for the time being, until somebody who knows CentralNotice can take a look at it. [production]
02:21 <Tim> fixed permissions on the rest of the logs in /home/wikipedia/logs/norotate (fixes centralnotice) [production]
2009-02-14 §
19:19 <Az1568_> re-enabled CentralNotice on testwiki to try and find the problem (we've had this before, but fixed it somehow...possibly with a regen? See November 16th log.) [production]
18:34 <domas> filed a bug at https://bugs.launchpad.net/ubuntu/+source/apparmor/+bug/329489 - could use some Canonical escalation too [production]
18:26 <domas> same affected srv47 - this is related to switching locking to fcntl() - this drives apparmor crazy [production]
17:47 <domas> srv178 kernel memleaked few gigs. blame: apparmor [production]
14:34 <domas> srv215 very much dead, doesn't show vitality signs even after serveractionhardreset [production]
14:28 <domas> correction, srv208.mgmt is pointing to uninstalled box [production]
14:27 <domas> DRAC serial on all new boxes is ttyS1 which is not in securetty [production]
14:24 <domas> srv209.mgmt is actually srv208's SP, and srv208.mgmt is pointing to dead box [production]
14:15 <domas> srv209,215 down? [production]
13:43 <domas> installing php5-apc-3.0.19-1wm2 (no more futexes) on all ubuntu appservers. [production]
11:01 <Andrew> test [production]
2009-02-13 §
22:10 <mark> esams squid upgrade complete [production]
21:05 <RobH> deployed srv207-srv216 in apaches cluster [production]
20:34 <RobH> added new servers to nagois and restarted it [production]
20:15 <RobH> setup all node groups, ganglia, apache, so on for srv199-srv206 and added into rotation [production]
19:38 <mark> Upgrading esams squids to 2.7.6 [production]
18:36 <mark> Upgraded squid on sq1 to 2.7.6 and rebooted the box [production]
18:03 <mark> Memory leak issues on the upload frontend squids, which started in November [production]
18:01 <RobH> sq13 back online, seems there is a memory leak, go mark for finding =] [production]
17:54 <RobH> lomaria install done for domas [production]
17:49 <RobH> rebooting sq13 due to it failing out in ganglia, OOM error evident. [production]
17:48 <RobH> reinstalling lomaria per domas request [production]
17:37 <RobH> sq8 was out of memory and locked up, rebooted, cleaned cache, and bringing back online [production]
17:34 <RobH> srv38 and srv39 back in rotation [production]
17:23 <RobH> srv38 and srv39 reinstalled, installing packages now [production]
16:57 <RobH> reinstalling srv38/srv39 [production]
16:57 <RobH> srv80 reinstalled as ubuntu apache and back in rotation [production]
16:31 <RobH> srv79 back in rotation [production]
16:21 <RobH> srv79 reinstalled, installing packages and ganglia [production]