Bill – Page 14 – Relentless Simplicity

October 3, 2008

Rails Thinking Sphinx Plugin: Full Text Searching that’s Cooler than a Polar Bear’s Toenails

Continuing my series of reviews of the plugins and products that have made Bonanzle great, today I’ll talk about Thinking Sphinx: How we’ve used it, what it’s done, and why it’s a dandy.

What It Is Bro, What It Is

What it is is a full text search Rails plugin that uses the Sphinx search engine to allow you to search big tables for data that would take a long-assed time (and a lot of custom application code) to find if you used MySql full text searching.

What Are Your Other Options

In the space of legitimate Rails full text plugins, the commonly mentioned choices are Sphinx (via Thinking Sphinx or Ultra Sphinx), Xapian (via acts_as_xapian), solr (via acts_as_solr) and (shudder) Ferret (via acts_as_ferret).

Jim Mulholland does a great job of covering the various choices at a glance, so if you’d like a good overview, starts with his blog about the choices.

To his commentary, I would add that Solr looks complicated to get running, appears to have been abandoned by its creator, and hasn’t been updated in quite awhile. It should also be mentioned that if you were to choose solr, every time you wished to talk about it online you’d have the burdensome task of backspacing the “a” out of the name that your fingers were intent on typing in.

Xapian seems alright, but the documentation on it seemed lacking and not a little arcane. Despite Jim’s post on how to use it, it seemed like the Xapian Rails community was pretty sparse. My impression was that if it didn’t “just work,” it would be I alone who would have to figure out why. Also, from what I can tell in Jim’s post, it sounds like one has to stop Xapian from serving search requests to run the index? Update: the FUD patrol informs me that you can concurrently index and serve, oh what joy!

Ferret sucks. We tried it in our early days. It caused mysterious indexing exceptions left and right, whenever we changed our models or migrated. The day we expunged it from our system was the day I started programming our site and stopped worrying about what broke Ferret today.

Ultra Sphinx looks OK, but as you can read here, it’s ease of use leaves something to be desired compared to the star of our blog post, who has now entered the building. Ladies and gentlemen, may I present to you, hailing from Australia and weighing in at many thousand lines of code:

Thinking Sphinx!

There’s a lot to like about Thinking Sphinx: it has easy to read docs with examples, it has an extremely active Google Group behind it, and it supports useful features like location-based searches and delta indexing (e.g., search updates in real time).

But if there is one reason that I would recommend Thinking Sphinx above your other choices, it’s that you probably don’t care a hell of a lot about full text searching. Because I didn’t. I care about writing my website. This is where Thinking Sphinx really shines. With the tutorials and Railscasts that exist for Thinking Sphinx, you can write an index for your model and actually be serving results from Thinking Sphinx within a couple hours time. That doesn’t mean its an oversimplified app though. Its feature list is long (most of the features we don’t yet use), but smarts defaults are assumed, and its super easy to get rolling with a basic setup, allowing you to hone the parameters of the search as your situation dictates.

Also extremely important in choosing a full text search system is reliability. Programming a full text engine (and its interface into your application) is rocket science, as far as I’m concerned. I don’t want to spend my time interpreting esoteric error messages from my full text search engine. It must work. Consistently. All the time. Without me knowing anything about it. Thinking Sphinx has done just that for us. In more than a month since we started using it, it’s been a solid, reliable champ.

A final, if somewhat lesser, consideration in my recommendation of TS is who you’ll be dealing with if something goes wrong. Being open source, my usual expectation is that if me and Google can’t solve a particular problem, it will be a long wait for a response from a random, ever-so-slightly-more-experienced-than-me user of the system in question who will hopefully, eventually answer my question in a forum. Thinking Sphinxes creator Pat Allen blows away this expectation by tirelessly answering almost all questions on Thinking Sphinx in its Google Group. From what I can tell, he does this practically every night. This is a man possessed. I don’t claim to know or understand what’s in the punch he’s drinking (probably not beginner’s enthusiasm, since TS has been around for quite some time now), but whatever’s driving him, I would recommend you take advantage of his expertise soon — before he becomes jaded and sour like the rest of us.

What About the Performance and Results?

Performance: great. Our usual TS query is returned in a fraction of a second from a table of more than 200,000 rows indexed on numerous attributes. Indexing the table currently takes about 1-2 minutes and doesn’t lock the database. Nevertheless, we recently moved our indexing to a remote server, since it did bog down the system somewhat to have it constantly running. I plan to describe in the next couple days how we got the remote indexing working, but suffice to say, it wasn’t very hard (especially with Pat’s guidance on the Google Group).

Results: fine. I don’t know what the pertinent metrics are here, but you can use weighting for your results and search on any number of criteria. Our users are happy enough with the search results they’re getting with TS out of the box, and when we do go to get more customized with our search weighting, I have little doubt that TS will be up to the task, and it’ll probably be easy to setup.

Final Analysis

If you want to do full text searching on a Rails model, do yourself a favor and join the bandwagon enjoying Thinking Sphinx. It’s one of the best written and supported plugin/systems I’ve stumbled across so far in the creation of Bonanzle.

I’m Bill Harding, and I approved of this message.

October 1, 2008

MySql Use “Distinct” and “Order by” with Multiple Columns AKA Apply “Order by” before “Group”

I’ve had a devil of a time trying to get Google to tell me how to write a Mysql query that allows us to somehow perform a MySql query that 1) filters rows on a distinct column 2) returns other columns in the query besides the distinct column and 3) allows us to order by a column. In our case, we (and you, if you’re running Savage Beast!) have a list of most recent forum posts on the site — currently, if you list all recent posts, the search will just find all posts and order by date of creation, but this makes for some dumb-looking output since you often end up with a list where 10 of the 20 posts are all from the same forum topic. All the user really wants to know is what topics have a new post in them, and to get a brief glimpse as to what that new post might be.

Thus, we want to create a query that returns the new posts, ordered by date of creation, that have a distinct topic_id.

Here’s the SQL that can make it happen:

Post.find_by_sql(“select posts.* from posts LEFT JOIN posts t2 on posts.topic_id=t2.topic_id AND posts.created_at < t2.created_at WHERE t2.topic_id IS NULL ORDER BY posts.created_at DESC”)

Hope that Google sees fit to lead other people here instead of struggling to get GROUP BY to order results beforehand (GROUP BY ‘posts.topic_id’ works, but it returns the first post in each distinct topic, rather than the last post as we desire), or get SELECT DISTINCT to return more than one column, as many forum posters unhelpfully suggested in all the results I was getting.

Update 11/26/08 – A Word of Caution

I finally got around to setting up some profiling for our site yesterday and was surprised to discover that the above query was taking longer per execution than almost anything else on our entire site. The SQL Explain was not too helpful to explain why, but it showed three joins, the join on the topics table involving every row of the table (which is presently almost 10,000).

Takeaway: for this query to work, it seems to consider every distinct topic in the table, rather than being smart and stopping when it hits the per-page paginated limit. Since I already determined that “group” and “distinct” were non-starters for being able to pick the newest post in each topic, I ended up revising the way the logic was done to an easier to manage and far-more-DB-efficient way:

We now track in each topic the newest post_id within that topic. While this adds a bit of overhead to keeping the topic updated when new posts are made to it, it allows us to do a far simpler query where we just select the more recent topics, joining to the most recent post in each topic, and then ordered by the age of those posts.

If you have the ability to create an analogous situation to solve your problem, your database will thank you for it. The above query starts getting extremely slow with more than a few thousand rows. Yet, I defy you to find an alternative to it that works at all using “group” or “distinct.”

September 30, 2008

Rails Hosting: Review of Slicehost vs. EC2

It’s a goal of mine to write a series of review for all the major plugins and services that have gone into the creation of Bonanzle. Previously, I reviewed Fiveruns and gave it a “thumbs down,” which gave me the blues, since I’d like Fiveruns to be the killer app for monitoring Rails performance. Unfortunately, though I two or three different Fiveruns sales people have noticed Bonanzle and told me I should use Fiveruns, none have gotten back to me with a promise that they could make it easier to use after I point them to my review. But I digress. Today we discuss Slicehost.

Synopsis

Slicehost has been very good to Bonanzle. After a short and bad experience with another Rails hosting provider that gave limited shell access, we started using Slicehost almost a year ago, first with a 256 MB slice to host our main Bonanzle server. A Slicehost “slice” is their name for a server partition, very much like an EC2 server instance (will get to a comparison of the two of them shortly). When you sign up for a slice, you can pick from a number of sizes: 256 MB, 512 MB, 1 GB, 2 GB, or 4 GB. When you setup your slice, you can choose from a variety of OS’ to have pre-installed on your slice (including most all the flavors of Ubuntu). You have full shell access with any slice you setup, so essentially you have the full range of possibility in configuring your server instances that you’d have if you had the server in your basement. If you choose to add more slices in the future, you can copy the disk image from your existing slices as a starting point for your new slice (as long as you have backups turned on the slice whose disk image you want to copy). This has been very convenient for us, as it saves us the trouble of having to repeatedly install basic stuff like Mysql and Apache on each new slice we add.

How We’ve Used It

From that initial 256 MB slice we started with a year ago, we now have grown to seven slices ranging in size from 512 MB to 4 GB. As mentioned above, it’s very convenient to get a new slice up to speed using the disk image of an old slice. It’s also very fast — when we’ve put in our request to get a new slice, it has taken from 30 minutes to a couple hours max to get the new slice created.

Uptime

None of our slices have gone down in a year of use. That’s nice.

Performance

The bigger the size of your slice, the more CPU you get to use in times of contention. According to the support personnel I’ve spoken with, the servers are hosted on quad core Opteron 64-bit 2 Ghz machines, and a 4 GB slice would get up to 25% of the CPU cycles in times of contention (which there rarely are). Scale down from the 25% for each level down in slice size (e.g., 2 GB slice would get 1/8th of the cycles).

In terms of practical speed, we’re currently serving about 50,000 pages/day, mostly non-cached, on a site that has a lot of interactive features and image processing. We’re doing this on one 4GB slice that currently runs 8 mongrels AND the Mysql server itself. Most page load times are less than a second, creating images takes longer. Good enough for me for now.

Compared to EC2

The closest comparable EC2 offering to a 4GB Slicehost slice is the

7.5 GB of memory, $288.00/month – 850 GB of instance storage, 0 GB BW included in price, 4 EC2 Compute Units

Compare to Slicehost:

4 GB of memory, $280.00/month (with automatic 10% discount it’s $250/mo) – 160GB HD, 1600GB BW included in price, equivalent of 2 EC2 (e.g., one 2 ghz processor) units during resource contention, more otherwise

EC2 jumps out to the early lead, as it offers about twice as much computing power and RAM for $30 more. But Slicehost catches up quickly when you consider bandwidth and storage:

EC2 bandwidth = $0.10-$0.17 per GB transferred. Slicehost = up to 1600 GB transfer free.

That is, if you were to use all of your Slice’s bandwidth, you’d save yourself something in the neighborhood of $250/month vs. Amazon. For storage, Amazon offers more storage space by default, but they make no guarantees about that your instance storage won’t evaporate at any time, which is why they also offer Elastic Block Storage (EBS), which is intended to be your “real” disk when operating in an EC2 instance. Use of EBS costs $0.10/GB and $0.10 per million IO writes, which Amazon estimates to add up to about $26/month more for a “medium sized web site.”

When you add up the total costs, assuming you were going to use your storage and bandwidth, Slicehost offers about half the memory and half the computing power, but it does so at less than half the price of EC2. And a 4GB Slicehost slice is no small computing organism. As mentioned above, it’s serving 50k daily pages of dynamic content and getting by well enough (except when it comes to image creation, which can take 5-10 seconds to process including thumbnails).

Where does EC2 win?

Still, there are a number of advantages to EC2. The first is that 4GB (the size I’ve been discussing) is the largest instance size currently offered at Slicehost, whereas Amazon has a couple different instances with significant more computing power/memory available. This alone is reason that we will probably need to switch to EC2 in the not-distant future, since at times of peak traffic we’re pushing the maximum performance of our slice currently. Update: The Slicehost support team informs me that they also have 8GB and 15.5GB slices available by request. Both of the unlisted, larger-sized slices also having corresponding 2x or 4x increases in HD space and processing power (and of course, cost).

Another annoying limitation of Slicehost is that all traffic is throttled at 10Mbps. While it’s not a “low” amount (Wikipedia says that 8-12 Mbps is equivalent to “medium to high-definition digital channel with DVD quality data” aka about 1 meg of transfer per second) per se, it is not conducive to high traffic, image-heavy sites, and it is annoying that throttling is set at the same level regardless of what slice size you use. Update: The Slicehost support team informs me that this limit can be adjusted as necessary by request. I requested that they double our bandwidth allowance and they had it done within an hour.

Where does Slicehost win?

Firstly, there are the cost wins described above if you are hosting a site that uses lots of bandwidth.

Secondly, I get the sense (from documents I’d previous read but can no longer locate) that it is far less likely that one’s instance storage will evaporate with Slicehost. I know that it’s never happened to us in the year we’ve been hosted, whereas I thought I recalled reading that EC2 made no guarantee that their instance storage would be available at a given time. Would love to get more details on this if anyone else can cite where I might have read this?

Another great feature of Slicehost that’s easy to underestimate is the availability of their help. They have a Slicehost chat room that is staffed by a handful of Slicehost employees during all normal business hours (Update: and non-normal hours too… I was talking to them at 3 AM last night about the progress of our resize to an 8GB slice. There were two Slicehost employees manning the chat window at that hour (!)). I’ve ended up visiting this chat room on numerous occasions when I want instant answers to my questions, and I’ve found the people in the chat room to be very knowledgeable and patient. Getting good support at Amazon is very expensive ($100-$400 per month, or more, for a service Slicehost provides free of charge).

Also, I’ve found that our slice almost always has more than the “guaranteed” CPU cycles available: our slice regularly uses more than “100%” (=1 of the 4 quadcore processors… which is what’s guaranteed with a 4GB slice), according to top.

Final Summary

I hope to continue adding to this article as I gain experience with the two services. As mentioned above, we have stuck exclusively to Slicehost so far, but if our site gets into the millions of uniques we might end up making the move to EC2. Update: I did some research on EC2 recently, and was pretty surprised at how esoteric their documentation is (see the section on creating your own AMI if you need to lull yourself to sleep), so I’d just as soon stay at Slicehost where there are less proprietary concepts involved. For people making the decision today about where to host, I’d pick Slicehost if you’re looking for high configurability, less learning about proprietary concepts, more human support, and lower, more predictable costs. I’d pick EC2 if you already know how to use it or are planning to run a complex scalable architecture that you want to be able to swap in more servers on a whim. I’d imagine EC2 is pretty easy to get up and running with some of the pre-configured AMIs (haven’t researched, but I’m sure they have one for Rails). But then again, Slicehost is pretty damn easy to get Rails rolling too, since you can follow any of the kajillion tutorials about setting up Rails on an Ubuntu machine. (Or you can use modrails, which from what I’ve heard is pretty much plug-and-use.)

Stay tuned for updates, and if you have comparable experience with either, please post it below!

September 12, 2008

Rails Mysql Indexes: Step 1 in Pitiful to Prime Performance

Like any breathing Rails developer, I love blogging about performance. I do it all the time. I’ve done it here, here, and quite famously, here.

But one thing I haven’t done is blog about Rails performance from a perspective of experience. But tripling in traffic for a few months in a row has a way of changing that.

So now I’m a real Rails performance guy. Ask me anything about Rails performance, and I’ll tell you to get back to me in a couple months, because this aint’ exactly yellowpages.com that I’m running here. BUT, these are the lessons and facts from our first few months of operation:

One combined Rails server+Mysql slice at Slicehost is handling about 3000 daily visits and 30,000 daily pageviews (on a highly real time, interactive site) with relative ease. Almost all pageviews less than 2 seconds, most less than 1.
Memcached saves our ass repeatedly
Full text searching (we’re using Thinking Sphinx) saves our ass repeatedly
BackgroundRb will ruin your life, cron-scheduled rake tasks will save it
Database ain’t nothing but a chicken wing with indexing

Now there are five salient observations to take from a growing site, but you notice that it was the last one that I chose to single out in the title of this blog? Why? Because, if I called this entry “Rails Performance Blog,” your eyes would glaze over and you’d wouldn’t be able to read through the hazy glare.

Why else? Because the day I spent indexing our tables was the only time in the history of Bonanzle that I will ever bring forth a sitewide 2x-3x performance increase within about 4 hours time. God damn that was a fantastic day. I spent the second half of it writing airy musings to my girlfriend and anyone who would listen about how much fun web sites are to program. Then I drank beer and went rafting. Those who haven’t indexed their DB lately: don’t you hate me and want to be like me more than you ever have before?

Well, I can’t help you with the former, but the latter, that we can work on.

Download Query Analyzer
Delete your development.log file. Start your site in development mode. Go to your slowest page. Open your development.log file in an editor that can automatically update as the file changes.
Look through the queries your Rails site is making. Any query where the “type” column reads “ALL” is a query on which you are searching every row of your database to satisfy the query. Hundreds of rows? OK, whatever. Thousands of rows? Ouch. Tens of thousands of rows (or more)? Your request might never be heard from again.
Create indexes to make those “ALL”s go away. Adding an index in Rails is the simplest thing ever. In a migration: add_index :table_name, :column_name and you’re done. remove_index :table_name, :column_name and you’re undone.
Observe that, at least for MySql, queries where you are filtering for more than one attribute in your where clause (e.g., select * from items where status = “active” and hidden = false) are still slow if you create indexes for “status” and “hidden.” Why? I think it’s because the DB ORs them together to find its results. But I don’t know the exact reason, nor do I care. What I do know is that an add_index :items, [:status, :hidden] creates a compound attribute that will get you back to log(n) time in making queries with compound where clauses.

Now, if you are like me or the 50 people in the Rails wiki and forums that learn about this crazy wonderful thing called “indexes,” your first question is “Indexing sounds pretty bangin. Why not just index the hell out of everything?”

Answer: Them indexes aren’t immaculately conceived, son. Every index you create has to be generated and maintained. So the more indexes you create, the more overhead there is to inserting or deleting records from your table. Of course, most queries on most sites are read queries, so you will make up the extra insert/delete time by 10x or more, but if you were to go buck wild and index the farm, you probably wouldn’t be much better off on balance than if you indexed nothing at all. You see why downloading Query Analyzer was the first step?

The general rule that is given on indexes is that most any foreign key should be indexed, and any criteria upon which you regularly search or sort should be indexed. That’s worked well for us. For tables with less than 500 rows, I usually get lazy and don’t do any indexing, and that seems fine. But assuredly, if you’re working on a table with 1,000 or more rows and you’re querying for columns that aren’t indexed, you are 15 minutes away from a beer-enabling, management-impressing performance optimization that would make Ferris Bueller proud.

September 2, 2008

Change ACL of Amazon S3 Files in Recursive Batch

We’re in the process of moving our images to be served off of S3, and wanted to share a quick recommend I came across this evening when trying to change our presently-private S3 image files to be public. The answer is Bucket Explorer. All things being equal, you certainly won’t mistake it for a high budget piece of UI mastery, but it is surprisingly capable of doing many things that have been troublesome for me with the Firefox S3 plugin (which is a major pain to even get working with Firefox 2 (which is a major pain to upgrade to Firefox 3… I upgraded for a month, spent 5 hours trying to figure out why some pages seemed to randomly freeze indefinitely before giving up and downgrading (my best guess was it seemed to be Flash-related))), or the AWS-S3 gem, or the other free S3 browsing web service I found somewhere or another.

In addition to providing a capable, FTP-like interface for one’s S3 buckets, it can also get stats on directories, do the aforementioned recursive batch permission setting, delete buckets (S3 gem won’t let me, even with the :force => true option), a bunch of other features, and probably most importantly (for me) — it works! Tra-lee!

It’s $50 to buy, but once it finishes changing batch permissions for about 20,000 of our images files (as its currently in the process of) I would seriously consider paying it. For the time being, I’m on a fully functional 30 day trial.

July 31, 2008

Bonanzle: “The Best eBay Alternative They’ve Seen”

An incredible accolade for a site that’s still technically in beta, Ecommerce Guide just named Bonanzle “The Best eBay Alternative They’ve Seen” in four years of reviewing eBay alternatives. Pessimistic side of me says that an article this effusive is an open invitation for every Tom, Dick and Harry to quibble and point out the faults of Bonanzle (of which there are admittedly still several… we’re haven’t even officially launched yet, people), or question how Bonanzle can be called an “eBay alternative” when it doesn’t even do auctions.

That said, it’s hard to imagine this project going much better than it has so far. While I’m fully aware that the hundreds of PHP eBay lookalikes are going to slowly start nibbling at what are now Bonanzle-only features, it’s comforting to know that they’re going to have to program those features in PHP (or maybe Java).

If you haven’t already, pay a visit to Bonanzle and cast your vote that Rails is an unfair advantage.

July 30, 2008

Bloggity – An Idea for a Rails Blog Plugin

Update: This plugin now exists! It’s under active development, and has some pretty cool features. Read more about it through the link.

There comes a time in most every Rails applications’ life where its owner wants to up the “Web 2.0” factor and create a blog presence. Luckily, this is Rails, a framework ideally suited for creating a blog. Unluckily, there doesn’t seem to be a de facto Rails blog plugin floating around the community yet.

Now I know what you’re thinking… Mephisto! No thanks. The same factors that drove me away from Beast drive me away from Mephisto as well. Who wants to manage multiple subdomains, multiple app repositories, two sets of Mongrels, cross-subdomain authentication issues, and figuring out how to share resources (like views and images) between your Mephisto app and your standard app?

Upon discovering that no good blog plugin existed, my first instinct was to see if I could bring to life a sort of “Savage Mephisto.” Unfortunately, whereas I had a little bit of time to squeeze in Savage Beast development during Bonanzle’s final run to launch, I have virtually no time whatsoever to work on a project like this, now that Bonanzle is live and doubling in traffic every few weeks. Nevertheless, I did at least give a half-hearted attempt to make Mephisto into a plugin before realizing that as vast and deep as Beast is, Mephisto is probably twice as complicated, and thus, half as plugin-able.

So I turned my attention to creating a blog plugin from scratch. You can get a sense of how far I’ve gotten on the other Bonanzle blog. Like SB, I wrote it using the Engines plugin, so it could theoretically be extracted to a repository for download with relatively little effort. However, I’m not sure what the demand is like for a plugin like this would be? I mean, you could write your own fairly functional blog in a day or less, and you could conquer the cross-domain Mephisto issues, or you could try out one of the not-well-rated blog plugin attempts on Agile Web Development.

Anyway, I’ve decided to use the responses to this post to gauge how much enthusiasm there is for an easy-to-setup Rails blog plugin. If there’s lots, I’ll try to make the time frame to launch short. If there’s little, it probably remains a Bonanzle-only plugin.

July 25, 2008

Rails script/server command line options

Usage: server [options]
-p, –port=port Runs Rails on the specified port. Default: 3000
-b, –binding=ip Binds Rails to the specified ip. Default: 0.0.0.0
-d, –daemon Make server run as a Daemon.
-u, –debugger Enable ruby-debugging for the server.
-e, –environment=name Specifies the environment to run this server under (test/development/production). Default: development
-s, –config-script=path Uses the specified mongrel config script.

-h, –help Show this help message.

July 18, 2008

Seattle Craigslist vs. Microsoft Expo, or, Don’t Ape the Mojo

With so many sites trying to be “the next Craigslist” these days, I thought it prudent to take explore what’s going through their minds, and why it is that they keep on failing.

The answer why lies in my Seattle Craigslist vs. Microsoft Expo, or, Don’t Ape the Mojo blog on the mothership.

July 16, 2008

Rails Review: Fiveruns

Bonanzle has twice tried to implement Fiveruns software for our site, so I thought that some might find it handy to hear our impressions of it. Here you are!

Quick Overview

Fiveruns is a fairly new, Rails-centric company that develops a full system monitoring stack for Rails application. They claim to monitor everything from Apache to MySQL to Rails to your Linux OS, all from within one web-based console. I believe they monitor other systems as well, but those were the ones we tried. There aren’t too many other players in this space yet, so the promise of being able to comprehensively monitor, both for errors and performance, the full Rails stack is an idea that should sound intoxicatingly wonderful to any busy Rails developer.

At press time, they were charging something like $40/month per box and $10/month for each additional Rails app… or something along those lines. Short story: if you have a full stack on one or two boxes, you’d pay something between $50-$100 month to monitor it.

Installation

The first time we installed Fiveruns, in late 2007, was something of a mess. Though their web site has (and had) great presentation, the actual technical meaty details describing what to download and what to do with it were pretty lacking. Installing the Rails plugin itself was pretty straightforward (just a script/plugin install), but much configuration was needed afterwards before we had it up and running.

The second time we installed it, right after Railsconf in early May, was better, but still pretty painful. They now have a gem-based install for the plugin and OS process you’ll need, and better (though not great) instructions on how to setup the gem. But after spending an hour or two following their instructions as best I could, we were 1 for 4 in systems that were up and monitored (Linux monitoring was up, Rails/Mysql/Apache notsomuch).

The Fiveruns support team was very helpful. I dealt mostly with Rachel. Rachel and I spent a few days going back and forth about different possibilities as to why our various systems were not working. We dealt with user/directory permissions problems, Apache conf issues, some settings in the Fiveruns web console, and a variety of other issues. Though many of our problems were technically “user error,” I would prefer to characterize them as “user error brought about by lack of specific instructions.”

In total, it took somewhere around 10 hours to get three of our four applications (all but Apache, which I decided I didn’t really care about) registering as “active” in the Fiveruns console.

Features and Usage

I was interested in Fiveruns specifically because I wanted to know how often our Rails actions were being called, and how long those actions were taking to process over time. This should be Fiveruns bread and butter, but we never realized the promise of this during our month with Fiveruns.

I’m not sure why. In the Fiveruns console, data from a Bonanzle Rails application was being received, but only a small fraction of our overall actions were being reported in the console. After running for two weeks, I think it had registered maybe 100 total action calls (vs. a few thousand action calls that had actually occurred). The action completion times it had registered seemed suspiciously incongruent with the times reported by our ApacheBench testing. Also suspicious was that the Fiveruns console reported us as having a “single Rails app” but not a Mongrel cluster, when in fact our application is a Mongrel cluster. It made me wonder if somehow the data Fiveruns was receiving was from one of our debug machines?

The UI for the statistics that we did accumulate was nice to look at, but difficult to use effectively. All metrics for Fiveruns are reported via Flash time graphs that go back something like 12-24 hours. They looked good, but the problem was that, even after hunting around their UI for 15 minutes, I could not find out how to change the time range to go back more than 24 hours. I also couldn’t figure out how to aggregate statistics. I can’t imagine that there isn’t a way to change the time range and see the big picture of how performance is changing over time, but as Don’t Make Me Think would preach — the responsibility for making users feel smart lies not with the user, but with the designer. I felt dumb trying to learn what I wanted to learn from Fiveruns statistic reporting interface.

Other Concerns

I have no way to verify that this definitely resulted from Fiveruns, but for the weeks we ran the Fiveruns software on Rails 2.0.x (and not before, and not so far after), once or twice a day we would receive nil exceptions from fields in records that were not nil. This meant that sometimes items that had been posted for months would, very, very occasionally, say their price field was nil. Obviously these errors were impossible to track down or definitively attribute to any cause, but the fact that we have not gotten one of them in the two weeks since uninstalling Fiveruns, while we were getting a couple per day with Fiveruns installed, suggests a strong causal probability.

The other concern I had with the software was privacy. Obviously the source code for a Rails project is its crown jewel, but looking through Fiveruns logs (and from the details provided by the Fiveruns team), it seems that Fiveruns could look into or actually send to Fiveruns any or all of our Rails source code. I don’t like to be paranoid, but when we’re talking about the source code for our site, the very basis of our existence, this possibility made me feel uneasy.

Fiveruns Response

After spending more than a day to just get Fiveruns up and running in the first place, I’ll admit that I was less than enthused about the possibility of reporting our problems and working through them with Fiveruns, taking still more of our time. There is hardly a feature on our site that took more than 10 hours for me to implement, and the fact that we dropped that much time just to get Fiveruns installed (poorly), left me reluctant to invest still more time working with them to debug our problems.

After a few emails from their sales coordinator asking me if I was going to sign up or not, I did eventually email them a list of my issues. I suggested that, as much as I want Fiveruns to be the solution, at some point their engineers need to step up their game and deliver a product that “just works” for the very useful purposes it purports to. I sent that mail a few weeks ago, and have not heard back from them since.

Final Analysis

Unlike the BackgroundRb review I’m going to write in the next few days, I have hope for Fiveruns. Their product makes wonderful promises, their support team was extremely friendly, and they have enough millions invested in them that I think they will ultimately be able to sort out their problems. Obviously, my experience with them is the experience of but one person, so it is also possible that we just had the worst batch of luck possible. This possibility seems somewhat far-fetched to me, since we are running a pretty vanilla stack of Apache/Mongrel/Gutsy, but if everyone had as many problems as we’ve had, Fiveruns wouldn’t be in business.

My hope is that someone at Fiveruns will either read this, or hear from users that read this, and will circle up the wagons to capture the unrealized potential with this service. Failing that, we will be looking for alternate services that can provide an easy way for us to monitor the performance of our Rails stack, preferably without them having unfettered access to our entire source tree.

If anybody has any alternate service suggestions, please comment below!