Rails 3 Performance: Abysmal to Good to Great

So you’ve upgraded from Rails 2.x to Rails 3 and you’re not happy with the response times you’re seeing? Or you are considering the upgrade, but don’t want to close your eyes and step into the great unknown without having some idea what to expect from a performance standpoint? Well, here’s what.

Performance in Rails 2.2

Prior to upgrading, this action (one of the most commonly called in our application) averaged 225-250 ms.

Rails 3.0.4 Day 1

One our first day after upgrading, we found the same unmodified action averaging 480ms. Not cool.

Thus began the investigative project. Since New Relic was not working with Rails 3 views this week (seriously. that’s a whole different story), I (sigh) headed into the production logs, which I was happy to discover actually broke out execution times by partial. But there seemed to be an annoyingly inconsistent blip every time we called this action, where one of the partials (which varied from action to action) would have would have something like 250-300 ms allocated to it.

I casually mentioned this annoyance to Mr. Awesome (aka Jordan), who speculated that it could have something to do with garbage collection. I’d heard of Ruby GC issues from time to time in the past, but never paid them much mind since I assumed that, since we were already using Ruby Enterprise Edition, the defaults would likely be fine enough. But given my lack of other options, I decided to start the investigation. That’s when I discovered this document from those otherworldly documenters at Phusion, describing the memory settings that “Twitter uses” (quoted because it is probably years old) to run their app. Running low on alternatives, I gave it a shot. Here were our results:

304 Twitter mem settings

New average time: 280ms. Now that is change we can believe in! Compared with the default REE, we were running more than 40% faster, practically back to our 2.2 levels. (Bored of reading this post and want to go implement these changes immediately yourself? This is a great blog describing how. Thanks, random internet dude with broken commenting system!)

That was good. But it inspired Jordan to start tracking our garbage collection time from directly within New Relic (I don’t have a link to help ya there, but Google it — I assure you it’s possible, and awesome), and we discovered that even with these changes, we were still spending a good 25-30% of our time collecting garbage in our app (an improvement over the 50-60% from when we initially launched, but still). I wondered if we could get rid of GC altogether by pushing our garbage collection to happen between requests rather than during them?

Because every director will tell you that audiences love a good cliffhanger, I’ll leave that question for the reader to consider. Hint: after exploring the possibility, our action is now faster in Rails 3 than it had been in Rails 2. It’s all about the garbage, baby.

23 Replies to “Rails 3 Performance: Abysmal to Good to Great”

  1. Unicorn has support for performing a GC flush between requests, though be aware that this means a given worker is going to be out of the action for a longer period between requests, and deferring GC runs means the total heap size grows more quickly than it would with intra-request flushes.

    I highly recommend looking at Ruby implementations with better GC performance like Rubinius or JRuby and seeing if one of those might meet your needs.

  2. That’s great info Duncan, thanks much! For us, we have plenty of open Passenger processes at any given time, so it makes sense for us to use some of those open cycles to garbage collect so that when the actual request comes in it can be handled more expeditiously. But obviously if we were running our entire application off one app server, it would probably make more sense to let GC happen naturally.

    That’s great info re: unicorn.

  3. Did you guys do any analysis of the request with perf tools or something akin? It’d be great to understand what was increasing the response time or causing the GC to become more active in Rails 3 vs pre Rails 3.

  4. @tim: We haven’t had the time to analyze that. My hunch is that Rails 3 main jam was to become more de-couped and abstracted, and that usually comes at the cost of more objects. Jordan had used scrap (http://www.coffeepowered.net/2009/03/24/announcing-scrap/) to figure out where all of the memory was coming from in Rails 3…the vast majority of it (I think something ridiculous like 500k allocations per request) was strings. Then it scales down to other stuff.

    Since our performance had been “good enough” with Rails 2.2 I never bothered to analyze how GC may have been affecting our performance prior to migration. I would certainly be curious. But now, with our New Relic mixin, we can see specifically how much time for each action is being spent on Garbage Collection, which, after the memory improvements and before the per-request collection, was still usually the single largest time user in any given action. If your Rails 3 performance isn’t where you want it to be and you’d like to get to the bottom of “why,” I’d highly recommend the New Relic GC tracking and the use of that Scrap middleware.

  5. so rails 3, with GC tuned, is still slower than rails 2 without the GC tuned?

    what if you do the GC tuning in the rails 2 version?

    i bet that improves the rails 2 performance significantly as well.

  6. Hey Chris,

    There’s no denying that Rails 3 is slower. But if, after GC tuning, we are only down 10%, it feels like a worthwhile trade-off for all the extra niceties Rails 3 affords.

    Of course, if the Rails guys were to put a premium on framework performance in one of these next releases, I can’t say that I’d exactly be opposed to that…! (Given that “beauty” and “performance” are often enemies of one another, and we know whose side the Rails/Ruby purists are on, I’m not holding my breath)

  7. @ken: That sounds super useful, thanks for sharing! Before discovering the GC slowness, we experimented with moving our application to 1.9.2 and found the performance identical to REE (with neither using smart about GC settings). I have to think that 1.9.2 has the potential to be faster than REE under the right circumstances though, since I thought that was one of the main selling points of Ruby 1.9?

    @wes: Hey Wes, great question. It’s sort of a two pronged answer. On one hand, there’s the new coolnesses like the query chaining (Item.where(“price > 50”).order(“created_at DESC”) grows on you quickly) and the built in support for engines (something we use extensively given the size of our app). And of course Bundler is a godsend. But the other, probably more important reason that upgrading made sense for us is that it’s going to be harder and harder to find plugins that are 2.x compatible; given how much of our app depends on gems and plugins, we didn’t want to be in a position where more than half the gems we wanted to use wouldn’t even work with our app. Right now, it feels like about 25% of the gems will only work in Rails 3…by the end of the year that will probably be more like 75%…so we figured it made sense to upgrade more sooner than later.

  8. Which version of Rails are you using? I think they made some really big changes to Arel in 3.0.3 which should cut down on the number of objects dramatically. Are you already on 3.0.3?

  9. Hey Bill,

    Nice writeup.

    How were you able to get around the problem of New Relic not reporting view numbers? I am facing a similar issue with out app which we migrated from Rails 2.3.8 to Rails 3.

  10. Hey Jatinder,

    We have the following code in our config/initializers directory as a mixin:

    # NewRelic agent version 2.13.5.beta4 no longer is able to trace views on Rails 3,
    # this just adds back the methodology used to trace views on pre-3 versions,
    # pending some massive update from NewRelic to presumably support Rails3’s
    # new instrumentation hooks
    ActionView::Template.class_eval do
    add_method_tracer :render, ‘View/#{virtual_path[%r{^(/.*/)?(.*)$},2]}/Rendering’

  11. I ran into this same issue upgrading an app from Rails2 to Rails3. Was going crazy trying to figure out what it was — thanks for this blog post, using REE and GC tuning fixed it.

    But here’s the thing… I was wondering, the date on your post is only a couple months ago, Rails has been out forever…. nobody else noticed this until now?

    Out of curiosity, I ran my app under rails 3.0.0 instead of the latest 3.0.8… and there was no slowness problem even without using REE at all. (I never used to use REE in my rails2 app, I was getting acceptable performance without it and didn’t need another thing to deal with).

    Looks to me like something has crept into Rails recently to cause this.

  12. Great article. We’re doing GC tuning and so far so good.

    However, we’re running into trouble with between-request GC. I separately tried both your patch to Passenger and the code provided in this StackOverflow question. In each case, GC.start appears to load the world (do a bunch of `SQL (0.0ms) SHOW TABLES` and other stuff like when the app first starts up).

    If we had a bunch of free app instances, this would be OK. But we don’t, so this means when an instance does between-request GC, it takes a lot longer than an inline GC, and our response times go down.

    Have you seen this? Do you know of a way to prevent GC.start from loading the world, or is it required?

  13. Hey Raphael,

    I haven’t seen that specific behavior. But I also haven’t paid that close of attention. I was paying attention primarily to the response times reported by New Relic. I guess I’ve always figured that server power is cheap (at least, compared to programmer power).

  14. Ok, so I was a little confused.

    The method in this SO question was actually causing app instances to die after the finished serving every request (and then passenger would spawn another). Obviously not the way to achieve between-request GC.

    Using apachebench (ab) I’m not seeing any significant change in response times using between-request GC. Did you use a benchmarking tool for testing? Or just throw it on production and see what happened? 🙂

    BTW, 5 days post GC-tuning, and our response times are down by 50%!


Leave a Reply

Your email address will not be published. Required fields are marked *