Copy Ruby Gems Between Ruby Installations/Computers/Systems

Surprising this isn’t better Google-documented. Here’s how I do it.

Make a list of your existing gems in a text file (run from machine with gems already installed):

gem list | tail -n+4 | awk '{print $1}' > gemlist

Copy the file to your new system. Make sure your new system has the packages installed that will be needed to build your gems (such as ruby1.8, ruby1.8-dev, rubygems1.8). Then run:

cat gemlist | xargs sudo gem install

It’s run through each line of your gemlist file and run sudo gem install on it. If you don’t sudo install your gems, you can remove that bit about sudo.

Or if you want to go for the gold with a single command line:

ssh -o 'StrictHostKeyChecking=no' #{gem_server} #{path_to_gem} list | tail -n+1 | awk '{print $1 $2}' | sed 's/(/ --version /' | sed 's/)//' | tail -n+3 | xargs -n 3 #{path_to_gem} install

After installing REE 1.8.7, I used a slight permutation of this to install my gems with Ruby 1.8.7 from their previous Ruby 1.8.6 installation (previous gem install was accessible as “gem”, REE 1.8.7 gem accessible as “/opt/ruby-enterprise-1.8.7-20090928/bin/gem”):

gem list | tail -n+1 | awk '{print $1 $2}' | sed 's/(/ --version /' | sed 's/)//' | tail -n+3 | xargs -n 3 /opt/ruby-enterprise-1.8.7-20090928/bin/gem install

That reinstalled all my existing gems with the REE 1.8.7 gem manager.

New Relic Apdex: The Best Reason So Far to Use New Relic

Since we first signed up with New Relic about six months ago, they’ve impressed me with the constant stream of features that they have added to their software on a monthly basis. When we first signed up, they were a pretty vanilla monitoring solution, and impressed me little more than FiveRuns had previously. They basically let you see longest running actions sorted by average time consumed, and they let you see throughput, but beyond that, there was little reason to get excited at the time.

Since then, they’ve been heaping on great additions. First, they added a new view (requested by yours truly, amongst others) that let actions be sorted not just by the average time taken, but by the arguably more important “Time taken * times called,” which tends to give a better bang-per-buck idea of where optimization time should be spent.

They’ve also been rearranging which features are available at which levels, which has made “Silver” level a much more tempting proposition, with both the “Controller Summary” (described last paragraph) and “Transaction Traces,” which allows you to see which specific database calls are taking longest to complete.screenhunter_05-may-07-1504screenhunter_06-may-07-1504screenhunter_07-may-07-15041

But by far my favorite New Relic feature added is their brand new “Apdex” feature. If you’re a busy web programmer or operator, the last thing you want to do is spend time creating subjective criteria to prioritize which parts of your application should be optimized first. You also don’t want to spend time determining when, exactly, an action has become slow enough that it warrants optimization time. Apdex provides a terrific way to answer both of these prickly, subjective questions, and it does it in typical New Relic fashion — with a very coherent and readable graphical interface.

I’ve included some screenshots of the Apdex for one of our slower actions at right. These show (from top to bottom) the actions in our application; ordered from most to least “dissatisfying,” the performance breakdown of one of our more dissatisfying actions; and the degree to which this action has been dissatisfying today, broken down by hour, and put onto a color coded scale that ranges from “Excellent” (not dissatisfying) down to poor. Apdex measures “dissatisfaction” as a combination of the number of times that a controller action has been “tolerable” (takes 2-8 seconds to complete) and “frustrating” (takes more than 8 seconds to complete).

New Relic is to be commended for tackling an extremely subjective problem (when and where to optimize) and creating a very sensible, objective framework through which to filter that decision. Bravo, guys. Now, hopefully after Railsconf they can spend a couple hours running Apdex on their Apdex window, since the rendering time for the window generally falls into their “dissatisfaction” range (greater than 8 seconds) 🙂

But I’m more than willing to cut them some slack for an addition this useful (and this new).

Rails Ajax Image Uploading Made Simple with jQuery

Last week, as part of getting Bloggity rolling with the key features of WordPress, I realized that we needed to allow the user to upload images without doing a page reload. Expecting a task as ordinary as this would be well covered by Google, I dutifully set out in search of “rails ajax uploading” and found a bunch of pages that either provided code that simply didn’t work, or claims that it couldn’t be done without a Rails plugin.

Not so. If you use jQuery and the jQuery-form plugin.

The main challenge in getting a AJAX uploading working is that the standard remote_form_for doesn’t understand multipart form submission, so it’s not going to send the file data Rails seeks back with the AJAX request. That’s where the jQuery form plugin comes into play. Here’s the Rails code for it:

<% remote_form_for(:image_form, :url => { :controller => "blogs", :action => :create_asset }, :html => { :method => :post, :id => 'uploadForm', :multipart => true }) do |f| %>
 Upload a file: <%= f.file_field :uploaded_data %>
<% end %>

Here’s the associated Javascript:

$('#uploadForm input').change(function(){
 $(this).parent().ajaxSubmit({
  beforeSubmit: function(a,f,o) {
   o.dataType = 'json';
  },
  complete: function(XMLHttpRequest, textStatus) {
   // XMLHttpRequest.responseText will contain the URL of the uploaded image.
   // Put it in an image element you create, or do with it what you will.
   // For example, if you have an image elemtn with id "my_image", then
   //  $('#my_image').attr('src', XMLHttpRequest.responseText);
   // Will set that image tag to display the uploaded image.
  },
 });
});

And here’s the Rails controller action, pretty vanilla:

 @image = Image.new(params[:image_form])
 @image.save
 render :text => @image.public_filename

As you can see, all quite straightforward with the help of jQuery. I’ve been using this for the past few weeks with Bloggity, and it’s worked like a champ.

Me No Blog Hella Ugly!

Welcome to the 2000’s, self!

I’m ever so excited to be blogging at a blog that not only understands code highlighting, but doesn’t look like it was crafted by a mad scientist with cataracts in 1992. Now it looks more like it was crafted by a mad scientist without cataracts circa 2008 — which is an entirely more accurate representation of the truth.

That’s the good news.

The bad news? That I have don’t anything meaningful to report in this post.

Maybe I’ll just write some highlighted code instead.

# ---------------------------------------------------------------------------
# options[:except_list]: list of symbols that we will exclude form this copy
# options[:dont_overwrite]: if true, all attributes in from_model that aren't #blank? will be preserved
def self.copy_attributes_between_models(from_model, to_model, options = {})
	return unless from_model && to_model
	except_list = options[:except_list] || []
	except_list << :id
	to_model.attributes.each do |attr, val|
		to_model[attr] = from_model[attr] unless except_list.index(attr.to_sym) || (options[:dont_overwrite] && !to_model[attr].blank?)
	end
	to_model.save if options[:save]
	to_model
end

Hey hey hey code, you're looking quite sexy this evening -- you come around here often?

Rails Blog Plugin Bloggity v. 0.5 – Now Available for Consumption

Made another pass at incorporating my newer changes to bloggity this evening. Now in the trunk:

  • FCKEditor used to write blog posts (=WYSIWYG, WordPress-like text area)
  • Images can be uploaded (via AJAX) while creating a blog post. You can then link to themvia the aforementioned FCKEditor
  • Added scaffolding for blog categories, and allowing categories to have a “group_id” specified, so you could maintain different sets of blogs on your site (i.e., main blog, CEO blog, user blogs, etc. Each would draw from categories that had a different group_id)
  • Blog comments can be edited by commenter
  • Blog commenting can be locked
  • Blog comments can be deleted by blog writer

With new features come new dependencies, but most of these are hopefully common enough that you’ll already have them:

  • attachment_fu (if you want to save images)
  • jquery and jquery-form plugin(if you want to upload images via AJAX. The jquery-form plugin is bundled in the bloggity source code)
  • FCKEditor (if you want to use a WYSIWYG editor)

If you’re already running bloggity, you can update your DB tables by running the migration under /vendor/plugins/bloggity/db/migrations. If not, you can just follow the instructions in the previous bloggity post and you should be good to go.

I’m hoping in the next week to do some more testing of these new features and add a README to the repository, but it’s too late for such niceties this evening.

P.S. Allow me to pre-emptively answer why it’s in Google’s SVN instead of Github.

Rails Fix Slow Loads in Development when Images Missing

I have found it useful to populate my local development database with data from our production server in order to be able to get good test coverage. However, a perpetual problem I’ve had with this approach is that it introduces an environment where sometimes images are available and sometimes they aren’t (the database knows about all the images, but some were uploaded locally, some reside on our main servers, and some are on S3).

What I’ve found is that even though Rails doesn’t give exceptions when it finds missing images, it does start to get painfully slow. Each missing image it has to process usually takes about 2 seconds. On pages with 5-10 missing images, the wait could be quite painful.

So I finally got fed up yesterday and wrote a hacky patch to get around this problem. Here it is:

def self.force_image_exists(image_location)
 default_image = "/images/dumpster.gif"
 if(image_location && (image_location.index("http") || File.exists?(RAILS_ROOT +  "/public" + image_location.gsub(/\?.*/, ''))))
  image_location
 else
  default_image
 end
end

This function is part of a utility class (named “UtilityGeneral”) that we use for various miscellaneous tasks. I call this method from a simple mixin:

if RAILS_ENV == 'development'
 module ActionView 
  module Helpers #:nodoc: 
   module AssetTagHelper
   # replace image tag
   def path_to_image(source)
     original_tag = ImageTag.new(self, @controller, source).public_path
     UtilityGeneral.force_image_exists(original_tag)
    end
   end
  end
 end
end

If anyone else works locally with images that may or may not exist, this wee patch should come in handy to save you from load times of doom on pages that are missing images. It just subs in an alternate image when the real image doesn’t exist locally.

P.S. When I grow up, I want a blog about coding that lets me paste code.
P.S.S. 4/10: I grew up!

Monitor Phusion Passenger Memory Usage

We are on the cusp of having Passenger running, but I am paranoid, based on our Mongrel experiences, of Passenger instances leaking memory up the wazoo and eventually exhausting our system resources. With Mongrel, we’ve used monit to ensure that memory usage remains intact with each Mongrel, but I hadn’t found a straightforward way to do the same with Phusion yet. So I’m improvising:

kill $(passenger-memory-stats | grep '[56789]..\.. MB.*Rails' | awk '{ print $1 }')

This single line (run via crontab) ought to do what our thousand line monit config file used to do: kill off Rails processes that exceed 500 MB. From my testing so far, it seems to do the trick.

I have verified that it does indeed kill one or multiple Rails processes started by Passenger if their memory usage is reported as being a three digit number that starts with 5-9. Obviously if a Rails instance were able to jump past the 500-999 MB range in less time than the frequency of our cron task, that would be a problem.

Will report back once I’ve witnessed it at work in the wild.

Update from the wild: Yes, it works.

Nginx “24: Too many open files” error with Rails? Here’s why.

We had been racking our brains on this one for a couple weeks. We have monit looking over our Mongrels, which usually keeps everything on the up and up. But every so often, our server would go bananas and the nginx error log would flood with the message:

939#0: accept() failed (24: Too many open files) while accepting new connection

Usually the problem automatically resolved itself, but last night it didn’t. Taking the error at face value, our server guy started looking at the number of open files on our system and the maximum files that could be opened (it’s confusing… “ulimit -a” reports one limit while “cat/proc/sys/fs/file-max” reports another. I think that the former might be for actual file system files opened and the latter might be for file handles (which also includes open IP connections and such)). But even after upping the limit and rebooting repeatedly, the problem persisted.

After server guy (literally) fell asleep on the keyboard around 2 AM, I figured out what had really been happening: any time a new visitor came to our site, we were geocoding their IP with a service that had gone AWOL. About a week earlier I’d noticed a similar slowdown of about 1-2 seconds with actions that created sessions, but I assumed it was the session creation itself that was causing the slowdown, when in fact it was the geocoding that happened alongside the session creation that was responsible for the lag.

Long story short, when nginx gives this error, what it really seems to mean is that it is holding too many open connections, and usually that is happening because you are using round robin dispatching (bad, I know, but we have our reasons) and one or more of the Mongrels is stuck and forcing the Mongrel queue to skyrocket.

The other lesson here is an obvious one that I’ve read many times before but have been slow to actually act on: making remote API calls without timeouts is asking for trouble. Here is a fine article if you’re interested in solving that problem in your own site before it is your ruin.

Rails Performance Analysis and Monitoring? Free and Easy with pl_analyze

Since writing about our lackluster experience with Fiveruns awhile back, I had been keeping my eyes open for a good way to get reliable, comprehensive, and easy-to-interpret performance metrics for Bonanzle. Prior to our transition away from shared hosting in November, I would browse through our production logs and regularly find actions taking 10 seconds or more, but I had no way to get a good overall sense of what actions were taking the longest, how long those actions were taking, and how it was changing from day to day.

Now we have a solution, and it’s a dandy.

Many have heard of pl_analyze, but I think that few may realize how truly easy it is to get up and running on any Rails 2.X application (we’re now on 2.2). To start, Geoff has written a strikingly excellent blog about how to change your log format to be compatible with pl_analyze by adding a single file to your lib directory and changing a couple lines in environment.rb.

After you follow his instructions, here are a couple more key points that I found relevant to polish the rough edges:

* By default, when we were running 2.0 (don’t know if it’s the case with 2.2, since we were pre-patched when we upgraded), logging of DB times was disabled. This monkey patch fixes that.

* By default, when we moved to 2.2, the log format changed and broke pl_analyze. This fixes that.

* After enabling the Hodel logger as described by Geoff, our production log became extremely verbose. One of the helpful commenters on Geoff’s blog pointed out the solution to this: add the line “config.logger.level = Logger::INFO” to your production.rb

After that, you’re good to go. Took me less than five minutes to get it set up, after which I was able to get priceless info like the following:

Request Times Summary: Count Avg Std Dev Min Max
ALL REQUESTS: 1576 0.242 0.229 0.000 2.487

ItemsController#show: 685 0.328 0.141 0.008 0.751
ChatsController#service_user_list: 286 0.022 0.037 0.005 0.301
HomeController#index: 158 0.256 0.308 0.000 1.461

Well, doesn’t look so pretty on my crap blog (shouldn’t I have upgraded this by now?), but it’ll look like a champ in your terminal window.

After getting our basic setup established a couple weeks ago, I started experimenting today with archiving our daily performance info into the database so I can track how action times are changing from day to day. Geoff provides a great start for this as well, with his pl_analyze Rails plugin, which you can install by running

./script/plugin install http://topfunky.net/svn/plugins/mint

After you’ve installed the plugin, run

rake mint:create_analyze_table

to create your table, and then

rake mint:analyze PRODUCTION_LOG=/path/to/production.log

to actually extract and store the day’s data. Be sure to read my comment in Geoff’s blog (it’s about the 30th comment) for a slight modification you may need to make to the analyze task if you get an error that says something to the effect that “0 parameteres were expected, got 1” or somesuch when you run mint:analyze.

Also note that this will store everything in your production log, so if you haven’t already setup log rotating on your production log, you’ll want to do that (it can be as simple as this).

After you’re archiving performance metrics to your database, you are in a performance analysis wonderland — it’s just a matter of applying creativity to data. For starters, I plan to create reports that list the top 10 daily time users (= count of action * average time per action), and the top 10 biggest daily deltas in time taken for various actions. I plan to use the latter to help triangulate where code changes affect performance. It would be trivial to create a model that actually allowed daily notes to be added to a report, so if the items#show action starts running 20% slower one day, I could demarcate the code changes that were made on the day that happened (though our changelist makes such a demarcation unnecessary).

I’m not sure what other tools are being used by the community to keep an eye on performance, but with the quickness and reliability (= stable results reported for weeks) of getting a pl_analyze solution running, it gets high recommends from ’round these parts. I can’t imagine returning to a day where we didn’t know if our average request time was 300 ms, 3000 ms, or 30000 ms (yes, we had some very dark days before our server move).

Rails Phusion Passenger vs. Mongrel Handlers, Anyone Dare Try?

Update: An answer from the heavens descends! On the same day of my post, no less. Gotta love an active framework.

Like most of the Rails community, I’ve been swept up in the fascination lately over easier deployment via Passenger. After getting 2.2 deployed, I had hoped that Bonanzle could try to make the leap to Passenger and become yet another success story of a multi million monthly page viewed site proving that the hype behind Passenger is legit.

But then I remembered our Mongrel handlers, specifically, that said handlers are currently satisfying about 50% (or more) of the requests for our entire site (mostly through chat and instant message reads + posts). As such, upgrading to Phusion puts me ill at ease, because I would think that if our Mongrel handlers became standard Rails actions, we would 1) require twice as many Rails instances to run our site, possibly requiring more memory with Phusion instead of less and 2) incur a much greater overhead per request, when Rails runs its routing and other pricey (and in this case, unnecessary) default behaviors on our requests.

Normally I might pose this question to a forum, but given the esotericness of the problem, I suspect that such a question would assuredly be greeted with crickets. If anyone else has actually tried this, or seen any stories comparing the two, I’d love to hear them and post them to this blog to make it easier for others to find.

If all else fails, I plan to eventually setup a test case to actually benchmark and get some numbers, but that would figure to be at leeast a 1-2 day ordeal by the time sufficient reliable data could be gathered.