Rails Performance: A Brief History of the Universe

After much ado, the time has finally come to get Bonanzle moved to real servers, with Capistrano and all the tricks the big boy Rails apps use. But as I started to look into the logistics of this, I ran into a quick interview with Alex Payne regarding Rails performance on Twitter that made me raise an eyebrow. A couple minutes later, I came to the realization that this “quick interview” actually represents the center of an impassioned web of controversy that has enveloped the Rails community regarding the scalability of Rails.

On one side of the argument are DHH and the programming purists, whose solution to the performance problem is to say any number of things other than, “yes, performance is an important issue to us, and we intend to make it better.” On the other side of the argument is people like Rob Conery and Joel Spolsky who write that maybe RoR developers should give some weight to these performance concerns, as they seem to be the only argument of substance against mainstream RoR love.

Judging by the responses to Rob’s article, methinks there is still quite a ways to go before the community will come to acknowledge the problem at hand. What makes me think it’s a problem? Um, how about objective data from reliable sources? Ruby is slow. Rails is slow and getting slower as time goes on. To be fair, I tend to agree with Joel that Rails can never be fast until Ruby is, so this isn’t all on DHH, as some have implied. At the same time, when some crazy Japanese guy (developer of YARV) with this web page is the supposed prophet of Ruby optimization, I am concerned. Even if the guy is crazy like a fox, the extent of what he will divulge about the future of YARV is that “YARV development is too HOT!” Better HOT! than cold, but something like an expected date of arrival would be a comforting piece of data to add to the development page.

Others data bits I’ve seen around:

* Some suggest JRuby is the answer to Ruby’s problems. But by all accounts, it isn’t the solution to Ruby’s performance problems. At least, not yet.

* A common argument I see being used to dilute the significance of performance is that “performance isn’t scalability.” What people mean when they say this is that poor performance doesn’t always mean you’re unequivocally screwed. And to the extent that you have developers that like to spend their time setting up caching and multi-database transactions, I suppose that is true. But as a programmer who knows a lot of other programmers, I can pretty confidently say that “how to implement a load balancer” is not the sort of problem that many programmers exalt to face. Esoteric details about how to implement Pound or a memcache or HTTP connections between multiple Mongrel processes and an Apache frontend — this is the punishment that Ruby on Rails developers bear for the joy that they experience programming their apps.

* Another common “solution” to the problem I see is to use something similar to RoR that makes an effort at better performance. Entries from this side of the arena include Grails and Merb. The question I’d be asking myself if I were a Rails evangelist that wanted to see the language thrive is, “why do these similar applications exist?” Partly because people are comfortable with legacy languages (in the case of Grails), but moreso because there are many practical people who love RoR as a language, but can’t stomach the performance trade-offs that go along with it.

I suspect that many Rails-evangelists may be quick to point out that if I don’t like it, I can go to hell or fix it myself. And if I won’t fix it, I am in no position to complain, blah blah blah. Whatever. The first step is admitting we have a problem. If there is any blemish upon Rails name, this is the one. Can we agree on that?

Delete DOS Directories Recursively

Windows XP doesn’t include my old standby “deltree,” and there is no combination of options that “del” can take to delete hidden directories recursively (yeah, I’m talkin bout you, .svn directories!)

After some rooting around the web and two tablespoons of experimentation, I’ve finally come up with the following Windows Powershell command that can recurisvely delete hidden directories and their contents:

get-childitem . -include .svn -force -recurse | foreach ($_) {remove-item -force $_.fullname}

The first “.” after the get-childitem is the base directory you want to start in. The parameter after “-include” is the pattern you want to operate on. In my case, the wretched Subversion directories (.svn)

Unfortunately, this script still prompts me for each directory I want to delete, but that’s only a tweak or two away from perfection.

Assorted Plugs

Today begins week two of my “un-“, er, “self-“, employment. Week #1 was spent in large part taking another pass at my processes to see what further enhancements I could make to the way I get stuff done. Here’s the cheat sheet of my favorites.

TODO List: Who knows how many different incarnations I’ve had of TODO lists? The one I’ve kept going back to is just a running document that sits in my Gmail “Drafts” folder. But I think I’ve finally found a piece of software that can beat the elegance of nothing: rememberthemilk.com. I found the site after Googling “best todo software” and finding a web poll where people rated about 10 different pieces of online TODO software. RTM beat the others by a factor of many. After using it, I can see why. It’s interface is dirt-simple, but it does everything I hoped it would, without bells and whistles getting in the way. Which is not to say it doesn’t also have bells and whistles: it took about 2 minutes to set up my Google Calendar to show which TODO items I have scheduled on which day. I consider RTM a must-have for any person who makes their own schedule.

Park on a Downhill Slope: The idea, which I read in Life Hacker (Confession in order: I got almost all these ideas from Life Hacker), says that the last thing you do before you leave should be to leave a folder on your desktop that describes in specifics what your first task will be for the next day. Then, when you get in, you’ll get straight to work and set up your momentum for the rest of the day. This goes hand-in-hand with a similar idea: don’t check your email for the first hour of the day (until you finish your first task). It sets the stage for distraction and not getting tasks done.

Close Thine Email: I’ve heard this recommendation countless times, but have finally started taking it. What I’ve noticed is that even though having Gmail open doesn’t seem like a big distraction (it doesn’t pop up or make noises when I receive an email, like some of my friends’ Outlook configurations do), it serves as an escape valve that my brain uses whenever hard problems arise. My tendency had been to double-check whether there might be any email to read instead of solving said problem. Bad, brain, bad!

Ban You from Your Most Visited Sites: Another Life Hacker special — here’s a script that you can use along with GreaseMonkey to keep yourself from visiting certain, customizable, web sites at customizable times of day. I scoffed at this one when I first saw it (isn’t that what self-discipline is for?), but it is handy to have the computer be strong for me when I am weak.

Time > Money: I came up with this one myself — when one is a programmer, and one has a lot to get done in not a lot of time, many hardware expenses, most software expenses, and pretty much all book expenses are worth the cost if they can make you work faster. When you figure that the average Rails contractor makes $75-$150 an hour, if there is a book that could cumulatively save you one hour, it is worth whatever it costs. It was this same stream of logic that emboldened me to get my fancy new Quad Core machine last week, which is not only going to minimize my idle time, but improve morale when I get more done with the same amount of brain power.

One Liner

This comes from an anonymous programmer applicant, who had the following assessment of my last posting about working at a startup:

“I had in fact already read your blog on startup environment. I have not experienced a work setting of precisely this nature, except possibly as an EMT on scene. “

Must Love Chaos and Compromise

Progress continues to lurch forward in fits and starts as we settle on the personnel configuration to lead us to launch. Having watched a handful of similar revelations occur to many of our previous team members, it has dawned on me that the same factors that make Bonanzle so exhilarating for me are the factors that cause others to turn tail and head for the highway. My conclusion is that, until you’ve experienced the atmosphere before, it is easy to over- or under- estimate how difficult it is to be a part of creating something big from scratch. As usual, Paul Graham has insightful observations on the topic, but reading his poetic account of “hard work” makes it sound more romantic than I think it is. For my time, and the time of our future potential applicants, I think it is vital to accurately describe the most important differences between the startup and non-startup company.

I don’t think that work at a startup is most accurately described as “harder” than work at a large company. One of the “hardest” jobs I ever had was keeping my brain busy while I did nothing for 8 hours a day as a web programmer at the University Bookstore. A better point of comparison between small and large company is the degree of chaos and compromise you experience on a daily basis.

Specifically, these here are my five biggest contrasts that I think startle people who haven’t been immersed in a startup before:

1. The roadmap is drawn as you go. Well, technically, the roadmap is drawn at the beginning, but the more time gets spent drawing that original roadmap, the more time was wasted when that everything-you-know-is-wrong moment happens. Startups are about doing, not speculating.
2. Despite the best intentions, things will be broken. Sometimes with no easy solutions. And it will take creativity to work around it.
3. You are beholden to deadlines. No matter what excellent new service pack is available; no matter what important features from the next milestone one would rather work on. Of course, sometimes that excellent service pack absolutely does need to be installed, so you have to figure out the relative degree of necessity.
4. You are beholden to deadlines. Items only get checked off the schedule if each and every team member is 100% productive with their time. Working at Microsoft it might well be weeks before somebody notices you’ve been spinning your wheels over a certain problem. At a startup, spinning your wheels for 3 days will show up on the schedule.
5. You are your manager. And you are your everything else. Even in a company that attempts to create specialized roles, there usually isn’t time to send an email to the manager to get a task clarified, then get ahold of your web designer to create HTML, before finally working on the original bug that had been assigned to you. Instead, each person must often use their confidence (and common sense) to guide them to a sensible solution when a task has not been well-defined (see also item #1).

Look down the list, and there it is: compromise, chaos, compromise, chaos, chaos. Is it intrinsically harder to deal with chaos and compromise than a lack thereof? I doubt it. But it does take a special personality to have the confidence, patience, and foresight to see how the decisions they make on an everyday basis might seem like chaos, but when the dust settles, suddenly something amazing stands where moments ago there was nothing.

That is the payoff that awaits those with the grit to make something big happen.

Oh my gosh that’s kind of evil

Who would have thought? Some 5 years after Google emerged in the public consciousness, and they have done their first kind-of-evil thing. When I logged into my Gmail this morning, there was a big, red stripe across the top of my Gmail account warning me that if I didn’t disable Firebug (web debugging software used by 99% of all web developers, 100% of all web developers in their right mind), than there could be severe performance penalties while I used Gmail.

I will admit that I was pretty disappointed that Google chose to try to make users change their habits, rather than taking the extra time (and yes, cost) to get Gmail to work with a tool so common amongst web developers. But at the same time, it made me re-realize just how seldom Google chooses to take the low road like this. In the pre-Google days, it was routine for applications to make me reboot my computer to install them. Or for applications to not include an uninstaller. And include spyware. And pester me with waits to use them if I didn’t pay.

Of course, many applications still do these things, and I don’t think that Google alone has turned the tide away from these annoying practices, but it has certainly risen public awareness that “don’t be evil” is a viable business strategy that can create both adoring users and a profitable bottom line. Katy recently pointed out that it should only be a matter of time until Starbucks’ insistence on charging users $6.00/hour to use the Internet could come back to hurt them. I hope she’s right. There is no need for these kinds of business practices, when righteous users now know to stand up to them.

Manually Set Rails Migration Version

Here’s another piece of info that wasn’t easy to discover using the Google query I expected. If you should ever find yourself needing to explicitly set the migration version that Rails thinks you are at, perform the following steps:

1. Open your Database (using MySQL, this should be something like “mysql -h localhost -u root -p”, followed by “use [database name];” in the MySQL console)

2. Run “Update schema_info SET version=[version number];”

And you’re done!

All Coming Together

You can look it up on www.williambharding.com — there are few things that I love more than seeing a plan hatch, and that’s just what I’m seeing as this site makes some big steps toward being the most usable site of its kind for its users. We began our first round of seeking seed investment this week, and so far, so great. People seem to quickly comprehend how this will be a big step forward in shopping, and the fact that I have a team that personifies awesome certainly makes for an easier sale.

hedgehog.jpgOnce we finish acquiring our seed funding, things get really fun. The old adage of “time is money” inverts to state “money is time,” and my hope with seeking our first group of investors is that the money we can obtain will buy us many months of time compared to the means that I’ve used to get the site to where it is now. I’ve been fortunate to get significant discounts on basically everything we’ve purchased so far (web design about half price, programming costs ridiculously low for the quality) by getting people excited to be part of the idea, but goodness, having a bit of capital to work with is going to open a lot of exciting new doors.

There’s still a hell of a lot to do, but if the team matters as much as the product, we’re golden.

Automatically Reload Modules On Change In Rails 2.0.x

This problem had me stumped longer than any other single problem I’ve run into so far, and no group or search query I could come up with was able to help me; ultimately I had to figure this out by diving into the actual Rails source code and digging until I found the right concoction. Perhaps this solution was not more findable because we’re running edge Rails? All data I could find on the forums seemed to say that Rails 1.2 did not have these sorts of problems, but I can neither accept nor refute that.

The problem I’d been having was that I had created a Rails Module in a file that was in a subdirectory of my /lib directory. The point of this Module was to allow me to share some methods between a select few Rails controllers without putting the methods into the global Application.rb (which seems sloppy to me…methods should only be available in the scope they’re needed says I). Actually, figuring out that I could create a Module in the lib directory to DRY up my controllers was also surprisingly difficult to find Google information on… you’d think people would object to putting stuff in Application.rb more often. Anyway, I digress.

This Module in the lib subdirectory had one annoying problem: when I made changes to it, I had to stop and re-start my Rails server for those changes to take effect. That got old fast. After the aforementioned digging, here were the TWO LINES OF CODE that took me way too many hours to figure out.

In Environment.rb: Add the path for the lib subdirectory where my module was stored, e.g.

config.load_paths += %W( #{RAILS_ROOT}/lib/item_setup )

There is a commented line in environment that has an example of the syntax for this.

In Development.rb: Add the name of my Module to the explicitly_unloadable_constants list, e.g.,

Dependencies.explicitly_unloadable_constants = 'NameOfMyModule'

And voila! It works. If you have other modules you want to automatically reload, use the << operator to add them to the explicitly_unloadable_constants array, e.g.,

Dependencies.explicitly_unloadable_constants << 'NameOfAnotherModule'

The other possible problem that I read about on the forums that didn’t come about for me was that if my lib directory had somehow gotten listed in Dependencies.load_once_paths, then it would not reload on being changed. You can Google around to figure out how to fix that (hint: delete your directory from the load_once_paths array), but it probably won’t be an issue.