Introduction
Based on the skimpy amount of Google results I get when I look for queries relating to Rails slave database (and/or the best rails slave database plugin), I surmise that not many Rails apps grow to the point of needing slave databases. But we have. So I’ve been evaluating the various choices intermittently over the last week, and have arrived at the following understanding of the current slave DB ecosystem:
Masochism
Credibility: Was the first viable Rails DB plugin, used to rule the roost for Google search results. The first result for “rails slave database” still points to a Masochism-based approach.
Pros: Once-high usage means that it is the best documented of the Rails slave plugins. Seems pretty straightforward to initially setup.
Cons: The author himself has admitted (in comments) that the project has fallen into a bit of a state of disrepair, and apparently it doesn’t play nice with Rails 2.2 and higher. The github lists multiple monkey patches necessary to get it working. It only appears to work with one slave DB.
master_slave_adapter
Credibility: It’s currently the most watched slave plugin-related project I can find on github (with about 90 followers). Also got mentioned in Ruby Inside a couple months ago. Has been updated in last six months.
Pros: Doesn’t use as much monkey patching to reach its goals, therefore theoretically more stable than other solutions as time passes.
Cons: Appears to only handle a connection to one slave DB. I’m not sure how many sites grow to the point of needing a slave DB, but then expect to stop growing such that they won’t need multiple slave DBs in the future? Not us. There’s also less support here than the other choices for limited use of the slave DB. This one assumes that you’ll want to use the slave for all SELECTs in the entire app, unless you’ve specifically wrapped it in a block that tells it to use the master.
Db Charmer
Credibility: Used in production by Scribd.com, which has about 4m uniques. Development is ongoing. Builds on acts_as_readonlyable, which has been around quite awhile.
Pros: Seems to strike a nice balance between the multiple database capabilities of SDP and the lightweight implementation of MSA. Allows one or more slaves to be declare in a given model, or for a model to use a different database entirely (aka db sharding). Doesn’t require any proprietary database.yml changes. Didn’t immediately break anything when I installed it.
Cons: In first hour of usage, it doesn’t work. It seems to route most of its functionality through a method called #switch_connection_to, and that method doesn’t do anything (including raise an error) when I try to call it. It just uses our existing production database rather than a slave. The documentation for this plugin is currently bordering on “non-existent,” although that is not surprising given that the plugin was only released a couple months ago. Emailed the plugin’s author a week ago to try to get some more details about it and never heard back.
Seamless Database Pool
Credibility: Highest rated DB plugin on Agile Web Development plugin directory. Has been updated in last six months.
Pros: More advertised functionality than any other slave plugin, including failover (if one of your slaves stops working, this plugin will try to use other slaves or your master). Documentation is comparatively pretty good amongst the slave DB choices, with rdoc available. Supports multiple slave databases, even allowing weighting of the DBs. And with the exception of Thinking Sphinx, it has “just worked” since dropping it in.
Cons: Tried to index Thinking Sphinx and ran into difficulty since this plugin redefines the connection adapter used in database.yml*. The changes needed to database.yml (which are quite proprietary), make me suspicious that this may also conflict with New Relic (which detects DB plugin in a similar manner to TS). Would be nice if it provided a way to specify database on a per-model basis, like Db Magic. Also, would inspire more confidence if this had a Github project to gauge number of people using this.
Conclusion
Unfortunately, working with multiple slave databases in Rails seems to be one of the “wild west” areas of development. It’s not uninhabited, but there is no go-to solution that seems ready to drop in and work with Rails 2.2 and above. For those running Rails 2.2+ and looking to use multiple slaves, Db Magic and Seamless Database Pool are the two clear frontrunners. I like the simpler, model-driven style plus lack of database.yml weirdness of Db Magic. But I really like the extra functionality of SDP. At this point, our choice will probably boil down to which one gives us the least hassle to get working, and that appears to be SDP, which worked immediately except for Thinking Sphinx.
I’ll be sure to post updates as I get more familiar with these plugins. Especially if it looks like there is any intelligent life out there besides me that is attempting to get this working.
Update 10/13: The more I use SDP, the more I’m getting to like it. Though I was initially drawn to the Db Magic model-based approach to databases, I now think that the SDP action-based approach might make more sense. Rationale: Most of the time when we’re rendering a page, we’ll be using data from models that are deeply connected, i.e., a user has user_settings and extend_user_info models associated with it. We could end up in hot water if the user model used a slave, while the user_settings used the master and extended_user_info used a different slave, as would be possible with a model-based slave approach. SDP abstracts away this by ensuring that every SELECT statement in the action will automatically use the same slave database from within your slave pool.
Also, though I didn’t notice it documented at first, SDP is smart enough to know that even if you marked an action to read from the slave pool, if you happen to call an INSERT/UPDATE/DELETE within the action, it will still use the master.
* Thinking Sphinx will still start/stop with SDP, it just won’t index. Luckily for us, we are already indexing our TS files on a separate machine, so I’ll just setup the database.yml on the TS building machine to not use SDP, which ought to solve the problem for us. If you know of a way to get TS to index with SDP installed, please do post to the comments below.
Thank you so much for writing this comparison. It has saved us so much time.
I can’t believe the lack of multiple slave support isn’t a bigger issue for people. It almost makes those plugins not worth the effort, in my opinion.
We also use TS, and your solution was exactly the one on my mind when I was reading your article.
Have you found SDP to be bulletproof in terms of reliability?
Also, any comments on using SDP on multiple load-balanced app servers to connect to multiple slaves?
Thanks again.
Hey Chris,
We’ve been rolling with SDP in production for the last week, and it has been bulletproof so far, with one major exception. When I first installed it, I noticed that if I updated a model and then requeried it from the database, I was getting the unchanged model back. After about 10 hours of debugging, I finally determined that SDP wasn’t playing nice with Rails’ built in query cache. Turned out to need a one line fix in seamless_database_pool_adapter.rb. After the class methods are evaled in the “self.adapter_class” method, add this line:
ActiveRecord::ConnectionAdapters::QueryCache.dirties_query_cache(self, :insert, :update, :delete)
In my version of the source, it’s line 108. I have tried to find the author to let him know, but his email address is hard to get ahold of.
Our environment is multiple load balanced app servers with a master database and currently just one slave DB. But I spent quite a bit of time understanding the SDP source to fix the above bug, and it looks like it should pretty smoothly scale to multiple slaves when we need to cross that bridge. If you try using it in a multi-app, multi-slave environment, do come back to share your results with the class. 🙂
Did you also check the “ex”-fiveruns plugin? “data_fabric”.
And there’s also multi_db on github which seems to allow multiple slaves with failover and claims to handle the query cache correctly.
Hey Sam,
I didn’t, but I’d welcome reports from those who have tried these. My methodology for choosing which plugins to review was those that I could find with any combination of commonsense queries on Google and the Agile Development Plugins directory. I figure that’s what “real” users will be doing, so if the plugin can’t be found through one of those two channels, then it’s probably going to be infrequently used, and infrequent use is the leading cause of plugin atrophy amongst 87% of Americans who responded to my imaginary poll.
But again, if you’ve tried either of those and/or can make a case for their credibility, others may well find your observations helpful. For our part, we’ve been quite happy with SDP after making the small patch I mentioned earlier.
Another note from the field: to get database failover “really” working, there are a few patches I had to make, both to SDP and to the Mysql adapter itself. FYI.
Alright, we just finished migrating to a new app server setup and it’s finally SDP time.
Hey Bill, if you have a chance, could you outline the changes you had to make to get SDP to work with Thinking Sphinx, along with those failover changes you mentioned in November? Thanks again 🙂