In the crazy wild days of Rails 2.x…
In the pre-Rails 3 ecosystem, there were a number of confusingly similar choices for getting master/slave database functionality established. These options included Masochism, DB Charmer, master_slave_adapter, and seamless_database_pool, amongst others. When it came time from Bonanza to make its choice on which slave plugin to use, I made my best effort to assess the velocity and functionality of each of the prominent slave database solutions, and wrote what went on to become a fairly popular post comparing the relative strengths of each choice.
Octopus
Fast forward to Rails 3, and the field has narrowed considerably. Most all of the top Google results for Rails slave database options these days point to Octopus, and with good reason. Its documentation is sound, and its github project has maintained good velocity for the better part of the past year. Reading between the lines of the Octopus documentation, it would seem that it was built first and foremost as a tool to make it stupidly easy to shard databases; secondarily, it also supports using slave databases in a non-sharding format, but the implementation here gets a little more sketchy, as the examples show users needing to explicitly declare a given slave database for a particular query. In the documentation, this is done at query time, e.g.,
User.where(:name => "Thiago").limit(3).using(:slave_one)
or
Octopus.using(:slave_two) do
User.create(:name => "Mike")
end
Seamless Database Pool
Upon learning about octopus, my natural inclination was to compare it to our current solution, seamless_database_pool. Admittedly, when we got to the Rails 3 party, SDP was running a bit behind. The author had been kind enough to do much of the legwork to get it compliant with AR3, but we still encountered errors actually trying to use the plugin within controllers and views the way we had been able to with the previous version.
So I fixed it.
What Seamless Database Pool now represents is a slave database plugin that is specifically built with the purpose of making it as easy as possible to A) connect to one or more weighted slave databases B) declare whether a particular Rails action should attempt to use slaves, masters or both (automatically defaulting to the master when write operations occur) and C) gracefully handle failover if one or more of the slave databases declared should become unavailable for whatever reason.
SDP does not have any built in support for sharding, so if that is what your DB needs, Octopus is your best bet. But if what you need is specifically a Rails 3 supported solution that will allow you to connect mix and match your main database and N number of slaves, in a weighted way and with failover automatically baked in, this is where seamless_database_pool really shines.
Bonanza has been using SDP in production for more than a year now, and in the meantime have experience failures of our slave database every few months, which at one point what have brought down the entire site. Now, within seconds, Rails figure out that it needs to re-route requests and finds a database it can use that is still available. The still-good SDP documenation describes how to make it happen.
Bottom line
Prior to writing this blog, if you Google master/slave database you would probably come away thinking there was only one solution, and that solution was only secondarily focused on allowing N slaves to be configured. I may be wrong about the level of support that Octopus already had for setting up multiple weighted failover slaves (and being able to declare usage of these on a per-action vs. per-query basis), but the documentation makes me think that this is at best a future roadmap feature. In the meantime, if it’s specifically database support you need, try the drag-and-droppable SDP gem. I will continue linking my fork of the project until the original author decides what he wants to do with my pull request (which fixes fundamental issues with Rails 3 controller integration, plus adds more robust slave failover).
Installation
Is as easy as possible. In your bundler Gemfile:
gem “seamless_database_pool”, :git => “git://github.com/wbharding/seamless_database_pool.git”
Your database.yml file will then look something like:
production:
adapter: seamless_database_pool
port: 3306
username: app_user
password: app_pass
pool_adapter: mysql
master:
host: 1.2.3.4
pool_weight: 0 # 0 means we only use master for writes if the controller action has been setup to use slaves
read_pool:
- host: 2.3.4.5
username: slave_login
password: slave_pass
Do drop a line in the comments with any questions or feedback if you have experience with either SDP or Octopus as solutions for Rails slave database support!
Looks great! I have something I’m going to try it on in the next couple weeks.
In my 5 minutes lookingat it so far, I’m not sure about all the options you can pass to the “use_database_pool” for the controller filter. With this line:
use_database_pool :all => :persistent, [:create, :update, :destroy] => :master
– is that ‘:persistent’ just referencing the ‘use_persistent_read_connection’ method? If so, that makes it seem like all methods, including those that write, will go to the read pool.
– If I’m breaking restful crud and adding some member methods that do their own update, for instance, say I have an ‘activate’ method, do I just add that to the array pointing to :master, like this?
[:create, :update, :destroy, :activate] => :master
– can you blog an example of how to use code blocks with the static methods on the SeamlessDatabasePool? I’m more interested in any rationale on *why* I would do so – what forces would lead me to such a decision over declaring with the controller_filter.
Thanks for this – I’ll mention it on the Ruby5 podcast this coming Friday if its not covered before then.
Hey David,
I’m not sure how the precedence would work if one were to concurrently declare a method as going to the persistent pool (which includes all masters and slaves, weighted per the database.yml file), *and* to declare certain method as going only to master. My hunch is that with the example you give that create/update/destroy would only work with master, but that isn’t how we’ve used SDP so I could not conclusively verify that.
For our project, there were no good reasons to use code blocks with SDP — we are only using it on a per-action basis. This is exactly why I preferred SDP to Octopus, as with Octopus, it would appear that one can only delegate to slave databases on an individual query basis. In our use case, we’re simply using SDP to delegate our most trafficked actions to the slave database. I.e., in our ItemsController, we have
use_database_pool :show => :persistent
Given that that action comprises more than half of the overall calls for our application, it’s nice to be able to offload it to our slave database. It’s also nice to know that stuff like updating the session (we use DB store for our session) still goes to master, even if the rest of the SQL queries in the items#show action are sent to the slave.
For our specific database.yml, I ended up just setting the weight of the master database to 0, so effectively, if use_database_pool isn’t declared, then only the master is used. The only time we go to our slave database is for the small handful of actions that comprise the vast majority of our traffic.
Hope this helps!
Hi Bill!
I’m the maintainer/creator of Octopus.
Some thoughts about your review:
1) first of all, congrats for your solution, it looks nice 🙂
2) Octopus doesn’t support weighted slaves (It’s on my TODO)
3) Octopus doesn’t support failover connections (It’s on my TODO)
4) Looks like you have a replicated environment, Octopus supports it by adding a flag into the config file. when you add this flag, it will sends all read queries between the slaves, and all writes queries will be sent to the master database. you can read more about it on: https://github.com/tchandy/octopus/wiki/Replication
I will dig into your code, and try to “import” some ideas/solutions to Octopus. ( That’s the beautiful of OpenSource Software 😉 )
Thank you for the mentioning Octopus.
Any ideas/questions, feel free to send me a message.
Thiago
Hey Thiago,
Good to hear from you. As I have no intention of maintaining SDP beyond keeping it functional, I am excited to hear of your intent to incorporate its key pieces of functionality into a well-maintained project like Octopus.
In addition to the weighted slaves and failover connections, the one other major piece of functionality we’d need to get Octopus to feature parity with SDP is the ability to specify when the slave pool should be used on an action-by-action basis. The reason for this is that we only have one slave, and is we sent all read queries to it, it would get crushed under the load. For us, it’s important that we can pick and choose which actions are best suited for our slave (a couple) and which are best suited for our master DB (most of the rest). I could also imagine a world where we added a second slave database that we might only use for long queries, like viewing stats. Again, SDP would let us create another “pool” where we could say that the second slave is used for viewing stats, the first slave is used for looking at items, and the master is used for everything else.
I don’t know how much overhead it would be to get “all of the above” implemented cleanly, but you will impress me thoroughly if you can pull it off! 🙂 I think that with the ever-growing number of scaled Rails sites, a single, well maintained plugin that can handle slaves, failover and sharding with equal aplomb is something that is sorely needed and would be deeply appreciated by the community.
Hi Bill,
Can I use both SDP and Octopus in one application ? Actually, I want to use SDP for redirecting all queries/actions from my controller to Slave DB except one action , which needs to be executed on another sharded database for which I’m thinking to use Octopus. Do you have any inputs on this scenario ?
Thanks.
I grasp you concerns for failover. But in the back of my mind, I’m thinking ‘is this not something that can be addressed via a load balancer at the web server and flatten out the requests while leveraging database slaves?’ Applications all have their persona and thus adapted solutions, so I’m curious on how this line of thought was arrived at.