The Problem
After adding our third app server a couple days ago, the appeal of digging through three separate production.log files when things go awry on Bonanzle was officially over.
Like many Rails developers in this situation, I Googled numerous terms in search of a solution, and most of these terms sent me to my good friend Jesse Proudman’s blog on using Syslog-ng to unify Rails logfiles. So we installed it (well, to be specific, we had Blue Box set it up for us, because it looked complicated), and determined that it was not what we were looking for. Installation issues aside (of which there were a few), the real killer when using Syslogger with Rails is that you lose the buffered production log output you have come to know and love, leaving your production logfiles a stew of mishmashed lines from numerous Passenger process in numerous states of processing. In short, if you get an appreciable amount of traffic (and I’d imagine you do if you’re reading this in the first place), and you are a human being, you will not be able to read an unbuffered Rails log without considerable time and frustration.
The Solution
Since it looked like Syslog was the only game in town currently for merging application server logs, I decided to spend the afternoon writing a plugin that would allow us to take an arbitrary number of production logfiles, from an arbitrary number of hosts, and merge them together into one file without changing the formatting of the production logs or affecting performance on the app servers.
The basic mechanism of this plugin is that it uses rsync to grab your production logs, then it boils those production logs down into hashes of { :time => action_time, :text => action text }. It then outputs the actions from all of your app servers into a single file in chronological order.
As a bonus, it also lets you specify the maximum size of your unified log file, and handles keeping the logfiles broken into bite-sized chunks, so that you can actually read the output afterwards (rather than ending up with a 5GB log file). This functionality is built in, and can be configured via an included YML file.
The remainder of this post will just quote from the Github project, which does a pretty fine job of explaining what’s going on:
========================
Log Weaver v0.2 sports the following featureset:
* Sync as many production logfiles, from as many hosts as you want, into a single logfile on a single server
* Use your existing Rails production.log, no need to modify how it’s formatted
* Break up your unified log file into bite sized chunks that you specify from a YML file, so you don’t end up with a 10 GB unified logfile that you can’t open or use
* Does not run on or adversely affect performance of app servers. Uses rsync to grab production log files from app servers, then does the “hard work” of combining them on what is presumably a separate server.
Installation
============
Clone the log-weaver github project into your vendor/plugins directory. No models, database tables, or installation is needed for this plugin. Simply edit the /log-weaver/config/system_logs.yml file to specify the settings of your hosts.
Usage
=====
Run “rake log_weaver:weave_logs” to initiate the process of log merging.
When you run this task, Log Weaver will rsync your logfiles from the locations you specified in the YML file. The first time you run Log Weaver, this might take a minute or two, depending on how big your production logs are. On subsequent runs, it should be pretty instanteous, since rsync is smart about merging files.
After rsyncing your production logs from your app servers, Log Weaver will build the production logs into bite-sized hashes and merge them in chronological order into the file that you specified in your YML file. Limited testing has shown that combining three logfiles that were each 1+ GB used less than 300MB of memory and completed in about 10 minutes.
Run “rake log_weaver:weave_logs” periodically to add to your unified log. When the size of your unified log exceeds the size you specified in the settings YML file, the unified log will be renamed “[log_name]_X.log” where X is the lowest integer of a file that doesn’t exist in your log directory. That is, if you named your file “unified.log,” Log Weaver would move the original log to “unified_2.log” and then open a new “unified.log” file to continue merging your logs.
Future Improvements
===================
Log Weaver was written over the course of a few hours to fit the baseline needs of Bonanzle, there is surely plenty of room to improve! For starters, this would probably make more sense as a gem than a Rails plugin.
Feel free to fork and add whatever you think it needs and ping me to pull in your improvements and we can make this plugin a worthwhile thing.
I’ve recently implemented the unified logging deal for as well. In constrast, one of our main motivations was to not have logs on the app server because they received enough traffic to generate enough logs to fill up the disk (despite using logrotate). The disk being full usually resulted in the app server going down until space was freed up.
That aside, I’m pretty sure you could implement most of Log Weaver’s functionality using a bit of syslog-ng-fu. On each app server, you can filter to an appropriate destination in the file system (ie /path/to/deploy/log/production.lg). Then on the centralized server, you can filter the incoming logs to a logfile which includes the app server’s hostname in addition to a unified log.
Like I said, hadn’t tried implementing it yet, but will try to report back if I do.
Yeah, that sounds conceivable, if a bit of a pain in the ass. The main challenge of using Syslog-ng for this, from my experience, would be to coalesce the unbuffered output into readable buffer-like streams.
For my purposes of creating readable, buffered logs in bite-sized chunks, Log Weaver pretty ideally performs the task. I wonder if you could just adapt Log Weaver such that it deletes the older production logs from the application server after it has finished consolidating them into the unified log? Do you have enough disk space for one day’s worth of production logs? That would probably be a 10 minute adaptation to the rake file, rather than the fairly complicated exercise of mashing together scattered Syslog-ng output into something readable.
Will be curious to hear what you end up with either way! Thanks for commenting.
This plugin assumes that the request data in the incoming log files is not interleaved. If an application server has multiple concurrent processes handling requests, however, you’ll get interleaved logs. You’ll have the same problem you had with syslog.
How did you resolve that issue? One process per server? One log file per process? Log file locking?
The buffered logger that ships with Rails will only prevent interleaving if all of your requests are handled by the same process. And even then, only if you never generate more log data per request than would fill the logger’s buffer.
The only way I can see to address this (without using one of the less-than-ideal solutions listed above) is to send all your log data to syslog and then de-interleave the results from there. Syslog will prefix each log line with unique information hostname and pid that can be used to safely de-interleave. (Alternatively, you could modify your Rails logger to prepend this information to every line of your production.log output.) Then you just weave the results of each hostname-pid combination separately as if they had been separate, independent log files the whole time.