Caching external HTTP requests with VCR in Rack

I was recently working on a project to build continuous integration performance tests on an application which makes extensive HTTP based calls to a middleware application. I was running in to issues where the middleware calls were unstable in the test, stage and load test environments. For the purpose of these tests, I was only interested in the render times of the Rails calls. The obvious choice seemed to cache the external HTTP requests.

Enter VCR. VCR is a test tool designed to cache external HTTP requests. It uses your favorite mocking tool (webmock, flexmock, etc.) and records HTTP requests and saves them in a YAML (or optionally JSON) file.

I decided to run a standalone instance of the server with a custom Rack config implementing VCR as a caching layer for initial HTTP requests. After a bit of hacking, it turned out to be way simpler then I expected.

Install VCR and Mock framework

Along with VCR, I choose webmock, or rather it was choosen for me -- it's the mock framework currently used in the application I work on.

Install VCR

$ gem install vcr
$ gem install webmock

Or via Gemfile

# file: <app_root>/Gemfile
sources :rubygems

group :performance do
  gem 'vcr'
  gem 'webmock'
end

Configure Rack

Next add the following to your Rack config file.

# file: <app_root>/config.ru

# only use cache when running performance tests   
if ENV['RACK_ENV'] == "performance" 
  require 'vcr'
  require 'webmock'

  # configure VCR 
  VCR.configure do |c|
    c.cassette_library_dir  = './cache/' # where to save cache files
    c.hook_into             = :webmock   # select mock framework
  end

  use VCR::Middleware::Rack do |cassette|
    cassette.name     'cassette'
    cassette.options  :record => :new_episodes,  # record only new requests
                        :match_requests_on => [:host, :path] 
                                                 # define new by host and/or path changes
  end
end

... your app config ...

So what does this record?

For this example, imagine that your application is making an external call to a service which returns data about a passed keyword. We'll define the host as "example.org" and the URI format as "/q/".

  1. http://example.org/q/food -- recorded
  2. http://example.org/q/bar -- recorded
  3. http://example.org/q/food -- from cache
  4. http://example.org/q/bar -- from cache
  5. etc., etc., etc.

VCR as a Production Cache

VCR+Rack can be used in this way, although it could be slow if you're caching a large number of HTTP requests or your cached requests are large themselves.

Also, I would add an expiration to your cache, which is supported in newer version of VCR

While this is untested, it should look something like this.

# file: <app_root>/config.ru

require 'vcr'
require 'webmock'
VCR.configure do |c|
  c.cassette_library_dir  = './cache/'
  c.hook_into             = :webmock
end
use VCR::Middleware::Rack do |cassette|
  cassette.name     'cassette'
  cassette.option   :record => :new_episodes,  
                      :re_record_interval => 5.minutes, # expire every 5 minutes
                      :match,requests_on => [:host, :path]
end

... your app config ...

The primary issue with this is that all cached requests are stored in a YAML file as a serialized HTTP objects. This can be quite large. If you're making many different quests and/or the results are large, the cache file can quickly become unwieldy (I've seen 150K line YAML files) and the parse time can actually take a while. Use this solution at your own risk and performance test with and without a full cache before putting it in a production environment.

Enjoy!