I’m a strong proponent of separating application code from the Rails framework. It leads to better design, and allows for faster feedback when doing TDD.
RSpec 3 encourages this approach by providing separate helpers for code which depends on Rails, and that which doesn’t.
But there are some ways you can end up coupling your code to Rails without realizing it.
Let’s say we’re writing a very simple validation class which we want to isolate from Rails.
We’ve installed the
rspec-rails gem and run the installer, so we have a
.rspec file which contains:
--color --require spec_helper
This means that
spec_helper.rb will be automatically available in our tests, so we don’t need to require it.
It provides only a minimal setup so we do need to manually require the class being tested:
require "validator" RSpec.describe Validator, ".call" do it "returns true if a name is provided" do result = Validator.call("foo") expect(result).to be(true) end end
Notice how we aren’t using
rails_helper, since this shouldn’t depend on Rails.
Here’s our implementation:
class Validator def self.call(name) name.present? end end
rake to check it’s working:
% bundle exec rake Finished in 0.00296 seconds (files took 1.94 seconds to load) 1 example, 0 failures
Looks good. But now, let’s try running just that one test in isolation:
% bundle exec rspec spec/validator_spec.rb Failures: 1) Validator.call returns true if a name is provided Failure/Error: result = Validator.call("foo") NoMethodError: undefined method `present?' for "foo":String # ./lib/validator.rb:5:in `call' # ./spec/validator_spec.rb:5:in `block (2 levels) in <top (required)>'
It fails. Confused?
The error message gives us a clue to what’s happening.
#present? method is added by ActiveSupport, and isn’t part of Ruby.
When running the full test suite with
rake, Rails will be autoloaded, and the tests which use
spec_helper are no longer isolated.
This is trivial example, but there are many more subtle ways in which code can end up accidentally dependent on parts of Rails, or other parts of your application. This often makes it more difficult to reason about behaviour.
So how do we make it pass? In this case it’s simple.
We could replace the use
#present? with our own check such as,
name != "".
Or we could pull in just the small part of ActiveSupport which we actually need:
This could be put in either the spec or the implementation, but it makes more sense to have it in implementation so that the dependencies are explicit.
Prefer Explicit over Implicit
The act of listing dependencies provides a valuable form of design feedback. If a class has a large number of dependencies, then it’s probably violating the Single Responsibility Principle.
The simplest way to reduce the chance of accidentally introducing these kinds of couplings is to be in the habit of running specs individually. If you do TDD by the book, then this shouldn’t be a problem. You should only need to run the full suite occasionally, e.g. before pushing.
If your project uses Spring, this can lead to further complications. As Spring pre-loads Rails, the Rails environment is always present. To run a spec without Spring, you need pass an environment setting to the Springified RSpec executable:
% DISABLE_SPRING=1 bin/rspec spec/validator_spec.rb
or run it via
bundle exec, which bypasses Spring:
% bundle exec rspec spec/validator_spec.rb
Selectively enabling Spring
Ideally we want to use Spring only when the spec uses
If you normally run tests from within your editor, e.g. with vim-spec,
you could write a function to skip Spring if the spec uses
let [rails_helper_line, rails_helper_col] = searchpos('require.*rails_helper') " check column number in case line is commented out if rails_helper_line > 0 && rails_helper_col == 1 let command = "rspec" else let command = "DISABLE_SPRING=1 rspec" endif
Finding existing coupling
If you’ve already separated your Rails and non-Rails specs, and want to verify that you don’t have any accidental Rails coupling, you can run:
find . -name "*_spec.rb" -print | xargs -n 1 bundle exec rspec
It may take a while to run on large test suites.
One of the famous benefits of Rails is convention over configuration. Many things which need to be handled manually in other frameworks are handled automatically for us by Rails. However, there are usually trade-offs to be aware of.
If you’ve seen Java or C# code, you’ll be familiar with there being a list of
using statements at the top of each class.
And if you’ve written Ruby code outside Rails, you’ll know that you have to call
require, or perhaps
require_relative to call code from another file.
In Rails, it’s normal for all of the app’s classes and modules to be automatically loaded on boot. This means any module or class is callable from anywhere else in the app.
This also applies to any gems that your app uses.
After you create new application, you’ll find the following code in
# Require the gems listed in Gemfile, including any gems # you've limited to :test, :development, or :production. Bundler.require(*Rails.groups)
This means that, based on your Rails environment, only gems matching that group will be loaded when Rails starts up. If a gem isn’t within any particular group, then it will be loaded regardless of the Rails environment.
The problem with Bundler.require
If you’ve ever worked on a long-running Rails project, you’ll have noticed that startup time gradually creeps up over time. This is often due to the accumulation of many gems.
Loading each gem takes time. There is the filesystem overhead, since gems often contain hundreds of files. There is a parsing overhead, as Ruby loads all the modules and classes defined and verifies the syntax. And there may also be an initialization process for the gem.
This process typically takes in the order of 10ms to 100ms per gem, based on its complexity. That doesn’t sound like much, but once your app grows, that can add additional seconds to the time to start the Rails server, the Rails console, any generators, or any rake tasks.
But most importantly, they add to the overhead of running a single test. Despite DHH’s position, many of us still strongly believe that TDD is useful approach for writing clean code and to help drive good design.
If it takes ten seconds to run a single test, a developer is less likely to run tests as often. This means mistakes are not caught quickly, and it becomes necessary to backtrack. It’s much easier to fix code you wrote 10 seconds ago than 10 minutes ago.
require: false, gems will not be loaded until the first time where they’re explicitly required.
This means it’s up to you to decide when to require each gem. Normally, you’ll do so close to where you actually make use of them.
At that point you will still have the load-time overhead, but the difference is that this will now be limited to a particular subset of the app, rather than happening every time on startup.
Let’s say your writing some code which needs to extract all the URLs of all the links within a web page. We’ll use HTTParty to fetch the content, and Nokogiri to parse it.
# Gemfile gem 'httparty', require: false gem 'nokogiri', require: false
We could require these gems manually in our class immediately before they’re needed:
class LinkParser def initialize(url) @url = url end def links require 'nokogiri' doc = Nokogiri.parse(html) doc.all('a').map do |link| link[:href] end end private def html require 'httparty' HTTParty.get(@url).body end end
But I think calling
require from within a class feels a little awkward.
My preference, and the Ruby convention, is to group all the
require calls together at the top of the implementation.
This means you can see all of a classes dependencies at a glance:
require 'nokogiri' require 'httparty' class LinkParser # ... end
This list also acts as a form of design feedback – if your class has lots of dependencies then it probably has too many responsibilities and should be split up.
Remember, there is no harm in calling
require twice for the same path. Ruby
will recognize that the file has already been loaded, and ignore the call.
But what about Spring?
Tools such as Spring or Zeus attempt to solve the slow startup problem by keeping Rails loaded in memory. But there are some important downsides.
The first, and usually most common objection to Spring is that it can lead to confusing situations where changes are not reloaded.
The second is less obvious but more important.
Recently there has been a movement towards keeping the majority of application code in classes which are not dependent on Rails.
rspec-rails gem now helpfully generates both a plain
spec_helper and a
rails_helper, so that you choose to test your class in isolation from Rails.
However, once you add Spring to your project, Rails will always be loaded, so’s it very easy to accidentally make your class implicitly dependent on Rails, a gem, or another class, without even realising so.
A few people have ask for benchmarks, so I’ve added this section to show some measurements.
I’ve taken the Gemfile from Discourse as it’s a reasonable-sized, production Rails app.
I had to make some small modifications which you can see in this gist.
Let’s first measure how long it would take to require all the gems in the
test groups. This is equivalent to requiring each gem in the
traditional way, without the
default: false option.
$ time bundle exec ruby -e 'Bundler.require(:default, :test)' bundle exec ruby -e 'Bundler.require(:default, :test)' 3.90s user 1.40s system 95% cpu 5.527 total
Now let’s run with
Bundler.setup, which is the equivalent of setting
default: false for every gem:
$ time bundle exec ruby -e 'Bundler.setup(:default, :test)' bundle exec ruby -e 'Bundler.setup(:default, :test)' 1.19s user 0.16s system 98% cpu 1.375 total
(I ran these several times to get an average time. My machine is a 2013 MacBook Air).
So, the difference is over four seconds, which is very significant when you’re doing real TDD and running hundreds of tests over the course of a day.
A recent post on Reddit asked about how an experienced developer can improve their Rails knowledge.
There’s an abundance of resources online for those looking to pick up the basics of building a Rails app. But once you get beyond that, finding good quality content takes some effort.
I’ve been writing Rails code professionally for around six years. In this post, I’ve listed the resources which have been most useful to me over that time.
None of these resources are free. Paying for them is a high-return investment in yourself. If you work as an employee of a company, most will have a training budget per employee, so ask them to contribute.
- Design Patterns in Ruby
- Eloquent Ruby
- Metaprogramming Ruby 2
- Practical Object-Oriented Design in Ruby
- Rails 4 Test Prescriptions: Build a Healthy Codebase
- Refactoring: Ruby Edition
- The Rails 4 Way
- The Ruby Way (3rd Edition)
- The Well-Grounded Rubyist
- Fearless Refactoring: Rails Controllers
- Growing Rails Applications
- Objects on Rails
- Ruby Science
- Understanding The 4 Rules of Simple Design
There are two simple maxims which have had a huge impact on how I write code:
Always check a module in cleaner than when you checked it out. – Uncle Bob
For each desired change, make the change easy (warning: this may be hard), then make the easy change – Kent Beck
These have led to the following principle which I apply whenever committing code:
A single commit on the mainline branch should be either an improvement to the structure of existing code, or a change in behaviour, but not both.
Separate commits are easier to review
diff is pretty smart, but it’s not perfect.
It often can’t distinguish between code which is new and code which has been moved around.
This makes changes more difficult to review.
By using separate commits, the reviewer can examine each diff separately, making it easier to understand what’s changed.
Separate commits improve Continuous Integration
If a conflated commit breaks the build, it’s often not clear whether it was the clean-up or the behaviour change which is to blame. Distinct commits help to pinpoint the cause of the breakage.
Separate commits reduce waste
Sometimes there’s a need to rollback a commit.
git revert subcommand applies a new commit which reverses the changes made in a given commit.
If that commit was responsible for restructuring some existing code, as well as changing behaviour, then we’d lose out on the benefit of that clean-up.
Separate commits are git-friendly
git bisect command helps to discover when a defect was introduced.
Using separate commits makes this more effective.
Separate commits give a clearer history
The commits on a file should give a clear history of how the file has changed over time. Keeping the commits focussed make this easier to understand.
I’ve been aiming to do this in my own code for some time now. And I’ve worked for clients on many struggling, legacy Rails projects where following this simple guideline would have saved a lot of pain.
I was curious to see if I could programatically enforce this convention on a real project, using the fig-leaf gem introduced by the book.