Active Record and Sequel migrations provide an easy way for Ruby developers to alter their database schemas without having to write SQL by hand. This abstraction means that the same migration file could work against both a Postgres and MySQL database by simply changing the underlying database adapter. For large projects with many developers, however, it can be difficult to keep migration style consistent and enforce best practices without additional tooling.
As an example, the Cloud Foundry Cloud Controller service uses the Ruby Sequel gem as its ORM and can be run on both Postgres and MySQL-compatible databases. This flexibility means we need to take some special care when writing our migration files. Consider the following Sequel migration:
It’s simple enough. As you might have guessed, this migration will create a new table called strings
that has a single “String” column named my_string
.
But what type of column actually is “String”? Well it depends on the database you’re targeting! Since we didn’t specify a limit in MySQL Sequel will create a varchar(255)
that will hold 255 characters and in Postgres you’ll get a text
column that will hold an “unlimited” number of characters.
This discrepancy can be difficult to reason about throughout the rest of the codebase, so we’d prefer it if they were both consistent. In Sequel this is as simple as providing the size
argument when creating the column:
As a developer, this requirement can be easy to overlook when you’re focused on developing a new feature and it also might not necessarily be caught in code review. This is where Rubocop comes in.
What is Rubocop
If you’ve been developing in Ruby for some time, chances are you’ve used or at least heard of Rubocop. If not, Rubocop is a code linter that trawls through your code files using static code analysis and finds problems, such as methods with too many lines, code style violations, or common gotchas.
Rubocop itself is just a gem that you can include as a development dependency in your project and it provides a handy rubocop
cli and several rake tasks which you can either run manually or as part of your CI test suites. It even supports an --auto-correct
option that can automatically correct certain issues. It’s pretty handy. 🙂
Writing our own Rubocop Cop
- Note: Unfortunately the methods available within Rubocop for developing your own cops aren’t very discoverable. My general technique is to try to find an official cop in the Rubocop codebase that vaguely matches what I’m trying to do and reverse engineer it. They do have a development README which might help a bit, but it’s pretty sparse.
Rubocop includes a number of built in linters which it calls “cops” (you can tweak which ones it runs by providing a .rubocop.yml
file). It doesn’t, however, include any that will enforce our specific migration concerns. So let’s make one!
Let’s take a look at the following Sequel migration:
This migration is adding a new widgets
table that has columns for guid
, widget_name
, and count
. The migration developer wisely decided to add an index on the guid
column since we’ll likely be making a lot of queries using it. There’s just one problem… the developer forgot to give this index a name!
This means that the index will have an autogenerated name which could be different across the different database types and it will be difficult to make changes to this index in the future if need be. So let’s write a Rubocop cop that will catch this for them next time.
Let’s start out by creating a new linters
directory and the following file within it:
Now, we’ll set the message that Rubocop will provide whenever it finds offending lines by setting the MSG
constant.
Now it’s time to get to the actual implementation of our cop. Rubocop analyzes code by using the parser gem to create an Abstract Syntax Tree, or AST, for the code. The following AST represents the migration that we’re analyzing:
Rubocop uses a handful of methods to traverse the various nodes in these trees. Typically since migrations occur in Ruby blocks, we’ll use the on_block
method that Rubocop provides.
First we’ll start simple. The code above will go through every Ruby block, check the method name, and skip to the next unless we’re dealing with one of Sequel’s column adding methods (for simplicity we’re just checking a subset of them in this cop). We can find the column adding method by looping through each “send” node descendant of the create_table
block that we’re on. The send nodes look like this:
Our method_name(node)
method extracts the second child from these nodes which happens to be the method name. Since “String” is one of Sequel’s column adding methods we’ve found something to inspect!
Now let’s complete our cop:
So what’s going on here? First if we’re found a column adding method we’ll extract it’s options by grabbing the find child of the send_node
in the AST that we’re on. This works because Sequel’s column adding methods typically follow the pattern:
Then the true magic happens in the following line:
In this very basic implementation of the cop, if we see that the options passed to the method are just a single symbol and that symbol is :index
then we are clearly missing a name for our index. We then call Rubocop’s add_offense
method with the current AST node that we’re on. This will mark the expression as an offending piece of lint and it, along with the MSG
that we declared earlier, will show up in the output when we run rubocop
.
Now it’s time to make Rubocop aware of our new cop’s existence by adding it to the .rubocop.yml file
:
Then simply run the rubocop
command:
Voilà, you’ve now written a basic Rubocop cop for linting Sequel migrations! The same basic principles apply to Active Record migrations and for writing Rubocop cops in general.
If you’re interested in seeing a more complete example of the cop above, check out add_constraint_name.rb in the Cloud Controller repo and its associated test.
Hope you found this useful! 😊