learns_to build Academic Archive::Part 2:Setting up a New Rails App and a First Iteration on the Paper Model, Featuring our First Tests

28 August, 2006

Welcome to Part 2 of learns_to build Academic Archive, where I try to blog every last detail involved in building a Ruby on Rails application for publishing and peer-editing academic papers. As requested by Benjamin in the comments on Part 1, from now on, I'll be providing a table of contents to each post. So if you're looking for some specific piece of knowledge, you can jump right into the middle to get it. If you have any other ideas on how to make this series better, I'd love to hear about them in the comments.


  1. Creating a new Rails project
  2. Designing the Paper Model
  3. Setting Up the Database and Generating the Model
  4. Validating the Presence of Papers' Titles
  5. Getting Started with Testing: Fixtures
  6. Testing the Fixtures: Our First Test and First Test Helper
  7. Running Tests: Under Rake, Under Ruby
  8. How To Write a Test: Given, When, Then
  9. Philosophy of Testing

Well, we're airborne now. I posted Part 1 just before boarding a flight for LA and we just reached our cruising altitude.

At the end of Part 1, we'd thought our way through to a good starting design for the whole app and we were ready to start writing some real code. Specifically, we wanted to start with our central object: the Paper model. But before we write even our first line, we've got to do some setup and the tiniest bit more thinking.

Creating a new Rails project

First thing's first: run the "rails" command to generate the spine of a new Rails application in the file system:

gabc:~/Sites Greg$ rails archive

I ran this command from my "Sites" directory where I keep all my projects. It will generate a new folder in there called "archive" and inside it will create a whole bunch of files and folders which constitute a fresh default Rails application.

If you cd into this directory and run "rails --version" you may find that you've got an old version of the framework (mine was at 1.2). Rails is a relatively new framework and it's undergoing a ton of rapid development. This is good because it means that new features get added all the time which make your job easier and old bugs get fixed. To take full advantage of this situation, we want to always be running the most recent version (as I write this it's 1.6). Thankfully all this takes is a single command:

gabc:~/Sites/archive Greg$ rake rails:freeze:edge

We're using Rake, the handy-dandy Ruby build utility. Rake automates common ruby programming tasks like creating, writing, and running files (especially tests). We'll be using Rake constantly in the setup and development of our app; to see all that it can do run "rake -T" and you'll see a list of all the available rake commands with their descriptions. This particular rake command makes sure that we're always running the most recently released version of Rails, going out and grabbing any new versions that come along. When you run it, you'll probably see a bunch of subversion changes scroll down your screen as the framework gets updated to the most recent version.

Now, I've got to confess that I did all of this setup so far at home last night. I knew that I'd be working without internet access while I was traveling and obviously commands like "rake rails:freeze:edge" have to go out over the wire to get their job done. Also, since I was going to be traveling, I wanted to grab a local copy of the Rails documentation which I normally use online. So, if you're working with dependable web access you might skip this step, but it's nice to know how for when you need it:

gabc:~/Sites/archive Greg$ rake doc:rails

Rake will go ahead and check to see if you've got any of the documentation, downloading it and installing it in your project's doc/api directory where you don't. It will take a good chunk of time and download a whole bunch of files.

Designing the Paper Model

Ok, we're good to go. Setup is done. We could start generating app-specific files and writing code right now if we wanted, but just the slightest bit more thinking and note-taking is probably in order first. We decided at the end of our last post that we were going start work by building papers and then the surrounding paper-approval-category relationship. What we didn't discuss was any of the specifics of the Paper model itself. What is a "paper" really? What attributes does it have? Is that really the right name for it? During the electronic blackout period of our ascent here, I sketched some answers in my moleskine. I'll explain them now.

Oops. Speaking of electronic blackouts, I lost battery power just as I polished off that last paragraph. I spent the rest of the flight into LA napping and reading. Not altogether unpleasant. Now, I'm in the corner of an LAX gate about 100 yards from where my flight will board, hunched over the only open outlet in the vicinity, trying to catch a quick charge before my flight for NY boards in 45 minutes.

Anyway, the last question that I asked in the air over Oregon may seem kind of nit-picky, but when it comes to domain modeling, the names we chose for things turn out to be surprisingly important. They should be expressive and unambiguous. We need be able to remember what they mean without confusion upon returning to our code after a long break. A good rule of thumb is: would this name make sense to someone who knows about the domain, but is not in any way a coder? For example, we could call our main object Article instead of Paper. Usage differs even within academia. In the humanities they tend to be papers when delivered at conferences and articles when printed in journals. Students and teachers think of them as papers. Engineers and scientists tend to lean towards papers as well -- for them "article" has a more formal ring to it. I chose paper instead of article because it has less linguistic ambiguity and talking about "an editor's articles" makes me think of parts of speech as much as written documents. You'll find as we go along that I do some hand wringing each time a new name needs to be coined. The process is even tougher when dealing with join models and other nouns that don't have a precise correlation to words in the real world (at work right now we're thinking about changing the name of a model from Batch to Batching because it really represents an event wherein some things are joined together into a batch. Both of those choices sound ugly and are confusing in different contexts).

So, what attributes does a Paper have? Here's a transcription of the sketch I made on my way in from Portland:

  • title
  • created_at
  • updated_at
  • url?
  • file_column?

The first attribute is pretty self-explanatory. The next two are time stamps; created_at tells you when the paper first entered our system and updated_at when it was last changed. These are pretty standard in database-driven web apps and if you include them on your models in a Rails app, Rails will automatically make sure that they get set in the way you'd expect.

A note here about attributes and the role of the database in a Rails app. So far, we've talked about our models in terms of the way they capture real world objects into the abstraction of our design. From another point of view, though, our models are simply representations in code of the database tables we're going to create. The database acts as persistent memory for our program. Here's how it works. At various points along the way, for example when we create a fresh object, the instance of our model will correspond exactly to the state of one row in our database. In concrete terms, if we wrote:

thesis = Paper.create :title => "It's Not Just Academic"

Then the object stored in "thesis" would correspond exactly with a row in the papers table. Each of its attribute-reader methods would return precisely the values of the corresponding columns in the database. Now, say we start changing the values of our paper's attributes like so:

thesis.title = "It Is Just Academic"

Well now the object we have in memory, the paper we're working with in our Ruby code, has diverged from the corresponding paper that we've got saved in the database. This will remain true until we call "save" -- at which point Rails will write our version of the object to the database updating each of the columns so they represent the current values of the attributes -- or "reload," which causes rails to revert the paper we've got in memory to the state that it has stored in the database, attributes will get reset to the values of their corresponding columns, whatever information we'd placed into those variables will be overwritten.

The last two attributes on our Paper model, url and file_column, represent two different ideas I had for keeping track of the location of the actual HTML files that our authors upload. The first and simpler of the two (the one I'll probably start with, in other words) is url. That would just be a string that keeps track of the location in the file system to which we uploaded the HTML file. Under this system, the part of our code that accepts uploads will have to be sure to record the uploaded-file's name so that we'll know where to look for it and how to link to it. The other option "file_column" represents an option I know a little less about, the File Column Plugin. I've never actually used it myself, but I've heard tell of a Rails plugin that allows you to store uploaded files in the actual database itself, handling all of the conversion code so that you can access the file from the database just as you would any other attribute stored there. That sounds intriguing and probably has important optimization repercussions (in other words, it probably plays a big part in determining what resource the application will consume most voraciously: memory on disk, database calls, processor time, etc.). Right now, storing the url as a string seems simpler to me so I'm going to start with that while making a note that the file column plugin is something I should look into more closely later.

Setting Up the Database and Generating the Model

Now that does it for theory and it's time to start actually coding our app (finally!). Wait. Wait. I just realized we've got one more small piece of configuration business to take care of: setting up and configuring the database. This bit is easy and once you've made a few Rails apps you'll be able to do it by rote. There are a ton of different combinations of databases, database engines, operating systems, etc. out there, so I'm just going to tell you what I have to do to get setup. If you're running on a contemporary Mac with a well-configured copy of MySQL things shouldn't be too different for you. If not, Google around, there are plenty of resources out there to help you get things right. Here we go:

First I've got to create the trio of databases on which a Rails app depends: development, test, and production. I'll do this from the command line:

gabc:~/Sites/archive Greg$ mysql -p -u root
(type your root password)
mysql> create database archive_development;
mysql> create database archive_test;
mysql> create database archive_production;
mysql> exit

Then, I'll open up config/database.yml and add my MySQL password to each of the three entires. Now we should be totally good to go. Serious this time. Let's run the server just to make sure:

gabc:~/Sites/archive Greg$ mongrel_rails start -d

Bringing up localhost:3000 in my browser I see: "Welcome aboard: You're riding the Rails!"

At last, it's time to get started on our Paper model. First I'll run the Rails model generator to get all of the files I'll need created and setup:

gabc:~/Sites/archive Greg$ script/generate model Paper

This'll give us, in addition to the model itself, a unit test and fixtures that are all set up and ready to go as well as a migration for setting up the database to handle our new model.

I'll write the migration next since we've basically done all the work already when thinking about what attributes our papers need to have. Here it is (archive/db/migrate/001_create_papers.rb):

class CreatePapers < ActiveRecord::Migration
def self.up
create_table :papers do |t|
t.column :title, :string
t.column :url, :string
t.column :updated_at, :datetime
t.column :created_at, :datetime
def self.down
drop_table :papers

The generator left me with empty self.up and self.down methods, which I've filled in to create the papers table with all the proper fields. Like I said above, the table that corresponds to our model is basically just another view on our model. When we save an individual Paper object the table will store the values that we've assigned to the object. And Rails provides us with convenient methods for reading them back out again. In a minute we'll get to using those, but first let's actually run our migration:
gabc:~/Sites/archive Greg$ rake migrate
Now the papers table exists and has the right fields. We can even go in right away and make a paper by hand if we want via Rails' "console", a shell the framework provide for interacting directly with our data. The console is a great place to sift through your data by hand or try out expressions when you're working on writing custom methods:

gabc:~/Sites/archive Greg$ script/console
>> thesis = Paper.new :title => "It's Not Just Academic"
=> #<Paper:0x26b6e5c @attributes={"updated_at"=>nil, "title"=>"It's Not Just Academic", "url"=>nil, "created_at"=>nil}, @new_record=true>
>> Paper.count
=> 0
>> thesis.save
=> true
>> Paper.count
=> 1
>> thesis
=> #<Paper:0x26b6e5c @attributes={"updated_at"=>Mon Aug 21 14:47:04 EDT 2006, "title"=>"It's Not Just Academic", "url"=>nil, "id"=>1, "created_at"=>Mon Aug 21 14:47:04 EDT 2006}, @new_record=false, @errors=#<ActiveRecord::Errors:0x2637a6c @base=#<Paper:0x26b6e5c ...>, @errors={}>>
>> thesis.title
=> "It's Not Just Academic"

If you follow along with that input, you'll see that I made a new paper with the title "It's Not Just Academic," storing it in a local variable called "thesis". Since I hadn't yet saved the new paper, there were still no papers to be found in the database. Then I did save it, which succeeded, returning true, and re-counted the papers in the database to discover that it was there now. Next, I looked at the object stored in thesis to find a paper different from the one I'd originally put there. It now had non-nil values for "created_at" and "updated_at" along with an additional instance variable by the name of @errors where Rails would store any errors that it happened upon while saving the object (you can read out the current errors on any object by saying something like this: thesis.errors.full_messages). And finally I used a method automatically added by Rails to read off the thesis's title attribute.

Validating the Presence of Papers' Titles

Ok. Now that we're past the total basics of getting our Paper model up and running, we can actually start doing something with it. What do we want the Paper model to do? Well, from when we thought about our screens earlier we know that when users upload papers they're going to be giving us two things: the title, and the HTML file. We're then going to need to store the title in the database, store the file in the filesystem, and store the file's location in the database as well, specifically in the url field we added to the papers table. It would be great if we could give the papers nice urls. For example, I'd love it if the url for my thesis could be something along the lines of: www.academicarchive.org/borenstein/art_history/its_not_just_academic.html. Now I don't want to think too hard about the "/borenstein/art_history" part right now because that's going to have to do with routing and right now I'm trying to concentrate on the Paper model. What I do know from this is that we don't want to save any papers into the database that don't have titles and we're going to want to figure out a system for making the titles our users give us safe to use as urls (there are rules about what can and can't be in a url, i.e. you can't have spaces, can't have apostrophes, they have to be under a certain length, etc.).

I want to take the first of these first: making sure that every paper we save in the database has a title. Thankfully, Rails makes this super easy with a system called validations. In essence, validations are just methods that automatically get run at different points in an object's life cycle (when you make a new one, when it gets saved, etc.), throwing errors unless the object meets certain criteria. When our app has actual views, we can use the validation errors to let our users know that they've done something wrong through on-screen feedback. At this point though, we're just going to use it to make sure that all of our papers have titles. The validation is a one-liner add, like so (in archive/app/models/paper.rb):

class Paper < ActiveRecord::Base
validates_presence_of :title

What does the Rails' implementation of this validation actually look like in practice? Let's jump into script/console and find out:

gabc:~/Sites/archive Greg$ script/console
Loading development environment.
>> thesis = Paper.new
=> #<Paper:0x2662e9c @attributes={"updated_at"=>nil, "title"=>nil, "url"=>nil, "created_at"=>nil}, @new_record=true>
>> thesis.title
=> nil
>> thesis.save!
ActiveRecord::RecordInvalid: Validation failed: Title can't be blank
from ./script/../config/../config/../vendor/rails/activerecord/lib/active_record/validations.rb:756:in `save!'
from (irb):3

You can see that we built a new paper and didn't assign it a title. Then when we tried to save the paper, Rails raised an "ActiveRecord::Record Invalid" error that included a message explaining its cause and a traceback showing us exactly where in the code the problem came up (we called "save!" with the exclamation mark at the end because that tells Rails to throw an error in our face if one comes up instead of simply failing silently).

Getting Started with Testing: Fixtures

Now that we've finally written some actual code, our next job is to make sure that code actually works as we expect it to and that means tests. Testing is a big subject, but suffice it to say here that it has two main purposes: to make sure our code does what we think it does and to make it easy for us to change our code later on (if we make a major change and all the tests still pass, that's a good sign that the rest of our code still works; if they don't, well that means we've probably got some fixing to do). (Don't worry if you're totally new to testing and the whole concept seems a little fuzzy to you. It will become clear in a minute when we actually write our first test -- tests are one of those things, like spiral staircases, that are much easier to show than to describe.)

Anyway, for our tests to be most effective, we want to cover as much of our code as possible and that means starting right away. The more untested code you write the less likely you are to ever go back and add tests and the more likely you are to end up with confusing, unmaintainable code. In fact, some people insist that you should "test first," writing tests that define the behavior you want from your code before writing your code itself. That way you don't "overcode"; you make sure not only that your code works, but that it doesn't have any undesirable side effects. We may do some test first development a little later on, but right now we're in a simple enough situation that I'm perfectly happy to start testing with a whopping one line of existing code.

What do we want to test? We want to test that our code actually does require each paper to have a title like we're trying to get it to and, further, that a paper without a title will always throw an error. So, the first thing we need is some fake papers to play around with for testing. As part of its testing suite, Rails gives us a place to create these papers: the fixtures. You can think of fixtures as just like tables in the database, only they happen to be represented in a flat file. At the start of a test run, Rails loads the data in these files into a temporary testing database so you can access it in your test methods. This makes it perfect for creating different scenarios against which to run your code and make sure that it does the right thing. In our case, we're going to want to make some papers and see if our code can tell whether or not they're valid.

Rails already created our fixture file for us when we generated the Paper model, so let's open it up and take a look (it lives at test/fixtures/papers.yml):

# Read about fixtures at http://ar.rubyonrails.org/classes/Fixtures.html
id: 1
id: 2

Here's how this works: the non-indented lines are "names" by which we can refer to each entry. The other lines are pairs of column names and row values in the table. It will quickly become clear if I show you how I turned the version of my thesis we were playing with before in script/console into a fixture:

id: 1
title: "It's Not Just Academic"
created_at: 2006-08-21 09:34:28
updated_at: 2006-08-21 09:34:28

Pretty self-explanatory. The one gotcha is the format of the "created_at" and "updated_at" fields, which look different than what Ruby printed to the screen when we were in script/console. This is MySQL datetime format. When I can't remember how it goes, I make a new record in script/console and then just go look at my database using a GUI tool like YourSQL (especially when I'm on an airplane on the way from NY to San Francisco with no access to the web). There are a few other things that commonly go wrong when working with fixtures and I'll just point them out here, while we're on the subject: (1) the .yml format (rhymes with "camel") is super picky about white space; indentations need to be 2-spaces wide, there can only be one space between the colon and the value, etc. (2) each entry in a particular fixture file needs to have a unique id; if you accidentally re-use the same id twice in one file everything will go haywire. (3) the test database doesn't necessarily get reloaded each time you run your test, only if you run it under rake; sometimes this can get especially confusing because the fixtures that get loaded up for one test tend to stick around for the next one and so you can have tests that pass or fail depending on what order you run them in (for example a functional test that fails when you run "rake test:functionals" may pass if you run just "rake" (which runs the units first before the functionals)).

If you're totally new to tests, some of that may have just seemed like gibberish. Don't worry about it. You can always reread that paragraph if you're running into mysterious errors as some future point done the line. . .

Testing the Fixtures: Our First Test and First Test Helper

I'm back in Portland now and recovered from my travels. Where were we? That's right. We've got our fixture in place so it's time to write some tests! Before we try and test our actual code, though, it's probably a good idea to make sure that our fixture itself is well-formed, or else our tests will be pretty useless. I've got a little test helper method from some earlier projects that's super helpful for this (for full disclosure, like most things it was probably actually Chris's idea). If we want a method to be available to all our test, we just stick it in test/test_helper.rb, so that's where we'll stick the following code (there's a helpful little comment in test_helper.rb that will guide you once you once you're in there):

def assert_all_valid klass
klass.find(:all).each do |obj|
assert obj.valid?, "#{obj.class} with id #{obj.id} is invalid"

Let's walk through this method. First of all, it takes a class as an argument. Since "class" itself is a reserved word (a word that has special properties in Ruby and is hence unavailable as a name for a normal variable) we call it "klass". We might as well have called it "bob," but "klass" is conventional because it's easy to remember what it means. Once that's understood, there's not too much else going on here. We use Rails' "find(:all)" syntax to find all the members of our class and then we assert the validity of each particular member in turn, printing out a helpful message if the object is not valid. When defining custom test_helper methods of your own you'll save yourself a lot of headaches if you add as specific as possible of an error message so that, when the test fails, it will be clear what went wrong as well as, importantly, which particular objects or attributes were involved (hence the inclusion of obj.class and obj.id in the message).

A note of syntactical explanation: Rails adds a method to our objects called "valid?" that returns true if the object passes its class's validations and false if not; "assert" is the simplest testing method, passing if its argument is true and failing if it is false. Put these two together and you've got a test that passes if and only if the object is valid.

Now, let's write and run the test. In the test for our Paper model that Rails automatically stubbed out for us (test/unit/paper_test.rb), we'll replace the sample method with:

def test_fixtures
assert_all_valid Paper

save the file and then run the tests like so:

gabc:~/Sites/archive Greg$ rake test:units
(in /Users/Greg/Sites/archive)
/opt/local/bin/ruby -Ilib:test "/opt/local/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake/rake_test_loader.rb" "test/unit/paper_test.rb"
Loaded suite /opt/local/lib/ruby/gems/1.8/gems/rake-0.7.1/lib/rake/rake_test_loader
Finished in 0.186302 seconds.
1) Failure:
[./test/unit/../test_helper.rb:30:in `assert_all_valid'
./test/unit/../test_helper.rb:29:in `assert_all_valid'
./test/unit/paper_test.rb:7:in `test_fixtures']:
Paper with id 2 is invalid.
<false> is not true.
1 tests, 2 assertions, 1 failures, 0 errors
rake aborted!
Command failed with status (1): [/opt/local/bin/ruby -Ilib:test "/opt/local...]

What's this? Our very first test and we've already failed it! Well, thanks to the message we added to our custom assertion, it's really easy to tell what's going on: we have an invalid paper in our fixtures (when tests fail or throw errors they print out Es and Fs and then report back on the problem with a trace, showing which lines in which files got run before the problem hit; if you're trying to track down a less obvious problem than this one, that trace will be your lifeline). If we look at our paper fixtures (test/fixtures/papers.yml), we'll see that, in addition to the paper we created above, we've got the second one that Rails automatically created for us still hanging around:

id: 2

And that paper is definitely not valid. Remember, we're validating the presence of our papers' titles and this one hasn't got one. It's only got an id. So, in order to get this test to pass, we've got to either delete this paper from our fixture or edit it so it'll be valid. Let's do the latter, like so:

id: 2
title: "Simulacra and Simulacrum"
created_at: 1996-08-21 09:34:28
updated_at: 1996-08-21 09:34:28

Now, saving the file and rerunning should result in our first clean test run:

gabc:~/Sites/archive Greg$rake db:test:prepare
(in /Users/Greg/Sites/archive)
rubygabc:~/Sites/archive Greg$ruby test/unit/paper_test.rb
Loaded suite test/unit/paper_test
Finished in 0.194362 seconds.
1 tests, 2 assertions, 0 failures, 0 errors

This is great! After a little bit of setup, we've successfully tested the code we just wrote: our validation catches papers that don't have titles.

Looking a little closer at the output from the test run, notice that we got credit for two assertions rather than just one. That's because rake counted the internal call to "assert obj.valid?" as well as the direct call to "assert_all_valid" itself. If we had two papers in our fixtures, rake would have told us we wrote three assertions, and so on.

Running Tests: Under Rake, Under Ruby

It is probably worthwhile to spend a moment here on some of the specifics involved when running tests. There are four basic ways to run tests: a "full rake", just the units (the tests that exercise our models), just the functionals (those that exercise our controllers), or individual test files one at time. The first three we do by invoking rake ("rake", "rake test:units", and "rake test:functionals" respectively) and the last we do by just running the test file as if it was any other ruby program ("ruby test/units/paper_test.rb", for example). When you run your tests, Rails uses a different database from the one you're developing on. If you remember some of the configuration we did above, when we set up our database.yml file, we told Rails to use a database called "archive_test" for this purpose. At the start of each run, rake clears that database and then loads it up with the data you stored in your fixtures so that you'll have a controlled environment in which to do your testing. Further, the Rails testing framework keeps the data generated in each test method from polluting your database for other methods. Each test method gets a clean start.

Besides running different sets of test files, each of the three different rakes (full, units, and functionals) does this database destroying and recreating process separately. So, if you run a full rake, your test database gets destroyed and recreated twice, once at the start of the rake when the units run and once halfway through before the functionals do. Since rake only loads up the tables that you tell it to (by including different sets of fixtures at the top of each of your test files) this ordering can mean that you can get different results from the same test! Let's say you were working on a functional test. When you run that test under rake test:functionals only the set of tables explicitly asked for in the functionals tests get loaded. Under a full rake the units run first, so by the time your tests get run, the tables created by the units will still be hanging around. If your tests passage or failure hinges on this difference, you'll see different behavior in the two situations. If you encounter this issue just make sure that each of your tests calls all of the fixtures that it needs (don't forget the ones being referenced through associations either!).

And finally when you're just running a single test like "ruby test/units/paper_test.rb" -- which can be a real time saver once you've got a lot of tests written and running the whole suite takes a full minute or two -- you don't have the benefit of rake's database loading at all. Your test will run with whatever the current state of the test database was leftover from your last rake. This can result in some seriously strange results that will have you chasing ghost bugs that aren't really there. To prevent that problem, simply run "rake db:test:prepare" before your test and rake will setup your test database just how you want it.

How To Write a Test: Given, When, Then

Now, while our first test definitely exercised the code we just wrote (the validation obviously got run), it plays kind of a more general role: guarding our paper fixtures from any invalid data. More to the point, if we stopped validating on the presence of a paper's title, the test would still pass (try it, go delete the whole line and then rerun your tests). Therefore, this can't quite be said to be a test on that validation as such. So, let's write one.

How, generally, do you write a test? Well, most tests have three parts: the setup that must be in place to accomplish some action, the actual code that runs the action (this is the code you're trying to test), and then some ideas about what we expect the effect of that action to be. Splitting these parts up in your mind and then addressing them one at a time usually makes it much easier to write a test. When I start my test methods, I find it helps to start by writing these parts down explicitly as comments so I can keep focused on exactly what I have to do (plus it lets me do a bunch of typing, which feels productive, without having to actually do any thinking), like so (in test/units/paper_test.rb):

def test_validates_presence_of_title

Giving tests descriptive names is always a good idea since the whole point of them is that if you ever see them in a test run they should tell you exactly what's gone wrong. Rails will only run test methods that actually start with "test_", so a good recipe for naming tends to be appending some description of what you're testing onto there.

Back to the question of how to test our validation. Let's try to say the three parts of our test in words. Given a paper that has no title, when we try to save it, then the paper should throw an error, remain unsaved, and report itself invalid. Now, that's starting to sound like something I could write up in code. I'll give it a shot. Here's my first draft:

def test_validates_presence_of_title
p = Paper.new
assert !p.valid?

I make a new paper. Don't assign it a title. Try to save it. And then assert that it is not valid. Just like I planned. What happens when I run that test?

1) Error:
ActiveRecord::RecordInvalid: Validation failed: Title can't be blank
/Users/Greg/Sites/archive/config/../vendor/rails/activerecord/lib/active_record/validations.rb:756:in `save!'
./test/unit/paper_test.rb:14:in `test_validates_presence_of_title'

Oops! Trying to save the paper failed, like it was supposed to, but the error that it threw prevented the rest of our test from executing. What we need to do is wrap our save call in an assertion which knows to expect the error, like so:

def test_validates_presence_of_title
p = Paper.new
assert !p.valid?

This is a passing test. Assert_raises takes an error type as an argument (thankfully we knew exactly what type of error to expect since we'd already seen it on the first run) and passes only if the code in its block throws that error.

Now, I'll show you just one more iteration of this test with a few more trimmings:

def test_validates_presence_of_title
paper_count = Paper.count
p = Paper.new
assert !p.title
assert !p.valid?
assert_equal paper_count, Paper.count

What have I added? Start with the first and last lines. One of the things we'd said we wanted to test was that the paper should remain unsaved. Well, there's two sides to that: the object's side and the model's side. We're already testing for the error thrown by the call to "save!", but now we want to test the model side, i.e. that the number of papers in the database doesn't change. To test that, we store the count of papers into a local variable (paper_count) on the first line and then compare it to a fresh count on the last line ("count" is a useful method that Rails adds to all of your model classes, it returns the result of Model.find(:all).length). As long as these two are the same, we'll know that nothing we've done has affected the count of papers in the database.

The other thing I've added is the assertion that, just after it is newly made, the paper does not have a title. While somewhat extraneous, the purpose of this assertion is to make explicit one of the assumptions in our given state: a new paper doesn't have a title. Since it's the very absence of that title that renders the paper invalid, it made sense to write an assertion verifying it before getting to the heart of the matter.

Philosophy of Testing

Is this overkill? This particular example is obviously somewhat contrived. I probably wouldn't be this thorough in testing such a simple situation if I wasn't trying to demonstrate the ins and outs of my thought process while writing tests. But what should our "philosophy of testing" be? Is it possible to have too many tests? What should be the thrust of the tests that we do write?

Like so many other things, answers to these questions are partially a matter of taste and partially a matter of responding to the particular situation you find yourself in, both of which are things that are hard to learn through any other method besides experience (I work all day with coders who are better at them than I), but I think I can lay down a few guidelines that help guide my thinking.

Let's start with some don'ts:

  • Don't test something that's part of the framework or a third-party library. If you don't trust other people's code enough to use it without redundant testing, you should probably just avoid using it altogether. Plus, this is just unnecessary extra work when the whole point of using libraries and frameworks is to avoid duplicating effort that other people have already put in. (To a certain extent we're breaking this rule in our test above, but not too badly. The key difference is that we're testing whether we've successfully used the framework to enforce a business logic rule (that papers must have titles) rather than whether or not the framework's code for enforcing that rule works in the first place.)
  • Don't let your tests lock down the specifics of your code too much. When I first got into the swing of writing tests, I got hooked on assertions. I wanted to run up the score, to see more dots zoom across my screen. And so for awhile, I picked up the bad habit of writing assertions on everything I knew to be true in my code: the exact wording of error messages, the exact values of a bunch of attributes in the fixtures, etc. This turned out to be a bad idea because it made my tests incredibly fragile. Anytime I'd twiddle around with my fixtures at all (say, to fix a typo), my tests would break. My tests were making more work for me when they were supposed to make my life easier. Which brings me to. . .

The dos:

  • Do write tests that ensure outcomes. Our goal with writing tests is to leverage a specific situation we've thought of (and, often, captured in the fixtures) into a general structure that will make sure that our code will act right in all situations. For example, in testing our validation above, we could have written something like this:

    def test_validates_presence_of_title
    p = Paper.new
    assert !p.valid?
    p.title = "My title"
    assert p.valid?

    On the surface, this test seems a lot like the one we wrote above. It asserts that a paper without a title is not valid, adds a title to the paper, and then asserts that paper is valid. What it doesn't do is engage with the more general purpose of our validation: preventing papers that lack titles from getting saved to the database. It also has some specifics hard coded into it: the choice of "My title" as a title. While that seems fine right now, what if we made a change later on that, say, required all of our titles to be formatted in unicode for internationalization? Then this test would start to fail even though it was unrelated to the new code we were trying to write. It would become yet another spot in our code we had to change to add a new feature or to alter our design.
  • Do write tests first to specify behavior. Often times tests are just a better medium in which to think about the design for your program than the program itself. Writing a test lets your think precisely about what you want your test to do without worrying about how it's going to have to get it done. For example, take the goal I mentioned of having pretty urls for our papers (getting the url for my thesis to end with "its_not_just_academic.html"). Well, I still don't have a clear plan for how to accomplish that goal, but I know how to write a test on it:

    def test_paper_url
    p = Paper.new :title => "It's Not Just Academic"
    assert_equal "its_not_just_academic.html", p.url

    Right now, running this test will result in a failure:

    1) Failure:
    test_paper_url(PaperTest) [./test/unit/paper_test.rb:28]:
    <"its_not_just_academic.html"> expected but was

    But now I've got the beginning of a kind of objective standard against which I can write my system for generating papers' urls from their titles. If this test (and presumably some others) passed then I would be done. Working this way lets me focus on making the individual parts of my code work without having to constantly be trying to remember what the point of all of this code was. The tests keep track of the larger context so I don't have to.

If all of these testing ideas seem a bit hypothetical to you right now, don't worry about it. Hopefully you'll be seeing them all in practice a lot as we continue work.

Speaking of which, now that we've got a basic Paper model, it's time for us to write some real screens with the forms and views that our users will interact with to upload their own papers. So, stay tuned until next time when we'll:

gabc:~/Sites/archive Greg$ script/generate controller papers

Tagged: , , , , , , , , , , , , , ,