Ruby on Rails – ODIN Project – 3

odinDatabases and Active Record

This section covers the back end of Rails, which is the most important part of the framework. You’ll learn about databases and go deep into SQL before applying that knowledge to Rails’ Active Record gem.


 Step 1: Databases and SQL

Data is the core of every major web app and here you’ll learn how to use SQL.


Databases and SQL

Introduction

Data is the core of any good web app and a good working knowledge of SQL will take you a long way. That knowledge lets you not just understand what’s going on behind the scenes with your ORM tool (ie. Active Record) but also to feel comfortable asking more complicated questions of your data. And that’s really what SQL is all about – asking questions of your database and occasionally also adding or changing things in it. Querying can be incredibly useful for you.

In more straightforward cases, you might want to display all users who signed up in December via the promotion code “FREESTUFF”. You might want to display all comments created by the current user and sorted by topic and creation date. In more complex cases, you may want to show a list of all the orders shipped to states with more than 1,000 users by quantity and total order value. Or, for internal reasons, you might ask marketing analysis questions like which promotion channels produce users who meet your specified engagement criteria of reading 5 articles per work week.

All of those examples involve you engaging with your database. Luckily, many databases (the ones we’ll be focusing on) speak SQL and luckily SQL really only has a dozen or so words that you’ll consistently use out of a few dozen total. So it’s not a large language, it’s much more about the concepts behind it.

You will start with the questions like the ones above and then have to figure out how to ask them properly of your database, which likely has a bunch of different tables in it. Everyone probably visualizes it a bit differently, but finding a way to visualize what’s going on when you do SQL queries is pretty important. I actually think of Excel tables moving in my head and combining with each other and reshuffling as necessary.

We’ll move beyond just the simple SELECT "users".* FROM "users" LIMIT 1 queries (this one is used whenever you ask Rails for the first user with User.first) and into more dynamic topics like joining tables together, performing calculations on the results, and grouping results together in new ways.

All this is being used by Rails behind the scenes so understanding it will make you much better at writing queries in Rails. This is why we’re going over databases before learning Active Record. We’ll do the same thing with forms.  You’ll learn how to build them in HTML and then how to use Rails to make them for you.


A Note on Resources

SQL is one of those topics that’s been stored away in dusty old technical manuals and 90’s style websites. Even the best books out there can make it seem oddly complicated because they tend to write for the database engineer who actually does need to know all the details.

Though the prevalence of web applications these days has grown the demand among new users to focus on understanding the concepts of SQL, the learning tools haven’t really caught up. We’ll do our best to impart those concepts using the tools available.

Points to Ponder

Look through these now and then use them to test yourself after doing the tasks

Important Concepts:

  • What is the Primary Key?
  • What are Foreign Keys?
  • What is a Schema?

Statements:

  • SELECT
  • CREATE TABLE
  • DROP TABLE
  • CREATE INDEX
  • DROP INDEX
  • UPDATE
  • DELETE
  • INSERT INTO
  • CREATE DATABASE
  • DROP DATABASE
  • COMMIT (concept)
  • ROLLBACK (concept)

Clauses:

  • DISTINCT
  • WHERE
  • IN
  • AND
  • OR
  • BETWEEN
  • LIKE
  • ORDER BY
  • COUNT

Functions

  • GROUP BY
  • HAVING
  • AVG
  • COUNT
  • MIN
  • MAX
  • SUM

Other

  • What are Indexes good for?
  • What’s the difference between WHERE and HAVING?

The World’s Fastest Semi-Complete Explanation of SQL

This is a very brief explanation of SQL. It won’t do a particularly good job teaching you specific new tactics but should present a general overview to have going into the reading tasks.

SQL is the language used to talk to many relational databases. These databases use lots of tables to store different types of data (ie. “users” and “posts” tables). Tables are long lists like spreadsheets where each row is a different record (or object, ie. a single user) and each column is one of that record’s attributes (like name, email, etc). The one column that all tables include is an “ID” column, which gives the unique row numbers, and is called the record’s “primary key”.

You can “link” tables together by making one of the columns in one table point to the ID of another table, for instance a row in the “posts” table might include the author’s ID under the column called “user_id”. Because the “posts” table has the ID of another table in it, that column is called a “foreign key”.


Setting Up

SQL lets you do everything. The first category of commands are for setting up the database (CREATE DATABASE), setting up an individual table (CREATE TABLE), and similar commands for altering or destroying them. The setup information for your database is stored in a special file called the “Schema”, and this is updated whenever you make changes to the structure of your database. Think of the schema as saying “here’s our database and it’s got a couple tables. The first table is ‘users’ and it’s got columns for ‘ID’ (which is an integer), ‘name’ (which is a number of characters), ’email’ (which is a number of characters) …”

In addition to setting up tables, you can tell your database to only allow unique values in a particular column (e.g. for usernames) or to index a column for faster searching later with CREATE INDEX. Create indexes, which basically do all the hard work of sorting your table ahead of time, for columns that you’ll likely be using to search on later (like username). It will make your database much faster.

SQL likes semicolons at the end of lines and using single quotes (‘) instead of double quotes(“).


 Mucking Around with Data

Once your database is set up and you’ve got empty tables to work with, you use SQL’s statements to start populating it. The main actions you want to do are CRUD (which we’ve seen before) – Create, Read, Update, and Destroy. Most of the commands you run will fall under the “Read” category, since you’ll spend lots of time asking questions of your data and trying to display it.

Every CRUDdy command in SQL contains a few parts – the action (“statement”), the table it should run on, and the conditions (“clauses”). If you just do an action on a table without specifying conditions, it will apply to the whole database and you’ll probably break something.

For “Destroy” queries, the classic mistake is typing DELETE * FROM users without a WHERE clause, which removes all your users from the table. You probably needed to delete just one user, who you would specify based on some (hopefully unique) attribute like “name” or “id” as part of your condition clause, e.g. DELETE * FROM users WHERE users.id = 1. You can do all kinds of common sense things like using >, <, <= etc. comparison operators to specify groups of rows to run commands on or logical operators like AND, OR, NOT etc to chain multiple clauses together, ie. DELETE * FROM users WHERE id > 12 AND name = 'foo'.

“Create” queries use INSERT INTO and you’ll need to specify which columns to insert stuff into and then which values to put in those columns, which looks something like INSERT INTO Users (name, email) VALUES ('foobar','foo@bar.com');. This is one of the few queries that you don’t need to be careful about which rows you’ve selected since you’re actually just adding new ones into the table.

“Update” queries use UPDATE and you’ll need to tell it what data to SET (using key=”value” pairs) and which rows to do those updates for. Be careful because if your WHERE clause finds multiple rows (ie. if you’ve searched based on a common first name), they’ll all get updated. A standard query for updating a user’s email may look something like the following (though in the real world you’d search on ID because it’s always unique):

    UPDATE Users 
    SET name='barfoo', email='bar@foo.com' 
    WHERE email='foo@bar.com';`

“Read” queries, which use SELECT, are the most common, e.g. SELECT * FROM users WHERE created_at < '2013-12-11 15:35:59 -0800'. The * you see just says “all the columns”. Specify a column using both the table name and the column name. You can get away with just the column name for simple queries but as soon as there are more than one table involved, SQL will yell at you so just always specify the table name: SELECT users.id, users.name FROM users.

A close cousin of SELECT, for if you only want unique values of a column, is SELECT DISTINCT. Say you want a list of all the different names of your users without any duplicates… try SELECT DISTINCT users.name FROM users.


 Mashing Tables Together

If you want to get all the posts created by a given user, you need to tell SQL which columns it should use to zip the tables together with the ON clause. Perform the “zipping” with the JOIN command. But wait, if you mash two tables together where the data doesn’t perfectly match up (e.g. there are multiple posts for one user), which rows do you actually keep? There are four different possibilities:

Note: the “left” table is the original table (the one that the FROM clause was ON), ie. “users” in examples below.

1.INNER JOIN, aka JOIN – Your best friend and 95% of what you’ll use. Keeps only the rows from both tables where they match up. If you asked for all the posts for all users (SELECT * FROM users JOIN posts ON users.id = posts.user_id), it would return only the users who have actually written posts and only posts which have specified their author in the user_id column. If an author has written multiple posts, there will be multiple rows returned (but the columns containing the user data will just be repeated).

2.LEFT OUTER JOIN – Keep all the rows from the left table and add on any rows from the right table which match up to the left table’s. Set any empty cells this produces to NULL. ie. return all the users whether they have written posts or not. If they do have posts, list those posts as above. If not, set the columns we asked for from the “posts” table to NULL.

3.RIGHT OUTER JOIN – The opposite. Keep all rows in the right table.

4.FULL OUTER JOIN – Keep all rows from all tables, even if there are mismatches between them. Set any mismatched cells to NULL.

Joins naturally let you specify conditions too, like if you only want the posts from a specific user: SELECT * FROM users JOIN posts ON users.id = posts.user_id WHERE users.id = 42.


Using Functions to Aggregate Your Data

When you run a vanilla SQL query, you often get back a bunch of rows. Sometimes you want to just return a single relevant value that aggregates a column, like the COUNT of posts a user has written. In this case, just use one of the helpful “aggregate” functions offered by SQL (most of which you’d expect to be there – functions like SUM and MIN and MAX etc). You include the function as a part of the SELECT statement, like SELECT MAX(users.age) FROM users. The function will operate on just a single column unless you specify *, which only works for some functions like COUNT (because how would you MAX a column for “name”?).

You often see aliases (AS) used to rename columns or aggregate functions so you can call them by that alias later, ie. SELECT MAX(users.age) AS highest_age FROM users will return a column called highest_age with the maximum age in it.

Now we’re getting into the fun stuff. Aggregate functions like COUNT which return just a single value for your whole dataset are nice, but they become really useful when you want to use them on very specific chunks of your data and then group them together, ie. displaying the COUNT of posts for EACH user (as opposed to the count of all posts by all users). That would look like:

    SELECT users.name, COUNT(posts.*) AS posts_written
    FROM users
    JOIN posts ON users.id = posts.user_id
    GROUP BY users.name;

The last trick is if you want to only display a subset of your data. In a normal situation, you’d use a WHERE clause to narrow it down. But if you’ve used an aggregate function like COUNT (say to get the count of posts written for each user in the example above), WHERE won’t work anymore. So to conditionally retrieve records based on aggregate functions, you use the HAVING function, which is essentially the WHERE for aggregates. So say I only want to display users who have written more than 10 posts:

    SELECT users.name, COUNT(posts.*) AS posts_written
    FROM users
    JOIN posts ON users.id = posts.user_id
    GROUP BY users.name
    HAVING posts_written >= 10;
  • Go back to the W3 example (above) and add the line HAVING NumberOfOrders > 60; to the bottom.

SQL is faster than Ruby!

Learning SQL is particularly relevant because it’s MUCH faster for you to build queries that use SQL intelligently than to just get a lot of data out of your database and then use Ruby to process it. For instance, if you want all the unique names of your users, you COULD just grab the whole list from your database using SQL like SELECT users.name FROM users (which Active Record will do for you with User.select(:name)) then remove duplicates using Ruby’s #uniq method, e.g. User.select(:name).uniq… but that requires you to pull all that data out of your database and then put it into memory and then iterate through it using Ruby. Use SELECT DISTINCT users.name FROM users instead to have SQL do it all in one step.

SQL is built to be fast. It has a special query optimizer which takes a look at the whole query you’re about to run and it figures out exactly which tables it needs to join together and how it can most quickly execute the query. The difference between using SELECT and SELECT DISTINCT is negligible compared to the time cost of doing it in Ruby. Learning your SQL will help you write Active Record queries that can do more which will make you app much faster.


Your Tasks

1. Look at the Databases section (http://bedford-computing.co.uk/learning/ruby-on-rails/web-development-101-odin-project/)

2. Read this simple SQL Course (http://www.sqlcourse.com/) for the basics.

3. Read this more advanced SQL Course (http://www.sqlcourse2.com/) to learn about the advanced SELECT stuff like aggregate functions and GROUP BY


 Conclusion

The next step, once you’ve had a chance to practice this all in the projects, is to apply it to Rails with Active Record. You’ll quickly find that Active Record makes your life much, much, much easier.


Additional Resources

  • Read Zed Shaw’s Learn SQL The Hard Way (http://sql.learncodethehardway.org/). The book is imperfect and also incomplete but should help solidify things for you. It’ll have you doing a fair bit of work from the command line, so you’ll get a chance to catch up on your Bash scripting as well.

Project: SQL and Databases

Project: SQL Zoo

SQL Zoo is one of the few resources online that actually lets you build and run queries against existing tables. Each tutorial will show you a table and then have you run queries against it to answer specific questions. Some of the queries, particularly at first, are quite simple. They definitely get more challenging towards the end.


 Your Task

1. Go to SQL Zoo (http://sqlzoo.net/wiki/SQL_Tutorial) and complete all 10 tutorials listed under the SQL Tutorial – Tutorials: Learn SQL in stages section and the quizzes listed at the end of each. The first is Select basics (http://sqlzoo.net/wiki/SELECT_basics). Make sure the drop-down on the left of the main page for Engine says MySQL (the default). Large results will be cut off and not all rows or columns shown, so the answers may not look 100% correct.


Additional Resources


Step 2: Active Record Basics

Active Record turns all the bare database queries (like SQL) into Ruby methods.

Introduction

Probably the most important way that logical thinking is required when building a website is in setting up your data model properly. Data is the foundation of almost all major web applications, from a simple blog site to Facebook’s massively complex web of data. Having an obscure or overly complex data model can cripple you when you try to grow and make your life as a developer exceedingly painful. If you’re working with the wrong tools, something “simple” like asking to display all the comments a user has made on another user’s web posts can take up far too many brain and CPU cycles to accomplish.

If data is the most important piece of a web application, then how Rails handles data should be very interesting to you. Luckily, this is one of the most significant reasons that Rails has performed so well compared with the options available just a few years ago. ActiveRecord is the interface that Rails gives you between the database and your application. It lets you structure your data models for your users, blog posts, comments, followers, etc. in a logical and nearly plain-English way. If it seems complicated (which it will at points), just imagine life before Active Record.

Having a solid understanding of Active Record will make the rest of Rails seem simple by comparison.  In this lesson, we’ll cover all the basics of working with models, from setting them up to building simple associations between them.

Points to Ponder

Look through these now and then use them to test yourself after doing the tasks

  • What is an ORM?
  • Why is Active Record more useful than just using SQL?
  • What are the two steps required to make a new row in your database table with ActiveRecord?
  • What are “generators” in Rails?

What is an ORM?

What is Active Record anyway? Recall that Rails is actually seven Ruby gems that work harmoniously together. Active Record is, to put it inelegantly, the gem that takes care of all the database activity. It’s known as an ORM.

ORM stands for Object-Relational-Mapping. It basically means that Active Record takes data which is stored in a database table using rows and columns, which needs to be modified or retrieved by writing SQL statements (if you’re using a SQL database), and it lets you interact with that data as though it was a normal Ruby object.

So if I want to get an array containing a listing of all the users, instead of writing code to initiate a connection to the database, then doing some sort of SELECT * FROM users query, and converting those results into an array, you can just type User.all and Active Record gives you an array filled with User objects that you can use.

Even more impressive, it doesn’t really matter which type of database you’re using (as long as you’ve set up the config/database.yml file properly), Active Record smooths out all the differences between those databases for you so you don’t have to think about it. You focus on writing code for your application, and Active Record will think about the  details of connecting you to your database. It also means that if you switch from one database to another, you don’t actually need to change any major application code – just some configuration files.


Rails Models

That’s a step ahead of ourselves, though, because first it makes sense to think about what the relationship is between Rails and a database anyway. It’s actually pretty straightforward — you want to store information about your users, so you create a database table called users. You want to be able to access that data from your application, so you create a model called User, which is really just a Ruby file which inherits from Active Record and thus gets to use all the methods referred to earlier like all and find and create. One table corresponds with one model which inherits from Active Record.


 30 Seconds About Working With Models

Very briefly, Active Record lets you create a Ruby object that represents a row in one of your database tables, like a User. To create a new User is a two-step process: First, you’ll need to do a User.new and might pass it a hash full of its attributes like

u = User.new(name: "Sven", email: "sven@theodinproject.com")

If you don’t pass a hash, you’ll need to manually add the attributes by setting them like with any other Ruby object: u.name = "Sven". The second step is to actually save that model instance into the database. Until now, it’s just been sitting in memory and evaporates if you don’t do anything with it. To save, simply call u.save. You can run both steps at once using the #create method:

u = User.create(name: "Sven", email: "sven@theodinproject.com")

This saves you time, but, as you’ll see later, you’ll sometimes want to separate them in your application.


 Your Tasks

That was really just a ‘teaser’ about what Active Record can do. In the reading below, you’ll learn about how to specifically interact with Active Record in your models.

  1. Read the Rails Guides Active Record Basics (http://guides.rubyonrails.org/active_record_basics.html).
  • We’ll go more into Migrations and Validations in the next section.
  • Model files in Rails live in the app/models folder and are just normal .rb files. The key points are that the file and the class name is named after the table in your database (but singular), and that class inherits from ActiveRecord::Base to get its super powers.

Migrations

When You Need Them

Imagine you’re staring at a blank computer screen and you need to start your new Rails project. What’s the first thing you do? You type $ rails new MyProjectName then cd into that directory.  Then what?

Figure out the data models that you’ll need to use for the first iteration of your site and start getting them set up. For our purposes, we’ll just assume all you need is the ubiquitous User model to keep track of all the dozens of users who will be on your site someday. After you’ve actually created the database in the first place (using $ rake db:create), to create that model you’ll need to do two steps:

1. Create a model file in app/models which is set up like you just learned above.

2. Create a database table called “users” that has the appropriate columns. This is done using a migration file and then running the migration.

The best part is that Rails knows that you want to do this and has given you a handy shortcut for doing so: the $ rails generate model YourModelNameHere command. When you type it in, you will see in the Terminal output which files are being created. Don’t worry about any specs or test files that also get created, the important ones are the model file and the migration file. Rails has lots of these handy generators which don’t do much except create new files in the right spots of your application for you. The output looks something like:

  invoke  active_record
  create    db/migrate/20131223154310_create_testmodels.rb
  create    app/models/testmodel.rb
  invoke    rspec
  create      spec/models/testmodel_spec.rb

The model file that the generator creates is just a bare-bones model file in the app/models directory (which you could easily have created yourself). The other main file is the migration file in the db/migrations folder, which starts with a complicated looking timestamp like 20130924230504_create_users.rb.

If you dive into that file, you’ll see that there’s not much in it except another bare-bones ruby class that inherits from ActiveRecord::Migration and some timestamps[*]. The timestamps just create created_at and updated_at columns for you so you can track when your database records were created or modified. These two columns are just helpful enough that they are included as standard practice.

If you want to only create the database migration file (without the Model or any of the test files), just use $ rails generate migration NameYourMigration. You’ll end up using this one more once you’ve got things up and running since you’ll probably be modifying your data table instead of creating a new one. There’s a syntax for specifying additional parameters when you call this, but there’s no need to remember that syntax because you can also manually go in and edit the migration file yourself.

[*]: Unless you passed the Rails generator the column names you wanted, in which case they would show up automatically in the migration fields. Generators let you pass in arguments to do even more for you.


 What Are They?

So what’s a migration? A migration is basically a script that tells Rails how you want to set up or change a database. It’s the other part of Active Record magic that allows you to avoid manually going in and writing SQL code to create your database table. You just specify the correct Ruby method (like the aptly named create_table) and its parameters and you’re almost good to go.

Migrations are just a script, so how do you tell Rails to run that script and actually execute the code to create your table and update your database’s schema? By using the $ rake db:migrate command, which runs any migrations that haven’t yet been run. Rails knows this because it keeps track of which migrations have been run (using timestamps) behind the scenes. When you run that command, Rails will execute the proper SQL code to set up your database table and you can go back to actually building the website.

Why is this useful? Obviously it lets you set up your database using user-friendly Ruby code instead of SQL, but it’s more than that. Over time, you’ll build up a bunch of these migration files. If you decide that you want to blow away your database and start from scratch, you can do that easily and then rerun the migrations. If you decide to deploy to the web, you will run those same migrations and the production database will be there waiting for you… even if it’s a different type of database! Again, Active Record does the heavy lifting for you here so you can focus on building your website.

The most immediately useful feature of migrations is when you’ve screwed something up because they’re (usually) reversible. Let’s say you just migrated to create a new database column but forgot a column to store the user’s email… oops! You can just type $ rake db:rollback and the last series of migrations that you ran will be reversed and you’re back to where you were. Then you just edit the file, rerun the migrations, and move on with your life.

This introduces the last nuance of migrations that we’ll talk about here – reversibility. For each method that you use in the migration, you want to specify how to reverse it if you have to. The reverse of adding a table is dropping that table, of adding a column is removing the column and so on. Many methods have a really obvious reverse case, so you don’t need to explicitly state it and can set up the whole migration file using the change method. But some of them do not, so you will need to separately specify up and down methods. You’ll read more about that in the assignment.

A final note, you never want to rollback migrations unless you’ve screwed something up. In situations where you have a legitimate case for removing a column (because you no longer need it for any purpose), you actually create a new migration that removes that column using the remove_column method. It preserves the database. Once you get familiar with this, you can build a database just using the schema file.


 How Much do I Need to Know about Database?

Migrations don’t involve writing SQL, but you do need to understand enough about databases to know how you want yours structured! Which columns do you want? Which ones should be indexed (and why)? Should you set a default value? What data type will be stored in your column – a string or text?


Your Tasks

1. Read the Rails Guides Active Record Migrations (http://edgeguides.rubyonrails.org/active_record_migrations.html). Don’t worry about 3.6-3.8. Just skim section 7. Seeds (section 8) are useful and you’ll be using them later.


 Basic Validations

Imagine you’ve got your database up and running and want to make sure that the data people are sending to your database is good data. For instance, to create an account on your site, a user needs to enter both a username and an email address. How do you enforce this?

There are three levels of validations that you can enforce, each more strict and secure than the previous. At the topmost level, you can write code using Javascript in your browser that detects if someone has filled out the form properly and will prompt them to finish it before moving on. The advantage here is that it is almost immediate so has great user experience. The problem with this is that Javascript is easy to circumvent and the user could easily submit a malicious or faulty request.

The second layer of enforcement for your validations of user data (which you should never trust) is to do so at the server level. This means writing code in your Rails application (specifically in the model that you are trying to save an instance of, ie. User) that examines user inputs, checks them versus the constraints you set up, and returns errors if there are any.

This is more secure than javascript but has the disadvantage of taking a full round-trip HTTP request to your application in order to check it. Model validations are generally pretty effective and that’s what we’ll focus on here.

Another problem occurs when your application has scaled up to the point where you are running multiple instances of it on multiple servers that all talk to the same central database. Let’s say you want to make sure a username is unique. What happens if two users almost simultaneously submit the same username and it is received by two separate concurrent instances of your application? When each instance of your application checks with the database to see if the username is unique, both times it looks okay so they both go ahead and save the model… oops! That may not sound plausible, but how about in rapidly occurring automated transactions? These “race conditions” are very real.

So the only way to truly enforce constraints is on the database level, since your single database is the sole arbiter of what is unique and valid in this world. You can use extra parameters passed to some of the now-familiar migration methods like add_index to say add_index :users, :username, unique: true, which enforces in the most secure way that the column is unique. Again, though, most of your validations can be implemented in your Rails application’s models.


 Your Tasks

1. Read the Rails Guides Active Record Validations (http://guides.rubyonrails.org/active_record_validations.html).  Section 2 on helpers can be skimmed. These help you get more specific with your validations and you’ll run into them later.  You can skim section 6 about custom validators. Section 8 will likely only be interesting if you’ve seen ERB in rails views before..


 Basic Associations

In the databases sections, you learned about how a relational database like sqlite3 or PostgreSQL lets you link two tables together using their primary keys (called a foreign key in the specific table that is referencing another one). It’s the real power of relational databases that they let you leverage these, well, relationships. Active Record takes that feature and lets you use it in all kinds of useful ways. Do you want to get all of your first user’s blog posts? Try User.first.posts. It’s as simple as that.

That functionality doesn’t come out of the box. You need to tell Rails that posts actually belong to a user. On the database table level, this means that every row in the posts table will have column for user_id that tells you which user “owns” that post. The users table doesn’t need to acknowledge the posts at all… after all, a single user can have an infinite number of posts. If we’re interested in a user’s posts, we just have to query the posts table for all posts that link back to that user’s ID. Rails makes these relationships very easy to specify. What we just talked about is aptly named a “has many / belongs to” association (a User has_many Post objects associated with it and a Post belongs_to a single User).

Step one with understanding this stuff is just to think about which different types of relationships are possible. Remember, half the battle of setting up your application is understanding what your data relationships will look like, so give this some thought and keep at it when it gets confusing.

The has_many / belongs_to, or a “one-to-many”, relationship is pretty common, and usually easiest to think of in terms of actual objects… a Child can have many Marble objects, each of which belongs to that child. But it also applies in slightly less intuitive cases, like where a single object belongs_to multiple other objects. An example would be a FranchiseLocation for a McDonalds, which belongs_to the Corporation Mcdonalds but might also belongs_to the City San Francisco.

It’s clear that it should belong to its corporate parent, but why does it belong to a City too? It’s often easier to think of it from the opposite perspective — a City can certainly have many FranchiseLocation objects. As long as a FranchiseLocation can only be in a single city, it effectively “belongs_to” that city in the way that Rails describes it.

Another common relationship is the many-to-many relationship, which can also be called has_and_belongs_to_many in Rails terms. This often comes up in actual relationships — a Human can have many favorite Dog objects, and each Dog object can have many favorite Human objects. In this case, how would you specify which dog objects are your favorites? It actually requires you to create another table (a join table, or “through” table) that specifically keeps track of all those relationships.

A key distinction here is that we’re not talking about how many Post objects a User currently has or how many FranchiseLocation objects a City currently has, we’re trying to model how many they COULD have over the entire lifetime of your application. If your User only has one Post now, you could hard code that post’s ID in your user table. Then when he has another Post, you’d have to create another table column to fit that ID. And another. And another. Which doesn’t make a lick of sense… so that’s why we say the User has_many :posts and let the Posts table hardcode in the User’s ID, since a post will only ever have one User associated with it (assuming you only have one author).


Your Tasks

1. Read the Rails Guides Active Record Associations (http://guides.rubyonrails.org/association_basics.html).  Read the beginning up to section 2.7.


 Conclusion

Active Record is the most powerful part of Rails and also complex to understand. You need to be able to translate the real world into database tables, which takes a bit of time to become familiar with. The most difficult concepts for new beginners are usually associations.

It’s easiest to start thinking about concrete relationships in the real world and then turning them into Active Record associations. Once you’re comfortable with which model has_many of which other model and who actually belongs_to the other, you can start modeling more abstract concepts like, say, event invitations and invitation acceptances.


Additional Resources


Project: Building With Active Record

In this project, you’ll get to spend some quality time with models. The tutorial will give you a chance to apply some of what you’ve learned in a structured environment and then the additional project will give you the chance to do it on your own.

Warmup: Thinking Data First

The first step to building a good data model is usually not on the computer. You’ll want to take a minute and think through what you’re going to need. You may not really know everything when you first start, but mapping out an approach path is crucial to avoiding costly mistakes later.

In this warmup, you’ll be given a variety of scenarios and you’ll be asked to map out what the data models will look like for each one. You can do it with a pen and paper or, if you’re lucky enough to be around one, a whiteboard. There are a few specific accepted techniques for how to map out models and associations but just do whatever comes naturally to you. One way to do it is with a list of models and another is to create a visual diagram where each model is represented by a box and you connect them with appropriate arrows.

For each scenario, you’ll be asked to write down the data, associations, and validations necessary to build it. That means which models (data tables) will be necessary to store the data (and which columns you will need), which fields of those tables will be subject to validation (ie. minimum password length or username uniqueness). Don’t worry if you’re not quite sure how to implement a given thing, the point here is to start thinking about how the data would be structured.

Example: You are building a blog for your startup which will have multiple authors and each author can write multiple posts.

This might look like:

Note: I’ll include the :id, :created_at and :updated_at columns but you can safely assume they’re always there since Rails or the database gives them to you automatically

  • Authors
    username:string [unique, 4-12 chars, present]
    email:string [unique, present]
    password:string [6-16 chars, present]
    id:integer
    created_at:datetime
    updated_at:datetime
    
    has_many posts
    
  • Posts
    title:string [unique, present]
    body:text [present]
    author_id:integer [present]
    id:integer
    created_at:datetime
    updated_at:datetime
    
    belongs_to author
    

Use whatever format feels best to you.


Your Task

For each of the following scenarios, write down the models, columns, validations and associations you might use to implement it. Some of these are more difficult than others and you’ll have to use a bit of creativity to infer which columns might need to be present for the scenario to make sense in the real world.

The trick is identifying what should be a different model and how these models will relate to each other via simple associations (all the ones below are has_many, has_one and/or belongs_to relationship).

Remember, if you feel like you will be hard coding data multiple times, it’s probably a sign that you should create a separate table. A common example is address information. You could write down the city and state explicitly for each user. How about making separate City and State models and relating them to each other?

1. You are building an online learning platform. You’ve got many different courses, each with a title and description, and each course has multiple lessons. Lesson content consists of a title and body text.

2. You are building the profile tab for a new user on your site. You are already storing your user’s username and email, but now you want to collect demographic information like city, state, country, age and gender. Think how many profiles should a user have? How would you relate this to the User model?

3. You want to build a virtual pinboard, so you’ll have users on your platform who can create “pins”. Each pin will contain the URL to an image on the web. Users can comment on pins (but can’t comment on comments).

4. You want to build a message board like Hacker News (https://news.ycombinator.com/). Users can post links. Other users can comment on these submissions or comment on the comments. How would you make sure a comment knows where in the hierarchy it lives?


Project 1: Ruby on Rails Tutorial

This chapter of the tutorial will give you a chance to start working with the database and models. It will guide you through setting up your first database migrations, making sure your user inputs are properly validated, and how to add a secure password to your User model.


 Your Task

1. Complete the Ruby on Rails Tutorial Chapter 6 Modeling Users (https://www.railstutorial.org/book/modeling_users)


 Project 2: Micro-Reddit

Let’s build Reddit (https://www.reddit.com/). Well, maybe a very junior version of it called micro-reddit. In this project, you’ll build the data structures necessary to support link submissions and commenting. We won’t build a front end for it because we don’t need to. You can use the Rails console to play around with models without the overhead of making HTTP requests and involving controllers or views.


Your Task

Get Started

1. Just like in the warmup, plan out what data models you would need to allow users to be on the site (don’t worry about login/logout or securing the passwords right now), to submit links (“posts”), and to comment on links. Users do NOT need to be able to comment on comments. Each comment refers to a Post.

2. Generate a new rails app from the command line ($ rails new micro-reddit) and open it up. We’ll use the default SQLite3 database so you shouldn’t have to change anything on that front.

3. Generate your User model and fill out the migration to get the columns you want.

4. Run the migration with $ rake db:migrate. You can use $ rake db:rollback if you realize you forgot anything or just create a new migration for the correction (which might involve the #add_column#remove_column or #change_column commands).


 Playing with Validations

1. In a new tab, open up the $ rails console. Try asking for all the users with > User.all. You should get back an empty array (no users yet!). Now create a blank new user and store it to a variable with > u = User.new. This user has been created in the ether of Ruby’s memory but hasn’t been saved to the database yet. Remember, if you’d used the #create method instead of the #new method, it would have just gone ahead and tried to save the new user right off the bat. Instead, we now get to play with it.

2. Check whether your new user is actually valid (ie. will it save if we tried?). > u.valid? will run all the validations. It comes up true… surprise! We haven’t written any validations so that’s to be expected. It’s also a problem because we don’t want to have users running around with blank usernames.

3. Implement the user validations you thought of in the first step in your app/models/user.rb file. These might involve constraints on the size of the username and that it must be present (otherwise you’ll potentially have users with no usernames!) and that it must be unique.

4. Reload your console using > reload!. You’ll need to do this every time you make changes to your app so the console can reload the current version. If it still seems broken, just > quit out of it and relaunch (sometimes #reload! doesn’t seem to do the trick). Build another new user but don’t save it yet by using > u2 = User.new. Run > u.valid? again to run the validations and it should come up false.

5. How do we find out what went wrong? Rails is helpful because it actually attaches error messages directly onto your user object when you fail validations so you can read into them with the #errors method. Try out > u.errors to see the errors or, better, > u.errors.full_messages to return a nice friendly array of messages. If you wrote custom messages into your validations, they will show up here as well.

6. Create a user who will actually save with > u3 = User.new(your_attributes_here) and run the validations. They should come up true. Save your user with the #save method so you’ve got your first user in the database.


 Playing with Associations

1. Create your Post model by referencing your data plan from the first step above, migrate the database, and add its validations.

2. Test your validations from the console, remembering to reload or relaunch it between changes.

3. Now set up your associations between User and Post models. Did you remember to include the foreign key column (user_id) in your posts table? If not, you can just add a new migration ($ rails generate migration yourmigrationname) and use the #add_column method mentioned above.

4. If you’ve properly set up your associations, you should be able to use a few more methods in the console, including finding a User’s Posts and finding the Post’s User. First test finding your lonely User’s Posts — > User.first.posts. It should be an empty array since you haven’t created posts, but it shouldn’t throw an error at you.

5. Build (but don’t yet save) a new post from the console, called p1, something like > p1 = Post.new(your_attributes_here). Don’t forget to include the ID of the user in your user_id field!

6. Now build another post using the association to the user — substitute #new with #build and run through the association instead — p2 = User.first.posts.build. Don’t fill in any fields yet. Examine the object that was created and you’ll see that the ID field already got filled out for you.

7. Save your original new post p1 so your user has officially written something. Test that you can use the other side of the association by trying > Post.first.user, which should return the original User object whose ID you pointed to when building the post.


Add in Commenting

1. You’ve now got a User and a Post and they’ve been linked. Commenting will look quite similar to your Post model but will be related not just to the post who is its “parent” but also to the user who has authored it. Set up the migration and migrate the database for your Comment model.

2. As before, add validations into your model and test them out in the console (refresh it!). Make sure you’ve required the two foreign keys (for posts and users) to be submitted, otherwise you could potentially have an orphan comment. You should not be able to save an invalid Comment and be able to save a valid Comment.

3. Build a second user and create a new comment which represents this user commenting on the first user’s post.

4. As before, add the associations you need between users, posts, and comments. You’ll need to be able to do the following methods successfully from the console (assuming your second user has an ID of 2):

  1. > u2 = User.find(2)
  2. > c1 = u2.comments.first should return that user’s comment. #comments returns an array with comments, which is why we need to use #first to actually retrieve the comment itself.
  3. > c1.user should return that comment’s author User (u2).
  4. > p1 = Post.first
  5. > p1.comments.first should return the comment c1.
  6. > c1.post should return the post p1.
  • If any of those don’t work, double check your associations. Sometimes the error messages can be helpful in prompting you for how to set up those associations.

 Additional Resources