Dockerize a Ruby application

In this post, I describe my journey from knowing essentially nothing about Docker to having a dockerized Padrino application with each service living in a container (more on that below). To find this guide useful, you’d need to have Ruby installed, but no prior experience with Docker.

We begin with the very basics: ensuring that Docker and docker-compose are installed, creating a minimal Docker setup, and generating a new toy Padrino application within its container. We then continue with a brief overview of the most basic Docker nomenclature and concepts and apply it to our setup by describing one container for each service: the Padrino application backed by a PostgreSQL database. Finally, we discuss how to initialize our PostgresSQL database and running commands inside containers, and how to upgrade whole containers.

Prerequisites

You need both Ruby and Docker installed. For the Ruby version, I would always suggest using the most recent one (2.3.1 by the time of this post) since it usually comes with lots of speed improvements and bug fixes. For Docker, I suggest using at least version 1.9. This particular version comes with a couple of changes, especially with support for the docker-compose syntax version 2. We will learn about docker-compose later, just trust me for now that Docker greater or equal to 1.9 is a good choice.

You can check the versions (for Ruby and Docker) you have installed by ruby -v and docker -v, respectively.

Starter project

Let’s start by generating a sample Padrino project.

Padrino stub

To generate the sample project, we need to have Padrino installed: gem install padrino. To have an at least remotely realistic setup, we create an application that is backed by a PostgresSQL database. Let’s pick a name for the new project, ruby_dockerfor instance, and create the Padrino app: padrino gen project ruby_docker -a postgres -d sequel -t rspec. In addition, add gem 'puma' to the project’s Gemfile. In a usual non-dockerized project you would now cd into your new project, bundle install --binstubs everything and then start it with padrino s.

We, however, want to move all those steps into Docker.

Basic Docker configuration

A Docker container basically is a virtual machine image. What differentiates a container from a virtual machine image as in VirtualBox is that the operating system is kept separate from your configuration (installed packages, user management, the actual code). The way to describe a container is by providing a Dockerfile. A Dockerfile specifies the parent container from which your container inherits its setup, and commands to run (like installing operating system packages or copying files in the container).

This still is very abstract, so let’s take a look at a concrete example. Create a file Dockerfile in the root of our Padrino application with the following content:

FROM ruby:slim

# Lay the base for our containerized Padrino app
RUN apt-get -qq update && \
    apt-get -qq -y install build-essential --fix-missing --no-install-recommends
# Packages required by postgres:
# - libpq-dev is required to build the pg gem with native extensions
# - postgres-client is required to get the usual Postgres commands like
    createdb.
RUN apt-get -qq -y install libpq-dev postgresql-client

ENV APP_HOME /app

RUN mkdir -p $APP_HOME
WORKDIR $APP_HOME

# Copy gemfiles from local into the machine
COPY Gemfile Gemfile.lock ./

RUN bundle install --binstubs

COPY . .
EXPOSE 3000

You might want to have the Dockerfile Reference at hand while we go through this file line by line.

Starting at the top, we define ruby:slim to be the parent container. Even if you start fresh and like to build your own container from the ground up (giving you the ultimate control over all packages but might prevent you to deploy your container(s) to services like Heroku), you need to specify a parent container. ruby:slim is a Debian-based container with only a minimum of packages and the latest Ruby installed (2.3.1 by the time of this post). Other options (assuming you want to stick with any of the official Ruby container configurations) are:

ruby: contains the most common packages for booting Ruby applications;
ruby:onbuild: like ruby plus some starter configuration for new Ruby containers.
ruby:alpine: even smaller than ruby:slim, but might require a bit more work to get all the libraries running, especially when relying on native extensions for curb, nokogiri, or puma.

Public Docker containers, including the above Ruby variants, can be found in the Docker Hub.

After the parent container is chosen, we run a couple of apt-get commands to install the packages required to build and run our application.

We continue to set the root for our application. It is not required to initialize an environment variable (here: APP_HOME) for this purpose, but it helps to keep our Dockerfile maintainable. Moreover, any container which might use our container as its parent can use this variable to get access to the absolute path to the application’s root. The WORKDIR directive basically tells Docker to cd into APP_HOME before any execution of RUN, ADD, or COPY command. Put differently: everything that is done after the workdir has been set is relative to your application’s root. That’s the case for the next command: copying Gemfile and Gemfile.lock into the container (to APP_HOME) from our local filesystem (outside the container). Having the Gemfile available inside the container, we tell Docker to install the specified gems. Finally, we copy our Padrino application into the container and open port 3000 (the port our webserver will listen on). Note that this port is only accessible from within the container’s network but not from the host. We come back to both topics – the container’s network and how to open container ports to the outside world – in a moment when we discuss docker-compose.yml.

Building our container

To build the container, run docker build .. Docker then loads the Dockerfile from the current directory and starts building our image. This includes (in this order): downloading the parent image, downloading and installing the Debian packages, copying files into the container, and installing the specified gems.

One process per container

What we’ve got so far is a sample Padrino application wrapped into a Docker container. We can also build it, but we haven’t started it yet. Let’s tackle that now.

Docker Compose basics

The goal we have set at the beginning of this post was to dockerize a Padrino application with each service (database, webserver) running in its own container. The Docker way to set up and configure such a network of containers is via Docker Compose with a configuration file named docker-compose.yml. Within this file we specify our services, volumes, and networks:

Each service runs within its own container. To specify a service, you give it a name, describe its dependent services, how to build it, the ports it should expose to the host, and finally how to start it. Note that each service is accessible from within the network of containers by its name given in the docker-compose.yml.
Application data can be distributed across volumes. Conversely, each service can mount multiple volumes. Most importantly: volumes are persistent, i.e., they are neither destroyed nor recreated when containers are rebuilt.
Networks: Per default, all services within a docker-compose.yml share the same network. This means they can see each other and, using their configured service name, can access each other through their exposed ports. Docker comes with three default networks (you can list them by docker network ls). Unless explicitly specified otherwise, each project gets its own network named according to the project directory’s basename.¹

Our `docker-compose.yml`

We are going to set up two services:

web runs the webserver; Puma in this case.
postgres runs our PostgreSQL database

Let’s first take a look at the whole configuration and discuss it subsequently.

services:
  web:
    build: .
    command: 'bundle exec puma'
    volumes:
      - .:/app
    ports:
      - "9000:3000"
    depends_on:
      - postgres

  postgres:
    image: postgres:9.5
    environment:
      - POSTGRES_USER
      - POSTGRES_PASSWORD
    ports:
      - '9001:5432'
    volumes:
      - 'postgres:/var/lib/postgresql/data'

volumes:
  postgres:

The configurable properties are identical for all services, so let’s take a look at the most important parts.

Build

The build context. For our purposes, this will always be the application’s root directory.

Volumes

volumes specifies a list of volumes and where to mount them. The format is <volume>:<container-local-path>. One important thing to note is that that services and volumes are two different concepts; thus they can be named identically. We use this fact and name both the PostgreSQL database and its volume postgres.

Ports

ports is lost of container ports to be exposed to your host machine. Note that these ports must be specified via an EXPOSE in your container’s Dockerfile. No port will be exposed to your host’s machine, regardless of any EXPOSE if you don’t explicitly specify any. However, if you do, the ports directive comes in two forms: <container-port> and <host-port>:<container-port>. In its first form, Docker generates a host port for you. You can list the generated ports with docker-compose port <service-name> <container-port>.

Command

Command to run the process which runs inside this container. For web in particular, we start the puma webserver. The postgres image already specifies how to start the PostgreSQL server.

Dependent Services

List all services that should be up and running before a particular service can be started. docker-compose then takes care of properly traversing the graph of services (hopefully a DAG, a directed acyclic graph) in order and booting each service.

Booting our setup

Two pieces are still missing to complete the initial setup for our container configuration: to add the puma gem, and let Padrino use the correct database credentials. To this end, add gem 'puma' to your Gemfile, add a .env file with contents

POSTGRES_USER=my_postgres_user
POSTGRES_PASSWORD=my_postgres_password

to your project’s root, and change the Sequel.connect directives in your config/database.rb to use these environment variables:

when :development then Sequel.connect("postgres://#{ENV['POSTGRES_USER']}:#{ENV['POSTGRES_PASSWORD']}@localhost/ruby_docker_development", :loggers => [logger])
when :production  then Sequel.connect("postgres://#{ENV['POSTGRES_USER']}:#{ENV['POSTGRES_PASSWORD']}@localhost/ruby_docker_production",  :loggers => [logger])
when :test        then Sequel.connect("postgres://#{ENV['POSTGRES_USER']}:#{ENV['POSTGRES_PASSWORD']}@localhost/ruby_docker_test",        :loggers => [logger])

That’s it for the base setup. Go ahead and boot both services up with docker-compose run web. You should see two services started up, namely rubydocker_postgres_1 and rubydocker_web_1. When you now open http://localhost:9000/ in your web browser, you should see Sinatra greeting you (though with a “no resource found” error page).

Let our application do something

We now have containerized a generated Padrino application, but it does not do anything so far. As a starting point for further explorations, let’s create a very simple application for managing remote photos (think stock photos you’d like to use within presentations). First, we create the photos table along with its Sequel model. Create db/migrate/001_add_photos.rb with contents

Sequel.migration do
  change do
    create_table :photos do
      primary_key :id
      String :title, null: false
      String :url,   null: false
    end
  end
end

and models/photo.rb

class Photo < Sequel::Model ; end

Add the following code the photo controller (app/controllers/photo.rb), containing two actions: listing all photos, and adding new photos.

RubyDocker::App.controllers :photos do
  get :index do
    content_type :json
    halt 200, Photo.all.map(&:to_hash).to_json
  end

  post :photo, map: '/photos' do
    content_type :json

    payload = JSON.parse(request.body.read)
    begin
      photo = Photo.create(payload)
      halt 201, photo.to_hash.to_json
    rescue
      halt 400, {message: $!.message}.to_json
    end
  end
end

At this point we don’t want to worry too much about securing our API. To this end, set set :protect_from_csrf, false at the bottom of config/apps.rb.

The code is now in place, but the database still has to be set up and migrated. On a local project, you now would execute the respective rake tasks directly for creating and migrating the database. To run the same tasks within our Postgres container, we use docker-compose run <service> <command>:

Create database: docker-compose run postgres bundle exec padrino rake sq:create
Run migrations: docker-compose run postgres bundle exec padrino rake sq:migrate

That’s it. Run docker-compose run web to start both containers and point your browser to http://localhost:9000. To create a photo entry, run

curl -H 'Content-Type: application/json' -XPOST -d '{"title":"&lt;title>","url":"&lt;url>"}' http://localhost:9000/photos

from the command line.

What’s next

There are so many topics left open. Here are just a few ideas what to explore next:

Extract environment variables into .env, and support multiple environments. Our current setup uses the same configuration for all environments (development, test, and production).
How to deploy our container setup to services like Heroku or Amazon Web Services (AWS)? How to do blue-green deployments?
Running only one webserver is fine to start with, but to scale we would want to run multiple webservers behind a load balancer.

I hope you found this starter configuration useful. Drop me a comment if you did, and also if you have ideas (besides the above) for a follow-up post on Docker and Ruby.

The project name can be configured using the COMPOSE_PROJECT_NAME environment variable. More configuration variables for Docker compose can be found here. ↩︎

Prerequisites#

Starter project#

Padrino stub#

Basic Docker configuration#

Building our container#

One process per container#

Docker Compose basics#

Our docker-compose.yml#

Build#

Volumes#

Ports#

Command#

Dependent Services#

Booting our setup#

Let our application do something#

What’s next#