Building in isolation with Docker

by Bruce Szalwinski

Background

Over at Device Detection, I wrote about creating an Apache Handler that could be used to do real time device detection.  The handler has a number of dependencies on other Perl modules, 51Degrees, JSON, Apache2::Filter, Apache2:RequestRec, Apache2::RequestUtil, etc.  And those modules have dependencies as well.  I wanted our build server, Bamboo, to do the building of my module without the nasty side effects of having to install third party libs into all of the build agents.  In the Maven world, I would just add all of these dependencies to my pom.xml and maven would download the dependencies from the repository into my local environment.  At build time, Bamboo would take care of establishing a clean environment, download my dependencies and most importantly, when the build was complete, the plates would be wiped clean ready to serve another guest leaving no traces of my build party.  The challenge then, how to do this in Perl world.  Spoiler alert, full source code is available at DeviceDetection.

Enter Docker

homer-isolation-tank

The fancy new way for developers to deliver applications to production is via Docker.  Developers define the application dependencies via a plain old text file, conventionally named Dockerfile.  Using the Docker toolkit, the Dockerfile is used to build a portable image that can be deployed to any environment.  A running image is known as a container, which behaves like an operating system running inside of a host operating system.  This is a lot like VMs but lighter weight.  Developers are now empowered to deliver an immutable container to production.  Let’s see if this can also be used to provide an isolated build environment.

docker-big-picture

For the big picture folks, here is what we are trying to do.  We’ll start by defining all of the application dependencies in our Dockerfile.  We’ll use the Docker toolkit to build an image from this Dockerfile.  We’ll run the image to produce a container.  The container’s job will be to build and test the Perl modules and when all tests are successful, produce an RPM.

Building the image was an iterative process and I had fun dusting off my sysadmin hat.  Here is where I finally ended up.

FROM google/debian:wheezy
RUN    apt-get -y install make gcc build-essential sudo
RUN    apt-get -y install apache2-threaded-dev
RUN    apt-get -y install libapache2-mod-perl2
RUN    apt-get -y install libtest-harness-perl libtap-formatter-junit-perl libjson-perl
RUN apt-get -y install rpm

Let’s break this down.   The first non-comment line in the Dockerfile must be the “FROM” command.  This defines the image upon which our image will be based.  I’m using the “google/debian” image tagged as “wheezy”.   Think of images as layers.  Each image may have dependencies on images below it.  Eventually, you get to a base image, which is defined as an image without a parent.

FROM google/debian:wheezy

The RUN command is used to add layers to the image, creating a new image with each successful command.  The 51Degrees Perl module is built using the traditional Makefile.PL process, so we start by installing the make, gcc and build-essentials.  Containers generally run as root so we wouldn’t normally need to install sudo, but our handler uses Apache::Test for its unit test and Apache::Test doesn’t allow root to create the required httpd process.  So we will end up running our install as a non-root user and give that user sudo capabilities.  More about that in a bit.

RUN     apt-get -y install make gcc build-essential sudo

Next, we install our apache environment.  With Apache2, there is a pre-fork and a threaded version which has to do with how apache handles multi-processing.  For my purposes, I didn’t really care which one I picked. It was important however to pickup the -dev version as this includes additional testing features.

RUN     apt-get -y install apache2-threaded-dev

Next, we install mod-perl since the device detector is a mod perl handler.

RUN     apt-get -y install libapache2-mod-perl2

Next, add our Perl dependencies.  Each Linux distro has its own way of naming Perl modules.  Why?  Because they can.  Debian uses “lib” prefix and “-perl” suffix, converts “::” to “-“, and lower cases everything.  To install the Perl module known as “Test::Harness”, you would request “libtest-harness-perl”.

RUN     apt-get -y install libtest-harness-perl libtap-formatter-junit-perl libjson-perl

And since we’ll be delivering a couple of RPMs at the end of this, we install the rpm package.

RUN apt-get -y install rpm

With the Dockerfile in place, it is time to build our image.  We tell docker to build our image and tag it as “device-detection”.  We tell docker to look in the current directory for a file named Dockerfile.

$ docker build -t device-detection .

Time for some coffee as docker downloads the internet and builds our image.  Here is the pretty version of the log produced after the initial construction of the image.  If there are no changes to the Dockerfile, then the image is just assembled from the cached results.  The 12 character hex strings are the ids (full ids are really  64 characters long) of the images that are saved after each step.

Sending build context to Docker daemon 2.048 kB
Sending build context to Docker daemon
Step 0 : FROM google/debian:wheezy
 ---> 11971b6377ef
Step 1 : RUN apt-get -y install make gcc build-essential sudo
 ---> Using cache
 ---> 2438117da917
Step 2 : RUN apt-get -y install apache2-threaded-dev
 ---> Using cache
 ---> 41f878809025
Step 3 : RUN apt-get -y install libapache2-mod-perl2
 ---> Using cache
 ---> 43eadc4ec9eb
Step 4 : RUN apt-get -y install libtest-harness-perl libtap-formatter-junit-perl libjson-perl
 ---> Using cache
 ---> 106d5f017b5c
Step 5 : RUN apt-get -y install rpm
 ---> Using cache
 ---> fd0dc5f192d6
Successfully built fd0dc5f192d6

Use the docker images to see the images that have been built.  The ubuntu/14.04 was before I got religion and started using google/debian.

$ docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
device-detection    latest              2f2b3e38e8c1        29 minutes ago      358.6 MB
ubuntu              14.04               d0955f21bf24        2 weeks ago         188.3 MB
google/debian       wheezy              11971b6377ef        9 weeks ago         88.2 MB

Running the Image

At this point, we have an image that contains our isolated build environment.  Now we are ready to do some building by running the image.  In Docker terms, a running image is known as a container.  The build-docker script will be used to produce a container.   When we create our Bamboo build plan, this is the script that we will execute.

#!/bin/bash
docker run --rm -v $PWD:/opt/51d device-detection:latest /opt/51d/entry.sh

The –rm removes the container when finished. The -v mounts the current directory as /opt/51d inside of the container.  The device-detection:latest refers to our image that we just built.  And finally, the /opt/51d/entry.sh is the command to execute inside of the container.

#!/bin/bash
adduser --disabled-password --gecos '' r
adduser r sudo
echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
su -m r -c /opt/51d/build

The entry.sh script will be executed inside of the container.  To test the handler, we’ll need the vendor’s module installed.  And to install modules, we need to have root privileges.  We are going to use Apache::Test to test the handler but Apache::Test won’t let us start the httpd process as root.  The solution is to create a new user, r, and give him sudo capabilities.  With that in place, we hand off execution to the next process /opt/51d/build.

That all worked well on my local environment, but something interesting happened when I went to deploy this from Bamboo. The owner of the files in the container turned out to be the user that built the container.  I was the one building the container in my local environment, but I wasn’t the one building the container inside of Bamboo.  When the user ‘r’ attempted to create a file, it got a permission denied error because the directories are not owned by him.  I discovered this by having Bamboo list the files from inside the running container. They are owned by the mysterious user with UID:GID of 3366:777.

build   08-Apr-2015 09:28:35    /opt/51d:
build   08-Apr-2015 09:28:35    total 28
build   08-Apr-2015 09:28:35    drwxr-xr-x 7 3366 777 4096 Apr  8 16:28 51Degrees-PatternWrapper-Perl
build   08-Apr-2015 09:28:35    drwxr-xr-x 5 3366 777 4096 Apr  8 16:28 CDK-51DegreesFilter
build   08-Apr-2015 09:28:35    -rwxr-xr-x 1 3366 777  530 Apr  8 16:28 build
build   08-Apr-2015 09:28:35    -rwxr-xr-x 1 3366 777  120 Apr  8 16:28 build-docker
build   08-Apr-2015 09:28:35    drwxr-xr-x 2 3366 777 4096 Apr  8 16:28 docker
build   08-Apr-2015 09:28:35    -rwxr-xr-x 1 3366 777  146 Apr  8 16:28 entry.sh
build   08-Apr-2015 09:28:35    -rwxr-xr-x 1 3366 777  497 Apr  8 16:28 rpm.sh

We can use this UID:GID information when creating our user. The stat command can be used to return the UID and GID of a file.  We’ll create a group associated with the group that owns the /opt/51d directory and then we’ll create our user with the UID and GID associated with the owner of the directory.  Our modified entry.sh script is then:

#!/bin/bash
addgroup --gid=$(stat -c %g /opt/51d) r
adduser --disabled-password --gecos '' --uid=$(stat -c %u /opt/51d) --gid=$(stat -c %g /opt/51d) r
adduser r sudo
echo '%sudo ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers
su -m r -c /opt/51d/build

And we can see that user ‘r’ is now the “owner” of the files.

build   08-Apr-2015 09:28:35    /opt/51d:
build   08-Apr-2015 09:28:35    total 28
build   08-Apr-2015 09:28:35    drwxr-xr-x 7 r r 4096 Apr  8 16:28 51Degrees-PatternWrapper-Perl
build   08-Apr-2015 09:28:35    drwxr-xr-x 5 r r 4096 Apr  8 16:28 CDK-51DegreesFilter
build   08-Apr-2015 09:28:35    -rwxr-xr-x 1 r r  530 Apr  8 16:28 build
build   08-Apr-2015 09:28:35    -rwxr-xr-x 1 r r  120 Apr  8 16:28 build-docker
build   08-Apr-2015 09:28:35    drwxr-xr-x 2 r r 4096 Apr  8 16:28 docker
build   08-Apr-2015 09:28:35    -rwxr-xr-x 1 r r  146 Apr  8 16:28 entry.sh
build   08-Apr-2015 09:28:35    -rwxr-xr-x 1 r r  497 Apr  8 16:28 rpm.sh

With the user setup, entry.sh hands off control to the build script to do the heavy lifting.  Here we setup our apache environment and start building the two Perl modules.  The () is a convient bash-ism that creates a sub-process, leaving us in the current directory when completed.  And the PERL_TEST_HARNESS_DUMP_TAP is an environment variable recognized by Tap::Formatter:Junit package.  Unit tests will live at the location specified by this variable.

#!/bin/bash

source /etc/apache2/envvars
export APACHE_TEST_HTTPD=/usr/sbin/apache2
export PERL_TEST_HARNESS_DUMP_TAP=/opt/51d/CDK-51DegreesFilter/dist/results

(cd /opt/51d/51Degrees-PatternWrapper-Perl && \
        perl Makefile.PL && \
        make && \
        make dist && \
        sudo make install && \
        ../rpm.sh FiftyOneDegrees-PatternV3-0.01.tar.gz)

(cd /opt/51d/CDK-51DegreesFilter && \
        perl Build.PL && \
        ./Build && \
        ./Build test && \
        ./Build dist && \
        ../rpm.sh CDK-51DegreesFilter-0.01.tar.gz)

When the build script completes, we are done and the container is stopped and removed.  Because we have mounted the current directory inside of container, artifacts produced by the container are available after the build completes.  This is exactly the side effect we need to have.  We can publish the tests results produced by the build process as well as the RPMs.  And we have accomplished the goal of having an isolated build environment, D’oh!

Fun things learned along the way

Inside of the container, Apache::Test starts an httpd server on port 8529. It then tries to setup the mod_cgi library by binding a socket to a filehandle in the /opt/51d/CDK-51DegreesFilter/t/logs directory via this directive:


<IfModule mod_cgid.c>
    ScriptSock /opt/51d/CDK-51DegreesFilter/t/logs/cgisock
</IfModule>

The httpd server had issues with this, not sure why, perhaps because Docker is binding the file system as well.  I resolved it by moving the ScriptSock location to /tmp/cgisock.  More details on this conundrum are available at stackoverflow where I asked and answered my own question,  http://stackoverflow.com/questions/29424132/error-accessing-cgi-script-inside-docker-container-operation-not-permitted-cou.

About collectivegenius
Everyone has a voice and great ideas come from anyone. At Cobalt, we call it the collective genius. When technical depth and passion meets market opportunity, the collective genius is bringing it’s best to the table and our customers win.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: