<

Setting up CI environment using Docker, ECS and terraform for Thoughtworks GoCD

One of my first tasks for my new job at Ditto Music, was to setup the CI server in a way that would be easier to manage and scalable. There was an existing instance of Thoughtworks GOCD CI, which consisted of the Server Running on a T2.small, and a couple of agents running on T2.micro. Each of the agents were manually configured, so scaling could be a challenge, and also creating new agents to be able to create different resource types could also prove timecostly.

Docker to the rescue?

It felt like a natural fit for using docker.

I needed to spin up potentially multiple instances of the same instance, and wanted a way to easily manage the software installed on the images.

Our cloud service provider is AWS, so it felt right that this went hand in hand with ECS - Elastic Container Service, and stored our agent images in ECR - Elastic Container Repository. I also wanted to use terraform, so that changes to the overall infrastructure could be managed through a deployment.

Deployment Flow

There are 4 stages to the deployment:

  • Create the Container Host
  • Create the Elastic Container Register
  • Build and Upload the Docker Image
  • Create the Elastic Container Service

These we relatively straightforward to setup in terraform and bash, and can be tailored to your own requirements.

Setting up the container host

Setting up the container host in terraform is pretty straight forward, as it is just an EC2 instance, running an optimised ECS ami. I opted to use the Amazon ecsInstanceRole for the iam role, rather than create my own, as it already existed for me.

IAM docs for ecsInstanceRole

resource "aws_ecs_cluster" "ci-container-cluster" {
  name = "CI-Container-Cluster"
}

resource "aws_instance" "ci-container-host-1" {
  ami           = "ami-2e9866c5" #ecs optimized image
  instance_type = "t2.medium"

  vpc_security_group_ids      = ["${aws_security_group.ci-container-host-security-group.id}"]
  subnet_id                   = "${element(data.aws_subnet_ids.public_subnets.ids, 0)}"
  key_name                    = "infrastructure"
  associate_public_ip_address = true

  user_data = <<EOF
#!/bin/bash
echo ECS_CLUSTER=${aws_ecs_cluster.ci-container-cluster.name} >> /etc/ecs/ecs.config
echo ECS_BACKEND_HOST= >> /etc/ecs/ecs.config
echo NO_PROXY=169.254.169.254,169.254.170.2,/var/run/docker.sock >> /etc/ecs/ecs.config
echo 'vm.max_map_count = 262144' >> /etc/sysctl.conf
sysctl -p
EOF

  iam_instance_profile = "${aws_iam_instance_profile.ingest.name}"

  tags {
    Name = "CI-Container-Host-1"
  }
}

data "aws_iam_role" "ecsInstanceRole" {
  name = "ecsInstanceRole"
}
resource "aws_iam_instance_profile" "ingest" {
  name  = "ingest_profile"
  role = "${data.aws_iam_role.ecsInstanceRole.name}"
}

Setting up the Elastic Container Registry

I opted to create a terraform module for this, as it was a very repetative process.

#container.tf

module "gocd-agent-git-repository" {
  source     = "../modules/ci-repository"
  agent_name = "gocd-agent-git"
}

module "gocd-agent-terraform-repository" {
  source     = "../modules/ci-repository"
  agent_name = "gocd-agent-terraform"
}

module "gocd-agent-node-repository" {
  source     = "../modules/ci-repository"
  agent_name = "gocd-agent-node"
}

module "gocd-agent-dotnetcore-repository" {
  source     = "../modules/ci-repository"
  agent_name = "gocd-agent-dotnetcore"
}

module "gocd-agent-docker-repository" {
  source     = "../modules/ci-repository"
  agent_name = "gocd-agent-docker"
}

Here I specified 5 different repositories, for each agent type:

  • git
  • terraform
  • node
  • dotnetcore
  • docker (docker in docker)

The module is straightforward, and looks like this

#ecr module

variable "agent_name" {}

resource "aws_ecr_repository" "agent-repository" {
  name = "${var.agent_name}"
}

resource "aws_ecr_repository_policy" "ci-agent-ecr-policy" {
  repository = "${aws_ecr_repository.agent-repository.name}"

  policy = <<EOF
{
    "Version": "2008-10-17",
    "Statement": [
        {
            "Sid": "new policy",
            "Effect": "Allow",
            "Principal": "*",
            "Action": [
                "ecr:GetDownloadUrlForLayer",
                "ecr:BatchGetImage",
                "ecr:BatchCheckLayerAvailability",
                "ecr:PutImage",
                "ecr:InitiateLayerUpload",
                "ecr:UploadLayerPart",
                "ecr:CompleteLayerUpload",
                "ecr:DescribeRepositories",
                "ecr:GetRepositoryPolicy",
                "ecr:ListImages",
                "ecr:DeleteRepository",
                "ecr:BatchDeleteImage",
                "ecr:SetRepositoryPolicy",
                "ecr:DeleteRepositoryPolicy"
            ]
        }
    ]
}
EOF
}

resource "aws_ecr_lifecycle_policy" "ci-agent-ecr-policy" {
  repository = "${aws_ecr_repository.agent-repository.name}"

  policy = <<EOF
{
    "rules": [
        {
            "rulePriority": 1,
            "description": "Keep last 5 images",
            "selection": {
                "tagStatus": "tagged",
                "tagPrefixList": ["v"],
                "countType": "imageCountMoreThan",
                "countNumber": 5
            },
            "action": {
                "type": "expire"
            }
        }
    ]
}
EOF
}

This will now give us something to upload our docker image to.

Building and Uploading the Docker Image.

I stored each of the agent dockerfiles within a directory, with the name of the agent as the name of the agent. I then had a simple bash script, that would loop through each of the dockerfiles, build it and upload the the elastic container registry

#build.sh 
#!/bin/bash
set -e

for dir in `find . -type d`
do

echo "using directory "$dir
if [ $dir = "." ]; then
    echo ""
else

    BASE_REPO=XXXXXXXXXXXX.dkr.ecr.eu-west-2.amazonaws.com
    IMAGE_NAME=${dir:2}
    VERSION_LATEST=latest
    VERSION=2.0
    echo "ImageName:"$IMAGE_NAME
    eval $(aws ecr get-login --region eu-west-2 --no-include-email)
    sleep 1

    docker build $dir -t $IMAGE_NAME:$VERSION
    docker tag $IMAGE_NAME:$VERSION $BASE_REPO/$IMAGE_NAME:$VERSION_LATEST
    docker tag $IMAGE_NAME:$VERSION $BASE_REPO/$IMAGE_NAME:$VERSION
    docker push $BASE_REPO/$IMAGE_NAME:$VERSION 
    docker push $BASE_REPO/$IMAGE_NAME:$VERSION_LATEST 

fi

done

Here is an example of one of the dockerfiles we created. This one is for the agent that runs terraform files

FROM gocd/gocd-agent-ubuntu-16.04:v18.7.0
RUN apt-get update -y && apt-get upgrade -y && \
    apt-get install -y bash tree tar zip unzip xz-utils
    
RUN curl -o terraform.zip https://releases.hashicorp.com/terraform/0.11.1/terraform_0.11.1_linux_amd64.zip && \
    unzip terraform.zip && \
    mv terraform /usr/local/bin/

RUN apt-get install -y python-setuptools python-dev build-essential  && \
    easy_install pip && \
    pip install --upgrade pip && \
    pip install awscli 
    
ENV GO_SERVER_URL=https://XXXX.XXXXXXXX.com:8154/go/
ENV AGENT_AUTO_REGISTER_KEY=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
ENV AGENT_AUTO_REGISTER_RESOURCES=terraform,aws-cli
ENV AGENT_AUTO_REGISTER_ENVIRONMENTS=Build,Infrastructure,Prod,QA
ENV AGENT_AUTO_REGISTER_HOSTNAME=gocd-agent-terraform

ENTRYPOINT /docker-entrypoint.sh

Setting up the Elastic Container Service

The elastic container service starts docker images, and manages them on the container host.

Here is some of the terraform config I used to create this service, most of the infra is so similar, that again it made sense to move this into a module.

#containers.tf
module "gocd-agent-git-container" {
  source              = "../modules/ci-agent"
  agent_name          = "gocd-agent-git"
  ecr_base_repository = "${var.repository}"
  tag                 = "2.0"
  memory_reservation  = 128
  instance_count      = 2
}

module "gocd-agent-terraform-container" {
  source              = "../modules/ci-agent"
  agent_name          = "gocd-agent-terraform"
  ecr_base_repository = "${var.repository}"
  tag                 = "2.0"
  memory_reservation  = 128
  instance_count      = 2
}

module "gocd-agent-node-container" {
  source              = "../modules/ci-agent"
  agent_name          = "gocd-agent-node"
  ecr_base_repository = "${var.repository}"
  tag                 = "2.0"
  memory_reservation  = 384
  instance_count      = 2
}
module "gocd-agent-dotnetcore-container" {
  source              = "../modules/ci-agent"
  agent_name          = "gocd-agent-dotnetcore"
  ecr_base_repository = "${var.repository}"
  tag                 = "2.0"
  memory_reservation  = 384
  instance_count      = 1
}

module "gocd-agent-docker-container" {
  source              = "../modules/ci-agent"
  agent_name          = "gocd-agent-docker"
  ecr_base_repository = "${var.repository}"
  tag                 = "2.0"
  memory_reservation  = 256
  instance_count      = 1
}

The in the actual definition for the module:

#container-module.tf

variable "instance_count" {
  default = 1
}
variable "agent_name" {}

variable "tag" {
  default = "latest"
}

variable "memory_reservation" {
  default = 256
}

variable "ecr_base_repository" {}


resource "aws_ecs_service" "ci-agent-service" {
  name                = "${var.agent_name}"
  cluster             = "${data.aws_ecs_cluster.ci-cluster.id}"
  task_definition     = "${aws_ecs_task_definition.agent-definition.arn}"
  desired_count       = "${var.instance_count}"
  scheduling_strategy = "REPLICA"
}

resource "aws_ecs_task_definition" "agent-definition" {
  family       = "${var.agent_name}"
  network_mode = "host"
  volume = {
    name      = "dockerdaemon"
    host_path = "/var/run/docker.sock"
  }
  container_definitions = <<DEFINITION
[
  {
    "name": "${var.agent_name}",
    "image": "${var.ecr_base_repository}/${var.agent_name}:${var.tag}",
    "hostname": "${var.agent_name}",
    "essential": true,
    "privileged": true,
    "memoryReservation": ${var.memory_reservation},
    "mountPoints": [
        {
          "sourceVolume": "dockerdaemon",
          "containerPath": "/var/run/docker.sock"
        }
    ],
    "requiresAttributes": [
        {
        "value": null,
        "name": "com.amazonaws.ecs.capability.ecr-auth",
        "targetId": null,
        "targetType": null
        },
        {
        "value": null,
        "name": "com.amazonaws.ecs.capability.task-iam-role",
        "targetId": null,
        "targetType": null
        },
        {
        "value": null,
        "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19",
        "targetId": null,
        "targetType": null
        }
    ]
  }
]
DEFINITION
}

data "aws_ecs_cluster" "ci-cluster" {
  cluster_name = "CI-Container-Cluster"
}

The terraform for this again is quite straightforward, its probably worth that:

  • We are mounting /var/run/docker.sock from the container to the host, so that rather than running docker in docker on the docker agent, we are utilising the host docker service, which prevents many issues -

Written on August 7, 2018.