From my experience, there are two main use cases for Docker: creating an output container with the applications that can be deployed somewhere, and creating a container with development dependencies (eg. language, database versions) that allow you to build/compile it without having everything installed on your machine. You can do both at the same time, but it used to result in huge images that go to production. Thankfully, there is a better way now - multi-stage builds.

Traditional approach

In the traditional, old-fashioned way you build Docker images, you specify layers that are stacked on top of each other, with the last one defining what process is being run as the main process of the container. Whatever you do in the meantime, it ends up on the output image.

For example, if you want to build an image where you build and run a Go application using the latest language version (1.10 as for 2.03.2018) you need to choose a base image, copy the application sources, run test (yeah!), compile to the static binary and run it:

# Dockerfile
FROM golang:1.10-stretch

COPY . ${GOPATH}/src/github.com/mycodesmells/misc-examples/docker/multi-stage

WORKDIR ${GOPATH}/src/github.com/mycodesmells/misc-examples/docker/multi-stage

RUN go test ./...
RUN CGO_ENABLED=0 go build -a -installsuffix cgo -o /dockerapp main.go

CMD ["/dockerapp"]

The problem with this solution is that in the end you have an image that would be much larger than it has to since no only there is a binary there, but all the development dependencies. The image built with that Dockerfile weights almost 400 megabytes:

dockerapp-simple                                latest                4aa20236b14e        47 seconds ago       397MB

In Go's case, the problems don't stop here. As for now, it is impossible to run race detection on Alpine Linux, so to access this feature of the toolchain you need to use an even larger base image:

# Dockerfile
FROM golang:1.10-alpine

COPY . ${GOPATH}/src/github.com/mycodesmells/misc-examples/docker/multi-stage

WORKDIR ${GOPATH}/src/github.com/mycodesmells/misc-examples/docker/multi-stage

RUN go test ./...
RUN CGO_ENABLED=0 go build -a -installsuffix cgo -o /dockerapp main.go

CMD ["/dockerapp"]

You probably realize that this one would be even larger than the first one since everything is larger than Alpine:

dockerapp-simple                                latest                4f901fea2b5c        39 seconds ago       801MB

800 megabytes that just runs a binary? That seems like a huge overengineering...

Multi-stage approach

With Docker 17.05 released in May 2017, a huge change has been introduced, as Dockerfiles can now be composed of stages so that whatever is needed in the process of building an output image can be used but dropped before producing the ultimate outcome.

This has been a life-changer because especially with Go, your final image just requires a single binary, but in the meantime, you need its sources, dependencies, or the language itself. For this, you just create a first stage with all that necessary dependencies, build the binary and copy it to the outcome image:

# Dockerfile
FROM golang:1.10-stretch

COPY . ${GOPATH}/src/github.com/mycodesmells/misc-examples/docker/multi-stage

WORKDIR ${GOPATH}/src/github.com/mycodesmells/misc-examples/docker/multi-stage

RUN go test -race ./...
RUN CGO_ENABLED=0 go build -a -installsuffix cgo -o /dockerapp main.go

# end of first stage, beginning of the second one
FROM alpine:3.7

COPY --from=0 /dockerapp /dockerapp
CMD ["/dockerapp"]

As you can see, each stage starts with FROM of their own, which acts just as in the usual Dockerfile, then if you want to copy stuff between stages, you add --from=N to COPY where N is zero-based stage number. You can also give names to stages and use them in COPY as well:

# Dockerfile
FROM golang:1.10-stretch as base
...
COPY --from=base /dockerapp /dockerapp
...

The most important is that the final image is much smaller than ones built with the traditional approach:

dockerapp-multi                                 latest                92ea77a47f16        3 seconds ago       10.7MB

Summary

The feature of multi-stage Docker builds can make your production images smaller and therefore easier to manage with just a tiny bit more of complexity. The main reason of that is it doesn't introduce any new keywords, and you can understand what is happening by reading it even if you've never done it before (so your teammates will understand it too!).

The full source code of this example is available on Github.