Multi-stage
In the beginning
When we first started to build docker images, long before the multi-stage syntax, our projects would look something like this:
We would have a base docker image containing the dependencies shared by the development
and the production
image, then one file for each final image.
The script build.sh
shows that we first build the base image tagging it as base_image
, then depending of the argument passed to the script we would build the final image.
Introducing multi-stage with target
Docker then introduced the multi-stage syntax for us to do everything in one file.
We now introduce the target
argument for docker build. In the script build.sh
we use the argument to specify the image we want to build.
So far it looks like a good deal, but there is a catch: If you want to build the last image ( production
in that case) the builder will go through all the instructions even if they are not part of the final image.
As we can see on the result above, the builder is actually running through ALL the instructions, but the final image will only contain the result of base_image
+ production_image
.
I added a sleep
in each images to demonstrate the time spent by the final image to build. The final image is built in ~23sec because each image was built.
Using ONBUILD to skip instructions
The ONBUILD
instruction in a Dockerfile
allows an instruction to be ran only when it is part of the final build.
Let’s take a look at the next example, we still use the multi-stage syntax + target but this time we use ONBUILD
to make things faster.
The Dockerfile
now has a dynamic final image name. We use –build-args with
ARG` to set the variable on build.
With the ONBUILD
instruction in each images, the builder will go through each instructions first until it reaches the target image (in that case the latest one FROM ${_APP_ENV_}_image
). Once it is done going through the file, it will run the instructions really used.
This time the build only took ~14sec to finish. And with the same final image. It’s a really good improvement but at what cost?. The image is not super clear and we had to create this “fake” target image.
BuildKit
BuildKit was introduced later and works in a different way. Instead of running through the file and executing all instructions one by one, it will read the file, then create a tree of images dependencies.
Docker build will run all stages in sequences and discard the one we don’t use in the end. BuildKit will traverse from the bottom to the top and only execute what we need.
The Dockerfile
is the same as before all the fuss with ONBUILD
, so much clear. In the build script, we used DOCKER_BUILDKIT=1
to mention to the builder that we want to use BuildKit.
This time the build only took ~11sec, it’s really close to the previous build but the Dockerfile
is clear, and we did not have to use a “fake” target image.
Caching
The issue
When using multi-stage building in a CI/CD environment, we build the image in a discardable environment, which means that most of the time you won’t have a cache for the builder.
One of the ways to fix that, is to pull
the previous image and use the --cache-from
argument in the builder
First case
In the case we are using our the base_image
as the base for production_image
If our final image is using the base image as its base, then there is no issue, all the layers are part of the final image.
If the base image has the layers [1, 2, 3, 4], and we use it for the final image adding layer [5, 6, 7], then the image we will push
to the registry will contain the layers [1, 2, 3, 4, 5, 6, 7].
Here we can see the result of the build using the previous image. It is using the cache from the final image and is super fast (~0.2sec)
Second case
In this case we use the base_image
to build stuff, then we start production_image
with a clean slate and copy the result of the base_image
in the production_image
.
If the base image has the layers [1, 2, 3, 4], and we use a clean image for the final image adding layer [5, 6, 7], then the image we will push
to the registry will contain the layers [5, 6, 7].
Because of that, the base_image
won’t be able to use cache, and thereby will invalidate the cache for the final_image
.
Here, the cache is never used.
A potential solution
Instead of building all the steps in one, we can run the builder for each step of the final image.
For each stage, we can pull
the previous build then use it as a cache.
Once each stage is built (ex: base_image
and production_image
) we need to push
them on the registry.
As we can see here, the cache is used in both images.
This solution works, the only trade off is that we have to push
and pull
each stages to and from the registry.