In this article, we are going to learn about Dockerfile Best Practices. In the production environment, we have to think about the size of the Docker images so we’ll take a look at how to optimize the Docker image by following the best practices in making the Dockerfile.
If you want to learn about the Docker Architecture you can take a look at this article Getting Started With Docker
In the software development journey, when we are learning DSA we always have to first learn how to optimize the code whereas most of the newbies in the coding journey ran away in learning the DSA concepts !!
In Docker, Dockerfiles are used to create, build, and run the image as a container. As DevOps engineers, we must always follow best practices when deploying to production.
Importance of optimizing Docker images
Let’s say we had created the image of a Node.js application of size 500 MB with its dependencies, now over time the application has some changes with its dependencies now the image is increased to 1GB.
It leads to memory usage, increasing surface attack i.e. security risks, and makes the slower build time. Let's explore how to optimize the Dockerfile by applying these methods.
Utilize a .dockerignore file
Take advantage of Docker Caching
Opt for minimal-size images
Implement Multi-Stage Builds
Distroless Image
Minimize the Number of Layers
1) Utilize a .dockerignore file
The .dockerignore file functions similarly to a .gitignore file. When building a Docker image, it is essential to exclude files that are not required for the build, such as .git files, node_modules, etc. that will improve the caching performance.
2) Docker Caching
As we are aware, Docker constructs the image layer by layer. During the initial build of a Docker image, it creates a layer to install the multiple dependencies and save to the local filesystem.
And again when we build the same image it caches the dependencies and makes the build time faster. To prove this let’s build the simple Golang application first time and see how much time it takes.
package main
import (
"fmt"
"log"
"net/http"
)
func health(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
fmt.Fprint(w, "Healthy")
}
func handler(w http.ResponseWriter, r *http.Request) {
fmt.Fprint(w, "Hello, World!")
}
func main() {
http.HandleFunc("/", handler)
http.HandleFunc("/health", health)
port := "8080"
log.Printf("Starting server on port %s...", port)
log.Fatal(http.ListenAndServe(":"+port, nil))
}
So, for this Golang application, we need to write a Dockerfile which is here:
FROM golang:latest
WORKDIR /app
COPY . .
EXPOSE 8080
CMD ["go", "run", "main.go"]
This is the simple Dockerfile which takes 233.3 seconds, now when we again build the Dockerfile it caches the layer in the filesystem and reduces the build time, let’s see.
Now see it takes 2.9 seconds only so this is the Docker Caching concept.
3) Minimal Size Images
In the above docker image, we used the image in the FROM instruction is golang:latest. As we can see our previous Docker image size is 838 Mb.
golang-app latest 42a2cfe9cb7e 838MB
Our priority is consistently to select the appropriate base image with a minimal operating system footprint. By not selecting the latest tag, we should always choose the specific version of the base image.
golang 1.20-alpine 71719a2da3d1 255MB
We have now utilized the golang:1.20-alpine image, which reduces the size to 255 MB. Using Alpine images is more secure, and efficient, and results in a smaller size. Alpine Linux is a security-focused, lightweight Linux distribution commonly used as a base image in Docker containers.
4) Multi-Stage Builds
The concept of Multi-Stage Builds in Docker involves using multiple FROM statements. Ultimately, we use the image from the final runtime environment.
By using multiple build environments, we can reduce the image size and include only the necessary artifacts in the final runtime environment. This approach is designed to eliminate unwanted layers in the image, which is truly impressive.
Let's look at the example of the app where we've already built an image that's about 800 MB.
Simple Dockerfile
FROM golang:latest
WORKDIR /app
COPY . .
EXPOSE 8080
CMD ["go", "run", "main.go"]
Performing Multi-Stage Build
# Stage 1: Build the Go application
FROM golang:1.20-alpine AS builder
# Set the working directory inside the builder container
WORKDIR /app
# Copy the Go source code
COPY go.mod main.go ./
# Build the Go application binary
RUN go build -o main .
# Stage 2: Create a minimal runtime image
FROM alpine:3.16
# Set the working directory inside the runtime container
WORKDIR /app
# Copy the compiled Go binary from the builder stage
COPY --from=builder /app/main .
# Expose the port that the application listens on
EXPOSE 8080
# Run the binary
CMD ["./main"]
golang-app 1 7ed0e6043f58 About a minute ago 12.2MB
Wow! Look at that—the image size is slashed down to just 12.2 MB! This is the incredible power of multi-stage builds in action!
Let’s break down the Multi-Stage Dockerfile
1) There are 2 FROM instructions where 1st FROM has the base image of golang-1.20 alpine which has the size of 255 MB (as we discussed above) and in 2nd FROM base image is alpine:3.16 which is of size 5.54 MB.
In the 1st image we are building the go application that’s why we took the base image of golang.
2) RUN go build -o main . this command will build the go application binary and store in the /app/main
3) COPY --from=builder /app/main . this stage is the main part of a multi-stage build where we are using COPY instruction to copy the binaries from the first stage i.e. our builder stage.
Can you believe it? With the simple Dockerfile, our image size was around a whopping 800 MB, but now, thanks to the magic of multi-stage builds, it's shrunk down to just 12 MB! That's an incredible transformation!
5) Using Distroless Images
A Distroless image is like a minimalist's dream for Docker! It includes only the essential parts needed to run an application, leaving out all those extra OS files, like package managers, shells, or utilities.
Isn't it fascinating how "distroless" images aim to shrink the Docker container's size by removing everything except the crucial runtime libraries or dependencies the application needs?
Let's dive in and explore building a Multi-Stage Build with a distroless image! How exciting is that?
# Stage 1: Build the Go application
FROM golang:1.20-alpine AS builder
# Set the working directory inside the builder container
WORKDIR /app
# Copy the go.mod and go.sum files first to leverage Docker cache
COPY go.mod main.go ./
# Copy the rest of the application code
COPY . .
# Build the Go application binary
RUN go build -o main .
# Stage 2: Create a minimal runtime image using Distroless
FROM gcr.io/distroless/static-debian11
# Set the working directory inside the runtime container
WORKDIR /app
# Copy the compiled Go binary from the builder stage
COPY --from=builder /app/main .
# Expose the port that the application listens on
EXPOSE 8080
# Run the binary
CMD ["./main"]
FROM gcr.io/distroless/static-debian11
gcr.io is the Google Container Registry in which distroless indicates the family of images.
6) Minimizing the number of layers
By embracing the topic, we understood the importance of reducing the number of layers. This not only keeps the size manageable but also enhances security.
Many DevOps engineers write Dockerfiles with multiple RUN and COPY statements, which can create extra layers and increase the image size. Let's look at a simple example to understand this better.
FROM ubuntu:20.04
# Install dependencies
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get clean && rm -rf /var/lib/apt/lists/*
WORKDIR /app
When we build the 1st Dockerfile ubuntu:1 where multiple RUN instructions are there it goes to 143 MB
ubuntu 1 22f72962e64e 6 minutes ago 143MB
This is a common practice followed by many DevOps engineers, so let's optimize it.
# Start with a base image
FROM ubuntu:20.04
RUN apt-get update && \
apt-get install -y \
curl && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Set the working directory
WORKDIR /app
ubuntu 2 fcf6b0dbc93a 7 seconds ago 89MB
But guess what? When we build the 2nd Dockerfile image ubuntu:2, the size drops to an incredible 89MB! You can see how combining those RUN statements makes a huge difference in the image size. Isn't that amazing?
Conclusion
In this article, we explored building Dockerfiles using various optimization techniques. We also learned about Multi-Stage Builds and how minimizing layers can significantly reduce image size and enhance performance. It's truly impressive to see the impact these strategies can have!
As DevOps engineers, we should always use best practices when building Dockerfiles. If you're working with Kubernetes, check out this article to get started: Kubernetes.
Follow me on Twitter for your queries and ideas on DevOps.
Here is the Docker Commands Cheatsheet that you can learn and start using right away!