Optimize Docker Images: Best Practices Guide

In this article, we are going to learn about Dockerfile Best Practices. In the production environment, we have to think about the size of the Docker images so we’ll take a look at how to optimize the Docker image by following the best practices in making the Dockerfile.

If you want to learn about the Docker Architecture you can take a look at this article Getting Started With Docker

In the software development journey, when we are learning DSA we always have to first learn how to optimize the code whereas most of the newbies in the coding journey ran away in learning the DSA concepts !!

In Docker, Dockerfiles are used to create, build, and run the image as a container. As DevOps engineers, we must always follow best practices when deploying to production.

Importance of optimizing Docker images

Let’s say we had created the image of a Node.js application of size 500 MB with its dependencies, now over time the application has some changes with its dependencies now the image is increased to 1GB.

It leads to memory usage, increasing surface attack i.e. security risks, and makes the slower build time. Let's explore how to optimize the Dockerfile by applying these methods.

Utilize a .dockerignore file
Take advantage of Docker Caching
Opt for minimal-size images
Implement Multi-Stage Builds
Distroless Image
Minimize the Number of Layers

1) Utilize a .dockerignore file

The .dockerignore file functions similarly to a .gitignore file. When building a Docker image, it is essential to exclude files that are not required for the build, such as .git files, node_modules, etc. that will improve the caching performance.

2) Docker Caching

As we are aware, Docker constructs the image layer by layer. During the initial build of a Docker image, it creates a layer to install the multiple dependencies and save to the local filesystem.

And again when we build the same image it caches the dependencies and makes the build time faster. To prove this let’s build the simple Golang application first time and see how much time it takes.

package main

import (
    "fmt"
    "log"
    "net/http"
)

func health(w http.ResponseWriter, r *http.Request) {
    w.WriteHeader(http.StatusOK)
    fmt.Fprint(w, "Healthy")
}

func handler(w http.ResponseWriter, r *http.Request) {
    fmt.Fprint(w, "Hello, World!")
}

func main() {
    http.HandleFunc("/", handler)
    http.HandleFunc("/health", health)

    port := "8080"
    log.Printf("Starting server on port %s...", port)
    log.Fatal(http.ListenAndServe(":"+port, nil))
}

So, for this Golang application, we need to write a Dockerfile which is here:

FROM golang:latest

WORKDIR /app

COPY . . 

EXPOSE 8080

CMD ["go", "run", "main.go"]

This is the simple Dockerfile which takes 233.3 seconds, now when we again build the Dockerfile it caches the layer in the filesystem and reduces the build time, let’s see.

Now see it takes 2.9 seconds only so this is the Docker Caching concept.

3) Minimal Size Images

In the above docker image, we used the image in the FROM instruction is golang:latest. As we can see our previous Docker image size is 838 Mb.

golang-app      latest    42a2cfe9cb7e   838MB

Our priority is consistently to select the appropriate base image with a minimal operating system footprint. By not selecting the latest tag, we should always choose the specific version of the base image.

golang          1.20-alpine   71719a2da3d1    255MB

We have now utilized the golang:1.20-alpine image, which reduces the size to 255 MB. Using Alpine images is more secure, and efficient, and results in a smaller size. Alpine Linux is a security-focused, lightweight Linux distribution commonly used as a base image in Docker containers.

4) Multi-Stage Builds

The concept of Multi-Stage Builds in Docker involves using multiple FROM statements. Ultimately, we use the image from the final runtime environment.

By using multiple build environments, we can reduce the image size and include only the necessary artifacts in the final runtime environment. This approach is designed to eliminate unwanted layers in the image, which is truly impressive.

Let's look at the example of the app where we've already built an image that's about 800 MB.

Simple Dockerfile

FROM golang:latest

WORKDIR /app

COPY . . 

EXPOSE 8080

CMD ["go", "run", "main.go"]

Performing Multi-Stage Build

# Stage 1: Build the Go application
FROM golang:1.20-alpine AS builder

# Set the working directory inside the builder container
WORKDIR /app

# Copy the Go source code
COPY go.mod main.go ./

# Build the Go application binary
RUN go build -o main .

# Stage 2: Create a minimal runtime image
FROM alpine:3.16

# Set the working directory inside the runtime container
WORKDIR /app

# Copy the compiled Go binary from the builder stage
COPY --from=builder /app/main .

# Expose the port that the application listens on
EXPOSE 8080

# Run the binary
CMD ["./main"]

golang-app      1             7ed0e6043f58   About a minute ago   12.2MB

Wow! Look at that—the image size is slashed down to just 12.2 MB! This is the incredible power of multi-stage builds in action!

Let’s break down the Multi-Stage Dockerfile

1) There are 2 FROM instructions where 1st FROM has the base image of golang-1.20 alpine which has the size of 255 MB (as we discussed above) and in 2nd FROM base image is alpine:3.16 which is of size 5.54 MB.

In the 1st image we are building the go application that’s why we took the base image of golang.

2) RUN go build -o main . this command will build the go application binary and store in the /app/main

3) COPY --from=builder /app/main . this stage is the main part of a multi-stage build where we are using COPY instruction to copy the binaries from the first stage i.e. our builder stage.

Can you believe it? With the simple Dockerfile, our image size was around a whopping 800 MB, but now, thanks to the magic of multi-stage builds, it's shrunk down to just 12 MB! That's an incredible transformation!

5) Using Distroless Images

A Distroless image is like a minimalist's dream for Docker! It includes only the essential parts needed to run an application, leaving out all those extra OS files, like package managers, shells, or utilities.

Isn't it fascinating how "distroless" images aim to shrink the Docker container's size by removing everything except the crucial runtime libraries or dependencies the application needs?

Let's dive in and explore building a Multi-Stage Build with a distroless image! How exciting is that?

# Stage 1: Build the Go application
FROM golang:1.20-alpine AS builder

# Set the working directory inside the builder container
WORKDIR /app

# Copy the go.mod and go.sum files first to leverage Docker cache
COPY go.mod main.go ./

# Copy the rest of the application code
COPY . .

# Build the Go application binary
RUN go build -o main .

# Stage 2: Create a minimal runtime image using Distroless
FROM gcr.io/distroless/static-debian11

# Set the working directory inside the runtime container
WORKDIR /app

# Copy the compiled Go binary from the builder stage
COPY --from=builder /app/main .

# Expose the port that the application listens on
EXPOSE 8080

# Run the binary
CMD ["./main"]

FROM gcr.io/distroless/static-debian11

gcr.io is the Google Container Registry in which distroless indicates the family of images.

6) Minimizing the number of layers

By embracing the topic, we understood the importance of reducing the number of layers. This not only keeps the size manageable but also enhances security.

Many DevOps engineers write Dockerfiles with multiple RUN and COPY statements, which can create extra layers and increase the image size. Let's look at a simple example to understand this better.

FROM ubuntu:20.04

# Install dependencies
RUN apt-get update

RUN apt-get install -y curl

RUN apt-get clean && rm -rf /var/lib/apt/lists/*

WORKDIR /app

When we build the 1st Dockerfile ubuntu:1 where multiple RUN instructions are there it goes to 143 MB

ubuntu          1             22f72962e64e   6 minutes ago   143MB

This is a common practice followed by many DevOps engineers, so let's optimize it.

# Start with a base image
FROM ubuntu:20.04

RUN apt-get update && \
    apt-get install -y \
    curl && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Set the working directory
WORKDIR /app

ubuntu          2             fcf6b0dbc93a   7 seconds ago   89MB

But guess what? When we build the 2nd Dockerfile image ubuntu:2, the size drops to an incredible 89MB! You can see how combining those RUN statements makes a huge difference in the image size. Isn't that amazing?

Conclusion

In this article, we explored building Dockerfiles using various optimization techniques. We also learned about Multi-Stage Builds and how minimizing layers can significantly reduce image size and enhance performance. It's truly impressive to see the impact these strategies can have!

As DevOps engineers, we should always use best practices when building Dockerfiles. If you're working with Kubernetes, check out this article to get started: Kubernetes.

Follow me on Twitter for your queries and ideas on DevOps.

Here is the Docker Commands Cheatsheet that you can learn and start using right away!

Dockerfile Best Practices: How to Optimize Your Docker Images

Table of contents