Prerequisites- This article assumes an intermediate level understanding of Docker and Django based application development.
Docker has revolutionized software development and has proven to be the nucleus of new-age development practices like CI-CD, distributed development, and collaboration.
Still, there isn’t any popular consensus on what are good docker development principles. Dockerfiles written for Java or Scala don’t directly translate to Python(we will explore this).
This article discusses an opinionated, production-ready Docker setup for Django applications which can be used in docker-compose file(also given below) or with Kubernetes. Our requirement further extends for containers to be scaled up and down without any side effects.
If you need the code without going into the reasoning, a sample Django repo with Docker setup is available for download on Github, here.
So without any further ado, below is our Dockerfile.
# Dockerfile for Django Applications
# Section 1- Base Image
FROM python:3.8-slim
# Section 2- Python Interpreter Flags
ENV PYTHONUNBUFFERED 1
ENV PYTHONDONTWRITEBYTECODE 1
# Section 3- Compiler and OS libraries
RUN apt-get update \
&& apt-get install -y --no-install-recommends build-essential libpq-dev \
&& rm -rf /var/lib/apt/lists/*
# Section 4- Project libraries and User Creation
COPY requirements.txt /tmp/requirements.txt
RUN pip install --no-cache-dir -r /tmp/requirements.txt \
&& rm -rf /tmp/requirements.txt \
&& useradd -U app_user \
&& install -d -m 0755 -o app_user -g app_user /app/static
# Section 5- Code and User Setup
WORKDIR /app
USER app_user:app_user
COPY --chown=app_user:app_user . .
RUN chmod +x docker/*.sh
# Section 6- Docker Run Checks and Configurations
ENTRYPOINT [ "docker/entrypoint.sh" ]
CMD [ "docker/start.sh", "server" ]
Before going into each section of the above Dockerfile and entrypoint.sh
and start.sh
mentioned in it, let’s discuss specifications of test Django application for which we are writing Docker setup:
- Celery is used for background tasks, with Redis as the celery backend.
- Celery beat is used for cron jobs, to schedule periodic tasks.
- Flower is used for background tasks monitoring.
- We are using PostgreSQL as our Database.
Let’s explore each section of our Dockerfile:
Section 1- Base Image
FROM python:3.8-slim
We have selected python:3.8-slim
as the base image. While choosing a base image key consideration is its size, as a bigger base image results in a bigger docker image size. Developers prefer alpine
flavor due to its small size and for languages such as Java or Scala, in most cases, it is the right way to go. Alpine is a minimal Docker image based on Alpine Linux.
But for Python applications, many requisite libraries are not supported by alpine
flavor out of the box. It means you would end up downloading dependencies on alpine
flavor which will result in bigger image size. This also means, greater image build time and application incompatibility. The slim flavor sit’s between alpine
and full version and hits the sweet spot in terms of size and compatibility.
At present, the Python community has started recognizing this issue, and you will find many articles like this which discuss this issue in detail.
Section 2- Python Interpreter Flags
ENV PYTHONUNBUFFERED 1
ENV PYTHONDONTWRITEBYTECODE 1
We have set two flags PYTHONUNBUFFERED
and PYTHONDONTWRITEBYTECODE
to non-empty values to modify the behavior of the Python interpreter.
When set to a non-empty value, PYTHONUNBUFFERED
will send python output straight to the terminal(standard output) without being buffered. This helps in two ways. Firstly, this allows us to get logs in real-time. Secondly, in case of container crash, it ensures that we receive output and hence, the reason for failure.
We are also setting PYTHONDONTWRITEBYTECODE
to a non-empty value. This ensures that the Python interpreter doesn’t generate .pyc
files which apart from being useless in our use-case, can also lead to few hard-to-find bugs.
Section 3- Compiler and OS libraries
apt-get update
Commands in this section install compilers, tools, and OS-level libraries. For e.g. apt-get update
, as you may already know, update the list of available packages. It doesn’t update packages themselves, just fetches their latest versions.
apt-get install -y --no-install-recommends build-essential libpq-dev
The build-essential
contains a collection of meta-packages that are necessary to compile software. This includes, but is not limited to, GNU debugger, g++/GNU compiler collection, and a few other tools and libraries. The complete list of build-essential
packages can be found here. As per official documentation libpq-dev
contains,
Header files and static library for compiling C programs to link with the libpq library in order to communicate with a PostgreSQL database backend.
Since libpq-dev
contains libraries concerning the PostgreSQL database, feel free to drop this if you are using some other database and install the requisite for that database.
The flag --no-install-recommends
skips the installation of other recommended packages. This is done to reduce docker image size. Please note that dependent packages mandatory for our packages are still getting installed.
rm -rf /var/lib/apt/lists/*
Cleaning /var/lib/apt/lists/*
can easily reduce your docker image size by ~5%-25%. The apt-get update
command updates versions of the list of packages that are not required in our Dockerfile after installing build-essential
and libpq-dev
. Hence, in this step, we clean out all the files added.
Section 4- Project libraries and User Creation
In this section, we install the project libraries mentioned in requirements.txt
and create a user who will be a non-root user for security purposes.
COPY requirements.txt /tmp/requirements.txt
If you notice, instead of copying the whole project, which we do eventually in Section 5, we are only copying requirements.txt
. Then we are installing all the libraries mentioned in it. This is done so because Docker works on the principle of layers. If there is any change in a layer, all the subsequence layers will be re-processed. Hence, copying only requirements.txt
ensures that installation is reused across docker builds. This layer is dropped if there is a change in the requirements.txt
file itself. Had we copied the entire project of Section 5 here, each new commit or change in code would lead to invalidating of these layers and re-installation of libraries.
pip install --no-cache-dir -r /tmp/requirements.txt
In this stage, we are installing all the project dependencies mentioned in requirements.txt
. The --no-cache-dir
flag is used to disable caching during pip installation. By default, pip caches installation files(.whl
etc) and source files(.tar.gz
etc). In docker installation, we don’t reinstall using the cache hence disabling it will reduce image size.
useradd -U app_user
Here, we are creating a non-root user app_user
using the useradd
command. By default, Docker runs container processes as root inside of a container. This is a bad practice since attackers can gain root access to the Docker host if they manage to break out of the container (source). The -U
flag creates a user group with the same name.
install -d -m 0755 -o app_user -g app_user /app/static
At the end of the section, we are creating a folder app/static
and giving our user app_user
ownership to it. This folder will be used by Django to collect all static resources of our project by running the command python manage.py collectstatic
.
Section 5- Code and User Setup
WORKDIR /app
We start this section by setting the working directory. The WORKDIR
instruction sets the working directory for subsequent commands. Since we don’t want to copy our code to the root folder, we are copying it to /app
folder.
USER app_user:app_user
Then we are setting the non-root user created at the end of Section 4 as the owner of subsequent commands. As mentioned earlier, this will improve our security.
COPY --chown=app_user:app_user . .
With everything set up, we copy the project into the docker image. Any code change will only result in an update in this and subsequent layers of docker, hence resulting in reduced docker image build time. While copying we are providing the content’s ownership to our user app_user
created in Section 4.
RUN chmod +x docker/*.sh
At the end of this section, we are giving executable permission to our two scripts files i.e. entrypoint.sh
and start.sh
. We will go into detail about these two files after the end of Section 6.
Section 6- Docker Run Checks and Configurations
ENTRYPOINT [ "docker/entrypoint.sh" ]
The ENTRYPOINT
section of a Dockerfile is always executed, hence we would like to hitch it for validations and Django commands such as migrate
. The CMD
is overridden by the command
section in a docker-compose
file so the value given here, serves as a default.
CMD [ "docker/start.sh", "server" ]
For a better understanding of what we are trying to do with ENTRYPOINT
and CMD
let’s look at the corresponding files entrypoint.sh
and start.sh
which are invoked by them.
#!/bin/bash
# entrypoint.sh file of Dockerfile
# Section 1- Bash options
set -o errexit
set -o pipefail
set -o nounset
# Section 2: Health of dependent services
postgres_ready() {
python << END
import sys
from psycopg2 import connect
from psycopg2.errors import OperationalError
try:
connect(
dbname="${DJANGO_POSTGRES_DATABASE}",
user="${DJANGO_POSTGRES_USER}",
password="${DJANGO_POSTGRES_PASSWORD}",
host="${DJANGO_POSTGRES_HOST}",
port="${DJANGO_POSTGRES_PORT}",
)
except OperationalError:
sys.exit(-1)
END
}
redis_ready() {
python << END
import sys
from redis import Redis
from redis import RedisError
try:
redis = Redis.from_url("${CELERY_BROKER_URL}", db=0)
redis.ping()
except RedisError:
sys.exit(-1)
END
}
until postgres_ready; do
>&2 echo "Waiting for PostgreSQL to become available..."
sleep 5
done
>&2 echo "PostgreSQL is available"
until redis_ready; do
>&2 echo "Waiting for Redis to become available..."
sleep 5
done
>&2 echo "Redis is available"
# Section 3- Idempotent Django commands
python manage.py collectstatic --noinput
python manage.py makemigrations
python manage.py migrate
exec "$@"
Let’s look at the above entrypoint.sh
, though in lesser detail than Dockerfile
.
Docker provides a default entrypoint /bin/sh
. In most systems, it is a symbolic link, and in the case of Ubuntu it is linked to /bin/bash
, but in some scenarios, this assumption could be wrong(source). Hence we will be explicitly linking it to /bin/bash
.
Section 1- Bash options
set -o errexit
set -o pipefail
set -o nounset
Here, we are setting few bash options. The errexit
fails the script on the first encounter of error and doesn’t proceed further, which is default bash behavior. The pipefail
means that if any element of the pipeline fails, then the pipeline as a whole will fail. The nounset
forces error whenever an unset variable is extended.
Section 2: Health of dependent services
Earlier, we had assumed that our application is using PostgreSQL database and Redis as celery backend. In this section, we are checking if both services are up and if not, we wait for them to come up.
Similarly, you may add other such critical services which are necessary for the normal functioning of your application.
Section 3- Idempotent Django commands
python manage.py collectstatic --noinput
python manage.py makemigrations
python manage.py migrate
There are many Django management commands which we need to run before starting our Django server. This includes commands to collect all static resources, collectstatic
, command to generate migrations files, makemigrations
, and command to apply these migrations on the database, migrate
. In this section, we are running all such commands.
The only thing which should be kept in mind is that all these commands should be idempotent i.e. multiple runs of these commands should not have any side-effect on the state of our application. Idempotency is required here because, suppose if Kubernetes is scaling these containers, multiple instances will be running and they will interfere will each other.
In fact, any idempotent operation can be executed here, not just Django commands.
We are using start.sh
file, to leverage the same Dockerfile and commands to run containers for Django server, Celery workers, Celery Beat and Flower, by having different arguments for each.
#!/bin/bash
cd /app
if [ $# -eq 0 ]; then
echo "Usage: start.sh [PROCESS_TYPE](server/beat/worker/flower)"
exit 1
fi
PROCESS_TYPE=$1
if [ "$PROCESS_TYPE" = "server" ]; then
if [ "$DJANGO_DEBUG" = "true" ]; then
gunicorn \
--reload \
--bind 0.0.0.0:8000 \
--workers 2 \
--worker-class eventlet \
--log-level DEBUG \
--access-logfile "-" \
--error-logfile "-" \
dockerapp.wsgi
else
gunicorn \
--bind 0.0.0.0:8000 \
--workers 2 \
--worker-class eventlet \
--log-level DEBUG \
--access-logfile "-" \
--error-logfile "-" \
dockerapp.wsgi
fi
elif [ "$PROCESS_TYPE" = "beat" ]; then
celery \
--app dockerapp.celery_app \
beat \
--loglevel INFO \
--scheduler django_celery_beat.schedulers:DatabaseScheduler
elif [ "$PROCESS_TYPE" = "flower" ]; then
celery \
--app dockerapp.celery_app \
flower \
--basic_auth="${CELERY_FLOWER_USER}:${CELERY_FLOWER_PASSWORD}" \
--loglevel INFO
elif [ "$PROCESS_TYPE" = "worker" ]; then
celery \
--app dockerapp.celery_app \
worker \
--loglevel INFO
fi
In the above script, we are using gunicorn to run our application server which is recommended approach for production. The python manage.py runserver
command should be used only in the development setup.
Following will be command in the docker-compose
file for each container type:
- Django server:
start.sh server
- Celery beat:
start.sh beat
- Flower:
start.sh flower
- Celery worker:
start.sh worker
A Django repo with the above Docker setup, along with the docker-compose file is available for download in Github, here.
I will be writing a follow-up article on this soon, where I will be discussing in detail the docker-compose and Kubernetes files corresponding to the above Docker setup.
This article heading doesn’t claim but describes the intent to have a production-ready Docker setup. Hence, please comment on any gap or improvement in the above setup.
That’s all for this blog, please follow for upcoming articles, thank you!
Related articles on the topic Django