Understanding Docker basics
Docker is a set of products that allow software to be run in virtualized environments, called containers. In the case of Docker Toolbox running on Windows, those containers are running inside a VirtualBox, normally called default. Docker creates this when it first runs. Last time we created a container called some-postgres. But where did we get all the software to make the container – the Operating System and PostgreSQL? All of that was contained in an image. We used an image called postgres that was stored on the Docker Hub. Let’s look at the command used to create the container:
Most docker commands start with the keyword docker.
docker run --help
will give you a list of possible flags.
The command run
creates a container.
The flag --name
gives it a name. Containers are unusual in Docker, in that if you don’t specify a name they are given a default name such as angry_davinci, jolly_wing or tender_banach. For most of the other Docker objects, if you don’t specify a name, you end up having to use the hash key that Docker generates. In this case I used some-postgres, as suggested on the postgres Hub page.
Ports
The next flag publishes the container’s port. All of the flags have a multi-character name preceded by 2 dashes (POSIX standard), but some of them have also have single-character alias as well. So, I could have used -p 5432:5432
or --publish 5432:5432
. The first number is the external port, and the second one is the internal port. The default port for PostgreSQL servers is 5432. So, if we create multiple PostgreSQL containers running on the default port (which we will shortly), we need to give them different external ports. If I wanted to create 3 such containers that could all be accessed from pgAdmin at the same time, I’d use the following port flags:
-p 5432:5432
-p 5433:5432
-p 5434:5432
On pgAdmin, I’d create 3 servers with ports 5432, 5433 and 5434.
The next flag, -e
or --env
, lists environment variables specific to the image. In this case, we want to set the postgres user password so that we can connect via pgAdmin. If an image needs to set such variables, they should be listed somewhere, and in the case of this PostgreSQL image they are explained in detail halfway down the front page.
The last flag is -d
or --detach
, specifying that we want the container to run in the background. If you forget this, then running the command will put you straight into the container and when you exit, the container will stop.
Finally, the name of the image is specified, in this case postgres. You might type a list of commands you want the container to run, immediately after the image name, but in this case we don’t need to do that.
Dockerfile
Running containers directly in this way doesn’t give you a lot of control, especially as Docker is supposed to improve automation rather than typing skills. The normal way to run Docker is a three-step process. First you create a text file, called a Dockerfile, containing a base image. The base image is the first thing in the Dockerfile (although you can have comments, starting with #) and is preceded by the keyword FROM
. There is a special base image, called scratch, that doesn’t contain anything at all. So, to build your own image from scratch, the first line of your Dockerfile will be:
FROM scratch
You could then add whatever you want into your image. But you don’t need to start from scratch. You can extend existing images. You could create an ubuntu image with specific tools and variables set. From there, you could build other images by installing different versions of PostgreSQL onto exactly the same underlying OS. That would allow you to test only the changes in PostgreSQL versions. Using the Dockerfile, you build your own image, just like the ones you can pull from the Docker hub. And using that image, you run a container.
Images
So, let’s build an image based on the postgres image we looked at last time. Create a text file called Dockerfile. By default, Docker looks for the Dockerfile in the current working directory (called the build context). You can, of course, run Docker commands from a Command Prompt or PowerShell window, not just the Docker Terminal program. And you can store your Dockerfiles in whatever directory suits you, just use the -f
flag to specify the file location. I’m going to be lazy and create a test directory right in the Docker Toolbox directory. Don’t forget that when moving around in the Terminal window, you need to use Linux commands. So, ls instead of dir, and pwd instead of echo %cd%. You also have access to vi editor if you want. My Dockerfile consists of just 2 lines:
FROM postgres
ENV POSTGRES_PASSWORD=mysecretpassword
To build the image, type
docker build -t craig/postgres:version1 .
Don’t forget the full stop (period) at the end. That tells Docker to use the Dockerfile in the current directory.
You can see the build is a two-step process. First it finds the base image. If it’s already downloaded, as here, then it moves on to the next step, otherwise it pulls it from the Docker Hub for you. Then it adds the next command line, in this case setting the POSTGRES_PASSWORD variable. That involves an intermediate container, which is automatically removed for you. After the build has been successful and the image is given an Image ID, you get a security warning. Important if you’re considering doing this for a real system, but for now you can safely ignore it. If I build version2 of my image without changing the Dockerfile, Docker is smart enough to know that nothing needs changing. It will create a new image, called craig/postgres:version2, but the Image ID will be the same as for craig/postgres:version1. You can see the possibility of multiple levels of dependencies being created here. Unfortunately, there isn’t an easy method of viewing dependencies short of third-party scripts and tools. The closest thing to a Docker command is:
docker inspect --format='{{.Id}} {{.Parent}} {{.RepoTags}}' $(docker images --quiet)
This will list the sha256 of an image (the first 12 digits of which are used as the Image ID) followed by the image’s immediate parent, if any. For more on the formatting command, see this blog. When it comes time to delete images, you won’t be able to if it has a child image, so you may need to resort to this in order to get rid of images you no longer need. You can also use:
docker history postgres