How to Postgres on Kubernetes

In part one, Nathan Koopmans, cloud platform engineer at OptimaData, showed how to create a simple plain PostgreSQL setup. Even though this works fine, you still want more data security. Therefore, in this second part, we look at the CloudNativePG operator for Kubernetes.
One of the biggest advantages to using the CloudNativePG operator is that it takes many things off your hands. For example, failover is automatic, it manages its own volume claims and has a built-in exporter for Prometheus-metrics. When you scale the deployment, it also automatically expands the pods and related issues related to the database. You miss these things with a setup like the one discussed in part one. Since the CloudNativePG operator is open source and free to use, it can also be applied in hobby and small business environments, where high availability is desired but costs must be kept down.
To get started, we are going to download the latest version of the operator. This can be found on CloudNativePG’s Github page.
On the right side, click on the version under Releases.
On the new page you will see all bug fixes and improvements. Scroll down to the Assets heading. Here, look for the .yaml file. In this case, it is cnpg-1.20.1.yaml. We can use wget
to get this file:
wget https://github.com/cloudnative-pg/cloudnative-pg/releases/download/v1.20.1/cnpg-1.20.1.yaml
You can also apply this directly by using the command kubectl apply -f
instead of wget
, but I always like to have the file offline for verification before I apply it.
After downloading the yaml file, it still needs to be applied. You do that in the following way:
kubectl apply -f cnpg-1.20.1.yaml
In my screenshot, you can see some roles and CRDs (customresourcedefinition) on unchanges. That is due to an earlier installation of the cnpg-1.20.1.yaml. So in your case it may be slightly different from my screenshot. At the top of the output you can see that a namespace has been created called cnpg-system. This namespace contains the operator POD and other operator related items.
Run the command kubectl get all -n cnpg-system
to see what all is in the namespace.
Now that we have the operator installed, we can install the command-line plug-in. This will allow us, when we have deployed our cluster later, to retrieve additional information. In the documentation we can find under “CloudNativePG Plugin” how to install the plug-in:
curl -sSfL
https://github.com/cloudnative-pg/cloudnative-pg/raw/main/hack/install-cnpg-plugin.sh | -SFL
sudo sh -s -- -b /usr/local/bin
After installing the plugin, you can use kubectl cnpg status cluster name -n namespace
to retrieve the status of the cluster. I will show the output in a later step.
Now that all the preparations are done, we can start creating a cluster. The CloudNativePG documentation provides an example for a cluster using cluster-example.yaml
:
# Example of PostgreSQL cluster
apiVersion: postgresql.cnpg.io/v1
child: Cluster
metadata:
name: cluster-example
spec:
instances: 3
# Example of rolling update strategy:
# - unsupervised: automated update of the primary once all
# replicas have been upgraded (default)
# - supervised: requires manual supervision to perform
# the switchover of the primary
primaryUpdateStrategy: unsupervised
# Require 1Gi of space
storage:
size: 1Gi
We modify this slightly to make it work for us. Note: check what your storage class is and how much space you want to allocate. In my case, I chose longhorn since that is my used storage solution and 5Gi space since I have enough. With kubectl get sc
you can see which storage classes you have at your disposal.
In the end, my example-cluster.yaml
looks like this:
apiVersion: postgresql.cnpg.io/v1
child: Cluster
metadata:
name: test-cluster
tags:
env: database
spec:
imageName: ghcr.io/cloudnative-pg/postgresql:13.6
primaryUpdateStrategy: supervised
instances: 3
storage:
size: 5Gi
pvcTemplate:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: longhorn
volumeMode: Filesystem
postgresql:
parameters:
log_line_prefix: '%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h '
pg_hba:
- host all all 0.0.0.0/0 md5
The cluster wants to run in its own namespace, so I create a new namespace first:
kubectl create ns testcluster
Then we can now start applying the cluster-example.yaml:
kubectl apply -f cluster-example.yaml -n testcluster
To keep an eye on what all is created, you can use the command
kubectl get all -n testcluster.
You can run the command several times, however, you can also put watch
in front of it. With the program watch
the screen is refreshed every 2 seconds. The command then looks like this watch kubectl get all -n testcluster
or if you want to watch only the pods watch kubectl get pods -n testcluster
After a moment of patience, there are 3 pods in the running state:
Since it is a newly set up cluster, for convenience we will assume that test-cluster-1
is the master / primary and 2 and 3 slave. We can check this with the plug-in we installed a few steps back, namely the cnpg plug-in.
kubectl cnpg status test-cluster -n test-cluster
You will then see the following output:
This confirms the presumption that test-cluster-1
is indeed the primary. We have three instances, all three are ready and the cluster is healthy. Now it’s time to see if we can connect.
The setup with CloudNativePG requires a different way to connect to the database within the pod. To start, we need to find out the password assigned to the postgres user. This can be done with the following command:
kubectl get secrets -n testcluster
Here we see a secret called test-cluster-superuser
, this is the secret from which we need to extract the password. The fields contained in the secret can be retrieved with kubectl describe secrets test-cluster-superuser -n testcluster
we will see the password field. Now we are going to extract the password and make it readable:
kubectl get secret test-cluster-superuser -n testcluster -o jsonpath='{.data.password}' | base64 --decode
The password I’m not showing here. It is a long string of numbers and letters. Copy it to a separate document so we can use it so easily. What we’re going to do now is start a port-forward to the pod which is primary. This way we can see that the database is working.
kubectl port-forward -n test-cluster test-cluster-1 5432:5432 &
The & character at the end performs the task in the background. This allows us to keep working in the same window. In part one, I already showed the use of the psql client (psql). However instead of connecting on the node IP address, we now do so on the localhost IP address, namely 127.0.0.1. This is controlled by the port-forward
.
The default user is postgres, we put the password in a separate document and we need that now.
psql -h 127.0.0.1 -U postgres
With \l
, you can see the databases. Now that we are actually getting information back, we know it is working properly.
In part one and part two, I showed how to use PostgreSQL within Kubernetes. I assume you have already thought about security within the Kubernetes cluster, such as setting up the securityContext
per pod. That is beyond the scope of this explanation series. Regarding the operator, the securityContext and RBAC (Role-Based Access Control) are already well taken care of. More information about the securityContext can be found in the Kubernetes documentation.
In the examples, we did a simple setup of PostgreSQL. Nothing has been done to the other configuration of parameters and that’s where a lot can be gained in terms of performance. It is always advisable to take a look at this so that your data or that of your client is safe and quickly available.