Production Databases¶
- Production databases are best managed via Kubernetes.
- Generally the install and management of databases is maintained by a Kubernetes 'Controller'.
- The controller simply manages the lifecycle of the database cluster, allowing for replication, load balancing etc.
- There are a few options (in order of preference, 2024):
- CloudNativePG
- CrunchyPG
- Zalando
- Percona, Stolon, KubeDB, etc
CloudNativePG¶
This guide assumes you have a functioning Kubernetes cluster, plus
available command line tools kubectl
and helm
.
CloudNativePG makes four types of resources available:
- Cluster - a replicated database cluster.
- Pooler - load balancing on top of a cluster (pg_bouncer).
- Backup - on demand db backup.
- ScheduledBackup - regular scheduled db backup.
These resources can be used in any namespace of the cluster, so a database could be deployed in a namespace alongside a tool.
Operator Install¶
Via Helm:
helm upgrade --install cnpg \
--namespace cnpg-system \
--create-namespace \
cnpg/cloudnative-pg
Install Kubectl Plugin¶
- To simplify management of databases, it's best to install the cnpg plugin.
- Details here
- This allows us to easily do maintenance tasks such as backup, re-scale replicas, delete specific replicas, upgrade the db image, etc.
Create a PostGIS Database¶
- Now we have the operator installed, this allows us to create databases
in Kubernetes using
apiVersion: postgresql.cnpg.io/v1
andkind: Cluster
. - If we deploy a spec with this properties, a database will automatically be deployed and managed by CloudNativePG.
- There are many configuration options available under the
spec
key.
Example:
cat <<EOF | kubectl apply --filename=-
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: fmtm-db
spec:
instances: 3
imageName: ghcr.io/cloudnative-pg/postgis:16
bootstrap:
initdb:
database: fmtm
postInitTemplateSQL:
- CREATE EXTENSION postgis;
- CREATE EXTENSION postgis_topology;
- CREATE EXTENSION fuzzystrmatch;
- CREATE EXTENSION postgis_tiger_geocoder;
storage:
size: 1Gi
backup:
barmanObjectStore:
destinationPath: "s3://fmtm-db-backups/"
endpointURL: "https://s3.fmtm.hotosm.org"
s3Credentials:
accessKeyId:
name: fmtm-s3-creds
key: ACCESS_KEY_ID
secretAccessKey:
name: fmtm-s3-creds
key: ACCESS_SECRET_KEY
retentionPolicy: "90d"
EOF
Fish shell equivalent
kubectl apply --filename (echo '
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: fmtm-db
spec:
instances: 3
imageName: ghcr.io/cloudnative-pg/postgis:16
bootstrap:
initdb:
database: fmtm
postInitTemplateSQL:
- CREATE EXTENSION postgis;
- CREATE EXTENSION postgis_topology;
- CREATE EXTENSION fuzzystrmatch;
- CREATE EXTENSION postgis_tiger_geocoder;
storage:
size: 1Gi
backup:
barmanObjectStore:
destinationPath: "s3://fmtm-db-backups/"
endpointURL: "https://s3.fmtm.hotosm.org"
s3Credentials:
accessKeyId:
name: fmtm-s3-creds
key: ACCESS_KEY_ID
secretAccessKey:
name: fmtm-s3-creds
key: ACCESS_SECRET_KEY
retentionPolicy: "90d"
' | psub)
Notes:
- This will use the latest PostGIS version.
- For Postgres 16 this is PostGIS 3.4.
- For testing, the
backup
section can be removed. - Full API reference for options (v1.21) here
Alternatively, generate the spec using the cnpg
plugin:
kubectl cnpg install generate \
-n fmtm \
--replicas 3
To delete the cluster, simply run the same command but use
kubectl delete
instead ofkubectl apply
.
Get the DB Credentials¶
- By default the database is created in the controller namespace
cnpg-system
. - A
postgres
password and userpassword
are generated as secrets. - To retrieve them:
kubectl get secret fmtm-db-app -o jsonpath='{.data.password}' | base64 -d
kubectl get secret fmtm-db-app -o jsonpath='{.data.pgpass}' | base64 -d
In production the
superuserSecret
andinitdb.secret
spec values can be set to use a secret from a centralised secret manager.
Connection Pooling¶
pg_bouncer
can be placed in front with a config such as:
cat <<EOF | kubectl apply --filename=-
apiVersion: postgresql.cnpg.io/v1
kind: Pooler
metadata:
name: fmtm-pooler-rw
spec:
cluster:
name: fmtm-db
instances: 3
type: rw
pgbouncer:
poolMode: session
parameters:
max_client_conn: "1000"
default_pool_size: "10"
EOF
Fish shell equivalent
kubectl apply --filename (echo '
apiVersion: postgresql.cnpg.io/v1
kind: Pooler
metadata:
name: fmtm-pooler-rw
spec:
cluster:
name: fmtm-db
instances: 3
type: rw
pgbouncer:
poolMode: session
parameters:
max_client_conn: "1000"
default_pool_size: "10"
' | psub)
Scheduled Backups¶
To run a daily scheduled backup to S3:
cat <<EOF | kubectl apply --filename=-
apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
name: fmtm-backup
spec:
schedule: "0 0 0 * * *"
backupOwnerReference: self
cluster:
name: fmtm-db
target: prefer-standby
method: barmanObjectStore
EOF
Fish shell equivalent
kubectl apply --filename (echo '
apiVersion: postgresql.cnpg.io/v1
kind: ScheduledBackup
metadata:
name: fmtm-backup
spec:
schedule: "0 0 0 * * *"
backupOwnerReference: self
cluster:
name: fmtm-db
target: prefer-standby
method: barmanObjectStore
' | psub)
Restoring The Backup¶
A new cluster can be deployed, using a backup source:
cat <<EOF | kubectl apply --filename=-
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: fmtm-db
spec:
instances: 3
storage:
size: 1Gi
bootstrap:
recovery:
source: clusterBackup
externalClusters:
- name: clusterBackup
barmanObjectStore:
destinationPath: "s3://fmtm-db-backups/"
endpointURL: "https://s3.fmtm.hotosm.org"
s3Credentials:
accessKeyId:
name: fmtm-s3-creds
key: ACCESS_KEY_ID
secretAccessKey:
name: fmtm-s3-creds
key: ACCESS_SECRET_KEY
EOF
Fish shell equivalent
kubectl apply --filename (echo '
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: fmtm-db
spec:
instances: 3
storage:
size: 1Gi
bootstrap:
recovery:
source: clusterBackup
externalClusters:
- name: clusterBackup
barmanObjectStore:
destinationPath: "s3://fmtm-db-backups/"
endpointURL: "https://s3.fmtm.hotosm.org"
s3Credentials:
accessKeyId:
name: fmtm-s3-creds
key: ACCESS_KEY_ID
secretAccessKey:
name: fmtm-s3-creds
key: ACCESS_SECRET_KEY
' | psub)
Accessing / Exposing The DB¶
- Note that exposing the database via a URL is typically not recommended and is generally not required.
- To connect to the database from an application deployed in the same cluster
we can use the service URL:
<service-name>.<namespace>.svc.cluster.local:<service-port>
- Instead, if external access is in fact required, we can use an Nginx Ingress.
- Documentation for this is available in CloudNativePG docs