Toulouse, October 17th, 2023
Today, we are releasing pglift 1.0, a new framework for deploying and operating PostgreSQL at scale. The project ships with both a command-line interface to manage the life-cycle of your databases and a collection of Ansible modules to drive your infrastructure as code in a production context.
By this 1.0 milestone, the user interface of pglift is now stable and the product considered ready for general use.
What is pglift?
The project originates from the need of Dalibo’s customers to be able to deploy and operate many PostgreSQL clusters as part of their production environments, in a uniform manner across their infrastructure:
- The production aspect means that PostgreSQL, the database engine, usually needs to be complemented by other services for backup, monitoring or high-availability.
- The uniform aspect typically refers to automation, meaning that end-users want to streamline the deployment of their database service so as to operate it in a safe manner.
pglift tries to answer these challenges by providing:
- a single entry-point and user-friendly command-line interface to operate a database cluster on site, and,
- a collection of Ansible modules, to automate deployment and operations across an infrastructure.
Either way, the behaviour of pglift is highly configurable, from how the PostgreSQL cluster is deployed (authentication, storage layout, logging, backup) to which components are set up alongside the instance to make it production-ready (physical backup, monitoring, high-availability).
Anyone wanting to deploy and operate PostgreSQL at a higher level than bare
initdb
or pg_ctl
commands or with the need for some level of automation
might find pglift attractive. This goes from Enterprise IT services to devops
wanting their infrastructure managed through GitOps.
On the other hand, pglift is not meant to be the single solution to every use case but rather a piece of software that composes well with other tools. As an illustration, while pglift is local first and agnostic of any particular infrastructure framework, it can seamlessly integrate with Ansible already and could be made working with Terraform (or Kubernetes) similarly.
While it remains as configurable as possible, pglift is opinionated in some respect, as when some business choices need to be made, they usually result from Dalibo’s expertise.
Enough theory, let’s see what pglift can do!
The command-line interface
The logical unit of work of pglift is the instance, which at least consists of a PostgreSQL cluster but usually also includes satellite components. Creating an instance from the command-line is done as follows:
$ pglift instance create main --pgbackrest-stanza=main
INFO initializing PostgreSQL
INFO configuring PostgreSQL authentication
INFO configuring PostgreSQL
INFO starting PostgreSQL 16-main
INFO creating role 'powa'
INFO creating role 'prometheus'
INFO creating role 'backup'
INFO altering role 'backup'
INFO creating 'powa' database in 16/main
INFO creating extension 'btree_gist' in database powa
INFO creating extension 'pg_qualstats' in database powa
INFO creating extension 'pg_stat_statements' in database powa
INFO creating extension 'pg_stat_kcache' in database powa
INFO creating extension 'powa' in database powa
INFO configuring Prometheus postgres_exporter 16-main
INFO configuring pgBackRest stanza 'main' for
pg1-path=/srv/pgsql/16/main/data
INFO creating pgBackRest stanza main
INFO starting Prometheus postgres_exporter 16-main
As we can see, aside from PostgreSQL, the instance includes a set of extensions required to make PoWA available, a Prometheus postgres_exporter service for monitoring and a pgBackRest setup for physical backup. These integrations are not defined during operations, but rather configured beforehand and locally through so-called site settings, in the form of plain YAML file such as:
# file: /etc/pglift/settings.yaml
prefix: /srv
postgresql:
auth:
host: scram-sha-256
prometheus:
execpath: /usr/bin/prometheus-postgres-exporter
pgbackrest:
repository:
mode: path
path: /srv/pgsql-backups
powa: {}
systemd: {}
rsyslog: {}
In addition to services running alongside PostgreSQL (monitoring, backup),
pglift can also handle system integrations like systemd
and rsyslog
as in
our example. All this works without root
privileges for better separation of
concerns and security.
One fundamental feature of pglift is that it is stateless but still aware of the state of objects it manages at runtime; typically, this allows to retrieve the state of an instance (at some point in time):
$ pglift instance get main -o json
{
"name": "main",
"version": "16",
"port": 5432,
"settings": {
"unix_socket_directories": "/run/user/1000/pglift/postgresql",
"shared_buffers": "1 GB",
"wal_level": "replica",
"archive_mode": true,
"archive_command": "/usr/bin/pgbackrest --config-path=/etc/pgbackrest --stanza=main --pg1-path=/srv/pgsql/16/main/data archive-push %p",
"effective_cache_size": "4 GB",
"log_destination": "syslog",
"logging_collector": true,
"log_directory": "/var/log/postgresql",
"log_filename": "16-main-%Y-%m-%d_%H%M%S.log",
"syslog_ident": "postgresql-16-main",
"cluster_name": "main",
"lc_messages": "C",
"lc_monetary": "C",
"lc_numeric": "C",
"lc_time": "C",
"shared_preload_libraries": "pg_qualstats, pg_stat_statements, pg_stat_kcache"
},
"data_checksums": false,
"locale": "C",
"encoding": "UTF8",
"standby": null,
"state": "started",
"pending_restart": false,
"wal_directory": "/srv/pgsql/16/main/wal",
"prometheus": {
"port": 9187
},
"data_directory": "/srv/pgsql/16/main/data",
"powa": {},
"pgbackrest": {
"stanza": "main"
}
}
or to modify the instance:
$ pglift pgconf -i main set log_connections=on
INFO configuring PostgreSQL
INFO instance 16/main needs reload due to parameter changes: log_connections
INFO reloading PostgreSQL configuration for 16-main
log_connections: None -> True
$ pglift instance alter main --prometheus-port 8188
INFO configuring PostgreSQL
INFO reconfiguring Prometheus postgres_exporter 16-main
INFO instance 16/main needs reload due to parameter changes: log_connections
INFO reloading PostgreSQL configuration for 16-main
INFO starting Prometheus postgres_exporter 16-main
$ pglift instance get main
name version port data_checksums locale encoding pending_restart prometheus pgbackrest
main 16 5432 False C UTF8 False port: 9187 stanza: main
pglift can also operate on PostgreSQL objects: databases, roles, schemas, privileges, etc.
Moreover, instances and other PostgreSQL objects can be manipulated through native PostgreSQL utilities from the pglift command-line, which is composable as it forwards the instance environment to invoked command:
$ pglift instance exec main -- pgbench -i bench
creating tables...
generating data (client-side)...
100000 of 100000 tuples (100%) done (elapsed 0.06 s, remaining 0.00 s)
vacuuming...
creating primary keys...
done in 0.18 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 0.08 s, vacuum 0.04 s, primary keys 0.05 s).
$ pglift instance exec main -- pgbench bench
pgbench (16.0 (Debian 16.0-1.pgdg120+1))
starting vacuum...end.
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 1
number of threads: 1
maximum number of tries: 1
number of transactions per client: 10
number of transactions actually processed: 10/10
number of failed transactions: 0 (0.000%)
latency average = 1.669 ms
initial connection time = 4.544 ms
tps = 599.125277 (without initial connection time)
This also applies to third party tools, e.g.:
$ pglift instance exec main -- pgbackrest info
stanza: main
status: ok
cipher: none
db (current)
wal archive min/max (16): 000000010000000000000001/000000010000000000000007
full backup: 20231016-092726F
timestamp start/stop: 2023-10-16 09:27:26+02 / 2023-10-16 09:27:31+02
wal start/stop: 000000010000000000000004 / 000000010000000000000004
database size: 32.0MB, database backup size: 32.0MB
repo1: backup set size: 4.2MB, backup size: 4.2MB
diff backup: 20231016-092726F_20231016-092821D
timestamp start/stop: 2023-10-16 09:28:21+02 / 2023-10-16 09:28:24+02
wal start/stop: 000000010000000000000007 / 000000010000000000000007
database size: 54.5MB, database backup size: 22.6MB
repo1: backup set size: 6MB, backup size: 1.8MB
backup reference list: 20231016-092726F
See the CLI tutorial in the documentation for more details.
Ansible modules
pglift ships with a collection of Ansible modules, under the dalibo.pglift
namespace. Here’s an example playbook illustrating its capabilities:
- name: Set up database instances
hosts: dbserver
tasks:
- name: main instance
dalibo.pglift.instance:
name: main
state: started
port: 5444
settings:
max_connections: 100
shared_buffers: 1GB
shared_preload_libraries: 'pg_stat_statements, passwordcheck'
surole_password: '{{ postgresql_surole_password }}'
pgbackrest:
stanza: main
password: '{{ backup_role_password }}'
prometheus:
password: '{{ prometheus_role_password }}'
port: 9186
roles:
- name: admin
login: true
password: '{{ admin_password }}'
connection_limit: 10
validity: '2025-01-01T00:00'
in_roles:
- pg_read_all_stats
- pg_signal_backend
databases:
- name: main
owner: admin
settings:
work_mem: 3MB
extensions:
- name: unaccent
schema: public
As we can see, the dalibo.pglift.instance
module makes it possible to manage
not only an instance, but also related objects such as roles
and
databases
. Sensitive data (such as passwords) can be managed by a secret
management solution, such as Ansible vault. Ansible modules allow for much
more control than the command-line, in that deeply nested or complex fields,
such as PostgreSQL instance (or database) settings
, database extensions
,
etc.
This interface is completely declarative and idempotent, but still stateless (as are Ansible modules in general). As for the command-line, head out to Ansible tutorial in the documentation for more details. Also note that these modules can work together with other Ansible modules, such as community.postgresql ones.
About the project
Under the hood, all the business logic is implemented as a Python API. The core idea of pglift is to make PostgreSQL infrastructure management declarative and stateless so that it can be used through infrastructure-as-code but does not need to store anything. This design made it possible to expose both an imperative command-line interface and a declarative interface, in the form of Ansible modules.
The project is open source native, released under the GPLv3 licence. It was made possible by a team of enthusiastic developers and administrators following the idea of encoding Dalibo’s expertise into a powerful, flexible and user-friendly toolkit for day-to-day PostgreSQL administration. Its development happens in public at:
- https://gitlab.com/dalibo/pglift/, for the main project (Python API and the CLI), and
- https://gitlab.com/dalibo/pglift-ansible/, the
dalibo.pglift
Ansible collection.
Finally, great care have been taken to provide an extensive but accessible documentation including tutorials, user guides and how-tos.