Create a Highly Available PostgreSQL Cluster Using Patroni and HAProxy
Updated by Kulshekhar Kabra Written by Kulshekhar Kabra
What is PostgreSQL?
PostgreSQL (Postgres) is an open-source, fully ACID compliant relational database that runs on all major operating systems. While Postgres is a highly versatile, feature-rich, and powerful database, it doesn’t have a built-in solution for high availability.
This guide shows you how to create a highly available Postgres cluster of three servers using Patroni.
Before You Begin
Familiarize yourself with our Getting Started guide and familiarize yourself with SSH and connecting to your linode.
This guide will use
sudo
wherever possible. Complete the sections of our Securing Your Server to create a standard user account and harden SSH access.Update your system:
sudo apt update && sudo apt upgrade
Create five Linodes on your account, all within the same data center. Take note of each Linode’s private IP address
NoteThis guide is written for a non-root user. Commands that require elevated privileges are prefixed withsudo
. If you’re not familiar with thesudo
command, see the Users and Groups guide.
Install PostgreSQL
Install Postgres on three Linodes in your setup. Because the configuration in this guide uses private IP addresses to communicate between Linodes in the same data center, this setup may not meet certain Highly Available requirements. For more information about private IPs, visit our Remote Access guide.
The examples in this guide assign the private IP addresses of the three Postgres Linodes 192.0.2.11
, 192.0.2.12
and 192.0.2.13
. To setup a private IP address on a Linode, refer to the Remote Access guide for more information.
On the three Linodes where you want to install Postgres, update the package lists:
sudo apt update
Install Postgres:
sudo apt install postgresql-9.5 -y
Upon installation, Postgres automatically runs as a service. Stop the Postgres service so that Patroni can manage it from this point on:
sudo systemctl stop postgresql
Patroni uses utilities that come installed with Postgres, located in the
/usr/lib/postgresql/9.5/bin
directory by default on Ubuntu 16.04. Create symbolic links in thePATH
to ensure that Patroni can find the utilities:sudo ln -s /usr/lib/postgresql/9.5/bin/* /usr/sbin/
Instead of creating symlinks, you can include the
/usr/lib/postgresql/9.5/bin
directory in yourPATH
.Repeat these steps on each of the three Linodes.
Install Patroni
Patroni is an open-source Python package that manages Postgres configuration. It can be configured to handle tasks like replication, backups and restorations.
In this guide, you will use Patroni to:
- Configure the Postgres instance running on the same server
- Configure replication from master to slaves
- Automatically failover to the best slave in case the master goes down.
Install
python
andpip
:sudo apt install python python-pip -y
Ensure that you have latest version of the
setuptools
python package:sudo pip install --upgrade setuptools
Use
pip
to install Patroni:sudo pip install patroni
Repeat these steps on each of the three Linodes.
Install etcd
Etcd is a fault-tolerant, distributed key-value store that is used to store the state of the Postgres cluster. Via Patroni, all of the Postgres nodes make use of etcd to keep the Postgres cluster up and running.
In this guide you use a single-server etcd cluster. However, in production, it may be best to use a larger etcd cluster so that one etcd node fails, it doesn’t affect your Postgres servers.
On the Linode where you want etcd installed, update the package lists:
sudo apt update
Install etcd:
sudo apt install etcd -y
The remainder of this guide uses 192.0.2.21
as the private IP address of this Linode.
Install HAProxy
When developing an application that uses a database, it can be cumbersome to keep track of the database endpoints if they keep changing. Using HAProxy simplifies this by giving a single endpoint to which you can connect the application.
HAProxy forwards the connection to whichever node is currently the master. It does this using a REST endpoint that Patroni provides. Patroni ensures that, at any given time, only the master Postgres node will appear as online, forcing HAProxy to connect to the correct node.
On the Linode where you want HAProxy installed, update the package lists:
sudo apt update
Install HAProxy:
sudo apt install haproxy -y
This guide uses 192.0.2.31
as the private IP address of this server and 203.0.113.1
as its public IP address.
Current Status
At this stage, you should have a total of five Linodes:
Example Private IP Address | Software Installed | Example Public IP Address |
---|---|---|
192.0.2.11 | Postgres, Patroni | - |
192.0.2.12 | Postgres, Patroni | - |
192.0.2.13 | Postgres, Patroni | - |
192.0.2.21 | etcd | - |
192.0.2.31 | HAProxy | 203.0.113.1 |
Configure etcd
Edit the
/etc/default/etcd
file to add the following configuration:- /etc/default/etcd
-
1 2 3 4 5 6 7 8 9 10 11 12 13
ETCD_LISTEN_PEER_URLS="http://192.0.2.21:2380" ETCD_LISTEN_CLIENT_URLS="http://localhost:2379,http://192.0.2.21:2379" ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.0.2.21:2380" ETCD_INITIAL_CLUSTER="etcd0=http://192.0.2.21:2380," ETCD_ADVERTISE_CLIENT_URLS="http://192.0.2.21:2379" ETCD_INITIAL_CLUSTER_TOKEN="cluster1" ETCD_INITIAL_CLUSTER_STATE="new"
Save the file, then restart the etcd service:
sudo systemctl restart etcd
Configure Patroni
Patroni can be configured using a YAML file which can be placed anywhere. In this guide, you will place this file at /etc/patroni.yml
.
Create a patroni.yml
file on all three Linodes that have Postgres and Patroni installed (192.0.2.11
, 192.0.2.12
, and 192.0.2.13
in this guide). Change name
to something unique, and change listen
and connect_address
(under postgresql
and restapi
) to the appropriate values on each Linode.
Edit this file to have the following content:
- /etc/patroni.yml
-
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58
scope: postgres namespace: /db/ name: postgresql0 restapi: listen: 192.0.2.11:8008 connect_address: 192.0.2.11:8008 etcd: host: 192.0.2.21:2379 bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 postgresql: use_pg_rewind: true initdb: - encoding: UTF8 - data-checksums pg_hba: - host replication replicator 127.0.0.1/32 md5 - host replication replicator 192.0.2.11/0 md5 - host replication replicator 192.0.2.12/0 md5 - host replication replicator 192.0.2.13/0 md5 - host all all 0.0.0.0/0 md5 users: admin: password: admin options: - createrole - createdb postgresql: listen: 192.0.2.11:5432 connect_address: 192.0.2.11:5432 data_dir: /data/patroni pgpass: /tmp/pgpass authentication: replication: username: replicator password: rep-pass superuser: username: postgres password: secretpassword parameters: unix_socket_directories: '.' tags: nofailover: false noloadbalance: false clonefrom: false nosync: false
Make note of the
data_dir
value in the above file. Thepostgres
user needs the ability to write to this directory. If this directory doesn’t exist, create it:sudo mkdir /data/patroni -p
Make
postgres
the owner of/data/patroni
:sudo chown postgres:postgres /data/patroni
Change the permissions on this directory to make it accessible only to the
postgres
user:sudo chmod 700 /data/patroni
Every option in the above file is configurable. View the latest version of the postgres0.yml file in Patroni’s Github repository.
Create a
systemd
script that will allow you to start, stop and monitor Patroni. Create a file at/etc/systemd/system/patroni.service
with the following content:- /etc/systemd/system/patroni.service
-
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
[Unit] Description=Runners to orchestrate a high-availability PostgreSQL After=syslog.target network.target [Service] Type=simple User=postgres Group=postgres ExecStart=/usr/local/bin/patroni /etc/patroni.yml KillMode=process TimeoutSec=30 Restart=no [Install] WantedBy=multi-user.targ
If
patroni
is installed in a location other than/usr/local/bin/patroni
on your machine, update the above file accordingly.Start Patroni and Postgres:
sudo systemctl start patroni
Check the status of Patroni:
sudo systemctl status patroni
If everything is set up correctly, the output from the first node (leader) will resemble:
● patroni.service - Runners to orchestrate a high-availability PostgreSQL Loaded: loaded (/etc/systemd/system/patroni.service; enabled; vendor preset: enabled) Active: active (running) since Thu 2017-07-29 16:49:18 UTC; 8min ago Main PID: 13097 (patroni) . . . ... INFO: Lock owner: postgresql0; I am postgresql0 ... INFO: no action. i am the leader with the lock
When starting subsequent nodes, the log will resemble:
INFO: no action. i am a secondary and i am following a leader Lock owner: postgresql0; I am postgresql2
Repeat these steps on each of the three Linodes with Postgres installed to create a highly available Postgres cluster with one master and two slaves.
Configure HAProxy
With the Postgres cluster set up, you need a way to connect to the master regardless of which of the servers in the cluster is the master. This is where HAProxy comes in. All Postgres clients (your applications, psql
, etc.) will connect to HAProxy which will make sure you connect to the master in the cluster.
On the Linode that has HAProxy installed, edit the configuration file at
/etc/haproxy/haproxy.cfg
to contain the following:- /etc/haproxy/haproxy.cfg
-
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
global maxconn 100 defaults log global mode tcp retries 2 timeout client 30m timeout connect 4s timeout server 30m timeout check 5s listen stats mode http bind *:7000 stats enable stats uri / listen postgres bind *:5000 option httpchk http-check expect status 200 default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions server postgresql_192.0.2.11_5432 192.0.2.11:5432 maxconn 100 check port 8008 server postgresql_192.0.2.12_5432 192.0.2.12:5432 maxconn 100 check port 8008 server postgresql_192.0.2.13_5432 192.0.2.13:5432 maxconn 100 check port 8008
This configuration exposes HAProxy stats on a public URL. In a production setup, it might be better to restrict this to an internal network/localhost and access it via an SSH tunnel.
Restart HAProxy to use the new settings:
sudo systemctl restart haproxy
If HAProxy fails to start, check for syntax errors:
/usr/sbin/haproxy -c -V -f /etc/haproxy/haproxy.cfg
Test the Setup
Connect Postgres clients to the public IP address of the Linode on which you installed HAProxy (in this guide,
203.0.113.1
) on port5000
.You can also connect to the HAProxy Linode on port
7000
to see the HAProxy dashboard:In the
postgres
section, thepostgresql_192.0.2.11_5432
row is highlighted in green. This indicates that192.0.2.11
is currently acting as the master.If you kill the primary Linode (using
sudo systemctl stop patroni
or by shutting down the server), the dashboard will look similar to:In the
postgres
section, thepostgresql_192.0.2.11_5432
row is now red and thepostgresql_192.0.2.13_5432
row is highlighted in green. This indicates that192.0.2.13
is currently acting as the master.Note
In this case, it just so happens that the third Postgres server is promoted to master. This might not always be the case. It is equally likely that the second server may be promoted to master.
When you now bring up the first server, it will rejoin the cluster as a slave and will sync up with the master.
You now have a robust, highly available Postgres cluster ready for use.
Possible Next Steps
While the setup in this guide should go far in making your Postgres deployment highly available, here are steps you can take to improve it further:
- Use a larger etcd cluster to improve availability.
- Use PgBouncer to pool connections.
- Add another HAProxy server and configure IP failover to create a highly available HAProxy cluster.
More Information
You may wish to consult the following resources for additional information on this topic. While these are provided in the hope that they will be useful, please note that we cannot vouch for the accuracy or timeliness of externally hosted materials.
Join our Community
Find answers, ask questions, and help others.
This guide is published under a CC BY-ND 4.0 license.