Project

General

Profile

Actions

Database Replication 2

New setup is a pair of PostgreSQL servers running on CentOS VMs inside the normal PPD Network with streaming WAL replication set up between them.

  • heplnv118 is the nominal primary
  • heplnv119 is the nominal secondary

Database setup and configuration is mainly handled by puppet.

Database files are stored in `/pgsql` although data is streamed in real time between the servers the Write Ahead Log "WAL" files are copied to a shared network disk `/net/pgsql/pg_archive/archive/heplnv118/`. The secondary PostgreSQL server manages cleaning the files up when they are no longer required.

Re-establish broken streaming

  • On the secondary
    Stop Puppet and PostgresSQL
    puppet agent --disable
    systemctl stop postgresql-11.service
    

    Unmount, clean and recreate the data directory
    umount /pgsql
    mkfs.xfs -f /dev/mapper/VolGroup00-pgsql
    mount /pgsql
    chown postgres:postgres /pgsql
    
  • On the Primary
    Create a new backup of the DB and copy it to the shared area
    /usr/local/sbin/pgsql_backup.sh
    cp /tmp_backup/pgsql/last/heplnv118_db_backup_YYYYMMDDHHmm.tgz /net/pgsql/pg_archive/backups/heplnv118/
    
  • Back on the Secondary
    Unpack the backup
    cd /pgsql/
    sudo -u postgres tar xzf /net/pgsql/pg_archive/backups/heplnv118/heplnv118_db_backup_202011051700.tgz
    

    Re-enable and run Puppet. This should create the recovery.conf file then start PostgreSQL in streaming recovery mode.
    puppet agent --enable
    puppet agent --test
    

    Check the status of PostgreSQL
    # systemctl status postgresql-11.service
    ● postgresql-11.service - PostgreSQL 11 database server
       Loaded: loaded (/etc/systemd/system/postgresql-11.service; enabled; vendor preset: disabled)
       Active: active (running) since Thu 2020-11-05 17:03:45 GMT; 56min ago
         Docs: https://www.postgresql.org/docs/11/static/
     Main PID: 181653 (postmaster)
       CGroup: /system.slice/postgresql-11.service
               ├─181653 /usr/pgsql-11/bin/postmaster -D /pgsql
               ├─181655 postgres: logger
               ├─181656 postgres: startup   recovering 000000010000000000000070
               ├─181658 postgres: checkpointer
               ├─181659 postgres: background writer
               ├─181661 postgres: stats collector
               └─181662 postgres: walreceiver   streaming 0/70F2D438
    
    Nov 05 17:03:44 heplnv119.pp.rl.ac.uk systemd[1]: Starting PostgreSQL 11 database server...
    Nov 05 17:03:44 heplnv119.pp.rl.ac.uk postmaster[181653]: 2020-11-05 17:03:44.891 GMT [181653] LOG:  listening on IPv4 address "0.0.0.0", port 5432
    Nov 05 17:03:44 heplnv119.pp.rl.ac.uk postmaster[181653]: 2020-11-05 17:03:44.891 GMT [181653] LOG:  listening on IPv6 address "::", port 5432
    Nov 05 17:03:44 heplnv119.pp.rl.ac.uk postmaster[181653]: 2020-11-05 17:03:44.894 GMT [181653] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432" 
    Nov 05 17:03:44 heplnv119.pp.rl.ac.uk postmaster[181653]: 2020-11-05 17:03:44.899 GMT [181653] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432" 
    Nov 05 17:03:44 heplnv119.pp.rl.ac.uk postmaster[181653]: 2020-11-05 17:03:44.913 GMT [181653] LOG:  redirecting log output to logging collector process
    Nov 05 17:03:44 heplnv119.pp.rl.ac.uk postmaster[181653]: 2020-11-05 17:03:44.913 GMT [181653] HINT:  Future log output will appear in directory "log".
    Nov 05 17:03:45 heplnv119.pp.rl.ac.uk systemd[1]: Started PostgreSQL 11 database server.
    Nov 05 17:03:45 heplnv119.pp.rl.ac.uk systemd[1]: Reloading PostgreSQL 11 database server.
    Nov 05 17:03:45 heplnv119.pp.rl.ac.uk systemd[1]: Reloaded PostgreSQL 11 database server.
    

Updated by Brew, Chris 11 months ago · 2 revisions