18

Trying to do an hourly incremental backup of a single Postgres server (Win7 64).

I have the following setup in postgresql.conf:

max_wal_senders = 2
wal_level       = archive
archive_mode    = on
archive_command = 'copy "%p" "c:\\postgres\\foo\\%f"'

(restart)

I did a base backup with pg_basebackup -U postgres -D ..\foo -F t -x

Which made a big base.tar file in the foo folder and added some 16,384 KB files, which I assume are WALs.

What I don't understand is why the WALs in foo don't change. The WALs in data/pg_xlog change. Is pg not supposed to copy them over? How does it decide to do so?

Perhaps I need to set archive_timeout=3600 ?

I've seen several sites (pg's mailing lists, bacula's postgres page) that say you need to call pg_start_backup() and pg_stop_backup(), but I believe that those are not required. Is that true?

Secondary questions:

  1. How often do the WALs in data/pg_xlog get written? What triggers a write?

    It seems to update a WAL if I do some DML then \q in psql. Or edit a table in pgAdmin then close the window. I figured it would write on commit.

  2. Best practices? pg_basebackup once a week? Archive WALs to same machine as PG or a remote machine?

Neil McGuigan
  • 7,653
  • 3
  • 36
  • 52

2 Answers2

8

There is an existing tool that'll help you a great deal, WAL-E. It provides an archive_command and restore_command for PITR to S3.

There are no commands to do incremental or differential logical backups. pg_dump can't take an incremental or differential. The only way to do that is via log archiving.

In theory you could take a new full backup, do a binary diff between it and the last backup, and upload the diff. This strikes me as a fragile and inefficient way to do things, though, and I really wouldn't recommend it.

Additionally, PgBarman supports integration with S3 via hook scripts, and will automate much of the backup rotation and management for you. Again, this may not be an option on Windows.

Craig Ringer
  • 51,279
  • 3
  • 136
  • 175
5

You want to do an incremental backup of the archive folder to remote storage.

Should you need to restore from the backup, the basic scenario is that you'd need your base backup as the starting point, and the entire contents of the archive folder to replay the transactional activity that happened between the starting point and the crash.

Also to avoid having the files in the archive folder piling up forever, you want to do a new base backup from time to time and delete the files that were archived before the new base backup.

Daniel Vérité
  • 26,604
  • 3
  • 60
  • 69
  • Thanks. A few questions: 1. Do I do pg_start_backup(), copy the data, then run pg_stop_backup(), or pg_start_backup(); pg_stop_backup(), then copy? – Neil McGuigan Jul 08 '13 at 20:30
  • @Neil: 1. and 2. don't apply when you use `pg_basebackup`, it already takes care of this. 3. postgres will automatically delete the WAL files in `pg_log` when they're no longer needed. You shouldn't do anything manually in `pg_log`. Otherwise see the `wal_keep_segments` parameter – Daniel Vérité Jul 08 '13 at 20:44
  • by "archive" folder you mean pg_xlog, yes? – Neil McGuigan Jun 26 '15 at 21:43
  • @NeilMcGuigan: not at all. The archive folder is the destination folder of your archive command, for instance here "c:\postgres\foo". `pg_xlog` is entirely managed automatically by postgres, whereas the archive folder is entirely managed by the DBA. – Daniel Vérité Jun 26 '15 at 23:09
  • I guess I'm confused because the WALs in foo never change after the initial pg_basebackup – Neil McGuigan Jun 26 '15 at 23:15
  • @NeilMcGuigan, perhaps check the PostgreSQL database logs to see why there may be a failure to copy new WAL files from `pg_xlog` to `foo`. Insufficient permissions to write to the destination can in theory be one reason. – Asclepius Jun 30 '15 at 22:16