<!--
-$PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.40 2004/08/03 20:32:30 tgl Exp $
+$PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.41 2004/08/03 23:42:59 tgl Exp $
-->
<chapter id="backup">
<title>Backup and Restore</title>
<sect1 id="backup-online">
<title>On-line backup and point-in-time recovery</title>
+ <indexterm zone="backup">
+ <primary>on-line backup</primary>
+ </indexterm>
+
+ <indexterm zone="backup">
+ <primary>point-in-time recovery</primary>
+ </indexterm>
+
<para>
At all times, <productname>PostgreSQL</> maintains a <firstterm>write ahead
log</> (WAL) that shows details of every change made to the database's data
<listitem>
<para>
If we continuously feed the series of WAL files to another machine
- that's been loaded with the same base backup, we have a <quote>hot
+ that's been loaded with the same base backup file, we have a <quote>hot
standby</> system: at any point we can bring up the second machine
and it will have a nearly-current copy of the database.
</para>
many situations where high reliability is needed.
</para>
+ <para>
+ To recover successfully using an on-line backup, you need a continuous
+ sequence of archived WAL files that extends back at least as far as the
+ start time of your backup. So to get started, you should set up and test
+ your procedure for archiving WAL files <emphasis>before</> you take your
+ first base backup. Accordingly, we first discuss the mechanics of
+ archiving WAL files.
+ </para>
+
+ <sect2 id="backup-archiving-wal">
+ <title>Setting up WAL archiving</title>
+
+ <para>
+ In an abstract sense, a running <productname>PostgreSQL</> system
+ produces an indefinitely long sequence of WAL records. The system
+ physically divides this sequence into WAL <firstterm>segment files</>,
+ which are normally 16Mb apiece (although the size can be altered when
+ building the server). The segment files are given numeric names that
+ reflect their position in the abstract WAL sequence. When not using WAL
+ archiving, the system normally creates just a few segment files and then
+ <quote>recycles</> them by renaming no-longer-needed segment files to
+ higher segment numbers. It's assumed that a segment file whose contents
+ precede the checkpoint-before-last is no longer of interest and can be
+ recycled.
+ </para>
+
+ <para>
+ When archiving WAL data, we want to capture the contents of each segment
+ file once it is filled, and save that data somewhere before the segment
+ file is recycled for reuse. Depending on the application and the
+ available hardware, there could be many different ways of <quote>saving
+ the data somewhere</>: we could copy the segment files to an NFS-mounted
+ directory on another machine, or write them onto a tape drive, or batch
+ them together and burn them onto CDs, or something else entirely. To
+ provide the database administrator with as much flexibility as possible,
+ <productname>PostgreSQL</> tries not to make any assumptions about how
+ the archiving will be done. Instead, <productname>PostgreSQL</> lets
+ the administrator specify a shell command to be executed to copy a
+ completed segment file to wherever it needs to go. The command could be
+ as simple as a <application>cp</>, or it could invoke a complex shell
+ script --- it's all up to you.
+ </para>
+
+ <para>
+ The shell command to use is specified by the <xref
+ linkend="guc-archive-command"> configuration parameter, which in practice
+ will always be placed in the <filename>postgresql.conf</filename> file.
+ In this string,
+ any <literal>%p</> is replaced by the absolute path of the file to
+ archive, while any <literal>%f</> is replaced by the file name only.
+ Write <literal>%%</> if you need to embed an actual <literal>%</>
+ character in the command. The simplest useful command is something
+ like
+<programlisting>
+archive_command = 'cp %p /mnt/server/archivedir/%f'
+</programlisting>
+ which will copy archivable WAL segments to the directory
+ <literal>/mnt/server/archivedir</>.
+ </para>
+
+ <para>
+ The archive command will be executed under the ownership of the same
+ user that the <productname>PostgreSQL</> server is running as. Since
+ the series of WAL files being archived contains effectively everything
+ in your database, you will want to be sure that the archived data is
+ protected from prying eyes; for example, archive into a directory that
+ does not have group or world read access.
+ </para>
+
+ <para>
+ It is important that the archive command return zero exit status if and
+ only if it succeeded. Upon getting a zero result,
+ <productname>PostgreSQL</> will assume that the WAL segment file has been
+ successfully archived, and it may be overwritten with new data very
+ soon thereafter. However, a nonzero status tells
+ <productname>PostgreSQL</> that the file was not archived; it will try
+ again periodically until it succeeds.
+ </para>
+
+ <para>
+ Speed of the archiving command is not important, so long as it can keep up
+ with the average rate at which your server generates WAL data. It is okay
+ if the archiving process falls a little behind (or even a lot behind, if
+ you don't mind the <literal>pg_xlog/</> directory filling up with
+ not-yet-archived segment files).
+ </para>
+
+ <para>
+ If you are concerned about being able to recover right up to the current
+ instant, you may want to take additional steps to ensure that the current,
+ partially-filled WAL segment is also copied someplace. This is
+ particularly important if your server generates only little WAL traffic
+ (or has slack periods where it does so), since it could take a long time
+ before a WAL segment file is completely filled and ready to archive.
+ One possible way to handle this is to set up a <application>cron</> job
+ that periodically (once a minute, perhaps) identifies the current WAL
+ segment file and saves it someplace safe. The combination of the archived
+ WAL segments and the saved current segment will then be enough to ensure
+ you can always restore to within a minute of current time. This behavior
+ is not presently built into <productname>PostgreSQL</> because we did not
+ want to complicate the definition of the <xref
+ linkend="guc-archive-command"> by requiring it to keep track of
+ successively archived, but different, copies of the same WAL file.
+ The <xref linkend="guc-archive-command"> is only invoked on finished
+ WAL segments that will not change anymore; and except in the case of
+ retrying a failure, it will be called only once for any given file name.
+ </para>
+
+ <para>
+ In writing your archive command, you should assume that the filenames to
+ be archived may be up to 64 characters long and may contain any
+ combination of ASCII letters, digits, and dots. It is not necessary to
+ remember the original full path (<literal>%p</>) but it is necessary to
+ remember the file name (<literal>%f</>).
+ </para>
+ </sect2>
+
+ <sect2 id="backup-base-backup">
+ <title>Making a Base Backup</title>
+
+ <para>
+ The procedure for making a base backup is relatively simple:
+ <orderedlist>
+ <listitem>
+ <para>
+ Ensure that WAL archiving is enabled and working.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Connect to the database as a superuser, and issue the command
+<programlisting>
+SELECT pg_start_backup('label');
+</programlisting>
+ where <literal>label</> is any string you want to use to uniquely
+ identify this backup operation. (One good practice is to use the
+ full path where you intend to put the backup dump file.) It does
+ not matter which database within the cluster you connect to to issue
+ this command. You can ignore the result returned by the function;
+ but if it reports an error, deal with that before proceeding.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Perform the backup, using any convenient filesystem-backup tool
+ such as <application>tar</> or <application>cpio</>. It is neither
+ necessary nor desirable to stop normal operation of the database
+ while you do this.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Again connect to the database as a superuser, and issue the command
+<programlisting>
+SELECT pg_stop_backup();
+</programlisting>
+ If this returns successfully, you're done.
+ </para>
+ </listitem>
+ </orderedlist>
+ </para>
+
+ <para>
+ It is not necessary to be very concerned about the amount of time elapsed
+ between <function>pg_start_backup</> and the start of the actual backup,
+ nor between the end of the backup and <function>pg_stop_backup</>; a
+ few minutes' delay won't hurt anything. You
+ must however be quite sure that these operations are carried out in
+ sequence and don't overlap.
+ </para>
+
+ <para>
+ Be certain that your backup dump includes all of the files underneath
+ the database cluster directory (e.g., <literal>/usr/local/pgsql/data</>).
+ If you are using tablespaces that do not reside underneath this directory,
+ be careful to include them as well (and be sure that your backup dump
+ archives symbolic links as links, otherwise the restore will mess up
+ your tablespaces).
+ </para>
+
+ <para>
+ You may, however, omit from the backup dump the files within the
+ <literal>pg_xlog/</> subdirectory of the cluster directory. This
+ slight complication is worthwhile because it reduces the risk
+ of mistakes when restoring. This is easy to arrange if
+ <literal>pg_xlog/</> is a symbolic link pointing to someplace outside
+ the cluster directory, which is a common setup anyway for performance
+ reasons.
+ </para>
+
+ <para>
+ To make use of this backup, you will need to keep around all the WAL
+ segment files generated at or after the starting time of the backup.
+ To aid you in doing this, the <function>pg_stop_backup</> function
+ creates a <firstterm>backup history file</> that is immediately stored
+ into the WAL archive area. This file is named after the first WAL
+ segment file that you need to have to make use of the backup. For
+ example, if the starting WAL file is <literal>0000000100001234000055CD</>
+ the backup history file will be named something like
+ <literal>0000000100001234000055CD.007C9330.backup</>. (The second part of
+ this file name stands for an exact position within the WAL file, and can
+ ordinarily be ignored.) Once you have safely archived the backup dump
+ file, you can delete all archived WAL segments with names numerically
+ preceding this one. The backup history file is just a small text file.
+ It contains the label string you gave to <function>pg_start_backup</>, as
+ well as the starting and ending times of the backup. If you used the
+ label to identify where the associated dump file is kept, then the
+ archived history file is enough to tell you which dump file to restore,
+ should you need to do so.
+ </para>
+
+ <para>
+ Since you have to keep around all the archived WAL files back to your
+ last full dump, your interval between full dumps would usually be chosen
+ based on how much storage you want to expend on archived WAL files.
+ You should also consider how long you are prepared to spend recovering,
+ if recovery should be necessary --- the system will have to replay all
+ those segments, and that could take awhile if it's been a long time
+ since the full dump.
+ </para>
+
+ <para>
+ It's also worth noting that the <function>pg_start_backup</> function
+ makes a file named <literal>backup_label</> in the database cluster
+ directory, which is then removed again by <function>pg_stop_backup</>.
+ This file will of course be archived as a part of your backup dump file.
+ The backup label file includes the label string you gave to
+ <function>pg_start_backup</>, as well as the time at which
+ <function>pg_start_backup</> was run, and the name of the starting WAL
+ file. In case of confusion it will
+ therefore be possible to look inside a backup dump file and determine
+ exactly which backup session the dump file came from.
+ </para>
+
+ <para>
+ It is also possible to make a backup dump while the postmaster is
+ stopped. In this case, obviously you can't use
+ <function>pg_start_backup</> or <function>pg_stop_backup</>, and
+ you will therefore be left to your own devices to keep track of which
+ backup dump is which and how far back the associated WAL files go.
+ It's generally better to follow the on-line backup procedure above.
+ </para>
+ </sect2>
+
+ <sect2 id="backup-pitr-recovery">
+ <title>Recovering with an On-line Backup</title>
+
+ <para>
+ Okay, the worst has happened and you need to recover from your backup.
+ Here is the procedure:
+ <orderedlist>
+ <listitem>
+ <para>
+ Stop the postmaster, if it's running, and clean out all existing files
+ under the cluster data directory and under the root directories of any
+ tablespaces you are using.
+ (If there are recent, unarchived WAL segment files in
+ <literal>pg_xlog/</> that you want to use during restore, move these aside
+ instead of removing them.)
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Restore the database files from your backup dump. Be careful that they
+ are restored with the right ownership (the database system user, not
+ root!) and with the right permissions. If you are using tablespaces,
+ you may want to verify that the symbolic links in <literal>pg_tblspc/</>
+ were correctly restored.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Remove any files present in <literal>pg_xlog/</>; these came from the
+ backup dump and are therefore probably obsolete rather than current.
+ If you didn't archive <literal>pg_xlog/</> at all, then re-create it,
+ and be sure to re-create the subdirectory
+ <literal>pg_xlog/archive_status/</> as well.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ If you had unarchived WAL segment files that you saved aside in step 1,
+ copy them into <literal>pg_xlog/</>. (It's best to copy them, not move
+ them back in, so that you still have the unmodified files if the worst
+ happens and you have to start over.)
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Create a recovery command file <literal>recovery.conf</> in the cluster
+ data directory, as discussed below. You may also want to temporarily
+ modify <literal>pg_hba.conf</> to prevent ordinary users from connecting
+ until you are sure the recovery has worked.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Start the postmaster. The postmaster will go into recovery mode and
+ proceed to read through the archived WAL files it needs. Upon completion
+ of the recovery process, the postmaster will rename
+ <literal>recovery.conf</> to <literal>recovery.done</> (to prevent
+ accidentally re-entering recovery mode in case of a crash later) and then
+ commence normal database operations.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Inspect the contents of the database to ensure you have recovered to
+ where you want to be. If not, return to step 1. If all is well,
+ let in your users by restoring <literal>pg_hba.conf</> to normal.
+ </para>
+ </listitem>
+ </orderedlist>
+ </para>
+
+ <para>
+ The key part of all this is to set up a recovery command file
+ that describes how you want to recover and how far the recovery
+ should run. You can use <literal>recovery.conf.sample</> (normally
+ installed in the installation <literal>share/</> directory) as a
+ prototype. The one thing that you absolutely must specify in
+ <literal>recovery.conf</> is the <literal>restore_command</>,
+ which tells how to get back archived WAL file segments. Like
+ the <literal>archive_command</>, this is a shell command string.
+ It may contain <literal>%f</>,
+ which is replaced by the name of the desired log file, and <literal>%p</>,
+ which is replaced by the absolute path to copy the log file to.
+ Write <literal>%%</> if you need to embed an actual <literal>%</>
+ character in the command. The simplest useful command is something
+ like
+<programlisting>
+restore_command = 'cp /mnt/server/archivedir/%f %p'
+</programlisting>
+ which will copy previously archived WAL segments from the directory
+ <literal>/mnt/server/archivedir</>. You could of course use something
+ much more complicated, perhaps even a shell script that requests the
+ operator to mount an appropriate tape.
+ </para>
+
+ <para>
+ It is important that the command return nonzero exit status on failure.
+ The command <emphasis>will</> be asked for log files that are not present
+ in the archive; it must return nonzero when so asked. This is not an
+ error condition. Be aware also that the basename of the <literal>%p</>
+ path will be different from <literal>%f</>; do not expect them to be
+ interchangeable.
+ </para>
+
+ <para>
+ WAL segments that cannot be found in the archive will be sought in
+ <literal>pg_xlog/</>; this allows use of recent un-archived segments.
+ However segments that are available from the archive will be used in
+ preference to files in <literal>pg_xlog/</>. The system will not
+ overwrite the existing contents of <literal>pg_xlog/</> when retrieving
+ archived files.
+ </para>
+
+ <para>
+ Normally, recovery will proceed through all available WAL segments,
+ thereby restoring the database to current time (or as close as we can
+ get given the available WAL segments). But if you want to recover to
+ some previous point in time (say, right before the junior DBA dropped your
+ main transaction table), just specify the required stopping point in
+ <literal>recovery.conf</>. You can specify the stop point either by
+ date/time or by transaction ID. As of this writing only the date/time
+ option is very usable, since there are no tools to help you identify
+ which transaction ID to use. Note that the stop point must be after
+ the ending time of the backup (ie, the time of
+ <function>pg_stop_backup</>). You cannot use a base backup to recover
+ to a time when that backup was still going on. (To recover to such
+ a time, you must go back to your previous base backup and roll forward
+ from there.)
+ </para>
+ </sect2>
+
+ <sect2 id="backup-timelines">
+ <title>Timelines</title>
+
+ <indexterm zone="backup">
+ <primary>timelines</primary>
+ </indexterm>
+
+ <para>
+ The ability to restore the database to a previous point in time creates
+ some complexities that are akin to science-fiction stories about time
+ travel and parallel universes. In the original history of the database,
+ perhaps you dropped a critical table at 5:15PM on Tuesday evening.
+ Unfazed, you get out your backup, restore to the point-in-time 5:14PM
+ Tuesday evening, and are up and running. In <emphasis>this</> history of
+ the database universe, you never dropped the table at all. But suppose
+ you later realize this wasn't such a great idea after all, and would like
+ to return to some later point in the original history? You won't be able
+ to if, while your database was up-and-running, it overwrote some of the
+ sequence of WAL segment files that led up to the time you now wish you
+ could get back to. So you really want to distinguish the series of
+ WAL records generated after you've done a point-in-time recovery from
+ those that were generated in the original database history.
+ </para>
+
+ <para>
+ To deal with these problems, <productname>PostgreSQL</> has a notion
+ of <firstterm>timelines</>. Each time you recover to a point-in-time
+ earlier than the end of the WAL sequence, a new timeline is created
+ to identify the series of WAL records generated after that recovery.
+ (If recovery proceeds all the way to the end of WAL, however, we do not
+ start a new timeline: we just extend the existing one.) The timeline
+ ID number is part of WAL segment file names, and so a new timeline does
+ not overwrite the WAL data generated by previous timelines. It is
+ in fact possible to archive many different timelines. While that might
+ seem like a useless feature, it's often a lifesaver. Consider the
+ situation where you aren't quite sure what point-in-time to recover to,
+ and so have to do several point-in-time recoveries by trial and error
+ until you find the best place to branch off from the old history. Without
+ timelines this process would soon generate an unmanageable mess. With
+ timelines, you can recover to <emphasis>any</> prior state, including
+ states in timeline branches that you later abandoned.
+ </para>
+
+ <para>
+ Each time a new timeline is created, <productname>PostgreSQL</> creates
+ a <quote>timeline history</> file that shows which timeline it branched
+ off from and when. These history files are necessary to allow the system
+ to pick the right WAL segment files when recovering from an archive that
+ contains multiple timelines. Therefore, they are archived into the WAL
+ archive area just like WAL segment files. The history files are just
+ small text files, so it's cheap and appropriate to keep them around
+ indefinitely (unlike the segment files which are large). You can, if
+ you like, add comments to a history file to make your own notes about
+ how and why this particular timeline came to be. Such comments will be
+ especially valuable when you have a thicket of different timelines as
+ a result of experimentation.
+ </para>
+
+ <para>
+ The default behavior of recovery is to recover along the same timeline
+ that was current when the base backup was taken. If you want to recover
+ into some child timeline (that is, you want to return to some state that
+ was itself generated after a recovery attempt), you need to specify the
+ target timeline in <literal>recovery.conf</>. You cannot recover into
+ timelines that branched off earlier than the base backup.
+ </para>
+ </sect2>
</sect1>
<sect1 id="migration">