Some desultory copy-editing on the backup/restore docs.
This commit is contained in:
parent
812bf6984b
commit
8e179aeb9e
@ -1,4 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.110 2007/12/15 15:41:02 adunstan Exp $ -->
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.111 2007/12/17 04:30:05 tgl Exp $ -->
|
||||
|
||||
<chapter id="backup">
|
||||
<title>Backup and Restore</title>
|
||||
@ -8,7 +8,7 @@
|
||||
<para>
|
||||
As with everything that contains valuable data, <productname>PostgreSQL</>
|
||||
databases should be backed up regularly. While the procedure is
|
||||
essentially simple, it is important to have a basic understanding of
|
||||
essentially simple, it is important to have a clear understanding of
|
||||
the underlying techniques and assumptions.
|
||||
</para>
|
||||
|
||||
@ -21,6 +21,7 @@
|
||||
<listitem><para>Continuous archiving</para></listitem>
|
||||
</itemizedlist>
|
||||
Each has its own strengths and weaknesses.
|
||||
Each is discussed in turn below.
|
||||
</para>
|
||||
|
||||
<sect1 id="backup-dump">
|
||||
@ -75,11 +76,11 @@ pg_dump <replaceable class="parameter">dbname</replaceable> > <replaceable cl
|
||||
|
||||
<para>
|
||||
Dumps created by <application>pg_dump</> are internally consistent,
|
||||
that is, updates to the database while <application>pg_dump</> is
|
||||
running will not be in the dump. <application>pg_dump</> does not
|
||||
that is, the dump represents a snapshot of the database as of the time
|
||||
<application>pg_dump</> begins running. <application>pg_dump</> does not
|
||||
block other operations on the database while it is working.
|
||||
(Exceptions are those operations that need to operate with an
|
||||
exclusive lock, such as <command>VACUUM FULL</command>.)
|
||||
exclusive lock, such as most forms of <command>ALTER TABLE</command>.)
|
||||
</para>
|
||||
|
||||
<important>
|
||||
@ -109,7 +110,7 @@ psql <replaceable class="parameter">dbname</replaceable> < <replaceable class
|
||||
before executing <application>psql</> (e.g., with
|
||||
<literal>createdb -T template0 <replaceable
|
||||
class="parameter">dbname</></literal>). <application>psql</>
|
||||
supports similar options to <application>pg_dump</> for specifying
|
||||
supports options similar to <application>pg_dump</>'s for specifying
|
||||
the database server to connect to and the user name to use. See
|
||||
the <xref linkend="app-psql"> reference page for more information.
|
||||
</para>
|
||||
@ -131,8 +132,8 @@ psql <replaceable class="parameter">dbname</replaceable> < <replaceable class
|
||||
<programlisting>
|
||||
\set ON_ERROR_STOP
|
||||
</programlisting>
|
||||
Either way, you will only have a partially restored
|
||||
dump. Alternatively, you can specify that the whole dump should be
|
||||
Either way, you will have an only partially restored database.
|
||||
Alternatively, you can specify that the whole dump should be
|
||||
restored as a single transaction, so the restore is either fully
|
||||
completed or fully rolled back. This mode can be specified by
|
||||
passing the <option>-1</> or <option>--single-transaction</>
|
||||
@ -146,7 +147,7 @@ psql <replaceable class="parameter">dbname</replaceable> < <replaceable class
|
||||
<para>
|
||||
The ability of <application>pg_dump</> and <application>psql</> to
|
||||
write to or read from pipes makes it possible to dump a database
|
||||
directly from one server to another; for example:
|
||||
directly from one server to another, for example:
|
||||
<programlisting>
|
||||
pg_dump -h <replaceable>host1</> <replaceable>dbname</> | psql -h <replaceable>host2</> <replaceable>dbname</>
|
||||
</programlisting>
|
||||
@ -156,7 +157,7 @@ pg_dump -h <replaceable>host1</> <replaceable>dbname</> | psql -h <replaceable>h
|
||||
<para>
|
||||
The dumps produced by <application>pg_dump</> are relative to
|
||||
<literal>template0</>. This means that any languages, procedures,
|
||||
etc. added to <literal>template1</> will also be dumped by
|
||||
etc. added via <literal>template1</> will also be dumped by
|
||||
<application>pg_dump</>. As a result, when restoring, if you are
|
||||
using a customized <literal>template1</>, you must create the
|
||||
empty database from <literal>template0</>, as in the example
|
||||
@ -196,13 +197,21 @@ pg_dumpall > <replaceable>outfile</>
|
||||
psql -f <replaceable class="parameter">infile</replaceable> postgres
|
||||
</synopsis>
|
||||
(Actually, you can specify any existing database name to start from,
|
||||
but if you are reloading in an empty cluster then <literal>postgres</>
|
||||
should generally be used.) It is always necessary to have
|
||||
but if you are reloading into an empty cluster then <literal>postgres</>
|
||||
should usually be used.) It is always necessary to have
|
||||
database superuser access when restoring a <application>pg_dumpall</>
|
||||
dump, as that is required to restore the role and tablespace information.
|
||||
If you use tablespaces, be careful that the tablespace paths in the
|
||||
dump are appropriate for the new installation.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<application>pg_dumpall</> works by emitting commands to re-create
|
||||
roles, tablespaces, and empty databases, then invoking
|
||||
<application>pg_dump</> for each database. This means that while
|
||||
each database will be internally consistent, the snapshots of
|
||||
different databases might not be exactly in-sync.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2 id="backup-dump-large">
|
||||
@ -215,6 +224,7 @@ psql -f <replaceable class="parameter">infile</replaceable> postgres
|
||||
be larger than the maximum size allowed by your system. Since
|
||||
<application>pg_dump</> can write to the standard output, you can
|
||||
use standard Unix tools to work around this possible problem.
|
||||
There are several ways to do it:
|
||||
</para>
|
||||
|
||||
<formalpara>
|
||||
@ -230,7 +240,6 @@ pg_dump <replaceable class="parameter">dbname</replaceable> | gzip > <replace
|
||||
Reload with:
|
||||
|
||||
<programlisting>
|
||||
createdb <replaceable class="parameter">dbname</replaceable>
|
||||
gunzip -c <replaceable class="parameter">filename</replaceable>.gz | psql <replaceable class="parameter">dbname</replaceable>
|
||||
</programlisting>
|
||||
|
||||
@ -257,14 +266,13 @@ pg_dump <replaceable class="parameter">dbname</replaceable> | split -b 1m - <rep
|
||||
Reload with:
|
||||
|
||||
<programlisting>
|
||||
createdb <replaceable class="parameter">dbname</replaceable>
|
||||
cat <replaceable class="parameter">filename</replaceable>* | psql <replaceable class="parameter">dbname</replaceable>
|
||||
</programlisting>
|
||||
</para>
|
||||
</formalpara>
|
||||
|
||||
<formalpara>
|
||||
<title>Use the custom dump format.</title>
|
||||
<title>Use <application>pg_dump</>'s custom dump format.</title>
|
||||
<para>
|
||||
If <productname>PostgreSQL</productname> was built on a system with the
|
||||
<application>zlib</> compression library installed, the custom dump
|
||||
@ -278,12 +286,22 @@ pg_dump -Fc <replaceable class="parameter">dbname</replaceable> > <replaceabl
|
||||
</programlisting>
|
||||
|
||||
A custom-format dump is not a script for <application>psql</>, but
|
||||
instead must be restored with <application>pg_restore</>.
|
||||
instead must be restored with <application>pg_restore</>, for example:
|
||||
|
||||
<programlisting>
|
||||
pg_restore -d <replaceable class="parameter">dbname</replaceable> <replaceable class="parameter">filename</replaceable>
|
||||
</programlisting>
|
||||
|
||||
See the <xref linkend="app-pgdump"> and <xref
|
||||
linkend="app-pgrestore"> reference pages for details.
|
||||
</para>
|
||||
</formalpara>
|
||||
|
||||
<para>
|
||||
For very large databases, you might need to combine <command>split</>
|
||||
with one of the other two approaches.
|
||||
</para>
|
||||
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
@ -314,9 +332,10 @@ tar -cf backup.tar /usr/local/pgsql/data
|
||||
The database server <emphasis>must</> be shut down in order to
|
||||
get a usable backup. Half-way measures such as disallowing all
|
||||
connections will <emphasis>not</emphasis> work
|
||||
(mainly because <command>tar</command> and similar tools do not take an
|
||||
atomic snapshot of the state of the file system at a point in
|
||||
time). Information about stopping the server can be found in
|
||||
(in part because <command>tar</command> and similar tools do not take
|
||||
an atomic snapshot of the state of the file system,
|
||||
but also because of internal buffering within the server).
|
||||
Information about stopping the server can be found in
|
||||
<xref linkend="server-shutdown">. Needless to say that you
|
||||
also need to shut down the server before restoring the data.
|
||||
</para>
|
||||
@ -336,7 +355,7 @@ tar -cf backup.tar /usr/local/pgsql/data
|
||||
table and the associated <filename>pg_clog</filename> data
|
||||
because that would render all other tables in the database
|
||||
cluster useless. So file system backups only work for complete
|
||||
restoration of an entire database cluster.
|
||||
backup and restoration of an entire database cluster.
|
||||
</para>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
@ -354,18 +373,18 @@ tar -cf backup.tar /usr/local/pgsql/data
|
||||
However, a backup created in this way saves
|
||||
the database files in a state where the database server was not
|
||||
properly shut down; therefore, when you start the database server
|
||||
on the backed-up data, it will think the server had crashed
|
||||
and replay the WAL log. This is not a problem, just be aware of
|
||||
on the backed-up data, it will think the previous server instance had
|
||||
crashed and replay the WAL log. This is not a problem, just be aware of
|
||||
it (and be sure to include the WAL files in your backup).
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If your database is spread across multiple file systems, there might not
|
||||
be any way to obtain exactly-simultaneous frozen snapshots of all
|
||||
If your database is spread across multiple file systems, there might not
|
||||
be any way to obtain exactly-simultaneous frozen snapshots of all
|
||||
the volumes. For example, if your data files and WAL log are on different
|
||||
disks, or if tablespaces are on different file systems, it might
|
||||
not be possible to use snapshot backup because the snapshots must be
|
||||
simultaneous.
|
||||
not be possible to use snapshot backup because the snapshots
|
||||
<emphasis>must</> be simultaneous.
|
||||
Read your file system documentation very carefully before trusting
|
||||
to the consistent-snapshot technique in such situations. The safest
|
||||
approach is to shut down the database server for long enough to
|
||||
@ -472,10 +491,10 @@ tar -cf backup.tar /usr/local/pgsql/data
|
||||
</para>
|
||||
|
||||
<para>
|
||||
To recover successfully using continuous archiving (also called "online
|
||||
backup" by many database vendors), you need a continuous
|
||||
To recover successfully using continuous archiving (also called
|
||||
<quote>online backup</> by many database vendors), you need a continuous
|
||||
sequence of archived WAL files that extends back at least as far as the
|
||||
start time of your backup. So to get started, you should setup and test
|
||||
start time of your backup. So to get started, you should set up and test
|
||||
your procedure for archiving WAL files <emphasis>before</> you take your
|
||||
first base backup. Accordingly, we first discuss the mechanics of
|
||||
archiving WAL files.
|
||||
@ -488,8 +507,8 @@ tar -cf backup.tar /usr/local/pgsql/data
|
||||
In an abstract sense, a running <productname>PostgreSQL</> system
|
||||
produces an indefinitely long sequence of WAL records. The system
|
||||
physically divides this sequence into WAL <firstterm>segment
|
||||
files</>, which are normally 16MB apiece (although the size can be
|
||||
altered when building <productname>PostgreSQL</>). The segment
|
||||
files</>, which are normally 16MB apiece (although the segment size
|
||||
can be altered when building <productname>PostgreSQL</>). The segment
|
||||
files are given numeric names that reflect their position in the
|
||||
abstract WAL sequence. When not using WAL archiving, the system
|
||||
normally creates just a few segment files and then
|
||||
@ -500,7 +519,7 @@ tar -cf backup.tar /usr/local/pgsql/data
|
||||
</para>
|
||||
|
||||
<para>
|
||||
When archiving WAL data, we want to capture the contents of each segment
|
||||
When archiving WAL data, we need to capture the contents of each segment
|
||||
file once it is filled, and save that data somewhere before the segment
|
||||
file is recycled for reuse. Depending on the application and the
|
||||
available hardware, there could be many different ways of <quote>saving
|
||||
@ -509,7 +528,7 @@ tar -cf backup.tar /usr/local/pgsql/data
|
||||
you have a way of identifying the original name of each file), or batch
|
||||
them together and burn them onto CDs, or something else entirely. To
|
||||
provide the database administrator with as much flexibility as possible,
|
||||
<productname>PostgreSQL</> tries not to make any assumptions about how
|
||||
<productname>PostgreSQL</> tries not to make any assumptions about how
|
||||
the archiving will be done. Instead, <productname>PostgreSQL</> lets
|
||||
the administrator specify a shell command to be executed to copy a
|
||||
completed segment file to wherever it needs to go. The command could be
|
||||
@ -527,7 +546,7 @@ tar -cf backup.tar /usr/local/pgsql/data
|
||||
In <varname>archive_command</>,
|
||||
any <literal>%p</> is replaced by the path name of the file to
|
||||
archive, while any <literal>%f</> is replaced by the file name only.
|
||||
(The path name is relative to the working directory of the server,
|
||||
(The path name is relative to the current working directory,
|
||||
i.e., the cluster's data directory.)
|
||||
Write <literal>%%</> if you need to embed an actual <literal>%</>
|
||||
character in the command. The simplest useful command is something
|
||||
@ -536,7 +555,7 @@ tar -cf backup.tar /usr/local/pgsql/data
|
||||
archive_command = 'cp -i %p /mnt/server/archivedir/%f </dev/null'
|
||||
</programlisting>
|
||||
which will copy archivable WAL segments to the directory
|
||||
<filename>/mnt/server/archivedir</>. (This is an example, not a
|
||||
<filename>/mnt/server/archivedir</>. (This is an example, not a
|
||||
recommendation, and might not work on all platforms.)
|
||||
</para>
|
||||
|
||||
@ -580,14 +599,18 @@ archive_command = 'test ! -f .../%f && cp %p .../%f'
|
||||
|
||||
<para>
|
||||
While designing your archiving setup, consider what will happen if
|
||||
the archive command fails repeatedly because some aspect requires
|
||||
the archive command fails repeatedly because some aspect requires
|
||||
operator intervention or the archive runs out of space. For example, this
|
||||
could occur if you write to tape without an autochanger; when the tape
|
||||
could occur if you write to tape without an autochanger; when the tape
|
||||
fills, nothing further can be archived until the tape is swapped.
|
||||
You should ensure that any error condition or request to a human operator
|
||||
is reported appropriately so that the situation can be
|
||||
resolved relatively quickly. The <filename>pg_xlog/</> directory will
|
||||
is reported appropriately so that the situation can be
|
||||
resolved reasonably quickly. The <filename>pg_xlog/</> directory will
|
||||
continue to fill with WAL segment files until the situation is resolved.
|
||||
(If the filesystem containing <filename>pg_xlog/</> fills up,
|
||||
<productname>PostgreSQL</> will do a PANIC shutdown. No prior
|
||||
transactions will be lost, but the database will be unavailable until
|
||||
you free some space.)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -625,7 +648,7 @@ archive_command = 'test ! -f .../%f && cp %p .../%f'
|
||||
|
||||
<para>
|
||||
The archive command is only invoked on completed WAL segments. Hence,
|
||||
if your server generates only little WAL traffic (or has slack periods
|
||||
if your server generates only little WAL traffic (or has slack periods
|
||||
where it does so), there could be a long delay between the completion
|
||||
of a transaction and its safe recording in archive storage. To put
|
||||
a limit on how old unarchived data can be, you can set
|
||||
@ -653,9 +676,12 @@ archive_command = 'test ! -f .../%f && cp %p .../%f'
|
||||
of one of these statements, WAL would not contain enough information
|
||||
for archive recovery. (Crash recovery is unaffected.) For
|
||||
this reason, <varname>archive_mode</> can only be changed at server
|
||||
start. (<varname>archive_command</> can be changed with a
|
||||
configuration file reload, and setting it to <literal>''</> does
|
||||
prevent archiving.)
|
||||
start. However, <varname>archive_command</> can be changed with a
|
||||
configuration file reload. If you wish to temporarily stop archiving,
|
||||
one way to do it is to set <varname>archive_command</> to the empty
|
||||
string (<literal>''</>).
|
||||
This will cause WAL files to accumulate in <filename>pg_xlog/</> until a
|
||||
working <varname>archive_command</> is re-established.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
@ -685,7 +711,7 @@ SELECT pg_start_backup('label');
|
||||
</para>
|
||||
|
||||
<para>
|
||||
It does not matter which database within the cluster you connect to to
|
||||
It does not matter which database within the cluster you connect to to
|
||||
issue this command. You can ignore the result returned by the function;
|
||||
but if it reports an error, deal with that before proceeding.
|
||||
</para>
|
||||
@ -730,12 +756,12 @@ SELECT pg_stop_backup();
|
||||
<para>
|
||||
Once the WAL segment files used during the backup are archived, you are
|
||||
done. The file identified by <function>pg_stop_backup</>'s result is
|
||||
the last segment that needs to be archived to complete the backup.
|
||||
the last segment that needs to be archived to complete the backup.
|
||||
Archival of these files will happen automatically, since you have
|
||||
already configured <varname>archive_command</>. In many cases, this
|
||||
happens fairly quickly, but you are advised to monitor your archival
|
||||
system to ensure this has taken place so that you can be certain you
|
||||
have a complete backup.
|
||||
have a complete backup.
|
||||
</para>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
@ -753,7 +779,7 @@ SELECT pg_stop_backup();
|
||||
GNU <application>tar</> return an error code indistinguishable from
|
||||
a fatal error if a file was truncated while <application>tar</> was
|
||||
copying it. Fortunately, GNU <application>tar</> versions 1.16 and
|
||||
later exits with <literal>1</> if a file was changed during the backup,
|
||||
later exit with <literal>1</> if a file was changed during the backup,
|
||||
and <literal>2</> for other errors.
|
||||
</para>
|
||||
|
||||
@ -763,7 +789,7 @@ SELECT pg_stop_backup();
|
||||
nor between the end of the backup and <function>pg_stop_backup</>; a
|
||||
few minutes' delay won't hurt anything. (However, if you normally run the
|
||||
server with <varname>full_page_writes</> disabled, you might notice a drop
|
||||
in performance between <function>pg_start_backup</> and
|
||||
in performance between <function>pg_start_backup</> and
|
||||
<function>pg_stop_backup</>, since <varname>full_page_writes</> is
|
||||
effectively forced on during backup mode.) You must ensure that these
|
||||
steps are carried out in sequence without any possible
|
||||
@ -800,7 +826,7 @@ SELECT pg_stop_backup();
|
||||
<literal>0000000100001234000055CD</> the backup history file will be
|
||||
named something like
|
||||
<literal>0000000100001234000055CD.007C9330.backup</>. (The second
|
||||
number in the file name stands for an exact position within the WAL
|
||||
part of the file name stands for an exact position within the WAL
|
||||
file, and can ordinarily be ignored.) Once you have safely archived
|
||||
the file system backup and the WAL segment files used during the
|
||||
backup (as specified in the backup history file), all archived WAL
|
||||
@ -814,7 +840,7 @@ SELECT pg_stop_backup();
|
||||
The backup history file is just a small text file. It contains the
|
||||
label string you gave to <function>pg_start_backup</>, as well as
|
||||
the starting and ending times and WAL segments of the backup.
|
||||
If you used the label to identify where the associated dump file is kept,
|
||||
If you used the label to identify where the associated dump file is kept,
|
||||
then the archived history file is enough to tell you which dump file to
|
||||
restore, should you need to do so.
|
||||
</para>
|
||||
@ -867,10 +893,10 @@ SELECT pg_stop_backup();
|
||||
<listitem>
|
||||
<para>
|
||||
If you have the space to do so,
|
||||
copy the whole cluster data directory and any tablespaces to a temporary
|
||||
copy the whole cluster data directory and any tablespaces to a temporary
|
||||
location in case you need them later. Note that this precaution will
|
||||
require that you have enough free space on your system to hold two
|
||||
copies of your existing database. If you do not have enough space,
|
||||
copies of your existing database. If you do not have enough space,
|
||||
you need at the least to copy the contents of the <filename>pg_xlog</>
|
||||
subdirectory of the cluster data directory, as it might contain logs which
|
||||
were not archived before the system went down.
|
||||
@ -886,7 +912,8 @@ SELECT pg_stop_backup();
|
||||
<para>
|
||||
Restore the database files from your backup dump. Be careful that they
|
||||
are restored with the right ownership (the database system user, not
|
||||
root!) and with the right permissions. If you are using tablespaces,
|
||||
<literal>root</>!) and with the right permissions. If you are using
|
||||
tablespaces,
|
||||
you should verify that the symbolic links in <filename>pg_tblspc/</>
|
||||
were correctly restored.
|
||||
</para>
|
||||
@ -896,8 +923,10 @@ SELECT pg_stop_backup();
|
||||
Remove any files present in <filename>pg_xlog/</>; these came from the
|
||||
backup dump and are therefore probably obsolete rather than current.
|
||||
If you didn't archive <filename>pg_xlog/</> at all, then recreate it,
|
||||
and be sure to recreate the subdirectory
|
||||
<filename>pg_xlog/archive_status/</> as well.
|
||||
being careful to ensure that you re-establish it as a symbolic link
|
||||
if you had it set up that way before.
|
||||
Be sure to recreate the subdirectory
|
||||
<filename>pg_xlog/archive_status/</> as well.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
@ -912,7 +941,7 @@ SELECT pg_stop_backup();
|
||||
<para>
|
||||
Create a recovery command file <filename>recovery.conf</> in the cluster
|
||||
data directory (see <xref linkend="recovery-config-settings">). You might
|
||||
also want to temporarily modify <filename>pg_hba.conf</> to prevent
|
||||
also want to temporarily modify <filename>pg_hba.conf</> to prevent
|
||||
ordinary users from connecting until you are sure the recovery has worked.
|
||||
</para>
|
||||
</listitem>
|
||||
@ -939,7 +968,7 @@ SELECT pg_stop_backup();
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The key part of all this is to setup a recovery command file that
|
||||
The key part of all this is to set up a recovery command file that
|
||||
describes how you want to recover and how far the recovery should
|
||||
run. You can use <filename>recovery.conf.sample</> (normally
|
||||
installed in the installation <filename>share/</> directory) as a
|
||||
@ -950,7 +979,7 @@ SELECT pg_stop_backup();
|
||||
a shell command string. It can contain <literal>%f</>, which is
|
||||
replaced by the name of the desired log file, and <literal>%p</>,
|
||||
which is replaced by the path name to copy the log file to.
|
||||
(The path name is relative to the working directory of the server,
|
||||
(The path name is relative to the current working directory,
|
||||
i.e., the cluster's data directory.)
|
||||
Write <literal>%%</> if you need to embed an actual <literal>%</>
|
||||
character in the command. The simplest useful command is
|
||||
@ -986,29 +1015,29 @@ restore_command = 'cp /mnt/server/archivedir/%f %p'
|
||||
Normally, recovery will proceed through all available WAL segments,
|
||||
thereby restoring the database to the current point in time (or as
|
||||
close as we can get given the available WAL segments). So a normal
|
||||
recovery will end with a "file not found" message, the exact text
|
||||
of the error message depending upon your choice of
|
||||
recovery will end with a <quote>file not found</> message, the exact text
|
||||
of the error message depending upon your choice of
|
||||
<varname>restore_command</>. You may also see an error message
|
||||
at the start of recovery for a file named something like
|
||||
<filename>00000001.history</>. This is also normal and does not
|
||||
indicate a problem in simple recovery situations. See
|
||||
indicate a problem in simple recovery situations. See
|
||||
<xref linkend="backup-timelines"> for discussion.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If you want to recover to some previous point in time (say, right before
|
||||
the junior DBA dropped your main transaction table), just specify the
|
||||
required stopping point in <filename>recovery.conf</>. You can specify
|
||||
the stop point, known as the <quote>recovery target</>, either by
|
||||
date/time or by completion of a specific transaction ID. As of this
|
||||
writing only the date/time option is very usable, since there are no tools
|
||||
the junior DBA dropped your main transaction table), just specify the
|
||||
required stopping point in <filename>recovery.conf</>. You can specify
|
||||
the stop point, known as the <quote>recovery target</>, either by
|
||||
date/time or by completion of a specific transaction ID. As of this
|
||||
writing only the date/time option is very usable, since there are no tools
|
||||
to help you identify with any accuracy which transaction ID to use.
|
||||
</para>
|
||||
|
||||
<note>
|
||||
<para>
|
||||
The stop point must be after the ending time of the base backup (the
|
||||
time of <function>pg_stop_backup</>). You cannot use a base backup
|
||||
The stop point must be after the ending time of the base backup, i.e.,
|
||||
the time of <function>pg_stop_backup</>. You cannot use a base backup
|
||||
to recover to a time when that backup was still going on. (To
|
||||
recover to such a time, you must go back to your previous base backup
|
||||
and roll forward from there.)
|
||||
@ -1018,7 +1047,7 @@ restore_command = 'cp /mnt/server/archivedir/%f %p'
|
||||
<para>
|
||||
If recovery finds a corruption in the WAL data then recovery will
|
||||
complete at that point and the server will not start. In such a case the
|
||||
recovery process could be re-run from the beginning, specifying a
|
||||
recovery process could be re-run from the beginning, specifying a
|
||||
<quote>recovery target</> before the point of corruption so that recovery
|
||||
can complete normally.
|
||||
If recovery fails for an external reason, such as a system crash or
|
||||
@ -1053,15 +1082,14 @@ restore_command = 'cp /mnt/server/archivedir/%f %p'
|
||||
replaced by the name of the file to retrieve from the archive,
|
||||
and any <literal>%p</> is replaced by the path name to copy
|
||||
it to on the server.
|
||||
(The path name is relative to the working directory of the server,
|
||||
(The path name is relative to the current working directory,
|
||||
i.e., the cluster's data directory.)
|
||||
Any <literal>%r</> is replaced by the name of the file containing the
|
||||
last valid restart point. That is the earliest file that must be kept
|
||||
to allow a restore to be restartable, so this information can be used
|
||||
to truncate the archive to just the minimum required to support
|
||||
restart of the current restore. <literal>%r</> would only be used in a
|
||||
warm-standby configuration (see <xref
|
||||
linkend="warm-standby-planning">).
|
||||
warm-standby configuration (see <xref linkend="warm-standby">).
|
||||
Write <literal>%%</> to embed an actual <literal>%</> character
|
||||
in the command.
|
||||
</para>
|
||||
@ -1079,7 +1107,7 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry id="recovery-target-time" xreflabel="recovery_target_time">
|
||||
<term><varname>recovery_target_time</varname>
|
||||
<term><varname>recovery_target_time</varname>
|
||||
(<type>timestamp</type>)
|
||||
</term>
|
||||
<listitem>
|
||||
@ -1089,7 +1117,7 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
|
||||
At most one of <varname>recovery_target_time</> and
|
||||
<xref linkend="recovery-target-xid"> can be specified.
|
||||
The default is to recover to the end of the WAL log.
|
||||
The precise stopping point is also influenced by
|
||||
The precise stopping point is also influenced by
|
||||
<xref linkend="recovery-target-inclusive">.
|
||||
</para>
|
||||
</listitem>
|
||||
@ -1100,29 +1128,29 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
|
||||
<listitem>
|
||||
<para>
|
||||
This parameter specifies the transaction ID up to which recovery
|
||||
will proceed. Keep in mind
|
||||
that while transaction IDs are assigned sequentially at transaction
|
||||
will proceed. Keep in mind
|
||||
that while transaction IDs are assigned sequentially at transaction
|
||||
start, transactions can complete in a different numeric order.
|
||||
The transactions that will be recovered are those that committed
|
||||
before (and optionally including) the specified one.
|
||||
At most one of <varname>recovery_target_xid</> and
|
||||
<xref linkend="recovery-target-time"> can be specified.
|
||||
The default is to recover to the end of the WAL log.
|
||||
The precise stopping point is also influenced by
|
||||
The precise stopping point is also influenced by
|
||||
<xref linkend="recovery-target-inclusive">.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry id="recovery-target-inclusive"
|
||||
<varlistentry id="recovery-target-inclusive"
|
||||
xreflabel="recovery_target_inclusive">
|
||||
<term><varname>recovery_target_inclusive</varname>
|
||||
<term><varname>recovery_target_inclusive</varname>
|
||||
(<type>boolean</type>)
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Specifies whether we stop just after the specified recovery target
|
||||
(<literal>true</literal>), or just before the recovery target
|
||||
(<literal>true</literal>), or just before the recovery target
|
||||
(<literal>false</literal>).
|
||||
Applies to both <xref linkend="recovery-target-time">
|
||||
and <xref linkend="recovery-target-xid">, whichever one is
|
||||
@ -1133,9 +1161,9 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry id="recovery-target-timeline"
|
||||
<varlistentry id="recovery-target-timeline"
|
||||
xreflabel="recovery_target_timeline">
|
||||
<term><varname>recovery_target_timeline</varname>
|
||||
<term><varname>recovery_target_timeline</varname>
|
||||
(<type>string</type>)
|
||||
</term>
|
||||
<listitem>
|
||||
@ -1150,14 +1178,14 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry id="log-restartpoints"
|
||||
<varlistentry id="log-restartpoints"
|
||||
xreflabel="log_restartpoints">
|
||||
<term><varname>log_restartpoints</varname>
|
||||
<term><varname>log_restartpoints</varname>
|
||||
(<type>boolean</type>)
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Specifies whether to log each restart point as it occurs. This
|
||||
Specifies whether to log each restart point as it occurs. This
|
||||
can be helpful to track the progress of a long recovery.
|
||||
Default is <literal>false</>.
|
||||
</para>
|
||||
@ -1181,12 +1209,14 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
|
||||
The ability to restore the database to a previous point in time creates
|
||||
some complexities that are akin to science-fiction stories about time
|
||||
travel and parallel universes. In the original history of the database,
|
||||
perhaps you dropped a critical table at 5:15PM on Tuesday evening.
|
||||
perhaps you dropped a critical table at 5:15PM on Tuesday evening, but
|
||||
didn't realize your mistake until Wednesday noon.
|
||||
Unfazed, you get out your backup, restore to the point-in-time 5:14PM
|
||||
Tuesday evening, and are up and running. In <emphasis>this</> history of
|
||||
the database universe, you never dropped the table at all. But suppose
|
||||
you later realize this wasn't such a great idea after all, and would like
|
||||
to return to some later point in the original history. You won't be able
|
||||
to return to sometime Wednesday morning in the original history.
|
||||
You won't be able
|
||||
to if, while your database was up-and-running, it overwrote some of the
|
||||
sequence of WAL segment files that led up to the time you now wish you
|
||||
could get back to. So you really want to distinguish the series of
|
||||
@ -1240,37 +1270,48 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
|
||||
<title>Tips and Examples</title>
|
||||
|
||||
<para>
|
||||
Some examples of configuring Continuous Archiving are given here.
|
||||
Some tips for configuring continuous archiving are given here.
|
||||
</para>
|
||||
|
||||
<sect3 id="backup-standalone">
|
||||
<title>Recovery Settings</title>
|
||||
<title>Standalone hot backups</title>
|
||||
|
||||
<para>
|
||||
It is possible to use the existing backup facilities to produce
|
||||
standalone hot backups. These are backups that cannot be used for
|
||||
point-in-time recovery, yet are much faster to backup and restore
|
||||
than <application>pg_dump</>.
|
||||
It is possible to use <productname>PostgreSQL</>'s backup facilities to
|
||||
produce standalone hot backups. These are backups that cannot be used
|
||||
for point-in-time recovery, yet are typically much faster to backup and
|
||||
restore than <application>pg_dump</> dumps. (They are also much larger
|
||||
than <application>pg_dump</> dumps, so in some cases the speed advantage
|
||||
could be negated.)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
To configure standalone backups you should use a switch file. If the
|
||||
file exists then archives are made, otherwise archiving is ignored.
|
||||
To prepare for standalone hot backups, set <varname>archive_mode</> to
|
||||
<literal>on</>, and set up an <varname>archive_command</> that performs
|
||||
archiving only when a <quote>switch file</> exists. For example:
|
||||
<programlisting>
|
||||
archive_command = 'if [ -f /var/lib/pgsql/backup_in_progress ]; then cp -i %p /var/lib/pgsql/archive/%f </dev/null ; fi'
|
||||
</programlisting>
|
||||
Backup can then be taken using a script like the following:
|
||||
This command will perform archiving when
|
||||
<filename>/var/lib/pgsql/backup_in_progress</> exists, and otherwise
|
||||
silently return zero exit status (allowing <productname>PostgreSQL</>
|
||||
to recycle the unwanted WAL file).
|
||||
</para>
|
||||
|
||||
<para>
|
||||
With this preparation, a backup can be taken using a script like the
|
||||
following:
|
||||
<programlisting>
|
||||
touch /var/lib/pgsql/backup_in_progress
|
||||
psql -c "select pg_start_backup('hot_backup');"
|
||||
tar -cvf /var/lib/pgsql/backup.tar /var/lib/pgsql/data/
|
||||
tar -cf /var/lib/pgsql/backup.tar /var/lib/pgsql/data/
|
||||
psql -c "select pg_stop_backup();"
|
||||
sleep 20
|
||||
rm /var/lib/pgsql/backup_in_progress
|
||||
tar -rvf /var/lib/pgsql/backup.tar /var/lib/pgsql/archive/
|
||||
tar -rf /var/lib/pgsql/backup.tar /var/lib/pgsql/archive/
|
||||
</programlisting>
|
||||
The switch file <filename>/var/lib/pgsql/backup_in_progress</> is
|
||||
created first, allowing archiving to start prior to the backup.
|
||||
created first, enabling archiving of completed WAL files to occur.
|
||||
After the backup the switch file is removed. Archived WAL files are
|
||||
then added to the backup so that both base backup and all required
|
||||
WAL files are part of the same <application>tar</> file.
|
||||
@ -1281,30 +1322,34 @@ tar -rvf /var/lib/pgsql/backup.tar /var/lib/pgsql/archive/
|
||||
<title><varname>archive_command</varname> scripts</title>
|
||||
|
||||
<para>
|
||||
Many people choose to use scripts to define their
|
||||
Many people choose to use scripts to define their
|
||||
<varname>archive_command</varname>, so that their
|
||||
<filename>postgresql.conf</> looks very simple:
|
||||
<filename>postgresql.conf</> entry looks very simple:
|
||||
<programlisting>
|
||||
archive_command = 'local_backup_script.sh'
|
||||
</programlisting>
|
||||
Using a separate script file is advisable any time you want to use
|
||||
more than a single command in the archiving process.
|
||||
This allows all complexity to be managed within the script, which
|
||||
can be written in a popular scripting language such as
|
||||
<application>bash</> or <application>perl</>. Statements echoed to
|
||||
<literal>stderr</> will appear in the database server log, allowing
|
||||
complex configurations to be easily diagnosed if they fail.
|
||||
<application>bash</> or <application>perl</>.
|
||||
Any messages written to <literal>stderr</> from the script will appear
|
||||
in the database server log, allowing complex configurations to be
|
||||
diagnosed easily if they fail.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Example of how scripts might be used include:
|
||||
Examples of requirements that might be solved within a script include:
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
Copying data to a secure off-site data storage provider
|
||||
Copying data to secure off-site data storage
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Batching WAL files so they are transferred every three hours, rather than
|
||||
one at a time as they fill
|
||||
Batching WAL files so that they are transferred every three hours,
|
||||
rather than one at a time
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
@ -1314,7 +1359,7 @@ archive_command = 'local_backup_script.sh'
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
Interfacing with monitoring software to report errors directly
|
||||
Interfacing with monitoring software to report errors
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
@ -1441,7 +1486,7 @@ archive_command = 'local_backup_script.sh'
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Directly moving WAL or "log" records from one database server to another
|
||||
Directly moving WAL records from one database server to another
|
||||
is typically described as log shipping. <productname>PostgreSQL</>
|
||||
implements file-based log shipping, which means that WAL records are
|
||||
transferred one file (WAL segment) at a time. WAL files (16MB) can be
|
||||
@ -1474,7 +1519,7 @@ archive_command = 'local_backup_script.sh'
|
||||
capability as a warm standby configuration that offers high
|
||||
availability. Restoring a server from an archived base backup and
|
||||
rollforward will take considerably longer, so that technique only
|
||||
really offers a solution for disaster recovery, not high availability.
|
||||
offers a solution for disaster recovery, not high availability.
|
||||
</para>
|
||||
|
||||
<sect2 id="warm-standby-planning">
|
||||
@ -1498,10 +1543,11 @@ archive_command = 'local_backup_script.sh'
|
||||
</para>
|
||||
|
||||
<para>
|
||||
In general, log shipping between servers running different major release
|
||||
In general, log shipping between servers running different major
|
||||
<productname>PostgreSQL</> release
|
||||
levels will not be possible. It is the policy of the PostgreSQL Global
|
||||
Development Group not to make changes to disk formats during minor release
|
||||
upgrades, so it is likely that running different minor release levels
|
||||
upgrades, so it is likely that running different minor release levels
|
||||
on primary and standby servers will work successfully. However, no
|
||||
formal support for that is offered and you are advised to keep primary
|
||||
and standby servers at the same release level as much as possible.
|
||||
@ -1556,8 +1602,9 @@ if (!triggered)
|
||||
|
||||
<para>
|
||||
A working example of a waiting <varname>restore_command</> is provided
|
||||
as a contrib module, named <application>pg_standby</>. This can be
|
||||
extended as needed to support specific configurations or environments.
|
||||
as a <filename>contrib</> module named <application>pg_standby</>. This
|
||||
example can be extended as needed to support specific configurations or
|
||||
environments.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -1642,7 +1689,7 @@ if (!triggered)
|
||||
time as it is being read by the standby database server.
|
||||
Thus, running a standby server for high availability can be performed at
|
||||
the same time as files are stored for longer term disaster recovery
|
||||
purposes.
|
||||
purposes.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -1663,9 +1710,9 @@ if (!triggered)
|
||||
<para>
|
||||
If the standby server fails then no failover need take place. If the
|
||||
standby server can be restarted, even some time later, then the recovery
|
||||
process can also be immediately restarted, taking advantage of
|
||||
process can also be immediately restarted, taking advantage of
|
||||
restartable recovery. If the standby server cannot be restarted, then a
|
||||
full new standby server should be created.
|
||||
full new standby server instance should be created.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -1673,40 +1720,40 @@ if (!triggered)
|
||||
a mechanism for informing it that it is no longer the primary. This is
|
||||
sometimes known as STONITH (Shoot the Other Node In The Head), which is
|
||||
necessary to avoid situations where both systems think they are the
|
||||
primary, which can lead to confusion and ultimately data loss.
|
||||
primary, which will lead to confusion and ultimately data loss.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Many failover systems use just two systems, the primary and the standby,
|
||||
connected by some kind of heartbeat mechanism to continually verify the
|
||||
connectivity between the two and the viability of the primary. It is
|
||||
also possible to use a third system (called a witness server) to avoid
|
||||
some problems of inappropriate failover, but the additional complexity
|
||||
might not be worthwhile unless it is set-up with sufficient care and
|
||||
also possible to use a third system (called a witness server) to prevent
|
||||
some cases of inappropriate failover, but the additional complexity
|
||||
might not be worthwhile unless it is set up with sufficient care and
|
||||
rigorous testing.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Once failover to the standby occurs, we have only a
|
||||
single server in operation. This is known as a degenerate state.
|
||||
The former standby is now the primary, but the former primary is down
|
||||
The former standby is now the primary, but the former primary is down
|
||||
and might stay down. To return to normal operation we must
|
||||
fully recreate a standby server,
|
||||
either on the former primary system when it comes up, or on a third,
|
||||
possibly new, system. Once complete the primary and standby can be
|
||||
considered to have switched roles. Some people choose to use a third
|
||||
fully recreate a standby server,
|
||||
either on the former primary system when it comes up, or on a third,
|
||||
possibly new, system. Once complete the primary and standby can be
|
||||
considered to have switched roles. Some people choose to use a third
|
||||
server to provide backup to the new primary until the new standby
|
||||
server is recreated,
|
||||
though clearly this complicates the system configuration and
|
||||
though clearly this complicates the system configuration and
|
||||
operational processes.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
So, switching from primary to standby server can be fast but requires
|
||||
some time to re-prepare the failover cluster. Regular switching from
|
||||
primary to standby is encouraged, since it allows regular downtime on
|
||||
each system for maintenance. This also acts as a test of the
|
||||
failover mechanism to ensure that it will really work when you need it.
|
||||
primary to standby is useful, since it allows regular downtime on
|
||||
each system for maintenance. This also serves as a test of the
|
||||
failover mechanism to ensure that it will really work when you need it.
|
||||
Written administration procedures are advised.
|
||||
</para>
|
||||
</sect2>
|
||||
@ -1729,7 +1776,7 @@ if (!triggered)
|
||||
over to the standby server(s). With this approach, the window for data
|
||||
loss is the polling cycle time of the copying program, which can be very
|
||||
small, but there is no wasted bandwidth from forcing partially-used
|
||||
segment files to be archived. Note that the standby servers'
|
||||
segment files to be archived. Note that the standby servers'
|
||||
<varname>restore_command</> scripts still deal in whole WAL files,
|
||||
so the incrementally copied data is not ordinarily made available to
|
||||
the standby servers. It is of use only when the primary dies —
|
||||
@ -1755,8 +1802,8 @@ if (!triggered)
|
||||
In a warm standby configuration, it is possible to offload the expense of
|
||||
taking periodic base backups from the primary server; instead base backups
|
||||
can be made by backing
|
||||
up a standby server's files. This concept is generally known as
|
||||
incrementally updated backups, log change accumulation or more simply,
|
||||
up a standby server's files. This concept is generally known as
|
||||
incrementally updated backups, log change accumulation, or more simply,
|
||||
change accumulation.
|
||||
</para>
|
||||
|
||||
@ -1776,7 +1823,7 @@ if (!triggered)
|
||||
far back you need to keep WAL segment files to have a recoverable
|
||||
backup. You can do this by running <application>pg_controldata</>
|
||||
on the standby server to inspect the control file and determine the
|
||||
current checkpoint WAL location, or by using the
|
||||
current checkpoint WAL location, or by using the
|
||||
<varname>log_restartpoints</> option to print values to the server log.
|
||||
</para>
|
||||
</sect2>
|
||||
@ -1807,8 +1854,8 @@ if (!triggered)
|
||||
the number after the first dot changes). This does not apply to
|
||||
different minor releases under the same major release (where the
|
||||
number after the second dot changes); these always have compatible
|
||||
storage formats. For example, releases 7.2.1, 7.3.2, and 7.4 are
|
||||
not compatible, whereas 7.2.1 and 7.2.2 are. When you update
|
||||
storage formats. For example, releases 8.1.1, 8.2.3, and 8.3 are
|
||||
not compatible, whereas 8.2.3 and 8.2.4 are. When you update
|
||||
between compatible versions, you can simply replace the executables
|
||||
and reuse the data directory on disk. Otherwise you need to back
|
||||
up your data and restore it on the new server. This has to be done
|
||||
@ -1839,15 +1886,15 @@ pg_dumpall -p 5432 | psql -d postgres -p 6543
|
||||
to transfer your data. Or use an intermediate file if you want.
|
||||
Then you can shut down the old server and start the new server at
|
||||
the port the old one was running at. You should make sure that the
|
||||
old database is not updated after you run <application>pg_dumpall</>,
|
||||
otherwise you will obviously lose that data. See <xref
|
||||
old database is not updated after you begin to run
|
||||
<application>pg_dumpall</>, otherwise you will lose that data. See <xref
|
||||
linkend="client-authentication"> for information on how to prohibit
|
||||
access.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
It is also possible to use replication like <productname>Slony</> to
|
||||
create a slave server with the updated version of
|
||||
It is also possible to use replication methods, such as
|
||||
<productname>Slony</>, to create a slave server with the updated version of
|
||||
<productname>PostgreSQL</>. The slave can be on the same computer or
|
||||
a different computer. Once it has synced up with the master server
|
||||
(running the older version of <productname>PostgreSQL</>), you can
|
||||
@ -1864,10 +1911,10 @@ pg_dumpall -p 5432 | psql -d postgres -p 6543
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If you cannot or do not want to run two servers in parallel you can
|
||||
If you cannot or do not want to run two servers in parallel, you can
|
||||
do the backup step before installing the new version, bring down
|
||||
the server, move the old version out of the way, install the new
|
||||
version, start the new server, restore the data. For example:
|
||||
version, start the new server, and restore the data. For example:
|
||||
|
||||
<programlisting>
|
||||
pg_dumpall > backup
|
||||
@ -1890,11 +1937,16 @@ psql -f backup postgres
|
||||
When you <quote>move the old installation out of the way</quote>
|
||||
it might no longer be perfectly usable. Some of the executable programs
|
||||
contain absolute paths to various installed programs and data files.
|
||||
This is usually not a big problem but if you plan on using two
|
||||
This is usually not a big problem, but if you plan on using two
|
||||
installations in parallel for a while you should assign them
|
||||
different installation directories at build time. (This problem
|
||||
is rectified in <productname>PostgreSQL</> 8.0 and later, but you
|
||||
need to be wary of moving older installations.)
|
||||
is rectified in <productname>PostgreSQL</> 8.0 and later, so long
|
||||
as you move all subdirectories containing installed files together;
|
||||
for example if <filename>/usr/local/postgres/bin/</> goes to
|
||||
<filename>/usr/local/postgres.old/bin/</>, then
|
||||
<filename>/usr/local/postgres/share/</> must go to
|
||||
<filename>/usr/local/postgres.old/share/</>. In pre-8.0 releases
|
||||
moving an installation like this will not work.)
|
||||
</para>
|
||||
</note>
|
||||
</sect1>
|
||||
|
Loading…
x
Reference in New Issue
Block a user