828 lines
32 KiB
Plaintext
828 lines
32 KiB
Plaintext
From pgsql-admin-owner+M15281=pgman=candle.pha.pa.us@postgresql.org Thu Oct 21 18:57:36 2004
|
|
Return-path: <pgsql-admin-owner+M15281=pgman=candle.pha.pa.us@postgresql.org>
|
|
Received: from svr1.postgresql.org (svr1.postgresql.org [200.46.204.71])
|
|
by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i9LLvYf17059
|
|
for <pgman@candle.pha.pa.us>; Thu, 21 Oct 2004 17:57:34 -0400 (EDT)
|
|
Received: from localhost (unknown [200.46.204.144])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 79D9132A71A
|
|
for <pgman@candle.pha.pa.us>; Thu, 21 Oct 2004 22:57:29 +0100 (BST)
|
|
Received: from svr1.postgresql.org ([200.46.204.71])
|
|
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
|
|
with ESMTP id 80515-02 for <pgman@candle.pha.pa.us>;
|
|
Thu, 21 Oct 2004 21:57:26 +0000 (GMT)
|
|
Received: from postgresql.org (svr1.postgresql.org [200.46.204.71])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 1209432A70E
|
|
for <pgman@candle.pha.pa.us>; Thu, 21 Oct 2004 22:57:29 +0100 (BST)
|
|
X-Original-To: pgsql-admin-postgresql.org@localhost.postgresql.org
|
|
Received: from localhost (unknown [200.46.204.144])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 4B39932A6C3
|
|
for <pgsql-admin-postgresql.org@localhost.postgresql.org>; Thu, 21 Oct 2004 22:51:01 +0100 (BST)
|
|
Received: from svr1.postgresql.org ([200.46.204.71])
|
|
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
|
|
with ESMTP id 78125-02
|
|
for <pgsql-admin-postgresql.org@localhost.postgresql.org>;
|
|
Thu, 21 Oct 2004 21:50:48 +0000 (GMT)
|
|
Received: from news.hub.org (news.hub.org [200.46.204.72])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 27F0632A6C2
|
|
for <pgsql-admin@postgresql.org>; Thu, 21 Oct 2004 22:50:49 +0100 (BST)
|
|
Received: from news.hub.org (news.hub.org [200.46.204.72])
|
|
by news.hub.org (8.12.9/8.12.9) with ESMTP id i9LLojJ7079086
|
|
for <pgsql-admin@postgresql.org>; Thu, 21 Oct 2004 21:50:45 GMT
|
|
(envelope-from news@news.hub.org)
|
|
Received: (from news@localhost)
|
|
by news.hub.org (8.12.9/8.12.9/Submit) id i9LLnd7p078783
|
|
for pgsql-admin@postgresql.org; Thu, 21 Oct 2004 21:49:39 GMT
|
|
From: Gaetano Mendola <mendola@bigfoot.com>
|
|
X-Newsgroups: comp.databases.postgresql.admin
|
|
Subject: Re: [ADMIN] replication using WAL archives
|
|
Date: Thu, 21 Oct 2004 23:49:35 +0200
|
|
Organization: PYRENET Midi-pyrenees Provider
|
|
Lines: 216
|
|
Message-ID: <41782EEF.5040708@bigfoot.com>
|
|
References: <002801c4b739$68450870$7201a8c0@mst1x5r347kymb> <1098384082.15573.14.camel@camel>
|
|
MIME-Version: 1.0
|
|
Content-Type: multipart/mixed;
|
|
boundary="------------060900090803090101060101"
|
|
X-Complaints-To: abuse@pyrenet.fr
|
|
cc: iain@mst.co.jp
|
|
To: Robert Treat <xzilla@users.sourceforge.net>
|
|
User-Agent: Mozilla Thunderbird 0.8 (Windows/20040913)
|
|
X-Accept-Language: en-us, en
|
|
In-Reply-To: <1098384082.15573.14.camel@camel>
|
|
X-Enigmail-Version: 0.86.1.0
|
|
X-Enigmail-Supports: pgp-inline, pgp-mime
|
|
To: pgsql-admin@postgresql.org
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Mailing-List: pgsql-admin
|
|
Precedence: bulk
|
|
Sender: pgsql-admin-owner@postgresql.org
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on
|
|
candle.pha.pa.us
|
|
X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham
|
|
version=2.61
|
|
Status: OR
|
|
|
|
This is a multi-part message in MIME format.
|
|
--------------060900090803090101060101
|
|
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
|
|
Content-Transfer-Encoding: 7bit
|
|
|
|
Robert Treat wrote:
|
|
> On Thu, 2004-10-21 at 02:44, Iain wrote:
|
|
>
|
|
>>Hi,
|
|
>>
|
|
>>I thought I read something about this in relation to v8, but I can't
|
|
>>find any reference to it now... is it (or will it be) possible to do
|
|
>>master-slave style database replication by transmitting log files to the
|
|
>>standby server and having it process them?
|
|
>>
|
|
>
|
|
>
|
|
> I'm not certain if this is 8.0, but some folks have created a working
|
|
> version against the 8.0 code that will do something like this. Search
|
|
> the pgsql-hacker mail list archives for more information on it.
|
|
|
|
I sent a post on hackers, I put it here:
|
|
|
|
=======================================================================
|
|
Hi all,
|
|
I seen that Eric Kerin did the work suggested by Tom about
|
|
how to use the PITR in order to have an hot spare postgres,
|
|
writing a C program.
|
|
|
|
I did the same writing 2 shell scripts, one of them perform
|
|
the restore the other one deliver the partial filled wal and
|
|
check if the postmaster is alive ( check if the pid process
|
|
still exist ).
|
|
|
|
With these two scripts I'm able to have an hot spare installation,
|
|
and the spare one go alive when the first postmaster dies.
|
|
|
|
How test it:
|
|
|
|
1) Master node:
|
|
modify postgresql.conf using:
|
|
|
|
~ archive_command = 'cp %p /mnt/server/archivedir/%f'
|
|
|
|
~ launch postgres and perform a backup as doc
|
|
|
|
~ http://developer.postgresql.org/docs/postgres/backup-online.html
|
|
|
|
suggest to do
|
|
|
|
launch the script:
|
|
|
|
partial_wal_deliver.sh <PID> /mnt/server/partialdir <pg_xlog path>
|
|
|
|
~ this script will delivery each 10 seconds the "current" wal file,
|
|
~ and touch the "alive" file in order to notify the spare node that
|
|
~ the master node is up and running
|
|
|
|
|
|
2) Spare node:
|
|
create a recovery.conf with the line:
|
|
|
|
~ restore_command = 'restore.sh /mnt/server/archivedir/%f %p /mnt/server/partialdir'
|
|
|
|
~ replace the content of data directory with the backup performed at point 1,
|
|
~ remove any file present in the pg_xlog directory ( leaving there the archive_status
|
|
~ directory ) and remove the postmaster.pid file ( this is necessary if you are running
|
|
~ the spare postgres on the same hw ).
|
|
|
|
~ launch the postmaster, the restore will continue till the "alive" file present in the
|
|
~ /mnt/server/partialdir directory is not updated for 60 seconds ( you can modify this
|
|
~ values inside the restore.sh script ).
|
|
|
|
Be sure that restore.sh and all directories involved are accessible
|
|
|
|
Let me know.
|
|
|
|
|
|
This is a first step, of course, as Eric Kerin did, is better port these script
|
|
in C and make it more robust.
|
|
|
|
Postgres can help this process, as suggested by Tom creating a pg_current_wal()
|
|
or even better having two new GUC parameters: archive_current_wal_command and
|
|
archive_current_wal_delay.
|
|
|
|
I problem I discover during the tests is that if you shut down the spare node
|
|
and the restore_command is still waiting for a file then the postmaster will never
|
|
exit :-(
|
|
==========================================================================
|
|
|
|
I hope that is clear.
|
|
|
|
|
|
|
|
Regards
|
|
Gaetano Mendola
|
|
|
|
|
|
|
|
--------------060900090803090101060101
|
|
Content-Type: text/plain;
|
|
name="restore.sh"
|
|
Content-Transfer-Encoding: 7bit
|
|
Content-Disposition: inline;
|
|
filename="restore.sh"
|
|
|
|
#!/bin/bash
|
|
|
|
|
|
SOURCE=$1
|
|
TARGET=$2
|
|
PARTIAL=$3
|
|
|
|
SIZE_EXPECTED=16777216 #bytes 16 MB
|
|
DIED_TIME=60 #seconds
|
|
|
|
function test_existence
|
|
{
|
|
if [ -f ${SOURCE} ]
|
|
then
|
|
COUNTER=0
|
|
|
|
#I have to check if the file is begin copied
|
|
#I assume that it will reach the right
|
|
#size in a few seconds
|
|
|
|
while [ $(stat -c '%s' ${SOURCE} ) -lt $SIZE_EXPECTED ]
|
|
do
|
|
sleep 1
|
|
let COUNTER+=1
|
|
if [ 20 -lt $COUNTER ]
|
|
then
|
|
exit 1 # BAILING OUT
|
|
fi
|
|
done
|
|
|
|
cp $SOURCE $TARGET
|
|
exit 0
|
|
fi
|
|
echo ${SOURCE}"> not found"
|
|
|
|
#if is looking for a history file and not exist
|
|
#I have suddenly exit
|
|
echo $SOURCE | grep history > /dev/null 2>&1 && exit 1
|
|
}
|
|
|
|
|
|
while [ 1 ]
|
|
do
|
|
|
|
test_existence
|
|
|
|
#CHECK IF THE MASTER IS ALIVE
|
|
DELTA_TIME=$(( $( date +'%s' ) - $( stat -c '%Z' ${PARTIAL}/alive ) ))
|
|
if [ $DIED_TIME -lt $DELTA_TIME ]
|
|
then
|
|
echo "Master is dead..."
|
|
# Master is dead
|
|
CURRENT_WAL=$( basename $SOURCE )
|
|
echo "Partial: " ${PARTIAL}
|
|
echo "Current wal: " ${CURRENT_WAL}
|
|
echo "Target: " ${TARGET}
|
|
cp ${PARTIAL}/${CURRENT_WAL}.partial ${TARGET} > /dev/null 2>&1 && exit 0
|
|
exit 1
|
|
fi
|
|
|
|
sleep 1
|
|
|
|
done
|
|
|
|
--------------060900090803090101060101
|
|
Content-Type: text/plain;
|
|
name="partial_wal_deliver.sh"
|
|
Content-Transfer-Encoding: 7bit
|
|
Content-Disposition: inline;
|
|
filename="partial_wal_deliver.sh"
|
|
|
|
#!/bin/bash
|
|
|
|
PID=$1
|
|
PARTIAL=$2
|
|
PGXLOG=$3
|
|
|
|
function copy_last_wal
|
|
{
|
|
FILE=$( ls -t1p $PGXLOG | grep -v / | head -1 )
|
|
|
|
echo "Last Wal> " $FILE
|
|
|
|
cp ${PGXLOG}/${FILE} ${PARTIAL}/${FILE}.tmp
|
|
mv ${PARTIAL}/${FILE}.tmp ${PARTIAL}/${FILE}.partial
|
|
find ${PARTIAL} -name *.partial | grep -v ${FILE} | xargs -i rm -fr {}
|
|
}
|
|
|
|
|
|
while [ 1 ]
|
|
do
|
|
ps --pid $PID > /dev/null 2>&1
|
|
ALIVE=$?
|
|
|
|
if [ "${ALIVE}" == "1" ]
|
|
then
|
|
#The process is dead
|
|
echo "Process dead"
|
|
copy_last_wal
|
|
exit 1
|
|
fi
|
|
|
|
#The process still exist
|
|
touch ${PARTIAL}/alive
|
|
copy_last_wal
|
|
|
|
sleep 10
|
|
done
|
|
|
|
--------------060900090803090101060101
|
|
Content-Type: text/plain
|
|
Content-Disposition: inline
|
|
Content-Transfer-Encoding: 8bit
|
|
MIME-Version: 1.0
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 5: Have you checked our extensive FAQ?
|
|
|
|
http://www.postgresql.org/docs/faqs/FAQ.html
|
|
|
|
--------------060900090803090101060101--
|
|
|
|
From pgsql-admin-owner+M15295=pgman=candle.pha.pa.us@postgresql.org Fri Oct 22 06:32:38 2004
|
|
Return-path: <pgsql-admin-owner+M15295=pgman=candle.pha.pa.us@postgresql.org>
|
|
Received: from svr1.postgresql.org (svr1.postgresql.org [200.46.204.71])
|
|
by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i9M9Waf18397
|
|
for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 05:32:36 -0400 (EDT)
|
|
Received: from localhost (unknown [200.46.204.144])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 9C9A532AC61
|
|
for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 10:32:32 +0100 (BST)
|
|
Received: from svr1.postgresql.org ([200.46.204.71])
|
|
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
|
|
with ESMTP id 53654-01 for <pgman@candle.pha.pa.us>;
|
|
Fri, 22 Oct 2004 09:32:26 +0000 (GMT)
|
|
Received: from postgresql.org (svr1.postgresql.org [200.46.204.71])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 3132D32AC53
|
|
for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 10:32:32 +0100 (BST)
|
|
X-Original-To: pgsql-admin-postgresql.org@localhost.postgresql.org
|
|
Received: from localhost (unknown [200.46.204.144])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id DC46E32A095
|
|
for <pgsql-admin-postgresql.org@localhost.postgresql.org>; Fri, 22 Oct 2004 10:23:07 +0100 (BST)
|
|
Received: from svr1.postgresql.org ([200.46.204.71])
|
|
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
|
|
with ESMTP id 49812-03
|
|
for <pgsql-admin-postgresql.org@localhost.postgresql.org>;
|
|
Fri, 22 Oct 2004 09:22:52 +0000 (GMT)
|
|
Received: from cmailg3.svr.pol.co.uk (cmailg3.svr.pol.co.uk [195.92.195.173])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 5AEA6329F2F
|
|
for <pgsql-admin@postgresql.org>; Fri, 22 Oct 2004 10:22:57 +0100 (BST)
|
|
Received: from modem-21.monkey.dialup.pol.co.uk ([217.135.208.21] helo=Nightingale)
|
|
by cmailg3.svr.pol.co.uk with smtp (Exim 4.41)
|
|
id 1CKvdM-0005eh-NO; Fri, 22 Oct 2004 10:22:53 +0100
|
|
Message-ID: <011a01c4b818$b7370a20$06e887d9@Nightingale>
|
|
From: "Simon Riggs" <simon@2ndquadrant.com>
|
|
To: "Gaetano Mendola" <mendola@bigfoot.com>,
|
|
"Robert Treat" <xzilla@users.sourceforge.net>, <pgsql-admin@postgresql.org>
|
|
cc: <iain@mst.co.jp>
|
|
References: <002801c4b739$68450870$7201a8c0@mst1x5r347kymb> <1098384082.15573.14.camel@camel> <41782EEF.5040708@bigfoot.com>
|
|
Subject: Re: [ADMIN] replication using WAL archives
|
|
Date: Fri, 22 Oct 2004 10:22:54 +0100
|
|
MIME-Version: 1.0
|
|
Content-Type: text/plain;
|
|
charset="iso-8859-1"
|
|
Content-Transfer-Encoding: 7bit
|
|
X-Priority: 3
|
|
X-MSMail-Priority: Normal
|
|
X-Mailer: Microsoft Outlook Express 6.00.2800.1409
|
|
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1409
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Mailing-List: pgsql-admin
|
|
Precedence: bulk
|
|
Sender: pgsql-admin-owner@postgresql.org
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on
|
|
candle.pha.pa.us
|
|
X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham
|
|
version=2.61
|
|
Status: OR
|
|
|
|
> Gaetano Mendola wrote
|
|
> Postgres can help this process, as suggested by Tom creating a
|
|
pg_current_wal()
|
|
> or even better having two new GUC parameters: archive_current_wal_command
|
|
and
|
|
> archive_current_wal_delay.
|
|
|
|
OK, we can modify the archiver to do this as well as the archive-when-full
|
|
functionality. I'd already agreed to do something similar for 8.1
|
|
|
|
PROPOSAL:
|
|
By default, archive_max_delay would be 10 seconds.
|
|
By default, archive_current_wal_command is not set.
|
|
If archive_current_wal_command is not set, the archiver will archive a file
|
|
using archive_command only when the file is full.
|
|
If archive_current_wal_command is set, the archiver would archive a file
|
|
whichever of these occurs first...
|
|
- it is full
|
|
- the archive_max_delay timeout occurs (default: disabled)
|
|
...as you can see I've renamed archive_current_wal_delay to reflect the fact
|
|
that there is an interaction between the current mechanism (only when full)
|
|
and this additional mechanism (no longer than X secs between log files).
|
|
With that design, if the logs are being created quickly enough, then a
|
|
partial log file is never created, only full ones.
|
|
|
|
When an xlog file is archived because it is full, then it is sent to both
|
|
archive_current_wal_command and archive_command (in that order). When the
|
|
timeout occurs and we have a partial xlog file, it would only be sent to
|
|
archive_current_wal_command. It may also be desirable to not use
|
|
archive_command at all, only to use archive_current_wal_command. That's not
|
|
currently possible because archive_command is the switch by which all of the
|
|
archive functioanlity is enabled, so you can't actually turn this off.
|
|
|
|
There is already a timeout feature designed into archiver for safety...so I
|
|
can make that read the GUCs, above and act accordingly.
|
|
|
|
There is an unresolved resilience issue: if the archiver goes down (or
|
|
whatever does the partial_wal copy functionality) then it it is possible
|
|
that users will continue writing to the database and creating xlog records.
|
|
It would be up to the user to define how to handle records that had been
|
|
committed to the first database in the interim before cutover. It would also
|
|
be up to the user to shut down the first node from the second - Shoot the
|
|
Other Node in the Head, as its known. All of that is up to the second node,
|
|
and as Tom says, is "the hard part"....I'm not proposing to do anything
|
|
about that at this stage, since it is implementation dependant.
|
|
|
|
I was thinking perhaps to move to having variable size xlog files, since
|
|
their contents are now variable - no padded records at EOF. If we did that,
|
|
then the archiver could simply issue a "switch logfile" and then the
|
|
archiver would cut in anyway to copy away the xlog. Having said that it is
|
|
lots easier just to put a blind timeout in the archiver and copy the file -
|
|
though I'm fairly uneasy about the point that we'd be ignoring the fact that
|
|
many people are still writing to it. But I propose doing the easy way....
|
|
|
|
Thoughts?
|
|
|
|
= - = - =
|
|
|
|
Gaetano - skim-reading your script, how do you handle the situation when a
|
|
new xlog file has been written within 10 seconds? That way the current file
|
|
number will have jumped by 2, so when your script looks for the "Last wal"
|
|
using head -1 it will find the N+2 and the intermediate file will never be
|
|
copied. Looks like a problem to me...
|
|
|
|
> I problem I discover during the tests is that if you shut down the spare
|
|
node
|
|
> and the restore_command is still waiting for a file then the postmaster
|
|
will never
|
|
> exit :-(
|
|
|
|
Hmm....Are you reporting this as a bug for 8.0? It's not on the bug list...
|
|
|
|
Do we consider that to be desirable or not?
|
|
|
|
Best Regards, Simon Riggs
|
|
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 2: you can get off all lists at once with the unregister command
|
|
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
|
|
|
From pgsql-admin-owner+M15302=pgman=candle.pha.pa.us@postgresql.org Fri Oct 22 13:56:14 2004
|
|
Return-path: <pgsql-admin-owner+M15302=pgman=candle.pha.pa.us@postgresql.org>
|
|
Received: from svr1.postgresql.org (svr1.postgresql.org [200.46.204.71])
|
|
by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i9MGuCf28637
|
|
for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 12:56:13 -0400 (EDT)
|
|
Received: from localhost (unknown [200.46.204.144])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 54E77EAEDAA
|
|
for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 17:55:51 +0100 (BST)
|
|
Received: from svr1.postgresql.org ([200.46.204.71])
|
|
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
|
|
with ESMTP id 86116-09 for <pgman@candle.pha.pa.us>;
|
|
Fri, 22 Oct 2004 16:55:57 +0000 (GMT)
|
|
Received: from postgresql.org (svr1.postgresql.org [200.46.204.71])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 0EC98EAEDA7
|
|
for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 17:55:51 +0100 (BST)
|
|
X-Original-To: pgsql-admin-postgresql.org@localhost.postgresql.org
|
|
Received: from localhost (unknown [200.46.204.144])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 5DB98EAEDBE
|
|
for <pgsql-admin-postgresql.org@localhost.postgresql.org>; Fri, 22 Oct 2004 17:45:13 +0100 (BST)
|
|
Received: from svr1.postgresql.org ([200.46.204.71])
|
|
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
|
|
with ESMTP id 82473-08
|
|
for <pgsql-admin-postgresql.org@localhost.postgresql.org>;
|
|
Fri, 22 Oct 2004 16:45:11 +0000 (GMT)
|
|
Received: from pns.mm.eutelsat.org (pns.mm.eutelsat.org [194.214.173.227])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id E49F0EAEDB5
|
|
for <pgsql-admin@postgresql.org>; Fri, 22 Oct 2004 17:45:00 +0100 (BST)
|
|
Received: from nts-03.mm.eutelsat.org (localhost [127.0.0.1])
|
|
by pns.mm.eutelsat.org (8.11.6/linuxconf) with ESMTP id i9MGh0U26124;
|
|
Fri, 22 Oct 2004 18:43:01 +0200
|
|
Received: from [127.0.0.1] (accesspoint.mm.eutelsat.org [194.214.173.4])
|
|
by nts-03.mm.eutelsat.org (8.11.6/linuxconf) with ESMTP id i9MGj5f09681;
|
|
Fri, 22 Oct 2004 18:45:05 +0200
|
|
Message-ID: <4179390B.10700@bigfoot.com>
|
|
Date: Fri, 22 Oct 2004 18:44:59 +0200
|
|
From: Gaetano Mendola <mendola@bigfoot.com>
|
|
User-Agent: Mozilla Thunderbird 0.8 (Windows/20040913)
|
|
X-Accept-Language: en-us, en
|
|
MIME-Version: 1.0
|
|
To: Simon Riggs <simon@2ndquadrant.com>
|
|
cc: Robert Treat <xzilla@users.sourceforge.net>, pgsql-admin@postgresql.org,
|
|
iain@mst.co.jp
|
|
Subject: Re: [ADMIN] replication using WAL archives
|
|
References: <002801c4b739$68450870$7201a8c0@mst1x5r347kymb> <1098384082.15573.14.camel@camel> <41782EEF.5040708@bigfoot.com> <011a01c4b818$b7370a20$06e887d9@Nightingale>
|
|
In-Reply-To: <011a01c4b818$b7370a20$06e887d9@Nightingale>
|
|
X-Enigmail-Version: 0.86.1.0
|
|
X-Enigmail-Supports: pgp-inline, pgp-mime
|
|
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
|
|
Content-Transfer-Encoding: 7bit
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Mailing-List: pgsql-admin
|
|
Precedence: bulk
|
|
Sender: pgsql-admin-owner@postgresql.org
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on
|
|
candle.pha.pa.us
|
|
X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham
|
|
version=2.61
|
|
Status: OR
|
|
|
|
-----BEGIN PGP SIGNED MESSAGE-----
|
|
Hash: SHA1
|
|
|
|
Simon Riggs wrote:
|
|
|>Gaetano Mendola wrote
|
|
|>Postgres can help this process, as suggested by Tom creating a
|
|
|
|
|
| pg_current_wal()
|
|
|
|
|
|>or even better having two new GUC parameters: archive_current_wal_command
|
|
|
|
|
| and
|
|
|
|
|
|>archive_current_wal_delay.
|
|
|
|
|
|
|
|
| OK, we can modify the archiver to do this as well as the archive-when-full
|
|
| functionality. I'd already agreed to do something similar for 8.1
|
|
|
|
|
| PROPOSAL:
|
|
| By default, archive_max_delay would be 10 seconds.
|
|
| By default, archive_current_wal_command is not set.
|
|
| If archive_current_wal_command is not set, the archiver will archive a file
|
|
| using archive_command only when the file is full.
|
|
| If archive_current_wal_command is set, the archiver would archive a file
|
|
| whichever of these occurs first...
|
|
| - it is full
|
|
| - the archive_max_delay timeout occurs (default: disabled)
|
|
| ...as you can see I've renamed archive_current_wal_delay to reflect the fact
|
|
| that there is an interaction between the current mechanism (only when full)
|
|
| and this additional mechanism (no longer than X secs between log files).
|
|
| With that design, if the logs are being created quickly enough, then a
|
|
| partial log file is never created, only full ones.
|
|
|
|
|
| When an xlog file is archived because it is full, then it is sent to both
|
|
| archive_current_wal_command and archive_command (in that order). When the
|
|
| timeout occurs and we have a partial xlog file, it would only be sent to
|
|
| archive_current_wal_command. It may also be desirable to not use
|
|
| archive_command at all, only to use archive_current_wal_command. That's not
|
|
| currently possible because archive_command is the switch by which all of the
|
|
| archive functioanlity is enabled, so you can't actually turn this off.
|
|
|
|
It seems good to me, the script behind archive command can be a nop if someone
|
|
want use the archive_current_wal_command
|
|
|
|
|
|
| = - = - =
|
|
|
|
|
| Gaetano - skim-reading your script, how do you handle the situation when a
|
|
| new xlog file has been written within 10 seconds? That way the current file
|
|
| number will have jumped by 2, so when your script looks for the "Last wal"
|
|
| using head -1 it will find the N+2 and the intermediate file will never be
|
|
| copied. Looks like a problem to me...
|
|
|
|
|
|
Yes, the only window failure I seen ( but I don't know if it's possible )
|
|
|
|
Master:
|
|
~ log N created
|
|
log N filled
|
|
archive log N
|
|
log N+1 created
|
|
log N+1 filled
|
|
~ log N+2 created
|
|
~ <---- the master die here before to archive the log N+1
|
|
~ archive log N+1
|
|
|
|
|
|
in this case as you underline tha last log archived is the N and the N+2
|
|
partial wal is added to archived wal collection and in the recovery fase
|
|
the recovery stop after processing the log N.
|
|
|
|
Is it possible that the postmaster create the N+2 file without finish to archive
|
|
the N+1 ? ( I suspect yes :-( )
|
|
|
|
The only cure I see here is to look for not archived WAL ( if possible ).
|
|
|
|
|
|
|>I problem I discover during the tests is that if you shut down the spare
|
|
|>node and the restore_command is still waiting for a file then the postmaster
|
|
|>will never exit :-(
|
|
|
|
|
|
|
|
| Hmm....Are you reporting this as a bug for 8.0? It's not on the bug list...
|
|
|
|
For me is a behave to avoid.
|
|
|
|
|
|
|
|
Regards
|
|
Gaetano Mendola
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
-----BEGIN PGP SIGNATURE-----
|
|
Version: GnuPG v1.2.5 (MingW32)
|
|
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
|
|
|
|
iD8DBQFBeTkJ7UpzwH2SGd4RAsMxAKCbV7W+wrGBocf2Ftlthm0egAlIWACgp87L
|
|
KU/YusyHuvT7jSFwZVKpP3M=
|
|
=rWZx
|
|
-----END PGP SIGNATURE-----
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
|
|
|
|
From pgsql-admin-owner+M15303=pgman=candle.pha.pa.us@postgresql.org Fri Oct 22 14:43:36 2004
|
|
Return-path: <pgsql-admin-owner+M15303=pgman=candle.pha.pa.us@postgresql.org>
|
|
Received: from svr1.postgresql.org (svr1.postgresql.org [200.46.204.71])
|
|
by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i9MHhZf06453
|
|
for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 13:43:35 -0400 (EDT)
|
|
Received: from localhost (unknown [200.46.204.144])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 01DD2EADBB7
|
|
for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 18:43:13 +0100 (BST)
|
|
Received: from svr1.postgresql.org ([200.46.204.71])
|
|
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
|
|
with ESMTP id 01872-03 for <pgman@candle.pha.pa.us>;
|
|
Fri, 22 Oct 2004 17:43:19 +0000 (GMT)
|
|
Received: from postgresql.org (svr1.postgresql.org [200.46.204.71])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 9E633EADAD4
|
|
for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 18:43:12 +0100 (BST)
|
|
X-Original-To: pgsql-admin-postgresql.org@localhost.postgresql.org
|
|
Received: from localhost (unknown [200.46.204.144])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id C1133EAED89
|
|
for <pgsql-admin-postgresql.org@localhost.postgresql.org>; Fri, 22 Oct 2004 18:31:20 +0100 (BST)
|
|
Received: from svr1.postgresql.org ([200.46.204.71])
|
|
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
|
|
with ESMTP id 97130-03
|
|
for <pgsql-admin-postgresql.org@localhost.postgresql.org>;
|
|
Fri, 22 Oct 2004 17:31:17 +0000 (GMT)
|
|
Received: from cmailm2.svr.pol.co.uk (cmailm2.svr.pol.co.uk [195.92.193.210])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 276CAEADBBD
|
|
for <pgsql-admin@postgresql.org>; Fri, 22 Oct 2004 18:31:07 +0100 (BST)
|
|
Received: from modem-558.snake.dialup.pol.co.uk ([62.137.114.46] helo=[192.168.0.102])
|
|
by cmailm2.svr.pol.co.uk with esmtp (Exim 4.41)
|
|
id 1CL3G3-0001Tx-K5; Fri, 22 Oct 2004 18:31:20 +0100
|
|
Subject: Re: [ADMIN] replication using WAL archives
|
|
From: Simon Riggs <simon@2ndquadrant.com>
|
|
To: Gaetano Mendola <mendola@bigfoot.com>
|
|
cc: Robert Treat <xzilla@users.sourceforge.net>, pgsql-admin@postgresql.org,
|
|
iain@mst.co.jp
|
|
In-Reply-To: <4179390B.10700@bigfoot.com>
|
|
References: <002801c4b739$68450870$7201a8c0@mst1x5r347kymb>
|
|
<1098384082.15573.14.camel@camel> <41782EEF.5040708@bigfoot.com>
|
|
<011a01c4b818$b7370a20$06e887d9@Nightingale> <4179390B.10700@bigfoot.com>
|
|
Content-Type: text/plain
|
|
Organization: 2nd Quadrant
|
|
Message-ID: <1098466150.20926.13.camel@localhost.localdomain>
|
|
MIME-Version: 1.0
|
|
X-Mailer: Ximian Evolution 1.4.6 (1.4.6-2)
|
|
Date: Fri, 22 Oct 2004 18:29:10 +0100
|
|
Content-Transfer-Encoding: 7bit
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Mailing-List: pgsql-admin
|
|
Precedence: bulk
|
|
Sender: pgsql-admin-owner@postgresql.org
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on
|
|
candle.pha.pa.us
|
|
X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham
|
|
version=2.61
|
|
Status: OR
|
|
|
|
On Fri, 2004-10-22 at 17:44, Gaetano Mendola wrote:
|
|
> | Gaetano - skim-reading your script, how do you handle the situation when a
|
|
> | new xlog file has been written within 10 seconds? That way the current file
|
|
> | number will have jumped by 2, so when your script looks for the "Last wal"
|
|
> | using head -1 it will find the N+2 and the intermediate file will never be
|
|
> | copied. Looks like a problem to me...
|
|
>
|
|
>
|
|
> Yes, the only window failure I seen ( but I don't know if it's possible )
|
|
>
|
|
> Master:
|
|
> ~ log N created
|
|
> log N filled
|
|
> archive log N
|
|
> log N+1 created
|
|
> log N+1 filled
|
|
> ~ log N+2 created
|
|
> ~ <---- the master die here before to archive the log N+1
|
|
> ~ archive log N+1
|
|
>
|
|
>
|
|
> in this case as you underline tha last log archived is the N and the N+2
|
|
> partial wal is added to archived wal collection and in the recovery fase
|
|
> the recovery stop after processing the log N.
|
|
>
|
|
> Is it possible that the postmaster create the N+2 file without finish to archive
|
|
> the N+1 ? ( I suspect yes :-( )
|
|
>
|
|
> The only cure I see here is to look for not archived WAL ( if possible ).
|
|
>
|
|
|
|
Hmm...well you aren't looking for archived wal, you're just looking at
|
|
wal...which is a different thing...
|
|
|
|
Situation I thought I saw was:
|
|
|
|
- copy away current partial filled xlog N
|
|
- xlog N fills, N+1 starts
|
|
- xlog N+1 fills, N+2 starts
|
|
- copy away current partial filled xlog: N+2 (+10 secs later)
|
|
|
|
i.e. if time to fill xlog (is ever) < time to copy away current xlog,
|
|
then you miss one.
|
|
|
|
So problem: you can miss one and never know you've missed one until the
|
|
recovery can't find it, which it never returns from...so it just hangs.
|
|
|
|
[Just so we're all clear: we're talking about Gaetano's script, not the
|
|
PostgreSQL archver. The postgresql archiver doesn't do it that way, so
|
|
it never misses one.]
|
|
|
|
--
|
|
Best Regards, Simon Riggs
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 9: the planner will ignore your desire to choose an index scan if your
|
|
joining column's datatypes do not match
|
|
|
|
From pgsql-admin-owner+M15306=pgman=candle.pha.pa.us@postgresql.org Fri Oct 22 17:56:07 2004
|
|
Return-path: <pgsql-admin-owner+M15306=pgman=candle.pha.pa.us@postgresql.org>
|
|
Received: from svr1.postgresql.org (svr1.postgresql.org [200.46.204.71])
|
|
by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id i9MKu6f05264
|
|
for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 16:56:06 -0400 (EDT)
|
|
Received: from localhost (unknown [200.46.204.144])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 4F4C2EAE4AE
|
|
for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 21:55:41 +0100 (BST)
|
|
Received: from svr1.postgresql.org ([200.46.204.71])
|
|
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
|
|
with ESMTP id 62857-05 for <pgman@candle.pha.pa.us>;
|
|
Fri, 22 Oct 2004 20:55:48 +0000 (GMT)
|
|
Received: from postgresql.org (svr1.postgresql.org [200.46.204.71])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 095CEEAE4AC
|
|
for <pgman@candle.pha.pa.us>; Fri, 22 Oct 2004 21:55:41 +0100 (BST)
|
|
X-Original-To: pgsql-admin-postgresql.org@localhost.postgresql.org
|
|
Received: from localhost (unknown [200.46.204.144])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 3FC9BEAE486
|
|
for <pgsql-admin-postgresql.org@localhost.postgresql.org>; Fri, 22 Oct 2004 21:50:48 +0100 (BST)
|
|
Received: from svr1.postgresql.org ([200.46.204.71])
|
|
by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024)
|
|
with ESMTP id 62565-02
|
|
for <pgsql-admin-postgresql.org@localhost.postgresql.org>;
|
|
Fri, 22 Oct 2004 20:50:48 +0000 (GMT)
|
|
Received: from news.hub.org (news.hub.org [200.46.204.72])
|
|
by svr1.postgresql.org (Postfix) with ESMTP id 06C49EAE46B
|
|
for <pgsql-admin@postgresql.org>; Fri, 22 Oct 2004 21:50:40 +0100 (BST)
|
|
Received: from news.hub.org (news.hub.org [200.46.204.72])
|
|
by news.hub.org (8.12.9/8.12.9) with ESMTP id i9MKolJB062812
|
|
for <pgsql-admin@postgresql.org>; Fri, 22 Oct 2004 20:50:48 GMT
|
|
(envelope-from news@news.hub.org)
|
|
Received: (from news@localhost)
|
|
by news.hub.org (8.12.9/8.12.9/Submit) id i9MKoRHh062731
|
|
for pgsql-admin@postgresql.org; Fri, 22 Oct 2004 20:50:27 GMT
|
|
From: Gaetano Mendola <mendola@bigfoot.com>
|
|
X-Newsgroups: comp.databases.postgresql.admin
|
|
Subject: Re: [ADMIN] replication using WAL archives
|
|
Date: Fri, 22 Oct 2004 22:50:34 +0200
|
|
Organization: PYRENET Midi-pyrenees Provider
|
|
Lines: 39
|
|
Message-ID: <4179729A.5020401@bigfoot.com>
|
|
References: <002801c4b739$68450870$7201a8c0@mst1x5r347kymb> <1098384082.15573.14.camel@camel> <41782EEF.5040708@bigfoot.com> <011a01c4b818$b7370a20$06e887d9@Nightingale> <4179390B.10700@bigfoot.com> <1098466150.20926.13.camel@localhost.localdomain>
|
|
MIME-Version: 1.0
|
|
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
|
|
Content-Transfer-Encoding: 7bit
|
|
X-Complaints-To: abuse@pyrenet.fr
|
|
To: Simon Riggs <simon@2ndquadrant.com>
|
|
User-Agent: Mozilla Thunderbird 0.8 (Windows/20040913)
|
|
X-Accept-Language: en-us, en
|
|
In-Reply-To: <1098466150.20926.13.camel@localhost.localdomain>
|
|
X-Enigmail-Version: 0.86.1.0
|
|
X-Enigmail-Supports: pgp-inline, pgp-mime
|
|
To: pgsql-admin@postgresql.org
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Mailing-List: pgsql-admin
|
|
Precedence: bulk
|
|
Sender: pgsql-admin-owner@postgresql.org
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on
|
|
candle.pha.pa.us
|
|
X-Spam-Status: No, hits=-4.9 required=5.0 tests=BAYES_00 autolearn=ham
|
|
version=2.61
|
|
Status: OR
|
|
|
|
Simon Riggs wrote:
|
|
|
|
> Situation I thought I saw was:
|
|
>
|
|
> - copy away current partial filled xlog N
|
|
> - xlog N fills, N+1 starts
|
|
> - xlog N+1 fills, N+2 starts
|
|
> - copy away current partial filled xlog: N+2 (+10 secs later)
|
|
>
|
|
> i.e. if time to fill xlog (is ever) < time to copy away current xlog,
|
|
> then you miss one.
|
|
>
|
|
> So problem: you can miss one and never know you've missed one until the
|
|
> recovery can't find it, which it never returns from...so it just hangs.
|
|
|
|
No. The restore.sh is not smart enough to know the last wal that must be
|
|
replayed, the only "smart thing" is to copy the supposed "current wal" in the
|
|
archive directory.
|
|
|
|
The script hang (and is a feature not a bug) if and only if the master is alive
|
|
( at least I'm not seeing any other hang ).
|
|
|
|
In your example in the archived directory will be present the files until logN
|
|
and logN+2 ( the current wal ) is in the partial directory, if the master die,
|
|
the restore.sh will copy logN+2 in the archived directory, the spare node will
|
|
execute restore.sh with file logN+1 as argument and if is not found then the
|
|
restore.sh will exit.
|
|
|
|
|
|
Regards
|
|
Gaetano Mendola
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 8: explain analyze is your friend
|
|
|