mirror of https://github.com/postgres/postgres
6242 lines
266 KiB
Plaintext
6242 lines
266 KiB
Plaintext
From goran@kirra.net Mon Dec 20 14:30:54 1999
|
|
Received: from villa.bildbasen.se (villa.bildbasen.se [193.45.225.97])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id PAA29058
|
|
for <pgman@candle.pha.pa.us>; Mon, 20 Dec 1999 15:30:17 -0500 (EST)
|
|
Received: (qmail 2485 invoked from network); 20 Dec 1999 20:29:53 -0000
|
|
Received: from a112.dial.kiruna.se (HELO kirra.net) (193.45.238.12)
|
|
by villa.bildbasen.se with SMTP; 20 Dec 1999 20:29:53 -0000
|
|
Sender: goran
|
|
Message-ID: <385E9192.226CC37D@kirra.net>
|
|
Date: Mon, 20 Dec 1999 21:29:06 +0100
|
|
From: Goran Thyni <goran@kirra.net>
|
|
Organization: kirra.net
|
|
X-Mailer: Mozilla 4.6 [en] (X11; U; Linux 2.2.13 i586)
|
|
X-Accept-Language: sv, en
|
|
MIME-Version: 1.0
|
|
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
CC: "neil d. quiogue" <nquiogue@ieee.org>,
|
|
PostgreSQL-development <pgsql-hackers@postgreSQL.org>
|
|
Subject: Re: [HACKERS] Re: QUESTION: Replication
|
|
References: <199912201508.KAA20572@candle.pha.pa.us>
|
|
Content-Type: text/plain; charset=iso-8859-1
|
|
Content-Transfer-Encoding: 8bit
|
|
Status: OR
|
|
|
|
Bruce Momjian wrote:
|
|
> We need major work in this area, or at least a plan and an FAQ item.
|
|
> We are getting major questions on this, and I don't know enough even to
|
|
> make an FAQ item telling people their options.
|
|
|
|
My 2 cents, or 2 ören since I'm a Swede, on this:
|
|
|
|
It is pretty simple to build a replication with pg_dump, transfer,
|
|
empty replic and reload.
|
|
But if we want "live replicas" we better base our efforts on a
|
|
mechanism using WAL-logs to rollforward the replicas.
|
|
|
|
regards,
|
|
-----------------
|
|
Göran Thyni
|
|
On quiet nights you can hear Windows NT reboot!
|
|
|
|
From owner-pgsql-hackers@hub.org Fri Dec 24 10:01:18 1999
|
|
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA11295
|
|
for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 11:01:17 -0500 (EST)
|
|
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id KAA20310 for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 10:39:18 -0500 (EST)
|
|
Received: from localhost (majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) with SMTP id KAA61760;
|
|
Fri, 24 Dec 1999 10:31:13 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers)
|
|
Received: by hub.org (bulk_mailer v1.5); Fri, 24 Dec 1999 10:30:48 -0500
|
|
Received: (from majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) id KAA58879
|
|
for pgsql-hackers-outgoing; Fri, 24 Dec 1999 10:29:51 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
|
Received: from bocs170n.black-oak.COM ([38.149.137.131])
|
|
by hub.org (8.9.3/8.9.3) with ESMTP id KAA58795
|
|
for <pgsql-hackers@postgreSQL.org>; Fri, 24 Dec 1999 10:29:00 -0500 (EST)
|
|
(envelope-from DWalker@black-oak.com)
|
|
From: DWalker@black-oak.com
|
|
To: pgsql-hackers@postgreSQL.org
|
|
Subject: [HACKERS] database replication
|
|
Date: Fri, 24 Dec 1999 10:27:59 -0500
|
|
Message-ID: <OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM>
|
|
X-Priority: 3 (Normal)
|
|
X-MIMETrack: Serialize by Router on notes01n/BOCS(Release 5.0.1|July 16, 1999) at 12/24/99
|
|
10:28:01 AM
|
|
MIME-Version: 1.0
|
|
MIME-Version: 1.0
|
|
Content-Type: text/html; charset=ISO-8859-1
|
|
Content-Transfer-Encoding: quoted-printable
|
|
Sender: owner-pgsql-hackers@postgreSQL.org
|
|
Status: OR
|
|
|
|
<P>I've been toying with the idea of implementing database replication for =
|
|
the last few days. The system I'm proposing will be a seperate progra=
|
|
m which can be run on any machine and will most likely be implemented in Py=
|
|
thon. What I'm looking for at this point are gaping holes in my think=
|
|
ing/logic/etc. Here's what I'm thinking...</P><P> </P><P>1) I wa=
|
|
nt to make this program an additional layer over PostgreSQL. I really=
|
|
don't want to hack server code if I can get away with it. At this po=
|
|
int I don't feel I need to.</P><P>2) The replication system will need to ad=
|
|
d at least one field to each table in each database that needs to be replic=
|
|
ated. This field will be a date/time stamp which identifies the "=
|
|
;last update" of the record. This field will be called PGR=5FTIM=
|
|
E for lack of a better name. Because this field will be used from wit=
|
|
hin programs and triggers it can be longer so as to not mistake it for a us=
|
|
er field.</P><P>3) For each table to be replicated the replication system w=
|
|
ill programatically add one plpgsql function and trigger to modify the PGR=
|
|
=5FTIME field on both UPDATEs and INSERTs. The name of this function =
|
|
and trigger will be along the lines of <table=5Fname>=5Freplication=
|
|
=5Fupdate=5Ftrigger and <table=5Fname>=5Freplication=5Fupdate=5Ffunct=
|
|
ion. The function is a simple two-line chunk of code to set the field=
|
|
PGR=5FTIME equal to NOW. The trigger is called before each insert/up=
|
|
date. When looking at the Docs I see that times are stored in Zulu (G=
|
|
T) time. Because of this I don't have to worry about time zones and t=
|
|
he like. I need direction on this part (such as "hey dummy, look=
|
|
at page N of file X.").</P><P>4) At this point we have tables which c=
|
|
an, at a basic level, tell the replication system when they were last updat=
|
|
ed.</P><P>5) The replication system will have a database of its own to reco=
|
|
rd the last replication event, hold configuration, logs, etc. I'd pre=
|
|
fer to store the configuration in a PostgreSQL table but it could just as e=
|
|
asily be stored in a text file on the filesystem somewhere.</P><P>6) To han=
|
|
dle replication I basically check the local "last replication time&quo=
|
|
t; and compare it against the remote PGR=5FTIME fields. If the remote=
|
|
PGR=5FTIME is greater than the last replication time then change the local=
|
|
copy of the database, otherwise, change the remote end of the database. &n=
|
|
bsp;At this point I don't have a way to know WHICH field changed between th=
|
|
e two replicas so either I do ROW level replication or I check each field. =
|
|
I check PGR=5FTIME to determine which field is the most current. &nbs=
|
|
p;Some fine tuning of this process will have to occur no doubt.</P><P>7) Th=
|
|
e commandline utility, fired off by something like cron, could run several =
|
|
times during the day -- command line parameters can be implemented to say P=
|
|
USH ALL CHANGES TO SERVER A, or PULL ALL CHANGES FROM SERVER B.</P><P> =
|
|
;</P><P>Questions/Concerns:</P><P>1) How far do I go with this? Do I =
|
|
start manhandling the system catalogs (pg=5F* tables)?</P><P>2) As to #2 an=
|
|
d #3 above, I really don't like tools automagically changing my tables but =
|
|
at this point I don't see a way around it. I guess this is where the =
|
|
testing comes into play.</P><P>3) Security: the replication app will have t=
|
|
o have pretty good rights to the database so it can add the nessecary funct=
|
|
ions and triggers, modify table schema, etc. </P><P> </P><P>&nbs=
|
|
p; So, any "you're insane and should run home to momma" comments?=
|
|
</P><P> </P><P> Damond=
|
|
</P><P></P>=
|
|
|
|
************
|
|
|
|
From owner-pgsql-hackers@hub.org Fri Dec 24 18:31:03 1999
|
|
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA26244
|
|
for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 19:31:02 -0500 (EST)
|
|
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id TAA12730 for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 19:30:05 -0500 (EST)
|
|
Received: from localhost (majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) with SMTP id TAA57851;
|
|
Fri, 24 Dec 1999 19:23:31 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers)
|
|
Received: by hub.org (bulk_mailer v1.5); Fri, 24 Dec 1999 19:22:54 -0500
|
|
Received: (from majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) id TAA57710
|
|
for pgsql-hackers-outgoing; Fri, 24 Dec 1999 19:21:56 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
|
Received: from Mail.austin.rr.com (sm2.texas.rr.com [24.93.35.55])
|
|
by hub.org (8.9.3/8.9.3) with ESMTP id TAA57680
|
|
for <pgsql-hackers@postgresql.org>; Fri, 24 Dec 1999 19:21:25 -0500 (EST)
|
|
(envelope-from ELOEHR@austin.rr.com)
|
|
Received: from austin.rr.com ([24.93.40.248]) by Mail.austin.rr.com with Microsoft SMTPSVC(5.5.1877.197.19);
|
|
Fri, 24 Dec 1999 18:12:50 -0600
|
|
Message-ID: <38640E2D.75136600@austin.rr.com>
|
|
Date: Fri, 24 Dec 1999 18:22:05 -0600
|
|
From: Ed Loehr <ELOEHR@austin.rr.com>
|
|
X-Mailer: Mozilla 4.7 [en] (X11; U; Linux 2.2.12-20smp i686)
|
|
X-Accept-Language: en
|
|
MIME-Version: 1.0
|
|
To: DWalker@black-oak.com
|
|
CC: pgsql-hackers@postgreSQL.org
|
|
Subject: Re: [HACKERS] database replication
|
|
References: <OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM>
|
|
Content-Type: text/plain; charset=us-ascii
|
|
Content-Transfer-Encoding: 7bit
|
|
Sender: owner-pgsql-hackers@postgreSQL.org
|
|
Status: OR
|
|
|
|
DWalker@black-oak.com wrote:
|
|
|
|
> 6) To handle replication I basically check the local "last
|
|
> replication time" and compare it against the remote PGR_TIME
|
|
> fields. If the remote PGR_TIME is greater than the last replication
|
|
> time then change the local copy of the database, otherwise, change
|
|
> the remote end of the database. At this point I don't have a way to
|
|
> know WHICH field changed between the two replicas so either I do ROW
|
|
> level replication or I check each field. I check PGR_TIME to
|
|
> determine which field is the most current. Some fine tuning of this
|
|
> process will have to occur no doubt.
|
|
|
|
Interesting idea. I can see how this might sync up two databases
|
|
somehow. For true replication, however, I would always want every
|
|
replicated database to be, at the very least, internally consistent
|
|
(i.e., referential integrity), even if it was a little behind on
|
|
processing transactions. In this method, its not clear how
|
|
consistency is every achieved/guaranteed at any point in time if the
|
|
input stream of changes is continuous. If the input stream ceased,
|
|
then I can see how this approach might eventually catch up and totally
|
|
resync everything, but it looks *very* computationally expensive.
|
|
|
|
But I might have missed something. How would internal consistency be
|
|
maintained?
|
|
|
|
|
|
> 7) The commandline utility, fired off by something like cron, could
|
|
> run several times during the day -- command line parameters can be
|
|
> implemented to say PUSH ALL CHANGES TO SERVER A, or PULL ALL CHANGES
|
|
> FROM SERVER B.
|
|
|
|
My two cents is that, while I can see this kind of database syncing as
|
|
valuable, this is not the kind of "replication" I had in mind. This
|
|
may already possible by simply copying the database. What replication
|
|
means to me is a live, continuously streaming sequence of updates from
|
|
one database to another where the replicated database is always
|
|
internally consistent, available for read-only queries, and never "too
|
|
far" out of sync with the source/primary database.
|
|
|
|
What does replication mean to others?
|
|
|
|
Cheers,
|
|
Ed Loehr
|
|
|
|
|
|
|
|
************
|
|
|
|
From owner-pgsql-hackers@hub.org Fri Dec 24 21:31:10 1999
|
|
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA02578
|
|
for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 22:31:09 -0500 (EST)
|
|
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id WAA16641 for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 22:18:56 -0500 (EST)
|
|
Received: from localhost (majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) with SMTP id WAA89135;
|
|
Fri, 24 Dec 1999 22:11:12 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers)
|
|
Received: by hub.org (bulk_mailer v1.5); Fri, 24 Dec 1999 22:10:56 -0500
|
|
Received: (from majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) id WAA89019
|
|
for pgsql-hackers-outgoing; Fri, 24 Dec 1999 22:09:59 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
|
Received: from bocs170n.black-oak.COM ([38.149.137.131])
|
|
by hub.org (8.9.3/8.9.3) with ESMTP id WAA88957;
|
|
Fri, 24 Dec 1999 22:09:11 -0500 (EST)
|
|
(envelope-from dwalker@black-oak.com)
|
|
Received: from gcx80 ([151.196.99.113])
|
|
by bocs170n.black-oak.COM (Lotus Domino Release 5.0.1)
|
|
with SMTP id 1999122422080835:6 ;
|
|
Fri, 24 Dec 1999 22:08:08 -0500
|
|
Message-ID: <001b01bf4e9e$647287d0$af63a8c0@walkers.org>
|
|
From: "Damond Walker" <dwalker@black-oak.com>
|
|
To: <owner-pgsql-hackers@postgreSQL.org>
|
|
Cc: <pgsql-hackers@postgreSQL.org>
|
|
References: <OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM> <38640E2D.75136600@austin.rr.com>
|
|
Subject: Re: [HACKERS] database replication
|
|
Date: Fri, 24 Dec 1999 22:07:55 -0800
|
|
MIME-Version: 1.0
|
|
X-Priority: 3 (Normal)
|
|
X-MSMail-Priority: Normal
|
|
X-Mailer: Microsoft Outlook Express 5.00.2314.1300
|
|
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
|
X-MIMETrack: Itemize by SMTP Server on notes01n/BOCS(Release 5.0.1|July 16, 1999) at 12/24/99
|
|
10:08:09 PM,
|
|
Serialize by Router on notes01n/BOCS(Release 5.0.1|July 16, 1999) at 12/24/99
|
|
10:08:11 PM,
|
|
Serialize complete at 12/24/99 10:08:11 PM
|
|
Content-Transfer-Encoding: 7bit
|
|
Content-Type: text/plain;
|
|
charset="iso-8859-1"
|
|
Sender: owner-pgsql-hackers@postgreSQL.org
|
|
Status: OR
|
|
|
|
>
|
|
> Interesting idea. I can see how this might sync up two databases
|
|
> somehow. For true replication, however, I would always want every
|
|
> replicated database to be, at the very least, internally consistent
|
|
> (i.e., referential integrity), even if it was a little behind on
|
|
> processing transactions. In this method, its not clear how
|
|
> consistency is every achieved/guaranteed at any point in time if the
|
|
> input stream of changes is continuous. If the input stream ceased,
|
|
> then I can see how this approach might eventually catch up and totally
|
|
> resync everything, but it looks *very* computationally expensive.
|
|
>
|
|
|
|
What's the typical unit of work for the database? Are we talking about
|
|
update transactions which span the entire DB? Or are we talking about
|
|
updating maybe 1% or less of the database everyday? I'd think it would be
|
|
more towards the latter than the former. So, yes, this process would be
|
|
computationally expensive but how many records would actually have to be
|
|
sent back and forth?
|
|
|
|
> But I might have missed something. How would internal consistency be
|
|
> maintained?
|
|
>
|
|
|
|
Updates that occur at site A will be moved to site B and vice versa.
|
|
Consistency would be maintained. The only problem that I can see right off
|
|
the bat would be what if site A and site B made changes to a row and then
|
|
site C was brought into the picture? Which one wins?
|
|
|
|
Someone *has* to win when it comes to this type of thing. You really
|
|
DON'T want to start merging row changes...
|
|
|
|
>
|
|
> My two cents is that, while I can see this kind of database syncing as
|
|
> valuable, this is not the kind of "replication" I had in mind. This
|
|
> may already possible by simply copying the database. What replication
|
|
> means to me is a live, continuously streaming sequence of updates from
|
|
> one database to another where the replicated database is always
|
|
> internally consistent, available for read-only queries, and never "too
|
|
> far" out of sync with the source/primary database.
|
|
>
|
|
|
|
Sounds like you're talking about distributed transactions to me. That's
|
|
an entirely different subject all-together. What you describe can be done
|
|
by copying a database...but as you say, this would only work in a read-only
|
|
situation.
|
|
|
|
|
|
Damond
|
|
|
|
|
|
************
|
|
|
|
From owner-pgsql-hackers@hub.org Sat Dec 25 16:35:07 1999
|
|
Received: from hub.org (hub.org [216.126.84.1])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA28890
|
|
for <pgman@candle.pha.pa.us>; Sat, 25 Dec 1999 17:35:05 -0500 (EST)
|
|
Received: from localhost (majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) with SMTP id RAA86997;
|
|
Sat, 25 Dec 1999 17:29:10 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers)
|
|
Received: by hub.org (bulk_mailer v1.5); Sat, 25 Dec 1999 17:28:09 -0500
|
|
Received: (from majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) id RAA86863
|
|
for pgsql-hackers-outgoing; Sat, 25 Dec 1999 17:27:11 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
|
Received: from mtiwmhc08.worldnet.att.net (mtiwmhc08.worldnet.att.net [204.127.131.19])
|
|
by hub.org (8.9.3/8.9.3) with ESMTP id RAA86798
|
|
for <pgsql-hackers@postgreSQL.org>; Sat, 25 Dec 1999 17:26:34 -0500 (EST)
|
|
(envelope-from pgsql@rkirkpat.net)
|
|
Received: from [192.168.3.100] ([12.74.72.219])
|
|
by mtiwmhc08.worldnet.att.net (InterMail v03.02.07.07 118-134)
|
|
with ESMTP id <19991225222554.VIOL28505@[12.74.72.219]>;
|
|
Sat, 25 Dec 1999 22:25:54 +0000
|
|
Date: Sat, 25 Dec 1999 15:25:47 -0700 (MST)
|
|
From: Ryan Kirkpatrick <pgsql@rkirkpat.net>
|
|
X-Sender: rkirkpat@excelsior.rkirkpat.net
|
|
To: DWalker@black-oak.com
|
|
cc: pgsql-hackers@postgreSQL.org
|
|
Subject: Re: [HACKERS] database replication
|
|
In-Reply-To: <OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM>
|
|
Message-ID: <Pine.LNX.4.10.9912251433310.1551-100000@excelsior.rkirkpat.net>
|
|
MIME-Version: 1.0
|
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
|
Sender: owner-pgsql-hackers@postgreSQL.org
|
|
Status: OR
|
|
|
|
On Fri, 24 Dec 1999 DWalker@black-oak.com wrote:
|
|
|
|
> I've been toying with the idea of implementing database replication
|
|
> for the last few days.
|
|
|
|
I too have been thinking about this some over the last year or
|
|
two, just trying to find a quick and easy way to do it. I am not so
|
|
interested in replication, as in synchronization, as in between a desktop
|
|
machine and a laptop, so I can keep the databases on each in sync with
|
|
each other. For this sort of purpose, both the local and remote databases
|
|
would be "idle" at the time of syncing.
|
|
|
|
> 2) The replication system will need to add at least one field to each
|
|
> table in each database that needs to be replicated. This field will be
|
|
> a date/time stamp which identifies the "last update" of the record.
|
|
> This field will be called PGR_TIME for lack of a better name.
|
|
> Because this field will be used from within programs and triggers it
|
|
> can be longer so as to not mistake it for a user field.
|
|
|
|
How about a single, seperate table with the fields of 'database',
|
|
'tablename', 'oid', 'last_changed', that would store the same data as your
|
|
PGR_TIME field. It would be seperated from the actually data tables, and
|
|
therefore would be totally transparent to any database interface
|
|
applications. The 'oid' field would hold each row's OID, a nice, unique
|
|
identification number for the row, while the other fields would tell which
|
|
table and database the oid is in. Then this table can be compared with the
|
|
this table on a remote machine to quickly find updates and changes, then
|
|
each differences can be dealt with in turn.
|
|
|
|
> 3) For each table to be replicated the replication system will
|
|
> programatically add one plpgsql function and trigger to modify the
|
|
> PGR_TIME field on both UPDATEs and INSERTs. The name of this function
|
|
> and trigger will be along the lines of
|
|
> <table_name>_replication_update_trigger and
|
|
> <table_name>_replication_update_function. The function is a simple
|
|
> two-line chunk of code to set the field PGR_TIME equal to NOW. The
|
|
> trigger is called before each insert/update. When looking at the Docs
|
|
> I see that times are stored in Zulu (GT) time. Because of this I
|
|
> don't have to worry about time zones and the like. I need direction
|
|
> on this part (such as "hey dummy, look at page N of file X.").
|
|
|
|
I like this idea, better than any I have come up with yet. Though,
|
|
how are you going to handle DELETEs?
|
|
|
|
> 6) To handle replication I basically check the local "last replication
|
|
> time" and compare it against the remote PGR_TIME fields. If the
|
|
> remote PGR_TIME is greater than the last replication time then change
|
|
> the local copy of the database, otherwise, change the remote end of
|
|
> the database. At this point I don't have a way to know WHICH field
|
|
> changed between the two replicas so either I do ROW level replication
|
|
> or I check each field. I check PGR_TIME to determine which field is
|
|
> the most current. Some fine tuning of this process will have to occur
|
|
> no doubt.
|
|
|
|
Yea, this is indeed the sticky part, and would indeed require some
|
|
fine-tunning. Basically, the way I see it, is if the two timestamps for a
|
|
single row do not match (or even if the row and therefore timestamp is
|
|
missing on one side or the other altogether):
|
|
local ts > remote ts => Local row is exported to remote.
|
|
remote ts > local ts => Remote row is exported to local.
|
|
local ts > last sync time && no remote ts =>
|
|
Local row is inserted on remote.
|
|
local ts < last sync time && no remote ts =>
|
|
Local row is deleted.
|
|
remote ts > last sync time && no local ts =>
|
|
Remote row is inserted on local.
|
|
remote ts < last sync time && no local ts =>
|
|
Remote row is deleted.
|
|
where the synchronization process is running on the local machine. By
|
|
exported, I mean the local values are sent to the remote machine, and the
|
|
row on that remote machine is updated to the local values. How does this
|
|
sound?
|
|
|
|
> 7) The commandline utility, fired off by something like cron, could
|
|
> run several times during the day -- command line parameters can be
|
|
> implemented to say PUSH ALL CHANGES TO SERVER A, or PULL ALL CHANGES
|
|
> FROM SERVER B.
|
|
|
|
Or run manually for my purposes. Also, maybe follow it
|
|
with a vacuum run on both sides for all databases, as this is going to
|
|
potenitally cause lots of table changes that could stand with a cleanup.
|
|
|
|
> 1) How far do I go with this? Do I start manhandling the system catalogs (pg_* tables)?
|
|
|
|
Initially, I would just stick to user table data... If you have
|
|
changes in triggers and other meta-data/executable code, you are going to
|
|
want to make syncs of that stuff manually anyway. At least I would want
|
|
to.
|
|
|
|
> 2) As to #2 and #3 above, I really don't like tools automagically
|
|
> changing my tables but at this point I don't see a way around it. I
|
|
> guess this is where the testing comes into play.
|
|
|
|
Hence the reason for the seperate table with just a row's
|
|
identification and last update time. Only modifications to the synced
|
|
database is the update trigger, which should be pretty harmless.
|
|
|
|
> 3) Security: the replication app will have to have pretty good rights
|
|
> to the database so it can add the nessecary functions and triggers,
|
|
> modify table schema, etc.
|
|
|
|
Just run the sync program as the postgres super user, and there
|
|
are no problems. :)
|
|
|
|
> So, any "you're insane and should run home to momma" comments?
|
|
|
|
No, not at all. Though it probably should be remaned from
|
|
replication to synchronization. The former is usually associated with a
|
|
continuous stream of updates between the local and remote databases, so
|
|
they are almost always in sync, and have a queuing ability if their
|
|
connection is loss for span of time as well. Very complex and difficult to
|
|
implement, and would require hacking server code. :( Something only Sybase
|
|
and Oracle have (as far as I know), and from what I have seen of Sybase's
|
|
replication server support (dated by 5yrs) it was a pain to setup and get
|
|
running correctly.
|
|
The latter, synchronization, is much more managable, and can still
|
|
be useful, especially when you have a large database you want in two
|
|
places, mainly for read only purposes at one end or the other, but don't
|
|
want to waste the time/bandwidth to move and load the entire database each
|
|
time it changes on one end or the other. Same idea as mirroring software
|
|
for FTP sites, just transfers the changes, and nothing more.
|
|
I also like the idea of using Python. I have been using it
|
|
recently for some database interfaces (to PostgreSQL of course :), and it
|
|
is a very nice language to work with. Some worries about performance of
|
|
the program though, as python is only an interpreted lanuage, and I have
|
|
yet to really be impressed with the speed of execution of my database
|
|
interfaces yet.
|
|
Anyway, it sound like a good project, and finally one where I
|
|
actually have a clue of what is going on, and the skills to help. So, if
|
|
you are interested in pursing this project, I would be more than glad to
|
|
help. TTYL.
|
|
|
|
---------------------------------------------------------------------------
|
|
| "For to me to live is Christ, and to die is gain." |
|
|
| --- Philippians 1:21 (KJV) |
|
|
---------------------------------------------------------------------------
|
|
| Ryan Kirkpatrick | Boulder, Colorado | http://www.rkirkpat.net/ |
|
|
---------------------------------------------------------------------------
|
|
|
|
|
|
|
|
************
|
|
|
|
From owner-pgsql-hackers@hub.org Sun Dec 26 08:31:09 1999
|
|
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA17976
|
|
for <pgman@candle.pha.pa.us>; Sun, 26 Dec 1999 09:31:07 -0500 (EST)
|
|
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id JAA23337 for <pgman@candle.pha.pa.us>; Sun, 26 Dec 1999 09:28:36 -0500 (EST)
|
|
Received: from localhost (majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) with SMTP id JAA90738;
|
|
Sun, 26 Dec 1999 09:21:58 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers)
|
|
Received: by hub.org (bulk_mailer v1.5); Sun, 26 Dec 1999 09:19:19 -0500
|
|
Received: (from majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) id JAA90498
|
|
for pgsql-hackers-outgoing; Sun, 26 Dec 1999 09:18:21 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
|
Received: from bocs170n.black-oak.COM ([38.149.137.131])
|
|
by hub.org (8.9.3/8.9.3) with ESMTP id JAA90452
|
|
for <pgsql-hackers@postgreSQL.org>; Sun, 26 Dec 1999 09:17:54 -0500 (EST)
|
|
(envelope-from dwalker@black-oak.com)
|
|
Received: from vmware98 ([151.196.99.113])
|
|
by bocs170n.black-oak.COM (Lotus Domino Release 5.0.1)
|
|
with SMTP id 1999122609164808:7 ;
|
|
Sun, 26 Dec 1999 09:16:48 -0500
|
|
Message-ID: <002201bf4fb3$623f0220$b263a8c0@vmware98.walkers.org>
|
|
From: "Damond Walker" <dwalker@black-oak.com>
|
|
To: "Ryan Kirkpatrick" <pgsql@rkirkpat.net>
|
|
Cc: <pgsql-hackers@postgreSQL.org>
|
|
Subject: Re: [HACKERS] database replication
|
|
Date: Sun, 26 Dec 1999 10:10:41 -0500
|
|
MIME-Version: 1.0
|
|
X-Priority: 3 (Normal)
|
|
X-MSMail-Priority: Normal
|
|
X-Mailer: Microsoft Outlook Express 4.72.3110.1
|
|
X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3110.3
|
|
X-MIMETrack: Itemize by SMTP Server on notes01n/BOCS(Release 5.0.1|July 16, 1999) at 12/26/99
|
|
09:16:51 AM,
|
|
Serialize by Router on notes01n/BOCS(Release 5.0.1|July 16, 1999) at 12/26/99
|
|
09:16:54 AM,
|
|
Serialize complete at 12/26/99 09:16:54 AM
|
|
Content-Transfer-Encoding: 7bit
|
|
Content-Type: text/plain;
|
|
charset="iso-8859-1"
|
|
Sender: owner-pgsql-hackers@postgreSQL.org
|
|
Status: OR
|
|
|
|
>
|
|
> I too have been thinking about this some over the last year or
|
|
>two, just trying to find a quick and easy way to do it. I am not so
|
|
>interested in replication, as in synchronization, as in between a desktop
|
|
>machine and a laptop, so I can keep the databases on each in sync with
|
|
>each other. For this sort of purpose, both the local and remote databases
|
|
>would be "idle" at the time of syncing.
|
|
>
|
|
|
|
I don't think it would matter if the databases are idle or not to be
|
|
honest with you. At any single point in time when you replicate I'd figure
|
|
that the database would be in a consistent state. So, you should be able to
|
|
replicate (or sync) a remote database that is in use. After all, you're
|
|
getting a snapshot of the database as it stands at 8:45 PM. At 8:46 PM it
|
|
may be totally different...but the next time syncing takes place those
|
|
changes would appear in your local copy.
|
|
|
|
The one problem you may run into is if the remote host is running a
|
|
large batch process. It's very likely that you will get 50% of their
|
|
changes when you replicate...but then again, that's why you can schedule the
|
|
event to work around such things.
|
|
|
|
> How about a single, seperate table with the fields of 'database',
|
|
>'tablename', 'oid', 'last_changed', that would store the same data as your
|
|
>PGR_TIME field. It would be seperated from the actually data tables, and
|
|
>therefore would be totally transparent to any database interface
|
|
>applications. The 'oid' field would hold each row's OID, a nice, unique
|
|
>identification number for the row, while the other fields would tell which
|
|
>table and database the oid is in. Then this table can be compared with the
|
|
>this table on a remote machine to quickly find updates and changes, then
|
|
>each differences can be dealt with in turn.
|
|
>
|
|
|
|
The problem with OID's is that they are unique at the local level but if
|
|
you try and use them between servers you can run into overlap. Also, if a
|
|
database is under heavy use this table could quickly become VERY large. Add
|
|
indexes to this table to help performance and you're taking up even more
|
|
disk space.
|
|
|
|
Using the PGR_TIME field with an index will allow us to find rows which
|
|
have changed VERY quickly. All we need to do now is somehow programatically
|
|
find the primary key for a table so the person setting up replication (or
|
|
syncing) doesn't have to have an indepth knowledge of the schema in order to
|
|
setup a syncing schedule.
|
|
|
|
>
|
|
> I like this idea, better than any I have come up with yet. Though,
|
|
>how are you going to handle DELETEs?
|
|
>
|
|
|
|
Oops...how about defining a trigger for this? With deletion I guess we
|
|
would have to move a flag into another table saying we deleted record 'X'
|
|
with this primary key from this table.
|
|
|
|
>
|
|
> Yea, this is indeed the sticky part, and would indeed require some
|
|
>fine-tunning. Basically, the way I see it, is if the two timestamps for a
|
|
>single row do not match (or even if the row and therefore timestamp is
|
|
>missing on one side or the other altogether):
|
|
> local ts > remote ts => Local row is exported to remote.
|
|
> remote ts > local ts => Remote row is exported to local.
|
|
> local ts > last sync time && no remote ts =>
|
|
> Local row is inserted on remote.
|
|
> local ts < last sync time && no remote ts =>
|
|
> Local row is deleted.
|
|
> remote ts > last sync time && no local ts =>
|
|
> Remote row is inserted on local.
|
|
> remote ts < last sync time && no local ts =>
|
|
> Remote row is deleted.
|
|
>where the synchronization process is running on the local machine. By
|
|
>exported, I mean the local values are sent to the remote machine, and the
|
|
>row on that remote machine is updated to the local values. How does this
|
|
>sound?
|
|
>
|
|
|
|
The replication part will be the most complex...that much is for
|
|
certain...
|
|
|
|
I've been writing systems in Lotus Notes/Domino for the last year or so
|
|
and I've grown quite spoiled with what it can do in regards to replication.
|
|
It's not real-time but you have to gear your applications to this type of
|
|
thing (it's possible to create documents, fire off email to notify people of
|
|
changes and have the email arrive before the replicated documents do).
|
|
Replicating large Notes/Domino databases takes quite a while....I don't see
|
|
any kind of replication or syncing running in a blink of an eye.
|
|
|
|
Having said that, a good algo will have to be written to cut down on
|
|
network traffic and to keep database conversations down to a minimum. This
|
|
will be appreciated by people with low bandwidth connections I'm sure
|
|
(dial-ups, fractional T1's, etc).
|
|
|
|
> Or run manually for my purposes. Also, maybe follow it
|
|
>with a vacuum run on both sides for all databases, as this is going to
|
|
>potenitally cause lots of table changes that could stand with a cleanup.
|
|
>
|
|
|
|
What would a vacuum do to a system being used by many people?
|
|
|
|
> No, not at all. Though it probably should be remaned from
|
|
>replication to synchronization. The former is usually associated with a
|
|
>continuous stream of updates between the local and remote databases, so
|
|
>they are almost always in sync, and have a queuing ability if their
|
|
>connection is loss for span of time as well. Very complex and difficult to
|
|
>implement, and would require hacking server code. :( Something only Sybase
|
|
>and Oracle have (as far as I know), and from what I have seen of Sybase's
|
|
>replication server support (dated by 5yrs) it was a pain to setup and get
|
|
>running correctly.
|
|
|
|
It could probably be named either way...but the one thing I really don't
|
|
want to do is start hacking server code. The PostgreSQL people have enough
|
|
to do without worrying about trying to meld anything I've done to their
|
|
server. :)
|
|
|
|
Besides, I like the idea of having it operate as a stand-alone product.
|
|
The only PostgreSQL feature we would require would be triggers and
|
|
plpgsql...what was the earliest version of PostgreSQL that supported
|
|
plpgsql? Even then I don't see the triggers being that complex to boot.
|
|
|
|
> I also like the idea of using Python. I have been using it
|
|
>recently for some database interfaces (to PostgreSQL of course :), and it
|
|
>is a very nice language to work with. Some worries about performance of
|
|
>the program though, as python is only an interpreted lanuage, and I have
|
|
>yet to really be impressed with the speed of execution of my database
|
|
>interfaces yet.
|
|
|
|
The only thing we'd need for Python is the Python extensions for
|
|
PostgreSQL...which in turn requires libpq and that's about it. So, it
|
|
should be able to run on any platform supported by Python and libpq. Using
|
|
TK for the interface components will require NT people to get additional
|
|
software from the 'net. At least it did with older version of Windows
|
|
Python. Unix folks should be happy....assuming they have X running on the
|
|
machine doing the replication or syncing. Even then I wrote a curses based
|
|
Python interface awhile back which allows buttons, progress bars, input
|
|
fields, etc (I called it tinter and it's available at
|
|
http://iximd.com/~dwalker). It's a simple interface and could probably be
|
|
cleaned up a bit but it works. :)
|
|
|
|
> Anyway, it sound like a good project, and finally one where I
|
|
>actually have a clue of what is going on, and the skills to help. So, if
|
|
>you are interested in pursing this project, I would be more than glad to
|
|
>help. TTYL.
|
|
>
|
|
|
|
|
|
That would be a Good Thing. Have webspace somewhere? If I can get
|
|
permission from the "powers that be" at the office I could host a website on
|
|
our (Domino) webserver.
|
|
|
|
Damond
|
|
|
|
|
|
************
|
|
|
|
From owner-pgsql-hackers@hub.org Sun Dec 26 19:11:48 1999
|
|
Received: from hub.org (hub.org [216.126.84.1])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA26661
|
|
for <pgman@candle.pha.pa.us>; Sun, 26 Dec 1999 20:11:46 -0500 (EST)
|
|
Received: from localhost (majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) with SMTP id UAA14959;
|
|
Sun, 26 Dec 1999 20:08:15 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers)
|
|
Received: by hub.org (bulk_mailer v1.5); Sun, 26 Dec 1999 20:07:27 -0500
|
|
Received: (from majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) id UAA14820
|
|
for pgsql-hackers-outgoing; Sun, 26 Dec 1999 20:06:28 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
|
Received: from mtiwmhc02.worldnet.att.net (mtiwmhc02.worldnet.att.net [204.127.131.37])
|
|
by hub.org (8.9.3/8.9.3) with ESMTP id UAA14749
|
|
for <pgsql-hackers@postgreSQL.org>; Sun, 26 Dec 1999 20:05:39 -0500 (EST)
|
|
(envelope-from rkirkpat@rkirkpat.net)
|
|
Received: from [192.168.3.100] ([12.74.72.56])
|
|
by mtiwmhc02.worldnet.att.net (InterMail v03.02.07.07 118-134)
|
|
with ESMTP id <19991227010506.WJVW1914@[12.74.72.56]>;
|
|
Mon, 27 Dec 1999 01:05:06 +0000
|
|
Date: Sun, 26 Dec 1999 18:05:02 -0700 (MST)
|
|
From: Ryan Kirkpatrick <pgsql@rkirkpat.net>
|
|
X-Sender: rkirkpat@excelsior.rkirkpat.net
|
|
To: Damond Walker <dwalker@black-oak.com>
|
|
cc: pgsql-hackers@postgreSQL.org
|
|
Subject: Re: [HACKERS] database replication
|
|
In-Reply-To: <002201bf4fb3$623f0220$b263a8c0@vmware98.walkers.org>
|
|
Message-ID: <Pine.LNX.4.10.9912261742550.7666-100000@excelsior.rkirkpat.net>
|
|
MIME-Version: 1.0
|
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
|
Sender: owner-pgsql-hackers@postgreSQL.org
|
|
Status: OR
|
|
|
|
On Sun, 26 Dec 1999, Damond Walker wrote:
|
|
|
|
> > How about a single, seperate table with the fields of 'database',
|
|
> >'tablename', 'oid', 'last_changed', that would store the same data as your
|
|
> >PGR_TIME field. It would be seperated from the actually data tables, and
|
|
...
|
|
> The problem with OID's is that they are unique at the local level but if
|
|
> you try and use them between servers you can run into overlap.
|
|
|
|
Yea, forgot about that point, but became dead obvious once you
|
|
mentioned it. Boy, I feel stupid now. :)
|
|
|
|
> Using the PGR_TIME field with an index will allow us to find rows which
|
|
> have changed VERY quickly. All we need to do now is somehow programatically
|
|
> find the primary key for a table so the person setting up replication (or
|
|
> syncing) doesn't have to have an indepth knowledge of the schema in order to
|
|
> setup a syncing schedule.
|
|
|
|
Hmm... Yea, maybe look to see which field(s) has a primary, unique
|
|
index on it? Then use those field(s) as a primary key. Just require that
|
|
any table to be synchronized to have some set of fields that uniquely
|
|
identify each row. Either that, or add another field to each table with
|
|
our own, cross system consistent, identification system. Don't know which
|
|
would be more efficient and easier to work with.
|
|
The former could potentially get sticky if it takes a lots of
|
|
fields to generate a unique key value, but has the smallest effect on the
|
|
table to be synced. The latter could be difficult to keep straight between
|
|
systems (local vs. remote), and would require a trigger on inserts to
|
|
generate a new, unique id number, that does not exist locally or
|
|
remotely (nasty issue there), but would remove the uniqueness
|
|
requirement.
|
|
|
|
> Oops...how about defining a trigger for this? With deletion I guess we
|
|
> would have to move a flag into another table saying we deleted record 'X'
|
|
> with this primary key from this table.
|
|
|
|
Or, according to my logic below, if a row is missing on one side
|
|
or the other, then just compare the remaining row's timestamp to the last
|
|
synchronization time (stored in a seperate table/db elsewhere). The
|
|
results of the comparsion and the state of row existences tell one if the
|
|
row was inserted or deleted since the last sync, and what should be done
|
|
to perform the sync.
|
|
|
|
> > Yea, this is indeed the sticky part, and would indeed require some
|
|
> >fine-tunning. Basically, the way I see it, is if the two timestamps for a
|
|
> >single row do not match (or even if the row and therefore timestamp is
|
|
> >missing on one side or the other altogether):
|
|
> > local ts > remote ts => Local row is exported to remote.
|
|
> > remote ts > local ts => Remote row is exported to local.
|
|
> > local ts > last sync time && no remote ts =>
|
|
> > Local row is inserted on remote.
|
|
> > local ts < last sync time && no remote ts =>
|
|
> > Local row is deleted.
|
|
> > remote ts > last sync time && no local ts =>
|
|
> > Remote row is inserted on local.
|
|
> > remote ts < last sync time && no local ts =>
|
|
> > Remote row is deleted.
|
|
> >where the synchronization process is running on the local machine. By
|
|
> >exported, I mean the local values are sent to the remote machine, and the
|
|
> >row on that remote machine is updated to the local values. How does this
|
|
> >sound?
|
|
|
|
> Having said that, a good algo will have to be written to cut down on
|
|
> network traffic and to keep database conversations down to a minimum. This
|
|
> will be appreciated by people with low bandwidth connections I'm sure
|
|
> (dial-ups, fractional T1's, etc).
|
|
|
|
Of course! In reflection, the assigned identification number I
|
|
mentioned above might be the best then, instead of having to transfer the
|
|
entire set of key fields back and forth.
|
|
|
|
> What would a vacuum do to a system being used by many people?
|
|
|
|
Probably lock them out of tables while they are vacuumed... Maybe
|
|
not really required in the end, possibly optional?
|
|
|
|
> It could probably be named either way...but the one thing I really don't
|
|
> want to do is start hacking server code. The PostgreSQL people have enough
|
|
> to do without worrying about trying to meld anything I've done to their
|
|
> server. :)
|
|
|
|
Yea, they probably would appreciate that. They already have enough
|
|
on thier plate for 7.x as it is! :)
|
|
|
|
> Besides, I like the idea of having it operate as a stand-alone product.
|
|
> The only PostgreSQL feature we would require would be triggers and
|
|
> plpgsql...what was the earliest version of PostgreSQL that supported
|
|
> plpgsql? Even then I don't see the triggers being that complex to boot.
|
|
|
|
No, provided that we don't do the identification number idea
|
|
(which the more I think about it, probably will not work). As for what
|
|
version support plpgsql, I don't know, one of the more hard-core pgsql
|
|
hackers can probably tell us that.
|
|
|
|
> The only thing we'd need for Python is the Python extensions for
|
|
> PostgreSQL...which in turn requires libpq and that's about it. So, it
|
|
> should be able to run on any platform supported by Python and libpq.
|
|
|
|
Of course. If it ran on NT as well as Linux/Unix, that would be
|
|
even better. :)
|
|
|
|
> Unix folks should be happy....assuming they have X running on the
|
|
> machine doing the replication or syncing. Even then I wrote a curses
|
|
> based Python interface awhile back which allows buttons, progress
|
|
> bars, input fields, etc (I called it tinter and it's available at
|
|
> http://iximd.com/~dwalker). It's a simple interface and could
|
|
> probably be cleaned up a bit but it works. :)
|
|
|
|
Why would we want any type of GUI (X11 or curses) for this sync
|
|
program. I imagine just a command line program with a few options (local
|
|
machine, remote machine, db name, etc...), and nothing else.
|
|
Though I will take a look at your curses interface, as I have been
|
|
wanting to make a curses interface to a few db interfaces I have, in a
|
|
simple as manner as possible.
|
|
|
|
> That would be a Good Thing. Have webspace somewhere? If I can get
|
|
> permission from the "powers that be" at the office I could host a website on
|
|
> our (Domino) webserver.
|
|
|
|
Yea, I got my own web server (www.rkirkpat.net) with 1GB+ of disk
|
|
space available, sitting on a decent speed DSL. Even can setup of a
|
|
virtual server if we want (i.e. pgsync.rkirkpat.net :). CVS repository,
|
|
email lists, etc... possible with some effort (and time).
|
|
So, where should we start? TTYL.
|
|
|
|
PS. The current pages on my web site are very out of date at the
|
|
moment (save for the pgsql information). I hope to have updated ones up
|
|
within the week.
|
|
|
|
---------------------------------------------------------------------------
|
|
| "For to me to live is Christ, and to die is gain." |
|
|
| --- Philippians 1:21 (KJV) |
|
|
---------------------------------------------------------------------------
|
|
| Ryan Kirkpatrick | Boulder, Colorado | http://www.rkirkpat.net/ |
|
|
---------------------------------------------------------------------------
|
|
|
|
|
|
************
|
|
|
|
From owner-pgsql-hackers@hub.org Mon Dec 27 12:33:32 1999
|
|
Received: from hub.org (hub.org [216.126.84.1])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA24817
|
|
for <pgman@candle.pha.pa.us>; Mon, 27 Dec 1999 13:33:29 -0500 (EST)
|
|
Received: from localhost (majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) with SMTP id NAA53391;
|
|
Mon, 27 Dec 1999 13:29:02 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers)
|
|
Received: by hub.org (bulk_mailer v1.5); Mon, 27 Dec 1999 13:28:38 -0500
|
|
Received: (from majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) id NAA53248
|
|
for pgsql-hackers-outgoing; Mon, 27 Dec 1999 13:27:40 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
|
Received: from gtv.ca (h139-142-238-17.cg.fiberone.net [139.142.238.17])
|
|
by hub.org (8.9.3/8.9.3) with ESMTP id NAA53170
|
|
for <pgsql-hackers@hub.org>; Mon, 27 Dec 1999 13:26:40 -0500 (EST)
|
|
(envelope-from aaron@genisys.ca)
|
|
Received: from stilborne (24.67.90.252.ab.wave.home.com [24.67.90.252])
|
|
by gtv.ca (8.9.3/8.8.7) with SMTP id MAA01200
|
|
for <pgsql-hackers@hub.org>; Mon, 27 Dec 1999 12:36:39 -0700
|
|
From: "Aaron J. Seigo" <aaron@gtv.ca>
|
|
To: pgsql-hackers@hub.org
|
|
Subject: Re: [HACKERS] database replication
|
|
Date: Mon, 27 Dec 1999 11:23:19 -0700
|
|
X-Mailer: KMail [version 1.0.28]
|
|
Content-Type: text/plain
|
|
References: <199912271135.TAA10184@netrinsics.com>
|
|
In-Reply-To: <199912271135.TAA10184@netrinsics.com>
|
|
MIME-Version: 1.0
|
|
Message-Id: <99122711245600.07929@stilborne>
|
|
Content-Transfer-Encoding: 8bit
|
|
Sender: owner-pgsql-hackers@postgreSQL.org
|
|
Status: OR
|
|
|
|
hi..
|
|
|
|
> Before anyone starts implementing any database replication, I'd strongly
|
|
> suggest doing some research, first:
|
|
>
|
|
> http://sybooks.sybase.com:80/onlinebooks/group-rs/rsg1150e/rs_admin/@Generic__BookView;cs=default;ts=default
|
|
|
|
good idea, but perhaps sybase isn't the best study case.. here's some extremely
|
|
detailed online coverage of Oracle 8i's replication, from the oracle online
|
|
library:
|
|
|
|
http://bach.towson.edu/oracledocs/DOC/server803/A54651_01/toc.htm
|
|
|
|
--
|
|
Aaron J. Seigo
|
|
Sys Admin
|
|
|
|
************
|
|
|
|
From owner-pgsql-hackers@hub.org Thu Dec 30 08:01:09 1999
|
|
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA10317
|
|
for <pgman@candle.pha.pa.us>; Thu, 30 Dec 1999 09:01:08 -0500 (EST)
|
|
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id IAA02365 for <pgman@candle.pha.pa.us>; Thu, 30 Dec 1999 08:37:10 -0500 (EST)
|
|
Received: from localhost (majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) with SMTP id IAA87902;
|
|
Thu, 30 Dec 1999 08:34:22 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers)
|
|
Received: by hub.org (bulk_mailer v1.5); Thu, 30 Dec 1999 08:32:24 -0500
|
|
Received: (from majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) id IAA85771
|
|
for pgsql-hackers-outgoing; Thu, 30 Dec 1999 08:31:27 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
|
Received: from sandman.acadiau.ca (dcurrie@sandman.acadiau.ca [131.162.129.111])
|
|
by hub.org (8.9.3/8.9.3) with ESMTP id IAA85234
|
|
for <pgsql-hackers@postgresql.org>; Thu, 30 Dec 1999 08:31:10 -0500 (EST)
|
|
(envelope-from dcurrie@sandman.acadiau.ca)
|
|
Received: (from dcurrie@localhost)
|
|
by sandman.acadiau.ca (8.8.8/8.8.8/Debian/GNU) id GAA18698;
|
|
Thu, 30 Dec 1999 06:30:58 -0400
|
|
From: Duane Currie <dcurrie@sandman.acadiau.ca>
|
|
Message-Id: <199912301030.GAA18698@sandman.acadiau.ca>
|
|
Subject: Re: [HACKERS] database replication
|
|
In-Reply-To: <OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM> from "DWalker@black-oak.com" at "Dec 24, 99 10:27:59 am"
|
|
To: DWalker@black-oak.com
|
|
Date: Thu, 30 Dec 1999 10:30:58 +0000 (AST)
|
|
Cc: pgsql-hackers@postgresql.org
|
|
X-Mailer: ELM [version 2.4ME+ PL39 (25)]
|
|
MIME-Version: 1.0
|
|
Content-Type: text/plain; charset=US-ASCII
|
|
Content-Transfer-Encoding: 7bit
|
|
Sender: owner-pgsql-hackers@postgresql.org
|
|
Status: OR
|
|
|
|
Hi Guys,
|
|
|
|
Now for one of my REALLY rare posts.
|
|
Having done a little bit of distributed data systems, I figured I'd
|
|
pitch in a couple cents worth.
|
|
|
|
> 2) The replication system will need to add at least one field to each
|
|
> table in each database that needs to be re plicated. This
|
|
> field will be a date/time stamp which identifies the " last
|
|
> update" of the record. This field will be called PGR_TIME
|
|
> for la ck of a better name. Because this field will be used
|
|
> from within programs and triggers it can be longer so as to not
|
|
> mistake it for a user field.
|
|
|
|
I just started reading this thread, but I figured I'd throw in a couple
|
|
suggestions for distributed data control (a few idioms I've had to
|
|
deal with b4):
|
|
- Never use time (not reliable from system to system). Use
|
|
a version number of some sort that can stay consistent across
|
|
all replicas
|
|
|
|
This way, if a system's time is or goes out of wack, it doesn't
|
|
cause your database to disintegrate, and it's easier to track
|
|
conflicts (see below. If using time, the algorithm gets
|
|
nightmarish)
|
|
|
|
- On an insert, set to version 1
|
|
|
|
- On an update, version++
|
|
|
|
- On a delete, mark deleted, and add a delete stub somewhere for the
|
|
replicator process to deal with in sync'ing the databases.
|
|
|
|
- If two records have the same version but different data, there's
|
|
a conflict. A few choices:
|
|
1. Pick one as the correct one (yuck!! invisible data loss)
|
|
2. Store both copies, pick one as current, and alert
|
|
database owner of the conflict, so they can deal with
|
|
it "manually."
|
|
3. If possible, some conflicts can be merged. If a disjoint
|
|
set of fields were changed in each instance, these changes
|
|
may both be applied and the record merged. (Problem:
|
|
takes a lot more space. Requires a version number for
|
|
every field, or persistent storage of some old records.
|
|
However, this might help the "which fields changed" issue
|
|
you were talking about in #6)
|
|
|
|
- A unique id across all systems should exist (or something that
|
|
effectively simulates a unique id. Maybe a composition of the
|
|
originating oid (from the insert) and the originating database
|
|
(oid of the database's record?) might do it. Store this as
|
|
an extra field in every record.
|
|
|
|
(Two extra fieldss so far: 'unique id' and 'version')
|
|
|
|
I do like your approach: triggers and a separate process. (Maintainable!! :)
|
|
|
|
Anyway, just figured I'd throw in a few suggestions,
|
|
Duane
|
|
|
|
************
|
|
|
|
From owner-pgsql-patches@hub.org Sun Jan 2 23:01:38 2000
|
|
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA16274
|
|
for <pgman@candle.pha.pa.us>; Mon, 3 Jan 2000 00:01:28 -0500 (EST)
|
|
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id XAA02655 for <pgman@candle.pha.pa.us>; Sun, 2 Jan 2000 23:45:55 -0500 (EST)
|
|
Received: from hub.org (hub.org [216.126.84.1])
|
|
by hub.org (8.9.3/8.9.3) with ESMTP id XAA13828;
|
|
Sun, 2 Jan 2000 23:40:47 -0500 (EST)
|
|
(envelope-from owner-pgsql-patches@hub.org)
|
|
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 02 Jan 2000 23:38:34 +0000 (EST)
|
|
Received: (from majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) id XAA13624
|
|
for pgsql-patches-outgoing; Sun, 2 Jan 2000 23:37:36 -0500 (EST)
|
|
(envelope-from owner-pgsql-patches@postgreSQL.org)
|
|
Received: from falla.videotron.net (falla.videotron.net [205.151.222.106])
|
|
by hub.org (8.9.3/8.9.3) with ESMTP id XAA13560
|
|
for <pgsql-patches@postgresql.org>; Sun, 2 Jan 2000 23:37:02 -0500 (EST)
|
|
(envelope-from P.Marchesso@Videotron.ca)
|
|
Received: from Videotron.ca ([207.253.210.234])
|
|
by falla.videotron.net (Sun Internet Mail Server sims.3.5.1999.07.30.00.05.p8)
|
|
with ESMTP id <0FNQ000TEST8VI@falla.videotron.net> for pgsql-patches@postgresql.org; Sun,
|
|
2 Jan 2000 23:37:01 -0500 (EST)
|
|
Date: Sun, 02 Jan 2000 23:39:23 -0500
|
|
From: Philippe Marchesseault <P.Marchesso@Videotron.ca>
|
|
Subject: [PATCHES] Distributed PostgreSQL!
|
|
To: pgsql-patches@postgreSQL.org
|
|
Message-id: <387027FB.EB88D757@Videotron.ca>
|
|
MIME-version: 1.0
|
|
X-Mailer: Mozilla 4.51 [en] (X11; I; Linux 2.2.11 i586)
|
|
Content-type: MULTIPART/MIXED; BOUNDARY="Boundary_(ID_GeYGc69fE1/bkYLTPwOGFg)"
|
|
X-Accept-Language: en
|
|
Sender: owner-pgsql-patches@postgreSQL.org
|
|
Precedence: bulk
|
|
Status: ORr
|
|
|
|
This is a multi-part message in MIME format.
|
|
|
|
--Boundary_(ID_GeYGc69fE1/bkYLTPwOGFg)
|
|
Content-type: text/plain; charset=us-ascii
|
|
Content-transfer-encoding: 7bit
|
|
|
|
Hi all!
|
|
|
|
Here is a small patch to make postgres a distributed database. By
|
|
distributed I mean that you can have the same copy of the database on N
|
|
different machines and keep them all in sync.
|
|
It does not improve performances unless you distribute your clients in a
|
|
sensible manner. It does not allow you to do parallel selects.
|
|
|
|
The support page is : pages.infinit.net/daemon and soon to be in
|
|
english.
|
|
|
|
The patch was tested with RedHat Linux 6.0 on Intel with kernel 2.2.11.
|
|
Only two machines where used so i'm not competely sure that it works
|
|
with more than two. -But it should-
|
|
|
|
I would like to know if somebody else is interested in this otherwise
|
|
i'm probably not gonna keep it growing. So please reply me to my e-mail
|
|
(P.Marchesso@videotron.ca) to give me an idea of the amount of people
|
|
interested in this.
|
|
|
|
Thanks all.
|
|
|
|
Philippe Marchesseault
|
|
|
|
--
|
|
It's not the size of the dog in the fight,
|
|
but the size of the fight in the dog.
|
|
-Archie Griffen
|
|
|
|
|
|
|
|
--Boundary_(ID_GeYGc69fE1/bkYLTPwOGFg)
|
|
Content-type: application/octet-stream; name=replicator-0.1.tgz
|
|
Content-disposition: attachment; filename=replicator-0.1.tgz
|
|
Content-transfer-encoding: base64
|
|
|
|
H4sIAOeZbzgAA+w8a3PayLL7Nara/9Dx3iRAME/brO11tjDGDjc2OIBvTm5OihLSAFoLiUgi
|
|
LGfX//12z0MaAX7UOU72nLpW7cagmenp6enp9xCwmetYZuQHxR++1QM7pVq5DD8AQG2vxv+W
|
|
d3b4X/mUsKVUrdbKO7XKLrZWdkqVH2D3m2GkPfMwMgOAH2Z+GI0DFt7ejwW3Nv7nPkGy/xfm
|
|
NRs5Lnv0Ocql0l6pdOv+l8u1XbH/e7W90m6NWsu7pR+g9OiYbHj+n+9/q904vzppwhFst6A4
|
|
D4NiGFjF2Tj84vJPjme5c5vFra5vma5sl21FwzBdFw7AGwb+NQtgKv8mvAXYZWw6HnayXGZ6
|
|
d3cVXQ7AeBZMYXukOquv0/TXZKhhGLJrjEvBAvlhYjwbW5b2/r8ycu1Z2B6P7SFs+8lMYRQ4
|
|
s/irodA8UJOvwZ3eCXeahqu+GtqyD7SFrEFPNW2aQAO07c6+wPb52m65ztB49pNAQKfZzIys
|
|
CR7DA+OZ7YxGm5hgaFrXzLOLkeXPiuqgFHzXfnhnC96A+rJX2C3wWf9q5n96dPn/zbaH5L+U
|
|
95vkf6VUS/T/Xq2Crbt7ldqT/P8eD+6NWS6V8+VSzXgDxdzlBfTnDE6YBZWfcSMOqtWDnX0o
|
|
7+/v54rY4yelELaKE3/KYqYpaoykZNeW3v+XcBkWo+UMZcHkzVqDM7M2vZ6GY/G6mHuOmHEM
|
|
jL3arrn3cxU/PnO8SJmQan72fs6C5SH22sNeu/m9/T3qWcwBAuiZEfw36hYoQ7l6UKrg6pD9
|
|
SiXqATlojSCaMECOiNiUeVFIHwP8s3CiCZjw6vWrPPXw0t3AQkqEMAr8qQREzZpIngW+xcKw
|
|
IFppEepxRpD5QggPSDB740+lz3B0RBNl/+C9Ya0VXsGrQ9GWXjM2ndbPe03eeEP/MDdkf9zS
|
|
td+90npy+iJme7V9a29/3/gFnskH4CVHIQoYG7hOGOXBZvSvabmdrywIHJtlD43t7W0x0V3P
|
|
gyDVSjsmegH5WnWfID7j/6wxZvWgXDso7SSMSX0APuC2oPKRG4DThRHR2IleheD5Ee5hyFxm
|
|
Rcnm8aEgQcTbQThuv8F9nZqe3UeuhedH0Lg4GfSa581GH16+XKGo2C5cf69/0R9c9M4AefdQ
|
|
viM2nhLvI93L6mU4wy2NRhnRyH5HQmy9fhFu5VNbnlXd5R+HDwg9m/6MWZRpd06ag3fNj3lo
|
|
XTYGjW6z3oc/obS3t5fNw0vslAfEZtBr/W8zD6Us/FKKUQWYIdX9ILN1EY57SLVTE81/+wC2
|
|
4lk5cwgWURshGOWvlltPz+M8mtiOTdjHnuNu/69cre0I/69Sq5XLpSq1Vit7T/r/ezzFnP4Y
|
|
/YkTQivk8nOGhrw5ZvgZtSZ6eP4ijIWe43vKnO+9PwfbjMyhGTJAF880bIdk13AeMRuY99UJ
|
|
fI8kbcEwGv5sGTjjSQSZRpbLboDLieM6M5SNF2ZgTVBNMnPuRoZABTXnODCn4JB6ZSjc/VG0
|
|
MAN2CEt/Dhaq8oAl06GcB5TXRXRrpj46M0t6Mfds9NxoQRELpiH4Qsmfta/gjHksMF24nA9x
|
|
VXDuWMwLmWHivPQmnOAChkve/ZRm78nZ4dRHqJwKh8DQNsAJUIGFRJWKISeQ0PKABkAGKYgI
|
|
B+DPaFAWsVyCi7ojHldYX7BORkeYHBN/JvcDF7Zw0OseMkDOHM3dvIE94UOr/7Zz1Yd6+yN8
|
|
qHe79Xb/4yE3X3xsZV+l6eJMcRsRLC4mML1oiUQxLprdxlvsXz9unbf6Hwnt01a/3ez14LTT
|
|
hTpc1rv9VuPqvN6Fy6vuZafXLKC6Y4QQM24nJ4wQ0tRHqtksQgWDZpDxETcvRJTQg5yYX0lV
|
|
W8z5igiZaEvNlg/ZI9f3xsIuizS6HZK2R0Wfh0XgIENE/vruGcnuodL0rEIedvehz5AmDC5d
|
|
02KwDb05Da9WS3k4Rj6nrhd1gFKlXC5vl6ulWh6uevWCkTo+RcNAIyQJBRxAf0J8HdJGyxCC
|
|
Iw+XMAlxBeh8R7gJ4RS3lO+NhwSbiiNGJiXSxPNtZqBZqT8CXKhW6HMmpI6qpUCTI9Vxwvbr
|
|
8goWwdzz0LoA31uFS8Cmoi9KPOROvn9tDlrABIQmV4FNBI0vwrQsNkNb2fI9Dy0sRB93GnIE
|
|
/wOZzAsmrGkSJfTF9gnKwsRl0wwEThuKJ5060QmnSQiKJBla2pmLeq/f7MJxt/MO/1x2Ow1k
|
|
02Yva5Ahp7kQke345ECkXrnOcPUdWVrpd2gYeatD0SEJfeuaRen3Hosc/L/oeOv9aXn01rjF
|
|
D9roBW30gZK3W4l/ZRhffQeF7O9ONPDwZLjLDLlEhOWpjXYeWnjCyNxquH5IG56QmIw8AAvf
|
|
s4wcQC8IVqaEH2+ImXsT349WvB3JcZp/E02CuWAc3B9SGWjBzhluBceuPpu5y54anrEmZpBD
|
|
cFM8pN3mWavX79b7rU4bN3gQsDHHmhbxxbHzgIZul4WoDwi1tGmNL4q5fhoRjzGbH4lrz1+I
|
|
07NANzWF2wLFO/oIEaKXtsz59NtvWic0F/KENVvqxjlhzClEExdPjq/OFG0z8sRnIV7lAbxA
|
|
Nn1hZ//uoUGfghK4zEvgZhVIXC7iII16wd+3mPWHdyDw/qp1cnAEL2w+L8JU4BO/gdP1NsdA
|
|
dwq0ZZFLNeLOgeAaDpID5eeEXNZm5bh1lo35jX9VnVP96g08rFpH/r23setp/eq8r3Xl3zf2
|
|
bJ10L7SO9HVzv3a/q/fDr7f0+5/6eaojft/Ys91BKmo9+XfVkx+mMv9yIw6UMLNQmnFtfzr3
|
|
+FnkTqsQqa0TMmHAM6dMqUEu1ddOHXq05ERzkcfPWZeNHZLbbeyeuedkrR8nvqX6cUAIg/7H
|
|
yyY1Tdl05TRwiMjOzj+YP0rN9k9y9L/Ko2L1gdCbTwz7CAyLNOMsC2/JjOD6GY1G5B+p/Ikt
|
|
j4UqSpiwIfRL32/HrasqKeY7cjAYl+w6/yCLJzt5KFnWm0+PlxELlfzNdYXFmOrMJTquNaN6
|
|
I/uhZflVqjfkKL33ZuYlNruTzxrcaqVYUrABhY0HH9JHM4WF4v0FekEMR4hp9RWmI5J8jXJP
|
|
b1+noGy8QkXx1dU9cIVqUSsLo+fGkCeLMG6gsYD2/oL8ExRXaOBJ+5zJrY6RD7HBmmTEWyFx
|
|
NJQs8iTPO413g8t6412zf4BWKzOvD1c7nMXtsgPi0PFc9Kz8cXrWZ3wEJ4MaYjyDVcskRodL
|
|
uNVtegYb0Wj+rZUA1Q0xZVMhUk18jQRFq95dHnC6MImjE/4aEwXQQRqR73uQkGKzpJPO0tC0
|
|
5RIpQqnvEmzEJG6+0XYP/yczihzsWBkJz8xjX3lmNpoHdLQS9bX5KRp0UKem42X4Xl6z5QDP
|
|
PVp7zH7HlofSWsPXSpeJNlIufrBEFTieCrNMnXm2uERNBZufYu4qFO45+QeZrDj8ZJ6awwYe
|
|
Jjtg3qdy6fOhWp6UWqQzSbGSN27JfvFYLx6KR6q0OnUaEB6+IdIHQXEwtKfclw1n5sJjNoeJ
|
|
7DO3hPAzbTsYoBvPU7J1/KJEGzWOUCELg7aH5J6tOEO4yejaFnJFgYYUcmIc4il8kkz9dNBq
|
|
N/t56NHJQYHWrF/I8w7Jgb/ruFvI3hGT8BJ2Sh35G6VM43UU0K8YjMyp41JaQWJxuNZj5geU
|
|
qzmCCTrTYeYcRW6zPbjsdPvZ9c4m/8D/xCNa7frJSXdQb3/kZ+qcBqApwek8/AcL/MzLTBoK
|
|
vUSR93NiASDdhg7aF4J2ecisbFAuCy9jGLH4XOn0UP1AM+n6QD+OI1vqBoEUMSV6BQqtYxQn
|
|
A5Rv+tbdNZMY/pC5ijkZBzj3/VlIR4cOOadhWgNt4Fyh/RXrxme0x6VL8gop30O6EWNuJB+C
|
|
ynLvjbD5oNx/buNqnlzC/VI+cn2nZkPgItxw504mCKM0l2il1d9dRBUTrDnMd1CY20xSHxZz
|
|
ZzzdlAx/LheiVi6iJxTdYYs4HCS7o2CTMjIxstJ0uOTGtRB9Dz/h1P8hy9ARbY1ItJnKBOSy
|
|
TqKboKQwOtIQ4fLeU0uiBZHM1UJU4HsrsatCoRCrw02WZMwA2cM1raWrq8RM2kSNNbixxuM2
|
|
DwF4/vxuMpGkNkcsWuK0UYK1IJxIuyoifCCdx2YU5LOuVzVPgqiuthI19Po1ZXylJkzvzQ0Z
|
|
FujmIMQPdHQ5pI1benNH/H9D2v7Rcwz31P9VK3s7qv6vWqtWeP3fTvUp//M9nqf8z1P+5yn/
|
|
83j5H+MnZ4QMN5K5icHbH42f8KvjMe1N/Oqi/rdBo9NuNxsUfOhRLVLcplnIXEzu7lSrCSxp
|
|
ISrXpFJKA20fJ63VSnm/koBFt/ld86MaCLs/7yRDVTGJatwraQglITw5Eo9v0ijDdGpkuVRB
|
|
uCLWf3Lcrl80j/7Ymi7t4dbNITmSypMKtYyCTDvRN4asTSzOQwkUjyS9ibpcSB6HDKUppTMc
|
|
Mg+pA5GcoUsGf2iBg7wWJMjr7n9ed9vzcajzBujfQeeyKaJBiKmCLczLP7ingvPykAXPn+AK
|
|
gUcMPikSfBbRMxV0wW+x4xwHfvG/eZhIl8TK48HfdaLIxsiX0SYWGLLMjKxHbjrBBZ0xPXtI
|
|
4hHlA5rDOJOQNsrz5quZB4xH7mDDKosS0dYJODai44wcFgIzrQmX+3KrkjQintwpNhIrZBx+
|
|
egOmEJiitPs1cdBFgoUTboKg2uaUfdpFR924ScUBkfh4dpiHchyxUYcnV/zxqfboER/N/vP+
|
|
mvof2N2t7Kj6H5Rxu7z+p7z7ZP99j+fJ/nuy/57sv0et/7mj2sXxnMgxXQqP3RoA/5E0O1kB
|
|
vDRGi55oJpKQ1NQT1eG/Y71Lqre9Mn/kTNm3LIrR8hA/bg7CCzKqUKbsQMYIGV453Lcx43aJ
|
|
nibPqTz5vclKFZN0WfxKptyFYXVpBiFTMf6RM55Ls5YuX8a1UEJSiPiQIeNIVN4iA4ZyY2Tk
|
|
6RHyAd6/kg8oFvudkw78Znrz0BDRLEVCRAg/EW2HSypqyGzxXgpcshO35hHWOt2TSljrz9MI
|
|
1D+XUWFi+TaX1TDdfjPhLyUQlVNYAbaeVNictFH7It/cGahOprgr57CeCti0Z4m8EFBvyUhL
|
|
vCl+OA/Ws+hyy/iG6Y0FZbFz7halSDleFIiOgvI3vCWKZJTOc3QbuPAzw+vVkjFicx+dBREV
|
|
D1EL2f40Q5Ih0746PxdFJKmZcYIjkP14KxKXymZiyj4wu/9AQvJrJA9I7ce05NKeRtHSabVk
|
|
FKXlNl9rqizmjtsbh2s5mdWCnY2lFEmCQaVWzLj0TZaTKlP3C5oP4rJmEsEWNTiB9fWWGpx0
|
|
Gc4mEqq6BVF+A4nc0LLnWoJBrwFAumhO+qFKRFHpUTo3n5TPpYt70vxwVxXEg4UggbxD9tH/
|
|
SQ3lveH2f7tH8//0S8aPOsd9/l9V+n90/3Nnl+5/7JbK5Sf/73s838gES16i9Tn7sj1i/HUx
|
|
d+b6Q9Mls+byjHRVDsYDSn3hx08UM/x8aKRLF3mzXtynuol6DUrihTyiCOl0oFSEFGcjISys
|
|
JinuVOIxQMM7EtG3cK2A7Vj0lfXKswGOp8m40ODBKzlDT1zN3C1XPitrz7nXTDTimsoVrRXP
|
|
c0dtJVU5kd9IMTkqtUZVnJoAvFd073GGfqzNfjPR0QRunl3URbUEKoWMc1RCX+mXmILw+rUj
|
|
5WEx1/ahzUQOWoUbxbVXzxc2KDbI1cdqAzLrO+V8pnJqygqvaPKsFKcy26rp0nUgEkPKg6ar
|
|
WrOoVV3XtzK3EkqS+HaYDzIaOKjXr2NekwpH3d9MsUEetshCOnqBHilaqUe7O9UK2EOyo474
|
|
3c5bjCkRIt9Q2v0i5JXcqUl4t+TYOESay/eyiz3MrHc2xA5dvqdCwnmY0QdnaX+SJMTguH4i
|
|
GUHEtsPxp3Ll589SGyeXVnGpf/cyidKIM+vEgcgg8vzAi/AAqSGLcR9CgFgf4yR31lHH0fT4
|
|
0JszimuE6xcWeOBann4aFqLVazHun6hyLq3gWomJxcQXsfj5Csik3pWXEMbCIswk3uJsQPgT
|
|
JS/PAn6HIYeLD1My4iGH8dQJ8DBbE2Zdg7NyQR1N7qk063gUJEacAg7eqwgWplg/t2adSNWg
|
|
FIsOdzGs6Uz5Y2SBEjccHaFxhByEPv6cxeXDfDHbb6SVdrThmIqzvjpUzpbiaHE3QYcodzng
|
|
layX79nvzEqxqNY7sfmIoZ/TkD//xDGCwj3B3vglSxe2L8+6zd6g0bm4qLdPBp13scEXc3al
|
|
tKNY+x7mjstEgY6kNG7JvCYeBnFMdSzzt9BIsXxsUq4xO9BhdpkZ8HXoBrNeWLK5zw0SWxaF
|
|
IG4rNZv8qlcH3b3f/CFnexLvps1DVNoNmVAxVBxu4gqSx6QUK/EYFQ7z6W69cnKUCaXYMOR5
|
|
IQk8uT8m7qT5VKiLHiBBcjzkaM+S50+Lgmr5pQQdylmlq01/TEdZ5DECPFXGP3ET4T6n6y/x
|
|
r1a48R4nS3xedbUc9WMD8hCrtKdWsfVBFrGmNT/uiG49pau/YN1kWnXPbimLEjPp8qwg5Knk
|
|
KV1oJ3VhqxL3ZXJyhEP2H+V//dWP5v/hOTi5aH6DOe75/Z+9cq0W+3/lCv3+G358uv//XZ5u
|
|
EpL7WiqUDcM4b120+tzy7cmUlMwD0sVJLrUj9C4oeUTJFxd1/e+Abp/HXKgUKoVymaR1l9lv
|
|
zUi27hVKBaPuhn4qrUjgdEgC9II8i+lsHtHdZin2PYavg+sC/T5P6E8ZaY0JjqYK3ZCrMZ5e
|
|
kvUHiDAqxlD+EAyaRzzxRj/lQvdnCpChlFX9qv+208Uu3G40hsz1F1nDaIEZhvMpaSuyRma+
|
|
F5pDx3UiMiqTZJPKQhbgCueJc3sLDwInvMY5WrCgAJLhOtdSlgpx7Xi4skS9cp5aVXp5uvqs
|
|
sptkTZkWKjpmmOIermkHjJSx+MaErufjbd+aCyHa15EEDpptm+5sYj7nqTeOHIVqIz8ykeoX
|
|
9ROO5dwTkCiLS+jYc2FW60lcIABccg8Z8wycwWN2gX7/Rz2G8eEtatRWD1r9X9Ms5JAVQQUp
|
|
28JHSFLISLM4yUz1IiKlzHeRLqGHwEOlvayBo4gZcPR4wsgaoIEhMnEkbPTIxB0Ac8qrdAiQ
|
|
MF5wmDPFJSHeqFJ5jQwaHGEhjZ/tM/E7QUuGQOcznmmg2xeui+zNf5cHKM3KE3NEQwph2+YS
|
|
9TtZNgbn4YW5jPdvw5x5osJwmaR3ycARPMrr1YntDfolPjSUKMQh7Ka6NKPk7w9pP1FFW6yu
|
|
xseRDX6JQZ2IgsFvRdH7dqe9fSuImTk2tdLzpCybUrbG284HOOk0aVfhQ6f77leDJzrp1wm8
|
|
NZi0RpXUzSeXWMmhinO9aY+KZ7P5kU3ngFNuHP8dJ7E2vY48idvym+ho1dG+slQfHr9Irtmu
|
|
XWfXTF2+l15ee21wSPEvS8WG8vpVeFXIv2K5huSnye0wMglgDWPQadx/2wTyW67arYaIINUv
|
|
Ou0zXqDXk+Tvo9xzPGeqsslobwUmUsfkXMCNJ48Hktbse9pvrcbMSGWV11eP7hJd7OPL5dfp
|
|
nWidFgU4UTxrqHKQUJ6SuUx3ciqIWfJJMQJ/q1yAAjQV4kAyPvKN2BDFI87cUXKpMLVlHQHM
|
|
JikcOrTsib+gKwF55TXrW2mjQBdLxAW5hB1tVhj5MzprkAkZo/pCd5mFse/bVANnamKTV2CM
|
|
jERLMJ3hN/pPYqkyPSQYk4nfjdM8mwIdNC491Ulrd+izsU1JYyGFxA+YhbLjqx78X3vX2tzG
|
|
dWQ/79T+iAlSFYlVEEyKlmRbqexCJCQhoUiZDzv6tBkSQ3BsAIOdAUhjf/32Od195w4ethMn
|
|
dtUuUXFsAjP32befp/ueDN5eppdn8th/yHOXEZAy+z4v71WDn43lt2+9dJ09IGP+Xo4d8CUL
|
|
2Rd5qNYFDKadHLeQqiAN9MGg00tznd5nfEoZrBH67WTVCw8Ko8dshd6XdbqcA3MxG+9ZQZOe
|
|
DTa7Lu+1WFsB2xECXv6vYpLepByPeYoc+omXPoihXzyTcwPblSJxHkBJpFFKJ7cvXazizeEm
|
|
N3Y2s6VaX/omv8mWTGjNZ2sPweAUTQFnqc1KMjtzsqBTUS3b9U1aT97K2a3vxJhNP+TZjEZn
|
|
d53v8qSOSiUZ7qIJ9xsnxbqc3CtKSWaNXU7Xp8OF2CiKF6UXyUHIyHbrnMvcFKdBdZiGkRj9
|
|
Z7TQ1sw2DmqKsr3ZTMfvW+ioQll+1CCxc0BIAaZRyaFZgGaS5M3VuwsnUlPzfJE7VonvtS2O
|
|
/9lBa3Xuf+t6jblewjhSLBCcSYRQLR4KsJchZVPGCiEzEeZjTn74ZEq5/9+E/dTLClirB92m
|
|
H7CZohD04hMU0RxfUm6yAs6HYXifalH5ZHvpRankkKXXxUJdstATqijR2z0JHSQPutM0dIyn
|
|
R6CxDoBEo+sOv6HkbKz2abnWFouxQBdayrFe9Ri+gDNR8Qq2cJ1iflN31MeDrvB3Ne3YIa00
|
|
AQzrIntPtYobUf8OUkuV6WQnlE9+6Pk35X/eCy8tF6JO9m6yRyjvb/2J7P/l/F/Uh9j/r37E
|
|
/k+fvzww+3//4OWrA9j/+4fPH+3/X+Pz+999Jqzms/ouSX5vduNNVcyDOrCcwxosq1FefSVP
|
|
HOzFCoar355bKQ881wdOt2nn8vPhnmkPQTzFbs2k50Uo/yD/OWv+s3nmD7/1gv0f+0TnH5rf
|
|
v6SPnzr/6SvD/7/Yf7XP+t+Hh4eP+P9f5dOcf1gC0Pv9cgL/e7b2d0Mxv/XYHz+//BOd/6Oz
|
|
j5+Gp+/++X3I+f989/n//HD/1cs0ffn54YsXL/afHwL/dfji5WP9/1/lo3XAmWgxOB2c90/S
|
|
j1dvToZHqfwzOL0YJP/mNd2/8QSXbvrnpZhmB19+eZAk6XpKzxdfdvnTzowZzblI0p2fl69e
|
|
oFppnfbvxSI+yoQhFaNxzgSM/ecHh18y9SJJB/d5tVIjERb9tFgsHJo0X9GSiTKD5Nlr6X6K
|
|
H4u8ToLXfGJZKu4979LBe3OXzeh5KGgush4IXBpwdSf/pmvyscplbJMctUwuWVmILdWW8FIv
|
|
Ghc8zfi8LsYzc69m38uX5iiuEqQ2jeAyKjUlhIPnEJDS1EvTNytGAqoM5fK3Z8Yknm5D//1C
|
|
7H3tarzMkOKTWxzkx7rCb4mP+dkzQsi/N2tYrfYmosBsLEa/malbayyjl1KHTHZkAgV4Tqnr
|
|
Y76XXdk5T6IghoEPguUZIAIPd7C0s+XirqyYFsw6umWyrHX7ZEhPL+BT0td2UWVrcjfwmtGr
|
|
kvhinxTXVVatduU4wX2ZZ6PeXsr4CFz/WRPJTrj0NmJ4CsqyB6oJmULzPGNpkFaOW9frrlT5
|
|
LW5HoFPDN7ALmkzmFX0aBHVsH1m9QXvxnmroKgmB+Ig6orOjR2ZjfOlTo51qTFJINDaWV/fS
|
|
tTs3Hor6bq8bugrODgVgSdNiJ6DcjyzYOEdgJfEXgVwqFtGrjIcppbaoUV6Hl07GeKOjRCMz
|
|
IhY4Xl/31+bdsOZYItbbHZXmSaKTrebuXJZ4dYFsCu4fuVzNXXEQJNdShIaslJUtZvOyGNfF
|
|
KBFiBXvCYuYzjQlpJ9oSBg6Srr/Xn0rsSpWHdEV9ipGMer0XFIxG/iDZXV4tcLeWuaKLELDE
|
|
8UTLuqLJ1h2NV5Kpirb8IX+SS/FWfsh/yJAh1/UntjZXL1EJugllPtzlOHbJGO5bzlgRM7e5
|
|
NMR+4CQdu/tLqKOYaxDMgxa2VlhXHCM6unp6yvjuGjnDCccD1g2kFpEXgh0R5Uk7fSGJMI6a
|
|
Pr+7fOrEwIRRjVuvlGDo0kt8axLeOLKFSixI8SB7usjn9Vfp04M9yiUVle1VF7JMnorhXCJ+
|
|
YmQSSaaHO5TXxhrV/HGSj+WYU+LVtYEt0XQ33mFNg/VtjPvjqBGN73Iv6N5V9vmk9qkQS6m5
|
|
QUrwjiPkahvBJVzw3KUw82wRxBnVYSuUnc7KJtFUXeEmQJImkB7u3onYMAdfqHfbHcIAEXFo
|
|
88zinRhfYtyijinIy0BzMA9OHOopNZlOd7tsSTHLJoi765QgZGQhRLRPKUsZCNdhqNdTXa5a
|
|
LeqW18l4JStrKzF59ATW0nKRWTV0nCT8PFl12UnMnhSHyow7MGoR91jLhYgQzt6E4xw/w5cM
|
|
ugNvJQchElVTjoQ7VjrjKDG6VJRBposeJCcmUcxGxX0xoms4La/JSLSToM904QHKhTZveNos
|
|
vSk0g8BxVeSiRa96xjQR2ltwm7uhLOI0GzE/m6jF1NfZJqTH7zroUIqsdNJ6YuoGuDxLxS2a
|
|
5zQ1vec62Bz7H04u5VMpM1Suecu6cvMVcdpN+F1pXTPG3aV/W0Lb6yX/nvykgiy/Xg7OP1yk
|
|
/dNj4KiPh1rLBUnTZlJ102PAyYdvrjSUKw9+ODsevrXYLga/bzGULaqSkSMXG/Ec6jHEQChn
|
|
IEZEFJAEsacFZO8cScwhdb1hO3flBMKlzlam2k5FA73O40zzZGu6vAxsu3rR02XvfNTxdUR7
|
|
BpSmm1BnCcOnWIjmgNF71KnDqSBcHNAL3lqCiFLtGfbRL2iDIJG8Ku5lx+5zXRAdfDPhSfbw
|
|
lZ5phbvKzKVbfdaWzbP14paJ3QcZUJnoJk2Kv9oQmAH4e0wytbPcIJuRtc/5c8eSiZzNZTbG
|
|
kj1FRWVhBLcLhKf9BatsY5kzo5C1PylEpbWfZ4nvTNqJe+9A82Tk3E6GYm1Go0phIVmddkR2
|
|
dOSg9IW936uCUNq6Emi041y0JkllEopnoyErdRg5vFYWS61suUAYnky5ltadVIBmKm8Tx0fE
|
|
S29M2TUdIEcsHoXWWsgpeyWJlHXGOXnXF8sokk8qGy0WlIjpBqEl3vNTYYP5HKrXjFaJxlKJ
|
|
cFLGJfPcMuK9HjAwvohKZNUS6vbc4c8ud8IkATEiuzpQKBWO4M8wWF1Xs2ae1LEeg+2NlWuo
|
|
zcBwyQmZihRYiiKGwHdBk9CVfizNvLhZlst6or0LzyEvF9qVb6wERsCS2CDjp5LmpBnnsUnc
|
|
TLJiqsVxXfK/1iqIheLZTLtL9LXaJdatlyiKOeEsIJIACLOYPeYWmk7wDJXIxj6MFIH20rXw
|
|
RU0/7YoS4WkrVsFdUkuHymsI6c/vVnUBPJLStR5mN9e0J1XwVtZKuxKI6XxBPYr0LwjdH9wy
|
|
d6WZlPO8oRzT7wziiFlV2wnGOaZxtkQ5W0pkBrAQOtydrLhrslTpNFY0ydrbjNAY/LbKKxc2
|
|
uYOEsI8tdCmkIQr3NM8XjkLxEL/L8a+0eGy21xgBitNg0Nt1RhRTIM+/kbXlwsocFVBHkrPk
|
|
YkJSIhuT6608R1twDjSCtWWEp0/1dBzXG+MgbRKj6M1G6wWAiZ4sM22LGZtRHBIMMNZ7XgSx
|
|
zu9qFXUxtqi9sZZLXZnaXd7CCGppVADMWi8ZVsHpGSKKp7GoRqEVENAuTcBFv07/Zs9V97D0
|
|
Lui14rmwEiB16gZrAPdUlUEMCZ+xyQujFQYb2YS6lKBR/shiKtK2c2GcCJCeJjY0DVJJtGo5
|
|
6mOqRiJpK8IyuSZM8sGqyaaIogSCVnqazcqlcBeFL1II81C0OF66leNlbMC+2G37PIVOO0FJ
|
|
INPAAn3YKdBxhBf2GoeFIoVx4lv1jsiBbLW5XWxh/cCYGM0nE5dfaM4BLvdF/rDGE9lKo+E9
|
|
HfyAUsjS1FeOpQsiWyF5RWtT4UFlEwFOGyhBF1+9BLPWkneVibU4kM9mU0MgWGzUbOdaY709
|
|
0dzdb8JnDeiq2KWyTa/sszkeNEYTr3qXiRnoOHE7AzAn+YoqQzuPZpdyyapiyBmoy5m0Rlcu
|
|
VKOKGmKjd2jlP6CNF6rO1qbvTWWN7w2vOGudQd1ZQp9xRLsOP4rmWYpoC8N/CCiiFg1NU96J
|
|
FHcNp/NyEV5I1oiOmM3QLApmZQr3chajpgmRVs2WJutChYw1VjhNaGkbbhTaW86FkvYKqAO4
|
|
cYeonadKgCvDNcCxeVAkEmxtZd24kql54uoOkS9ofOq0qnycVaMJEXG3xDA/QEyrc+xSXuxG
|
|
YQKMlP73RWCYdZMUS8Uo8v9RUa0XSew6coQ7KlhJO6kOVh0B8tzrVHbpjoZD0xXNmyT/Ia/U
|
|
/HXHmWVqL6pysnWxIwOqrESdm8Cb4eZUvVUVkDkPieEvNJozJTpsPMYqebMOAuQ8WAt/S0PJ
|
|
uq5FBmlXLe/URPa02OR9OVkyYTcBALOsAHRTnt7MT3XfhgtdV87/otEp2yRNw0rZKuUOf1xV
|
|
X5/C+uhhQqowdfXn+R5RNdffwafiPnBFKpLfQCPbIn+TCz9xBxzDcwPP7lCihBnAZWZnSl0a
|
|
sgKN/tS/Qa4N1BWWk7PdwHeTnLKuUp8yBaGh159BmGOQqkA1RkjXzryf2rh83W5NUGVNezpa
|
|
TVQ370ZaK6dZhapQS3cMNU5CCB3Vxl7LEnaDRrY5syycJ6rc3fQ+mxTaHBKVhDsv6H/Tea3y
|
|
rGKgpjErqCCRIay6ppCbBjWzBCQa0gzoUTGyCJdbCJB+ijsm1FwXLqbXLqWwrj1bWF/xVl5E
|
|
e3Na+0DFTwXwz9uD3euvM/kH9uBmF3XFBWojm5X6qSeeYINU9q/FoXZMGToKvWfZRMYyU35m
|
|
aoyFbdU9oDdNzJgMJpxSzLYNd4e7ESD08H4rXSX4an/y8HK+QUHNAtXBLJd1qSzp7WJ57dLh
|
|
WldfVBfmrsTm/W3DVNQjpmNhWFC3YxokJx5CMM48tW3LjCk0PY3nZLN40OqRC0dfe0/Yu3bp
|
|
8ZiNcQEyVI6WsJWKxmoRy26yrGmZZHVd3hTuEJMjgAqLLBtdhLJ3/rzyYQAQrUCySDWXXxhc
|
|
YX4yqj3wkE8mWaw4NDOSWb73BBDodkk9z7njuSuz3Y35xMeFIT5IDfPHJaGsmbt6glIbv/YU
|
|
Zru6C61lWaNrWiAJ9mkvTl/5jhrAVCia2ulTnSFGrAmVqprUYON7NsPEUiFwAFb1QlQ3Opl4
|
|
i2xr/rCUZFWXM+otHHPoKjG1PbMT6iUzotUTIX+7oS1ErUPFik4AojXmJyOhIxOhSc8k37Lk
|
|
eA1Fkxropjat1t9Koa4La8Yo1xrYoD5Xt6mMsjHkRVDPr5NtamWLS1r+Ubkc30W8vbCIuTo5
|
|
p/Oc6TFbhrDmLooWA1GDNP280Rm0Lh8cQequEfuPTvRQHna7LpEopYJ68x/mcOTSgDJR7+w8
|
|
UlUQzYSDSahivkio4zxQGyx3dr+7d/BPxJWUBhkrypYQAwsTZpAiBTayFffcMqwknENfYKjQ
|
|
7bqv6rPiYniYndsLCeEaWuQTDPE3Ry4UVQO/CQPj0eE2wbwBL/YBiD2IQJf873Y5Uc4yKZh+
|
|
Bun1QrfOzbvY2rRLgdoWSF3AKenBaZKOwS3IbMP0LYVHY5hjmPjqtm2Hcs2lJyx8x8aUzIRb
|
|
j30o9gYWb+ZWWcUg3V1xXSzUVT/JHkL0vowvOVojo0t4L3A3VBcQGB0Q07CqdtbqhuLrjq9t
|
|
TvY9de4g4HgTqEb7b9X+8D3WGwQRpobH0WFGf09gT0cchp+sLeKaiWNQh5eWgYhSlKag/Jiq
|
|
/xMzXsSghrUDZMQPE9lPo7O0xAPJ9osiRfQQt32JUYDfxyWn+zstAZosIhhDe80cQmHsqRDJ
|
|
YJ7L22XFeFULcBIKprlT/UkajE3PoFMGQLqWpbhjiKuXtE+SIVSszm0OXfBGr6X2E2ghpYgd
|
|
cx5rFtkrFANQwU53CrLaPDIAISBW+3fLEespp6qkRNapxpwT0UQhcXJ/6Nb20+MH8NekTzXa
|
|
PC0MW2jxas1L2+smERVSGeY6khBAO08N/4JJ6aiYSSEDF3PZO2449Z7LaUD95JgsTNMPXayd
|
|
kVbOGcQFnJ/oN4jG3e8q5MLwT3g99umXpo3XQO0IedXFdDmRY5prsEgDGCJDxqZXNlw/icM2
|
|
EVovrxbqfo9eM9G/sYlQvZ0wd5w9C/tvIpMy392AngmVPBUjmlblSsyE1TNCCqLDHekJ3osw
|
|
P1V79VrTpo6DhVhGhWYvqts+/CVmJLUKmYdOkZynXRuAo/LlvZZFstr1bR4YFWpARL2C0Aru
|
|
IG7yjwxfdbgo6LPhkJL/vMsn0KTVGAaSbqaHMqeWp6KXTeAw3iwnqHtYVDfLqaZrK4e7ziZx
|
|
Lm3UfIRETdQp6fEUfygKS6whVw1AOVMSSuJuEUEdtlxu82VFDrbF5yY7szT5zL/01Efok7qB
|
|
VcDRL6S6Mu8Z3XUO1DNfnToOChbNZyP0ZuuTr9udM4GdKuOkNUKP8hmSBpMeV9biwmCYjYHd
|
|
2mJV+rvBv5qglD85iYr4ucIznPrndMljwdL0A/cxL1HmPkBykjFwHXKsletYN8EUf0AIv2IM
|
|
Eui+jSHlo8SpnazLbBKiEY2fay3OutDi6nHerDnTE33ptTlRUUnbwr0EUX02Kme6ASORPiMi
|
|
Swm1Sus70gyUQYr3lrMgjNXH1zAjG6TCTwJewtigSUJlxHdlQZ3wcu3UxGRKSBwGil7g3SfA
|
|
6cGMxGtZhvxeDwBqY69LK5Wq9WKDPdOI+KLnwbV1P8Vnhnpd41isQuDwCYQPHBxKw6jiHT6F
|
|
S6SI+q9XTWQrttOVRzfqyAaWCFyRplfdGsemGUCOno1G6ncAEch2j3M8Pr9jBL01xQj0InJN
|
|
Y3GJMuIwla5CM7NF+9VWOoC6c2ZUAlATJmkWQlnHsrYO8hFE4kyDU7jaudvmxaLkl3KCESKp
|
|
ydCjIco5F6p0B6OFH6/L0QbKgMrLlyyxsRuKjpVy9EWV3xeM3uqWA9RsN4bUftHIrms3qANA
|
|
i8VxwpULaXqBucVt8PCAMEXCF2DuMvZ6XlRRUT+hJxxce0PTIzBCLVGDF/ReD7J4BRz5laaK
|
|
oNQwhxAiIZBUrv2yFFkY+Ffhb8QWyh4vZdLgi/6E3rLc4EPdNqY3h3dSZWvPbhgSyiojQJ1J
|
|
2g6Yd+sSlk63seIosh2j0TjPIwdqW6F2kJhHCH1QZeWogVZX22+YSbaQw8bcm4CGLsJq2xKs
|
|
BclWAcNSup7vr/Da1B+/7ybKyVDo0n7PlUfHoEang7rCBv6EWDjlvzEKtbb4XesErynVoRjS
|
|
2q0uKh8Sw9BDfW8saVMNgxQI8ciYzf3Eyv/YJTKtK4CQwlFOcxyyOqE8CE7GOiCeLU0DQozr
|
|
7tf3CMmPmrEAMj4uswlPN89ede9kp2oBS5ySpuT9xglQ+9WybeUBtrNqHNMy2OzI/FFsA644
|
|
MDESXhkrP2FRDkt1Oj0Ltwlx/w966ZvBUf/qYsBKRR/Pz96d9z+g5JehYo/Tt+eDQXr2Nj16
|
|
3z9/N+jiufMBnojbAkY2aqCLMjb4e/DXy8HpZfpxcP5heHkprb35lPY/fpTG+29OBulJ/1tZ
|
|
zcFfjwYfL9Nv3w9OkzM0/+1QxnNx2ccLw9P02/Ph5fD0ndVS+vjpfPju/WX6/uzkeHBOtO5n
|
|
0jtf1KuNBheJjOOb4XF7Up3+hQy7E65W8sFjcrhm6S/D0+NuOhiyocFfP54PLmT+ibQ9/CAj
|
|
HsiPw9Ojk6tjAoHfXGlNH1bZk3FennFp/FlvXQYj7W/cyQTk8M+4lIlLKI3Igp8PL/6S9i8S
|
|
W9ivr/qhIVldaeND//SIG7W2kZhu+unsClJD5n1yjAcSfwALNUiPB29RNPob2V55Urq5uPow
|
|
sPW+uOQCnZykp4MjGW///FN6MTj/ZniEdUjOBx/7Q1l+YKTPz7X0tPKW5z1snlDJ4BvQwNXp
|
|
CWZ7Pvj6SuazhRLQRv+dUBsWM9r35NuhdI4dWt/8Ll+RH5rN/yRkhALpnxSY/cnIQ4YZkNtt
|
|
qhCiaKiz/+YMa/BGxjPksGQgWBBs0XH/Q//d4KKbBCJg1wYm76YXHwdHQ/yH/C6kJ3t9oqsi
|
|
p+jrK+yifGGNpH3ZTkwNdGhbhjMIWjt1GpG+18/l06bvNfoDXZycXYDYpJPLfsoRy7/fDPD0
|
|
+eBU1ovHqX90dHUuRwtP4A0ZzcWVHLbhKTclwXx5mofnx36euM7p2/7w5Op8g8akZ71oc6C0
|
|
FjbEiexir0saSIdvpauj97Z7aevUfkrfy1a8Gchj/eNvhuA82k8iZ+FiaGtyZi3YOpKxMddU
|
|
5sfntwD4gf3vzwHOKX74Ck5cyAEtT6t+1ktqAfLlJ7DdU1F5TNbVoGOTjyMRr5Ny3lzz3qAp
|
|
oyw3w+qZyBwzC6ReJGKJqLNsWQcppAae2d05K0yt1DN9B0NDVR9Fu1MSFYukLRFUEoa0nY07
|
|
9KKE0BAydiei58W5Y3axyCzw1ChIAdLr+qM6I1KrvFRnt5gaRhzenjb3khqMiDAci7TwWixP
|
|
GdU8FEUOippwn68sciUqfG3KWgM5JpAHTbGN+MY5j/lTk+8EpaCDkqXmvErnJe0gLXuXWxIs
|
|
AwYG9UMaE9QAg0L+EevJ9x03EC3Ak1pLzGvT12KB3GpVOUKK9EY/YsP/xLbW06pXK2nfa9Sr
|
|
4vMn7fUX3ZNo/mNFUf4DdyWma3clBsTez78v0TuJ7ktkKz/rzsRtC/D33ptIyMgvuzsRTfyy
|
|
+xNDktHPvkMRb/zyexQVIvGP36WI9zfvU/x5KfxIRgFOCV6BGBYCz5myWy9/CyIWBZmVD6ty
|
|
JuPXFEDR91EDfqKuzhZCo4VI7Tov9ESSDMtWBRCv1umF+5qAR9S3XNCIYRZFC9sqByY3BNW7
|
|
mSjV96rNOzm/lKltObvtk7vxthaC5A7031ycnYi2cfIp1pRfkwJs8/VG7L8xW/XhSa85BOun
|
|
v5EzZPz5BP1olbgWM2ALljsV/EVugr2Ou7t5Eg+kp1CVu9Uchh3jWg3K28fHMYS3jVo907aV
|
|
TdKyG3fmm53dMpRi0Y+mP4aKa3g1V3BoIMbGCLDYZfQoRMlOW4dmuUvqmedpv86TaSlNPruR
|
|
EXxPR8Y0ny1lwfJp/ewZuDaN5xr1/9I4x799xSnBeEg/5iO4Z7RcoS6WZ7oH+LG9PUVdWM3d
|
|
rpIaJvtEYxszRbAjuIzEucYZ16TcdJrMFNc1UKgUqfG1Zmi+N2R6BtzEfCIigqgpvgMy1fyK
|
|
T+WqHK1muZ9oyD8tWmw+8WwSewN5QqCNGMO1zqWhv0V0/gQBMWIE5TTWmsLLUtYOfKn3ghNN
|
|
OvszRpO+R6H/igzvjwodQbK3UMnlSk5aOftTNz0QvawqJqw+AgVFf+iiQkddeE7XN0JB5snd
|
|
wWSDX8UiRY1PI5RonkfejCTKfA1FBkJYrYpZUYagbFUiJg1mw1ISwSmTOB6cGZlg8iqZGG7U
|
|
kYhSQTRX3GOr6qnjUBJr3J1GyhQeHBbqadwjUd48Y2ZLdYtke3WLTWfmb1295vHzSz9R/afh
|
|
qZhzJyf//D5+4v6H588P9v3+h8NXuAvw4PDF4eP9f7/KZ+P+h6+vhkd/SY0WNi+AiG9siC6A
|
|
D+pP7wWFqvy7dwhFbiiCYWLXQSThOohu+8YIYe4He2l6NUNPUYG5ZzKi3mL8P+o8Vqba9Gnm
|
|
4aJiWuRzaeBDA4ahwed7itHMs8XNXagOrY9UdYB3qrEXCrpDqBmCGkH9EhyVuup0rYvm0kNZ
|
|
kygZn0/opWahloNDmsF5P51dnadeJ51yt5ce57eolwuO3ZmuRtedXnoUKvV+EJGJHnvJ4Z5d
|
|
gqOS1ufF5bHQZpSu3NEHnn3cT/+4uSCd5HNp7Tw3ARwe6KVPj3mBmGgTLJxSsjg664QTGbaX
|
|
vJAXj5pyPPPxf91dZz3caG0phaJVBS+Dgrn4p5WxfvpJhO3QK3QYWLteip6117NyWvLeUPP5
|
|
6jUfgzT5FV/g5Vv+gUoizONLIamXXwjl7IO5vHjR83/2+dSikiOfvOS01d8Q7q1SRO+qQXU+
|
|
KwDoBaY8X3gd5730dpKNheJeSRNUjTs0LqT3jtosupSRGlQnX+BZSw1O14C6pKNZKJbaVE7d
|
|
CwlJ/NIFuJORQjIKRHFd7fKbUpIvMb2lftlZzjsGqpZzts+rzDDrCZMQUXn70nZX94couspf
|
|
RlVQf113revbxvLwSxTJlzZ21H7eWfn5QlSSP4sunh6mz/f3fxVW//h5/Dx+Hj+Pn8fP4+fx
|
|
8/h5/Dx+Hj+Pn8fP/9fP/wKykq3cAMgAAA==
|
|
|
|
--Boundary_(ID_GeYGc69fE1/bkYLTPwOGFg)--
|
|
|
|
************
|
|
|
|
|
|
From owner-pgsql-hackers@hub.org Mon Jan 3 13:47:07 2000
|
|
Received: from hub.org (hub.org [216.126.84.1])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA23987
|
|
for <pgman@candle.pha.pa.us>; Mon, 3 Jan 2000 14:47:06 -0500 (EST)
|
|
Received: from localhost (majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) with SMTP id OAA03234;
|
|
Mon, 3 Jan 2000 14:39:56 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers)
|
|
Received: by hub.org (bulk_mailer v1.5); Mon, 3 Jan 2000 14:39:49 -0500
|
|
Received: (from majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) id OAA03050
|
|
for pgsql-hackers-outgoing; Mon, 3 Jan 2000 14:38:50 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
|
Received: from ara.zf.jcu.cz (zakkr@ara.zf.jcu.cz [160.217.161.4])
|
|
by hub.org (8.9.3/8.9.3) with ESMTP id OAA02975
|
|
for <pgsql-hackers@postgreSQL.org>; Mon, 3 Jan 2000 14:38:05 -0500 (EST)
|
|
(envelope-from zakkr@zf.jcu.cz)
|
|
Received: from localhost (zakkr@localhost)
|
|
by ara.zf.jcu.cz (8.9.3/8.9.3/Debian/GNU) with SMTP id UAA19297;
|
|
Mon, 3 Jan 2000 20:23:35 +0100
|
|
Date: Mon, 3 Jan 2000 20:23:35 +0100 (CET)
|
|
From: Karel Zak - Zakkr <zakkr@zf.jcu.cz>
|
|
To: P.Marchesso@videotron.ca
|
|
cc: pgsql-hackers <pgsql-hackers@postgresql.org>
|
|
Subject: [HACKERS] replicator
|
|
Message-ID: <Pine.LNX.3.96.1000103194931.19115A-100000@ara.zf.jcu.cz>
|
|
MIME-Version: 1.0
|
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
|
Sender: owner-pgsql-hackers@postgresql.org
|
|
Status: OR
|
|
|
|
|
|
Hi,
|
|
|
|
I look at your (Philippe's) replicator, but I don't good understand
|
|
your replication concept.
|
|
|
|
|
|
node1: SQL --IPC--> node-broker
|
|
|
|
|
TCP/IP
|
|
|
|
|
master-node --IPC--> replikator
|
|
| | |
|
|
libpq
|
|
| | |
|
|
node2 node..n
|
|
|
|
(Is it right picture?)
|
|
|
|
If I good understand, all nodes make connection to master node and data
|
|
replicate "replicator" on this master node. But it (master node) is very
|
|
critical space in this concept - If master node not work replication for
|
|
*all* nodes is lost. Hmm.. but I want use replication for high available
|
|
applications...
|
|
|
|
IMHO is problem with node registration / authentification on master node.
|
|
Why concept is not more upright? As:
|
|
|
|
SQL --IPC--> node-replicator
|
|
| | |
|
|
via libpq send data to all nodes with
|
|
current client/backend auth.
|
|
|
|
(not exist any master node, all nodes have connection to all nodes)
|
|
|
|
|
|
Use replicator as external proces and copy data from SQL to this replicator
|
|
via IPC is (your) very good idea.
|
|
|
|
Karel
|
|
|
|
|
|
----------------------------------------------------------------------
|
|
Karel Zak <zakkr@zf.jcu.cz> http://home.zf.jcu.cz/~zakkr/
|
|
|
|
Docs: http://docs.linux.cz (big docs archive)
|
|
Kim Project: http://home.zf.jcu.cz/~zakkr/kim/ (process manager)
|
|
FTP: ftp://ftp2.zf.jcu.cz/users/zakkr/ (C/ncurses/PgSQL)
|
|
-----------------------------------------------------------------------
|
|
|
|
|
|
************
|
|
|
|
From owner-pgsql-hackers@hub.org Tue Jan 4 10:31:01 2000
|
|
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA17522
|
|
for <pgman@candle.pha.pa.us>; Tue, 4 Jan 2000 11:31:00 -0500 (EST)
|
|
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.10 $) with ESMTP id LAA01541 for <pgman@candle.pha.pa.us>; Tue, 4 Jan 2000 11:27:30 -0500 (EST)
|
|
Received: from localhost (majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) with SMTP id LAA09992;
|
|
Tue, 4 Jan 2000 11:18:07 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers)
|
|
Received: by hub.org (bulk_mailer v1.5); Tue, 4 Jan 2000 11:17:58 -0500
|
|
Received: (from majordom@localhost)
|
|
by hub.org (8.9.3/8.9.3) id LAA09856
|
|
for pgsql-hackers-outgoing; Tue, 4 Jan 2000 11:17:17 -0500 (EST)
|
|
(envelope-from owner-pgsql-hackers@postgreSQL.org)
|
|
Received: from ara.zf.jcu.cz (zakkr@ara.zf.jcu.cz [160.217.161.4])
|
|
by hub.org (8.9.3/8.9.3) with ESMTP id LAA09763
|
|
for <pgsql-hackers@postgreSQL.org>; Tue, 4 Jan 2000 11:16:43 -0500 (EST)
|
|
(envelope-from zakkr@zf.jcu.cz)
|
|
Received: from localhost (zakkr@localhost)
|
|
by ara.zf.jcu.cz (8.9.3/8.9.3/Debian/GNU) with SMTP id RAA31673;
|
|
Tue, 4 Jan 2000 17:02:06 +0100
|
|
Date: Tue, 4 Jan 2000 17:02:06 +0100 (CET)
|
|
From: Karel Zak - Zakkr <zakkr@zf.jcu.cz>
|
|
To: Philippe Marchesseault <P.Marchesso@Videotron.ca>
|
|
cc: pgsql-hackers <pgsql-hackers@postgreSQL.org>
|
|
Subject: Re: [HACKERS] replicator
|
|
In-Reply-To: <38714B6F.2DECAEC0@Videotron.ca>
|
|
Message-ID: <Pine.LNX.3.96.1000104162226.27234D-100000@ara.zf.jcu.cz>
|
|
MIME-Version: 1.0
|
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
|
Sender: owner-pgsql-hackers@postgreSQL.org
|
|
Status: OR
|
|
|
|
|
|
On Mon, 3 Jan 2000, Philippe Marchesseault wrote:
|
|
|
|
> So it could become:
|
|
>
|
|
> SQL --IPC--> node-replicator
|
|
> | | |
|
|
> via TCP send statements to each node
|
|
> replicator (on local node)
|
|
> |
|
|
> via libpq send data to
|
|
> current (local) backend.
|
|
>
|
|
> > (not exist any master node, all nodes have connection to all nodes)
|
|
>
|
|
> Exactly, if the replicator dies only the node dies, everything else keeps
|
|
> working.
|
|
|
|
|
|
Hi,
|
|
|
|
I a little explore replication conception on Oracle and Sybase (in manuals).
|
|
(Know anyone some interesting links or publication about it?)
|
|
|
|
Firstly, I sure, untimely is write replication to PgSQL now, if we
|
|
haven't exactly conception for it. It need more suggestion from more
|
|
developers. We need firstly answers for next qestion:
|
|
|
|
1/ How replication concept choose for PG?
|
|
2/ How manage transaction for nodes? (and we need define any
|
|
replication protocol for this)
|
|
3/ How involve replication in current PG transaction code?
|
|
|
|
My idea (dream:-) is replication that allow you use full read-write on all
|
|
nodes and replication which use current transaction method in PG - not is
|
|
difference between more backends on one host or more backend on more hosts
|
|
- it makes "global transaction consistency".
|
|
|
|
Now is transaction manage via ICP (one host), my dream is alike manage
|
|
this transaction, but between more host via TCP. (And make optimalization
|
|
for this - transfer commited data/commands only.)
|
|
|
|
|
|
Any suggestion?
|
|
|
|
|
|
-------------------
|
|
Note:
|
|
|
|
(transaction oriented replication)
|
|
|
|
Sybase - I. model (only one node is read-write)
|
|
|
|
primary SQL data (READ-WRITE)
|
|
|
|
|
replication agent (transaction log monitoring)
|
|
|
|
|
primary distribution server (one or more repl. servers)
|
|
| / | \
|
|
| nodes (READ-ONLY)
|
|
|
|
|
secondary dist. server
|
|
/ | \
|
|
nodes (READ-ONLY)
|
|
|
|
|
|
If primary SQL is read-write and the other nodes *read-only*
|
|
=> system good work if connection is disable (data are save to
|
|
replication-log and if connection is available log is write
|
|
to node).
|
|
|
|
|
|
Sybase - II. model (all nodes read-write)
|
|
|
|
SQL data 1 --->--+ NODE I.
|
|
| |
|
|
^ |
|
|
| replication agent 1 (transaction log monitoring)
|
|
V |
|
|
| V
|
|
| |
|
|
replication server 1
|
|
|
|
|
^
|
|
V
|
|
|
|
|
replication server 2 NODE II.
|
|
| |
|
|
^ +-<-->--- SQL data 2
|
|
| |
|
|
replcation agent 2 -<--
|
|
|
|
|
|
|
|
Sorry, I not sure if I re-draw previous picture total good..
|
|
|
|
Karel
|
|
|
|
|
|
|
|
|
|
|
|
|
|
************
|
|
|
|
From pgsql-hackers-owner+M3133@hub.org Fri Jun 9 15:02:25 2000
|
|
Received: from hub.org (root@hub.org [216.126.84.1])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA22319
|
|
for <pgman@candle.pha.pa.us>; Fri, 9 Jun 2000 15:02:24 -0400 (EDT)
|
|
Received: from hub.org (majordom@localhost [127.0.0.1])
|
|
by hub.org (8.10.1/8.10.1) with SMTP id e59IsET81137;
|
|
Fri, 9 Jun 2000 14:54:14 -0400 (EDT)
|
|
Received: from ultra2.quiknet.com (ultra2.quiknet.com [207.183.249.4])
|
|
by hub.org (8.10.1/8.10.1) with SMTP id e59IrQT80458
|
|
for <pgsql-hackers@postgresql.org>; Fri, 9 Jun 2000 14:53:26 -0400 (EDT)
|
|
Received: (qmail 13302 invoked from network); 9 Jun 2000 18:53:21 -0000
|
|
Received: from 18.67.tc1.oro.pmpool.quiknet.com (HELO quiknet.com) (pecondon@207.231.67.18)
|
|
by ultra2.quiknet.com with SMTP; 9 Jun 2000 18:53:21 -0000
|
|
Message-ID: <39413D08.A6BDC664@quiknet.com>
|
|
Date: Fri, 09 Jun 2000 11:52:57 -0700
|
|
From: Paul Condon <pecondon@quiknet.com>
|
|
X-Mailer: Mozilla 4.73 [en] (X11; U; Linux 2.2.14-5.0 i686)
|
|
X-Accept-Language: en
|
|
MIME-Version: 1.0
|
|
To: ohp@pyrenet.fr, pgsql-hackers@postgresql.org
|
|
Subject: [HACKERS] Re: Big project, please help
|
|
Content-Type: text/plain; charset=us-ascii
|
|
Content-Transfer-Encoding: 7bit
|
|
X-Mailing-List: pgsql-hackers@postgresql.org
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@hub.org
|
|
Status: OR
|
|
|
|
Two way replication on a single "table" is availabe in Lotus Notes. In
|
|
Notes, every record has a time-stamp, which contains the time of the
|
|
last update. (It also has a creation timestamp.) During replication,
|
|
timestamps are compared at the row/record level, and compared with the
|
|
timestamp of the last replication. If, for corresponding rows in two
|
|
replicas, the timestamp of one row is newer than the last replication,
|
|
the contents of this newer row is copied to the other replica. But if
|
|
both of the corresponding rows have newer timestamps, there is a
|
|
problem. The Lotus Notes solution is to:
|
|
1. send a replication conflict message to the Notes Administrator,
|
|
which message contains full copies of both rows.
|
|
2. copy the newest row over the less new row in the replicas.
|
|
3. there is a mechanism for the Administrator to reverse the default
|
|
decision in 2, if the semantics of the message history, or off-line
|
|
investigation indicates that the wrong decision was made.
|
|
|
|
In practice, the Administrator is not overwhelmed with replication
|
|
conflict messages because updates usually only originate at the site
|
|
that originally created the row. Or updates fill only fields that were
|
|
originally 'TBD'. The full logic is perhaps more complicated than I have
|
|
described here, but it is already complicated enough to give you an idea
|
|
of what you're really being asked to do. I am not aware of a supplier of
|
|
relational database who really supports two way replication at the level
|
|
that Notes supports it, but Notes isn't a relational database.
|
|
|
|
The difficulty of the position that you appear to be in is that
|
|
management might believe that the full problem is solved in brand X
|
|
RDBMS, and you will have trouble convincing management that this is not
|
|
really true.
|
|
|
|
|
|
From pgsql-hackers-owner+M2401@hub.org Tue May 23 12:19:54 2000
|
|
Received: from news.tht.net (news.hub.org [216.126.91.242])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA28410
|
|
for <pgman@candle.pha.pa.us>; Tue, 23 May 2000 12:19:53 -0400 (EDT)
|
|
Received: from hub.org (majordom@hub.org [216.126.84.1])
|
|
by news.tht.net (8.9.3/8.9.3) with ESMTP id MAB53304;
|
|
Tue, 23 May 2000 12:00:08 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M2401@hub.org)
|
|
Received: from gwineta.repas.de (gwineta.repas.de [193.101.49.1])
|
|
by hub.org (8.9.3/8.9.3) with ESMTP id LAA39896
|
|
for <pgsql-hackers@postgresql.org>; Tue, 23 May 2000 11:57:31 -0400 (EDT)
|
|
(envelope-from kardos@repas-aeg.de)
|
|
Received: (from smap@localhost)
|
|
by gwineta.repas.de (8.8.8/8.8.8) id RAA27154
|
|
for <pgsql-hackers@postgresql.org>; Tue, 23 May 2000 17:57:23 +0200
|
|
Received: from dragon.dr.repas.de(172.30.48.206) by gwineta.repas.de via smap (V2.1)
|
|
id xma027101; Tue, 23 May 00 17:56:20 +0200
|
|
Received: from kardos.dr.repas.de ([172.30.48.153])
|
|
by dragon.dr.repas.de (UCX V4.2-21C, OpenVMS V6.2 Alpha);
|
|
Tue, 23 May 2000 17:57:24 +0200
|
|
Message-ID: <010201bfc4cf$7334d5a0$99301eac@Dr.repas.de>
|
|
From: "Kardos, Dr. Andreas" <kardos@repas-aeg.de>
|
|
To: "Todd M. Shrider" <tshrider@varesearch.com>,
|
|
<pgsql-hackers@postgresql.org>
|
|
References: <Pine.LNX.4.04.10005180846290.15739-100000@silicon.su.valinux.com>
|
|
Subject: Re: [HACKERS] failing over with postgresql
|
|
Date: Tue, 23 May 2000 17:56:20 +0200
|
|
Organization: repas AEG Automation GmbH
|
|
MIME-Version: 1.0
|
|
Content-Type: text/plain;
|
|
charset="iso-8859-1"
|
|
Content-Transfer-Encoding: 8bit
|
|
X-Priority: 3
|
|
X-MSMail-Priority: Normal
|
|
X-Mailer: Microsoft Outlook Express 5.00.2314.1300
|
|
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
|
|
X-Mailing-List: pgsql-hackers@postgresql.org
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@hub.org
|
|
Status: OR
|
|
|
|
For a SCADA system (Supervisory Control and Data Akquisition) which consists
|
|
of one master and one hot-standby server I have implemented such a
|
|
solution. To these UNIX servers client workstations are connected (NT and/or
|
|
UNIX). The database client programms run on client and server side.
|
|
|
|
When developing this approach I had to goals in mind:
|
|
1) Not to get dependend on the PostgreSQL sources since they change very
|
|
dynamically.
|
|
2) Not to get dependend on the fe/be protocol since there are discussions
|
|
around to change it.
|
|
|
|
So the approach is quite simple: Forward all database requests to the
|
|
standby server on TCP/IP level.
|
|
|
|
On both servers the postmaster listens on port 5433 and not on 5432. On
|
|
standard port 5432 my program listens instead. This program forks twice for
|
|
every incomming connection. The first instance forwards all packets from the
|
|
frontend to both backends. The second instance receives the packets from all
|
|
backends and forwards the packets from the master backend to the frontend.
|
|
So a frontend running on a server machine connects to port 5432 of
|
|
localhost.
|
|
|
|
On the client machine runs another program (on NT as a service). This
|
|
program forks for every incomming connections twice. The first instance
|
|
forwards all packets to port 5432 of the current master server and the
|
|
second instance forwards the packets from the master server to the frontend.
|
|
|
|
During standby computer startup the database of the master computer is
|
|
dumped, zipped, copied to the standby computer, unzipped and loaded into
|
|
that database.
|
|
If a standby startup took place, all client connections are aborted to allow
|
|
a login into the standby database. The frontends need to reconnect in this
|
|
case. So the database of the standby computer is always in sync.
|
|
|
|
The disadvantage of this method is that a query cannot be canceled in the
|
|
standby server since the request key of this connections gets lost. But we
|
|
can live with that.
|
|
|
|
Both programms are able to run on Unix and on (native!) NT. On NT threads
|
|
are created instead of forked processes.
|
|
|
|
This approach is simple, but it is effective and it works.
|
|
|
|
We hope to survive this way until real replication will be implemented in
|
|
PostgreSQL.
|
|
|
|
Andreas Kardos
|
|
|
|
-----Ursprüngliche Nachricht-----
|
|
Von: Todd M. Shrider <tshrider@varesearch.com>
|
|
An: <pgsql-hackers@postgresql.org>
|
|
Gesendet: Donnerstag, 18. Mai 2000 17:48
|
|
Betreff: [HACKERS] failing over with postgresql
|
|
|
|
|
|
>
|
|
> is anyone working on or have working a fail-over implentation for the
|
|
> postgresql stuff. i'd be interested in seeing if and how any might be
|
|
> dealing with just general issues as well as the database syncing issues.
|
|
>
|
|
> we are looking to do this with heartbeat and lvs in mind. also if anyone
|
|
> is load ballancing their databases that would be cool to talk about to.
|
|
>
|
|
> ---
|
|
> Todd M. Shrider VA Linux Systems
|
|
> Systems Engineer
|
|
> tshrider@valinux.com www.valinux.com
|
|
>
|
|
|
|
|
|
From pgsql-hackers-owner+M3662@postgresql.org Tue Jan 23 16:23:34 2001
|
|
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA04456
|
|
for <pgman@candle.pha.pa.us>; Tue, 23 Jan 2001 16:23:34 -0500 (EST)
|
|
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0NLKf004705;
|
|
Tue, 23 Jan 2001 16:20:41 -0500 (EST)
|
|
(envelope-from pgsql-hackers-owner+M3662@postgresql.org)
|
|
Received: from sectorbase2.sectorbase.com ([208.48.122.131])
|
|
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0NLAe003753
|
|
for <pgsql-hackers@postgresql.org>; Tue, 23 Jan 2001 16:10:40 -0500 (EST)
|
|
(envelope-from vmikheev@SECTORBASE.COM)
|
|
Received: by sectorbase2.sectorbase.com with Internet Mail Service (5.5.2653.19)
|
|
id <DG1W4Q8F>; Tue, 23 Jan 2001 12:49:07 -0800
|
|
Message-ID: <8F4C99C66D04D4118F580090272A7A234D32AF@sectorbase1.sectorbase.com>
|
|
From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
|
To: "'dom@idealx.com'" <dom@idealx.com>, pgsql-hackers@postgresql.org
|
|
Subject: RE: [HACKERS] Re: AW: Re: MySQL and BerkleyDB (fwd)
|
|
Date: Tue, 23 Jan 2001 13:10:34 -0800
|
|
MIME-Version: 1.0
|
|
X-Mailer: Internet Mail Service (5.5.2653.19)
|
|
Content-Type: text/plain;
|
|
charset="iso-8859-1"
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: ORr
|
|
|
|
> I had thought that the pre-commit information could be stored in an
|
|
> auxiliary table by the middleware program ; we would then have
|
|
> to re-implement some sort of higher-level WAL (I thought of the list
|
|
> of the commands performed in the current transaction, with a sequence
|
|
> number for each of them that would guarantee correct ordering between
|
|
> concurrent transactions in case of a REDO). But I fear I am missing
|
|
|
|
This wouldn't work for READ COMMITTED isolation level.
|
|
But why do you want to log commands into WAL where each modification
|
|
is already logged in, hm, correct order?
|
|
Well, it has sense if you're looking for async replication but
|
|
you need not in two-phase commit for this and should aware about
|
|
problems with READ COMMITTED isolevel.
|
|
|
|
Back to two-phase commit - it's easiest part of work required for
|
|
distributed transaction processing.
|
|
Currently we place single commit record to log and transaction is
|
|
committed when this record (and so all other transaction records)
|
|
is on disk.
|
|
Two-phase commit:
|
|
|
|
1. For 1st phase we'll place into log "prepared-to-commit" record
|
|
and this phase will be accomplished after record is flushed on disk.
|
|
At this point transaction may be committed at any time because of
|
|
all its modifications are logged. But it still may be rolled back
|
|
if this phase failed on other sites of distributed system.
|
|
|
|
2. When all sites are prepared to commit we'll place "committed"
|
|
record into log. No need to flush it because of in the event of
|
|
crash for all "prepared" transactions recoverer will have to
|
|
communicate other sites to know their statuses anyway.
|
|
|
|
That's all! It is really hard to implement distributed lock- and
|
|
communication- managers but there is no problem with logging two
|
|
records instead of one. Period.
|
|
|
|
Vadim
|
|
|
|
From pgsql-hackers-owner+M3665@postgresql.org Tue Jan 23 17:05:26 2001
|
|
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA05972
|
|
for <pgman@candle.pha.pa.us>; Tue, 23 Jan 2001 17:05:24 -0500 (EST)
|
|
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0NM31008120;
|
|
Tue, 23 Jan 2001 17:03:01 -0500 (EST)
|
|
(envelope-from pgsql-hackers-owner+M3665@postgresql.org)
|
|
Received: from candle.pha.pa.us (candle.navpoint.com [162.33.245.46])
|
|
by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id f0NLsU007188
|
|
for <pgsql-hackers@postgresql.org>; Tue, 23 Jan 2001 16:54:30 -0500 (EST)
|
|
(envelope-from pgman@candle.pha.pa.us)
|
|
Received: (from pgman@localhost)
|
|
by candle.pha.pa.us (8.9.0/8.9.0) id QAA05300;
|
|
Tue, 23 Jan 2001 16:53:53 -0500 (EST)
|
|
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
Message-Id: <200101232153.QAA05300@candle.pha.pa.us>
|
|
Subject: Re: [HACKERS] Re: AW: Re: MySQL and BerkleyDB (fwd)
|
|
In-Reply-To: <8F4C99C66D04D4118F580090272A7A234D32AF@sectorbase1.sectorbase.com>
|
|
"from Mikheev, Vadim at Jan 23, 2001 01:10:34 pm"
|
|
To: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
|
Date: Tue, 23 Jan 2001 16:53:53 -0500 (EST)
|
|
CC: "'dom@idealx.com'" <dom@idealx.com>, pgsql-hackers@postgresql.org
|
|
X-Mailer: ELM [version 2.4ME+ PL77 (25)]
|
|
MIME-Version: 1.0
|
|
Content-Transfer-Encoding: 7bit
|
|
Content-Type: text/plain; charset=US-ASCII
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
[ Charset ISO-8859-1 unsupported, converting... ]
|
|
> > I had thought that the pre-commit information could be stored in an
|
|
> > auxiliary table by the middleware program ; we would then have
|
|
> > to re-implement some sort of higher-level WAL (I thought of the list
|
|
> > of the commands performed in the current transaction, with a sequence
|
|
> > number for each of them that would guarantee correct ordering between
|
|
> > concurrent transactions in case of a REDO). But I fear I am missing
|
|
>
|
|
> This wouldn't work for READ COMMITTED isolation level.
|
|
> But why do you want to log commands into WAL where each modification
|
|
> is already logged in, hm, correct order?
|
|
> Well, it has sense if you're looking for async replication but
|
|
> you need not in two-phase commit for this and should aware about
|
|
> problems with READ COMMITTED isolevel.
|
|
>
|
|
|
|
I believe the issue here is that while SERIALIZABLE ISOLATION means all
|
|
queries can be run serially, our default is READ COMMITTED, meaning that
|
|
open transactions see committed transactions, even if the transaction
|
|
committed after our transaction started. (FYI, see my chapter on
|
|
transactions for help, http://www.postgresql.org/docs/awbook.html.)
|
|
|
|
To do higher-level WAL, you would have to record not only the queries,
|
|
but the other queries that were committed at the start of each command
|
|
in your transaction.
|
|
|
|
Ideally, you could number every commit by its XID your log, and then
|
|
when processing the query, pass the "committed" transaction ids that
|
|
were visible at the time each command began.
|
|
|
|
In other words, you can replay the queries in transaction commit order,
|
|
except that you have to have some transactions committed at specific
|
|
points while other transactions are open, i.e.:
|
|
|
|
XID Open XIDS Query
|
|
500 UPDATE t SET col = 3;
|
|
501 500 BEGIN;
|
|
501 500 UPDATE t SET col = 4;
|
|
501 UPDATE t SET col = 5;
|
|
501 COMMIT;
|
|
|
|
This is a silly example, but it shows that 500 must commit after the
|
|
first command in transaction 501, but before the second command in the
|
|
transaction. This is because UPDATE t SET col = 5 actually sees the
|
|
changes made by transaction 500 in READ COMMITTED isolation level.
|
|
|
|
I am not advocating this. I think WAL is a better choice. I just
|
|
wanted to outline how replaying the queries in commit order is
|
|
insufficient.
|
|
|
|
> Back to two-phase commit - it's easiest part of work required for
|
|
> distributed transaction processing.
|
|
> Currently we place single commit record to log and transaction is
|
|
> committed when this record (and so all other transaction records)
|
|
> is on disk.
|
|
> Two-phase commit:
|
|
>
|
|
> 1. For 1st phase we'll place into log "prepared-to-commit" record
|
|
> and this phase will be accomplished after record is flushed on disk.
|
|
> At this point transaction may be committed at any time because of
|
|
> all its modifications are logged. But it still may be rolled back
|
|
> if this phase failed on other sites of distributed system.
|
|
>
|
|
> 2. When all sites are prepared to commit we'll place "committed"
|
|
> record into log. No need to flush it because of in the event of
|
|
> crash for all "prepared" transactions recoverer will have to
|
|
> communicate other sites to know their statuses anyway.
|
|
>
|
|
> That's all! It is really hard to implement distributed lock- and
|
|
> communication- managers but there is no problem with logging two
|
|
> records instead of one. Period.
|
|
|
|
Great.
|
|
|
|
|
|
--
|
|
Bruce Momjian | http://candle.pha.pa.us
|
|
pgman@candle.pha.pa.us | (610) 853-3000
|
|
+ If your life is a hard drive, | 830 Blythe Avenue
|
|
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
|
|
|
|
From pgsql-general-owner+M805@postgresql.org Tue Nov 21 23:53:04 2000
|
|
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA19262
|
|
for <pgman@candle.pha.pa.us>; Wed, 22 Nov 2000 00:53:03 -0500 (EST)
|
|
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by mail.postgresql.org (8.11.1/8.11.1) with SMTP id eAM5qYs47249;
|
|
Wed, 22 Nov 2000 00:52:34 -0500 (EST)
|
|
(envelope-from pgsql-general-owner+M805@postgresql.org)
|
|
Received: from racerx.cabrion.com (racerx.cabrion.com [166.82.231.4])
|
|
by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id eAM5lJs46653
|
|
for <pgsql-general@postgresql.org>; Wed, 22 Nov 2000 00:47:19 -0500 (EST)
|
|
(envelope-from rob@cabrion.com)
|
|
Received: from cabrionhome (gso163-25-211.triad.rr.com [24.163.25.211])
|
|
by racerx.cabrion.com (8.8.7/8.8.7) with SMTP id AAA13731
|
|
for <pgsql-general@postgresql.org>; Wed, 22 Nov 2000 00:45:20 -0500
|
|
Message-ID: <006501c05447$fb9aa0c0$4100fd0a@cabrion.org>
|
|
From: "rob" <rob@cabrion.com>
|
|
To: <pgsql-general@postgresql.org>
|
|
Subject: [GENERAL] Synchronization Toolkit
|
|
Date: Wed, 22 Nov 2000 00:49:29 -0500
|
|
MIME-Version: 1.0
|
|
Content-Type: multipart/mixed;
|
|
boundary="----=_NextPart_000_0062_01C0541E.125CAF30"
|
|
X-Priority: 3
|
|
X-MSMail-Priority: Normal
|
|
X-Mailer: Microsoft Outlook Express 5.50.4133.2400
|
|
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
|
|
Precedence: bulk
|
|
Sender: pgsql-general-owner@postgresql.org
|
|
Status: OR
|
|
|
|
This is a multi-part message in MIME format.
|
|
|
|
------=_NextPart_000_0062_01C0541E.125CAF30
|
|
Content-Type: text/plain; charset="iso-8859-1"
|
|
Content-Transfer-Encoding: 7bit
|
|
|
|
Not to be confused with replication, my concept of synchronization is to
|
|
manage changes between a server table (or tables) and one or more mobile,
|
|
disconnected databases (i.e. PalmPilot, laptop, etc.).
|
|
|
|
I read through the notes in the TODO for this topic and devised a tool kit
|
|
for doing synchronization. I hope that the Postgresql development community
|
|
will find this useful and will help me refine this concept by offering
|
|
insight, experience and some good old fashion hacking if you are so
|
|
inclined.
|
|
|
|
The bottom of this message describes how to use the attached files.
|
|
|
|
I look forward to your feedback.
|
|
|
|
--rob
|
|
|
|
|
|
Methodology:
|
|
|
|
I devised a concept that I call "session versioning". This means that every
|
|
time a row changes it does NOT get a new version. Rather it gets stamped
|
|
with the current session version common to all published tables. Clients,
|
|
when they connect for synchronization, will immediately increment this
|
|
common version number reserve the result as a "post version" and then
|
|
increment the session version again. This version number, implemented as a
|
|
sequence, is common to all synchronized tables and rows.
|
|
|
|
Any time the server makes changes to the row gets stamped with the current
|
|
session version, when the client posts its changes it uses the reserved
|
|
"post version". The client then makes all it's changes stamping the changed
|
|
rows with it's reserved "post version" rather than the current version. The
|
|
reason why is explained later. It is important that the client post all its
|
|
own changes first so that it does not end up receiving records which changed
|
|
since it's last session that it is about to update anyway.
|
|
|
|
Reserving the post version is a two step process. First, the number is
|
|
simply stored in a variable for later use. Second, the value is added to a
|
|
lock table (last_stable) to indicate to any concurrent sessions that rows
|
|
with higher version numbers are to be considered "unstable" at the moment
|
|
and they should not attempt to retrieve them at this time. Each client,
|
|
upon connection, will use the lowest value in this lock table (max_version)
|
|
to determine the upper boundary for versions it should retrieve. The lower
|
|
boundary is simply the previous session's "max_version" plus one. Thus
|
|
when the client retrieves changes is uses the following SQL "where"
|
|
expression:
|
|
|
|
WHERE row_version >= max_version and row_version <= last_stable_version and
|
|
version <> this_post_version
|
|
|
|
The point of reserving and locking a post version is important in that it
|
|
allows concurrent synchronization by multiple clients. The first, of many,
|
|
clients to connect basically dictates to all future clients that they must
|
|
not take any rows equal to or greater than the one which it just reserved
|
|
and locked. The reason the session version is incremented a second time is
|
|
so that the server may continue to post changes concurrent with any client
|
|
changes and be certain that these concurrent server changes will not taint
|
|
rows the client is about to retrieve. Once the client is finished with it's
|
|
session it removes the lock on it's post version.
|
|
|
|
Partitioning data for use by each node is the next challenge we face. How
|
|
can we control which "slice" of data each client receives? A slice can be
|
|
horizontal or vertical within a table. Horizontal slices are easy, it's
|
|
just the where clause of an SQL statement that says "give me the rows that
|
|
match X criteria". We handle this by storing and appending a where clause
|
|
to each client's retrieval statement in addition to where clause described
|
|
above. Actually, two where clauses are stored and appended. One is per
|
|
client and one is per publication (table).
|
|
|
|
We defined horizontal slices by filtering rows. Vertical slices are limits
|
|
by column. The tool kit does provide a mechanism for pseudo vertical
|
|
partitioning. When a client is "subscribed" to a publication, the toolkit
|
|
stores what columns that node is to receive during a session. These are
|
|
stored in the subscribed_cols table. While this does limit the number
|
|
columns transmitted, the insert/update/delete triggers do not recognize
|
|
changes based on columns. The "pseudo" nature of our vertical partitioning
|
|
is evident by example:
|
|
|
|
Say you have a table with name, address and phone number as columns. You
|
|
restrict a client to see only name and address. This means that phone
|
|
number information will not be sent to the client during synchronization,
|
|
and the client can't attempt to alter the phone number of a given entry.
|
|
Great, but . . . if, on the server, the phone number (but not the name or
|
|
address) is changed, the entire row gets marked with a new version. This
|
|
means that the name and address will get sent to the client even though they
|
|
didn't change.
|
|
|
|
Well, there's the flaw in vertical partitioning. Other than wasting
|
|
bandwidth, the extra row does no harm to the process. The workaround for
|
|
this is to highly normalize your schema when possible.
|
|
|
|
Collisions are the next crux one encounters with synchronization. When two
|
|
clients retrieve the same row and both make (different)changes, which one is
|
|
correct? So far the system operates totally independent of time. This is
|
|
good because it doesn't rely on the server or client to keep accurate time.
|
|
We can just ignore time all together, but then we force our clients to
|
|
synchronize on a strict schedule in order to avoid (or reduce) collisions.
|
|
If every node synchronized immediately after making changes we could just
|
|
stop here. Unfortunately this isn't reality. Reality dictates that of two
|
|
clients: Client A & B will each pick up the same record on Monday. A will
|
|
make changes on Monday, then leave for vacation. B will make changes on
|
|
Wednesday because new information was gathered in A's absence. Client B
|
|
posts those changes Wednesday. Meanwhile, client A returns from vacation on
|
|
Friday and synchronizes his changes. A over writes B's changes even though
|
|
A made changes before the most recent information was posted by B.
|
|
|
|
It is clear that we need some form of time stamp to cope with the above
|
|
example. While clocks aren't the most reliable, they are the only common
|
|
version control available to solve this problem. The system is set up to
|
|
accept (but not require) timestamps from clients and changes on the server
|
|
are time stamped. The system, when presented a time stamp with a row, will
|
|
compare them to figure out who wins in a tie. The system makes certain
|
|
"sanity" checks with regard to these time stamps. A client may not attempt
|
|
to post a change with a timestamp that is more than one hour in the future
|
|
(according to what the server thinks "now" is) nor one hour before it's last
|
|
synchronization date/time. The client row will be immediately placed into
|
|
the collision table if the timestamp is that far out of whack.
|
|
Implementations of the tool kit should take care to ensure that client &
|
|
server agree on what "now" is before attempting to submit changes with
|
|
timestamps.
|
|
|
|
Time stamps are not required. Should a client be incapable of tracking
|
|
timestamps, etc. The system will assume that any server row which has been
|
|
changed since the client's last session will win a tie. This is quite error
|
|
prone, so timestamps are encouraged where possible.
|
|
|
|
Inserts pose an interesting challenge. Since multiple clients cannot share
|
|
a sequence (often used as a primary key) while disconnected. They will be
|
|
responsible for their own unique "row_id" when inserting records. Inserts
|
|
accept any arbitrary key, and write back to the client a special kind of
|
|
update that gives the server's row_id. The client is responsible for making
|
|
sure that this update takes place locally.
|
|
|
|
Deletes are the last portion of the process. When deletes occur, the
|
|
row_id, version, etc. are stored in a "deleted" table. These entries are
|
|
retrieved by the client using the same version filter as described above.
|
|
The table is pruned at the end of each session by deleting all records with
|
|
versions that are less than the lowest 'last_version' stored for each
|
|
client.
|
|
|
|
Having wrapped up the synchronization process, I'll move on to describe some
|
|
points about managing clients, publications and the like.
|
|
|
|
The tool kit is split into two objects: SyncManagement and Synchronization.
|
|
The Synchronization object exposes an API that client implementations use to
|
|
communicate and receive changes. The management functions handle system
|
|
install and uninstall in addition to publication of tables and client
|
|
subscriptions.
|
|
|
|
Installation and uninstallation are handled by their corresponding functions
|
|
in the API. All system tables are prefixed and suffixed with four
|
|
underscores, in hopes that this avoids conflict with an existing tables.
|
|
Calling the install function more than once will generate an error message.
|
|
Uninstall will remove all related tables, sequences, functions and triggers
|
|
from the system.
|
|
|
|
The first step, after installing the system, is to publish a table. A table
|
|
can be published more than once under different names. Simply provide a
|
|
unique name as the second argument to the publish function. Since object
|
|
names are restricted to 32 characters in Postgres, each table is given a
|
|
unique id and this id is used to create the trigger and sequence names.
|
|
Since one table can be published multiple times, but only needs one set of
|
|
triggers and one sequence for change management a reference count is kept so
|
|
that we know when to add/drop triggers and functions. By default, all
|
|
columns are published, but the third argument to the publish function
|
|
accepts an array reference of column names that allows you to specify a
|
|
limited set. Information about the table is stored in the "tables" table,
|
|
info about the publication is in the "publications" table and column names
|
|
are stored in "subscribed_cols" table.
|
|
|
|
The next step is to subscribe a client to a table. A client is identified
|
|
by a user name and a node name. The subscribe function takes three
|
|
arguments: user, node & publication. The subscription process writes an
|
|
entry into the "subscribed" table with default values. Of note, the
|
|
"RefreshOnce" attribute is set to true whenever a table is published. This
|
|
indicates to the system that a full table refresh should be sent the next
|
|
time the client connects even if the client requests synchronization rather
|
|
than refresh.
|
|
|
|
The toolkit does not, yet, provide a way to manage the whereclause stored at
|
|
either the publication or client level. To use or test this feature, you
|
|
will need to set the whereclause attributes manually.
|
|
|
|
Tables and users can be unpublished and unsubscribed using the corresponding
|
|
functions within the tool kit's management interface. Because postgres
|
|
lacks an "ALTER TABLE DROP COLUMN" function, the unpublish function only
|
|
removes default values and indexes for those columns.
|
|
|
|
The API isn't the most robust thing in the world right now. All functions
|
|
return undef on success and an error string otherwise (like DBD). I hope to
|
|
clean up the API considerably over the next month. The code has not been
|
|
field tested at this time.
|
|
|
|
|
|
The files attached are:
|
|
|
|
1) SynKit.pm (A perl module that contains install/uninstall functions and a
|
|
simple api for synchronization & management)
|
|
|
|
2) sync_install.pl (Sample code to demonstrate the installation, publishing
|
|
and subscribe process)
|
|
|
|
3) sync_uninstall.pl (Sample code to demonstrate the uninstallation,
|
|
unpublishing and unsubscribe process)
|
|
|
|
|
|
To use them on Linux (don't know about Win32 but should work fine):
|
|
|
|
- set up a test database and make SURE plpgsql is installed
|
|
|
|
- install perl 5.05 along with Date::Parse(TimeDate-1.1) , DBI and DBD::Pg
|
|
modules [www.cpan.org]
|
|
|
|
- copy all three attached files to a test directory
|
|
|
|
- cd to your test directory
|
|
|
|
- edit all three files and change the three DBI variables to suit your
|
|
system (they are clearly marked)
|
|
|
|
- % perl sync_install.pl
|
|
|
|
- check out the tables, functions & triggers installed
|
|
|
|
- % perl sync.pl
|
|
|
|
- check out the 'sync_test' table, do some updates/inserts/deletes and run
|
|
sync.pl again
|
|
NOTE: Sanity checks default to allow no more than 50% of the table
|
|
to be changed by the client in a single session.
|
|
If you delete all (or most of) the rows you will get errors when
|
|
you run sync.pl again! (by design)
|
|
|
|
- % perl sync_uninstall.pl (when you are done)
|
|
|
|
- check out the sample scripts and the perl module code (commented, but
|
|
not documented)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
------=_NextPart_000_0062_01C0541E.125CAF30
|
|
Content-Type: application/octet-stream; name="sync.pl"
|
|
Content-Transfer-Encoding: quoted-printable
|
|
Content-Disposition: attachment; filename="sync.pl"
|
|
|
|
|
|
|
|
# This script depicts the syncronization process for two users.
|
|
|
|
|
|
## CHANGE THESE THREE VARIABLE TO MATCH YOUR SYSTEM ###########
|
|
my $dbi_connect_string =3D 'dbi:Pg:dbname=3Dtest;host=3Dsnoopy'; #
|
|
my $db_user =3D 'test'; #
|
|
my $db_pass =3D 'test'; #
|
|
#################################################################
|
|
|
|
my $ret; #holds return value
|
|
|
|
use SynKit;
|
|
|
|
#create a synchronization object (pass dbi connection info)
|
|
my $s =3D Synchronize->new($dbi_connect_string,$db_user,$db_pass);
|
|
|
|
#start a session by passing a user name, "node" identifier and a collision =
|
|
queue name (client or server)
|
|
$ret =3D $s->start_session('JOE','REMOTE_NODE_NAME','server');
|
|
print "Handle this error: $ret\n\n" if $ret;
|
|
|
|
#call this once before attempting to apply individual changes
|
|
$ret =3D $s->start_changes('sync_test',['name']);
|
|
print "Handle this error: $ret\n\n" if $ret;
|
|
|
|
#call this for each change the client wants to make to the database
|
|
$ret =3D $s->apply_change(CLIENTROWID,'insert',undef,['ted']);
|
|
print "Handle this error: $ret\n\n" if $ret;
|
|
|
|
#call this for each change the client wants to make to the database
|
|
$ret =3D $s->apply_change(CLIENTROWID,'insert','1973-11-10 11:25:00 AM -05=
|
|
',['tim']);
|
|
print "Handle this error: $ret\n\n" if $ret;
|
|
|
|
#call this for each change the client wants to make to the database
|
|
$ret =3D $s->apply_change(999,'update',undef,['tom']);
|
|
print "Handle this error: $ret\n\n" if $ret;
|
|
|
|
#call this for each change the client wants to make to the database
|
|
$ret =3D $s->apply_change(1,'update',undef,['tom']);
|
|
print "Handle this error: $ret\n\n" if $ret;
|
|
|
|
#call this once after all changes have been submitted
|
|
$ret =3D $s->end_changes();
|
|
print "Handle this error: $ret\n\n" if $ret;
|
|
|
|
#call this to get updates from all subscribed tables
|
|
$ret =3D $s->get_all_updates();
|
|
print "Handle this error: $ret\n\n" if $ret;
|
|
|
|
print "\n\nSyncronization session is complete. (JOE) \n\n";
|
|
|
|
|
|
# make some changes to the database (server perspective)
|
|
|
|
print "\n\nMaking changes to the the database. (server side) \n\n";
|
|
|
|
use DBI;
|
|
my $dbh =3D DBI->connect($dbi_connect_string,$db_user,$db_pass);
|
|
|
|
$dbh->do("insert into sync_test values ('roger')");
|
|
$dbh->do("insert into sync_test values ('john')");
|
|
$dbh->do("insert into sync_test values ('harry')");
|
|
$dbh->do("delete from sync_test where name =3D 'roger'");
|
|
$dbh->do("update sync_test set name =3D 'tom' where name =3D 'harry'");
|
|
|
|
$dbh->disconnect;
|
|
|
|
|
|
#now do another session for a different user
|
|
|
|
#start a session by passing a user name, "node" identifier and a collision =
|
|
queue name (client or server)
|
|
$ret =3D $s->start_session('KEN','ANOTHER_REMOTE_NODE_NAME','server');
|
|
print "Handle this error: $ret\n\n" if $ret;
|
|
|
|
#call this to get updates from all subscribed tables
|
|
$ret =3D $s->get_all_updates();
|
|
print "Handle this error: $ret\n\n" if $ret;
|
|
|
|
print "\n\nSynchronization session is complete. (KEN)\n\n";
|
|
|
|
print "Now look at your database and see what happend, make changes to the =
|
|
test table, etc. and run this again.\n\n";
|
|
|
|
------=_NextPart_000_0062_01C0541E.125CAF30
|
|
Content-Type: application/octet-stream; name="sync_uninstall.pl"
|
|
Content-Transfer-Encoding: quoted-printable
|
|
Content-Disposition: attachment; filename="sync_uninstall.pl"
|
|
|
|
|
|
# this script uninstalls the synchronization system using the SyncManager o=
|
|
bject;
|
|
|
|
use SynKit;
|
|
|
|
### CHANGE THESE TO MATCH YOUR SYSTEM ########################
|
|
my $dbi_connect_string =3D 'dbi:Pg:dbname=3Dtest;host=3Dsnoopy'; #
|
|
my $db_user =3D 'test'; #
|
|
my $db_pass =3D 'test'; #
|
|
#################################################################
|
|
|
|
|
|
my $ret; #holds return value
|
|
|
|
#create an instance of the SyncManager object
|
|
my $m =3D SyncManager->new($dbi_connect_string,$db_user,$db_pass);
|
|
|
|
# call this to unsubscribe a user/node (not necessary if you are uninstalli=
|
|
ng)
|
|
print $m->unsubscribe('KEN','ANOTHER_REMOTE_NODE_NAME','sync_test');
|
|
|
|
#call this to unpublish a table (not necessary if you are uninstalling)
|
|
print $m->unpublish('sync_test');
|
|
|
|
#call this to uninstall the syncronization system
|
|
# NOTE: this will automatically unpublish & unsubscribe all users
|
|
print $m->UNINSTALL;
|
|
|
|
# now let's drop our little test table
|
|
use DBI;
|
|
my $dbh =3D DBI->connect($dbi_connect_string,$db_user,$db_pass);
|
|
$dbh->do("drop table sync_test");
|
|
$dbh->disconnect;
|
|
|
|
print "\n\nI hope you enjoyed this little demonstration\n\n";
|
|
|
|
|
|
|
|
------=_NextPart_000_0062_01C0541E.125CAF30
|
|
Content-Type: application/octet-stream; name="sync_install.pl"
|
|
Content-Transfer-Encoding: quoted-printable
|
|
Content-Disposition: attachment; filename="sync_install.pl"
|
|
|
|
|
|
# This script shows how to install the synchronization system=20
|
|
# using the SyncManager object
|
|
|
|
use SynKit;
|
|
|
|
### CHANGE THESE TO MATCH YOUR SYSTEM ##########################
|
|
my $dbi_connect_string =3D 'dbi:Pg:dbname=3Dtest;host=3Dsnoopy'; #
|
|
my $db_user =3D 'test'; #
|
|
my $db_pass =3D 'test'; #
|
|
#################################################################
|
|
my $ret; #holds return value
|
|
|
|
|
|
#create an instance of the sync manager object
|
|
my $m =3D SyncManager->new($dbi_connect_string,$db_user,$db_pass);
|
|
|
|
#Call this to install the syncronization management tables, etc.
|
|
$ret =3D $m->INSTALL;
|
|
die "Handle this error: $ret\n\n" if $ret;
|
|
|
|
|
|
|
|
#create a test table for us to demonstrate with
|
|
use DBI;
|
|
my $dbh =3D DBI->connect($dbi_connect_string,$db_user,$db_pass);
|
|
$dbh->do("create table sync_test (name text)");
|
|
$dbh->do("insert into sync_test values ('rob')");
|
|
$dbh->do("insert into sync_test values ('rob')");
|
|
$dbh->do("insert into sync_test values ('rob')");
|
|
$dbh->do("insert into sync_test values ('ted')");
|
|
$dbh->do("insert into sync_test values ('ted')");
|
|
$dbh->do("insert into sync_test values ('ted')");
|
|
$dbh->disconnect;
|
|
|
|
|
|
|
|
|
|
#call this to "publish" a table
|
|
$ret =3D $m->publish('sync_test');
|
|
print "Handle this error: $ret\n\n" if $ret;
|
|
|
|
#call this to "subscribe" a user/node to a publication (table)
|
|
$ret =3D $m->subscribe('JOE','REMOTE_NODE_NAME','sync_test');
|
|
print "Handle this error: $ret\n\n" if $ret;
|
|
|
|
#call this to "subscribe" a user/node to a publication (table)
|
|
$ret =3D $m->subscribe('KEN','ANOTHER_REMOTE_NODE_NAME','sync_test');
|
|
print "Handle this error: $ret\n\n" if $ret;
|
|
|
|
|
|
print "Now you can do: 'perl sync.pl' a few times to play\n\n";
|
|
print "Do 'perl sync_uninstall.pl' to uninstall the system\n";
|
|
|
|
|
|
------=_NextPart_000_0062_01C0541E.125CAF30
|
|
Content-Type: application/octet-stream; name="SynKit.pm"
|
|
Content-Transfer-Encoding: quoted-printable
|
|
Content-Disposition: attachment; filename="SynKit.pm"
|
|
|
|
# Perl DB synchronization toolkit
|
|
|
|
#created for postgres 7.0.2 +
|
|
use strict;
|
|
|
|
BEGIN {
|
|
use vars qw($VERSION);
|
|
# set the version for version checking
|
|
$VERSION =3D 1.00;
|
|
}
|
|
|
|
|
|
package Synchronize;
|
|
|
|
use DBI;
|
|
|
|
use Date::Parse;
|
|
|
|
# new requires 3 arguments: dbi connection string, plus the corresponding u=
|
|
sername and password to get connected to the database
|
|
sub new {
|
|
my $proto =3D shift;
|
|
my $class =3D ref($proto) || $proto;
|
|
my $self =3D {};
|
|
|
|
my $dbi =3D shift;
|
|
my $user =3D shift;
|
|
my $pass =3D shift;
|
|
|
|
$self->{DBH} =3D DBI->connect($dbi,$user,$pass) || die "Failed to connect =
|
|
to database: ".DBI->errstr();
|
|
|
|
$self->{user} =3D undef;
|
|
$self->{node} =3D undef;
|
|
$self->{status} =3D undef; # holds status of table update portion of sessi=
|
|
on
|
|
$self->{pubs} =3D {}; #holds hash of pubs available to sessiom with val =
|
|
=3D 1 if ok to request sync
|
|
$self->{orderpubs} =3D undef; #holds array ref of subscribed pubs ordered =
|
|
by sync_order
|
|
$self->{this_post_ver} =3D undef; #holds the version number under which th=
|
|
is session will post changes
|
|
$self->{max_ver} =3D undef; #holds the maximum safe version for getting up=
|
|
dates
|
|
$self->{current} =3D {}; #holds the current publication info to which chan=
|
|
ges are being applied
|
|
$self->{queue} =3D 'server'; # tells collide function what to do with coll=
|
|
isions. (default is to hold on server)
|
|
|
|
$self->{DBLOG}=3D DBI->connect($dbi,$user,$pass) || die "cannot log to DB:=
|
|
".DBI->errstr();=20
|
|
|
|
|
|
return bless ($self, $class);
|
|
}
|
|
|
|
sub dblog {=20
|
|
my $self =3D shift;
|
|
my $msg =3D $self->{DBLOG}->quote($_[0]);
|
|
my $quser =3D $self->{DBH}->quote($self->{user});
|
|
my $qnode =3D $self->{DBH}->quote($self->{node});
|
|
$self->{DBLOG}->do("insert into ____sync_log____ (username, nodename,stamp=
|
|
, message) values($quser, $qnode, now(), $msg)");
|
|
}
|
|
|
|
|
|
#start_session establishes session wide information and other housekeeping =
|
|
chores
|
|
# Accepts username, nodename and queue (client or server) as arguments;
|
|
|
|
sub start_session {
|
|
my $self =3D shift;
|
|
$self->{user} =3D shift || die 'Username is required';
|
|
$self->{node} =3D shift || die 'Nodename is required';
|
|
$self->{queue} =3D shift;
|
|
|
|
|
|
if ($self->{queue} ne 'server' && $self->{queue} ne 'client') {
|
|
die "You must provide a queue argument of either 'server' or 'client'";
|
|
}
|
|
|
|
my $quser =3D $self->{DBH}->quote($self->{user});
|
|
my $qnode =3D $self->{DBH}->quote($self->{node});
|
|
|
|
my $sql =3D "select pubname from ____subscribed____ where username =3D $qu=
|
|
ser and nodename =3D $qnode";
|
|
my @pubs =3D $self->GetColList($sql);
|
|
|
|
return 'User/Node has no subscriptions!' if !defined(@pubs);
|
|
|
|
# go though the list and check permissions and rules for each
|
|
foreach my $pub (@pubs) {
|
|
my $qpub =3D $self->{DBH}->quote($pub);
|
|
my $sql =3D "select disabled, pubname, fullrefreshonly, refreshonce,post_=
|
|
ver from ____subscribed____ where username =3D $quser and pubname =3D $qpub=
|
|
and nodename =3D $qnode";
|
|
my $sth =3D $self->{DBH}->prepare($sql) || die $self->{DBH}->errstr;
|
|
$sth->execute || die $self->{DBH}->errstr;
|
|
my @row;
|
|
while (@row =3D $sth->fetchrow_array) {
|
|
next if $row[0]; #publication is disabled
|
|
next if !defined($row[1]); #publication does not exist (should never occ=
|
|
ur)
|
|
if ($row[2] || $row[3]) { #refresh of refresh once flag is set
|
|
$self->{pubs}->{$pub} =3D 0; #refresh only
|
|
next;
|
|
}
|
|
if (!defined($row[4])) { #no previous session exists, must refresh
|
|
$self->{pubs}->{$pub} =3D 0; #refresh only
|
|
next;
|
|
}
|
|
$self->{pubs}->{$pub} =3D 1; #OK for sync
|
|
}
|
|
$sth->finish;
|
|
}
|
|
|
|
|
|
$sql =3D "select pubname from ____publications____ order by sync_order";
|
|
my @op =3D $self->GetColList($sql);
|
|
my @orderpubs;
|
|
|
|
#loop through ordered pubs and remove non subscribed publications
|
|
foreach my $pub (@op) {
|
|
push @orderpubs, $pub if defined($self->{pubs}->{$pub});
|
|
}
|
|
=09
|
|
$self->{orderpubs} =3D \@orderpubs;
|
|
|
|
# Now we obtain a session version number, etc.
|
|
|
|
$self->{DBH}->{AutoCommit} =3D 0; #allows "transactions"
|
|
$self->{DBH}->{RaiseError} =3D 1; #script [or eval] will automatically die=
|
|
on errors
|
|
|
|
eval { #start DB transaction
|
|
|
|
#lock the version sequence until we determin that we have gotten
|
|
#a good value. Lock will be released on commit.
|
|
$self->{DBH}->do('lock ____version_seq____ in access exclusive mode');
|
|
|
|
# remove stale locks if they exist
|
|
my $sql =3D "delete from ____last_stable____ where username =3D $quser an=
|
|
d nodename =3D $qnode";
|
|
$self->{DBH}->do($sql);
|
|
|
|
# increment version sequence & grab the next val as post_ver
|
|
my $sql =3D "select nextval('____version_seq____')";
|
|
my $sth =3D $self->{DBH}->prepare($sql);
|
|
$sth->execute;
|
|
($self->{this_post_ver}) =3D $sth->fetchrow_array();
|
|
$sth->finish;
|
|
# grab max_ver from last_stable
|
|
|
|
$sql =3D "select min(version) from ____last_stable____";=20
|
|
$sth =3D $self->{DBH}->prepare($sql);
|
|
$sth->execute;
|
|
($self->{max_ver}) =3D $sth->fetchrow_array();
|
|
$sth->finish;
|
|
|
|
# if there was no version in lock table, then take the ID that was in use
|
|
# when we started the session ($max_ver -1)
|
|
|
|
$self->{max_ver} =3D $self->{this_post_ver} -1 if (!defined($self->{max_v=
|
|
er}));
|
|
|
|
# lock post_ver by placing it in last_stable
|
|
$self->{DBH}->do("insert into ____last_stable____ (version, username, nod=
|
|
ename) values ($self->{this_post_ver}, $quser,$qnode)");
|
|
|
|
# increment version sequence again (discard result)
|
|
$sql =3D "select nextval('____version_seq____')";
|
|
$sth =3D $self->{DBH}->prepare($sql);
|
|
$sth->execute;
|
|
$sth->fetchrow_array();
|
|
$sth->finish;
|
|
|
|
}; #end eval/transaction
|
|
|
|
if ($@) { # part of transaction failed
|
|
return 'Start session failed';
|
|
$self->{DBH}->rollback;
|
|
} else { # all's well commit block
|
|
$self->{DBH}->commit;
|
|
}
|
|
$self->{DBH}->{AutoCommit} =3D 1;
|
|
$self->{DBH}->{RaiseError} =3D 0;
|
|
|
|
return undef;
|
|
|
|
}
|
|
|
|
#start changes should be called once before applying individual change requ=
|
|
ests
|
|
# Requires publication and ref to columns that will be updated as arguments
|
|
sub start_changes {
|
|
my $self =3D shift;
|
|
my $pub =3D shift || die 'Publication is required';
|
|
my $colref =3D shift || die 'Reference to column array is required';
|
|
|
|
$self->{status} =3D 'starting';
|
|
|
|
my $qpub =3D $self->{DBH}->quote($pub);
|
|
my $quser =3D $self->{DBH}->quote($self->{user});
|
|
my $qnode =3D $self->{DBH}->quote($self->{node});
|
|
|
|
my @cols =3D @{$colref};
|
|
my @subcols =3D $self->GetColList("select col_name from ____subscribed_col=
|
|
s____ where username =3D $quser and nodename =3D $qnode and pubname =3D $qp=
|
|
ub");
|
|
my %subcols;
|
|
foreach my $col (@subcols) {
|
|
$subcols{$col} =3D 1;
|
|
}
|
|
foreach my $col (@cols) {=09
|
|
return "User/node is not subscribed to column '$col'" if !$subcols{$col};
|
|
}
|
|
|
|
my $sql =3D "select pubname, readonly, last_session, post_ver, last_ver, w=
|
|
hereclause, sanity_limit,=20
|
|
sanity_delete, sanity_update, sanity_insert from ____subscribed____ where u=
|
|
sername =3D $quser and pubname =3D $qpub and nodename =3D $qnode";
|
|
my ($junk, $readonly, $last_session, $post_ver, $last_ver, $whereclause, $=
|
|
sanity_limit,=20
|
|
$sanity_delete, $sanity_update, $sanity_insert) =3D $self->GetOneRow($sql);
|
|
=09
|
|
return 'Publication is read only' if $readonly;
|
|
|
|
$sql =3D "select whereclause from ____publications____ where pubname =3D $=
|
|
qpub";
|
|
my ($wc) =3D $self->GetOneRow($sql);
|
|
$whereclause =3D '('.$whereclause.')' if $whereclause;
|
|
$whereclause =3D $whereclause.' and ('.$wc.')' if $wc;
|
|
|
|
my ($table) =3D $self->GetOneRow("select tablename from ____publications__=
|
|
__ where pubname =3D $qpub");
|
|
|
|
return 'Publication is not registered correctly' if !defined($table);
|
|
|
|
my %info;
|
|
$info{pub} =3D $pub;
|
|
$info{whereclause} =3D $whereclause;
|
|
$info{post_ver} =3D $post_ver;
|
|
$last_session =3D~ s/([+|-]\d\d?)$/ $1/; #put a space before timezone=09
|
|
$last_session =3D str2time ($last_session); #convert to perltime (seconds =
|
|
since 1970)
|
|
$info{last_session} =3D $last_session;
|
|
$info{last_ver} =3D $last_ver;
|
|
$info{table} =3D $table;
|
|
$info{cols} =3D \@cols;
|
|
|
|
my $sql =3D "select count(oid) from $table";
|
|
$sql =3D $sql .' '.$whereclause if $whereclause;
|
|
my ($rowcount) =3D $self->GetOneRow($sql);
|
|
|
|
#calculate sanity levels (convert from % to number of rows)
|
|
# limits defined as less than 1 mean no limit
|
|
$info{sanitylimit} =3D $rowcount * ($sanity_limit / 100) if $sanity_limit =
|
|
> 0;
|
|
$info{insertlimit} =3D $rowcount * ($sanity_insert / 100) if $sanity_inser=
|
|
t > 0;
|
|
$info{updatelimit} =3D $rowcount * ($sanity_update / 100) if $sanity_updat=
|
|
e > 0;
|
|
$info{deletelimit} =3D $rowcount * ($sanity_delete / 100) if $sanity_delet=
|
|
e > 0;
|
|
|
|
$self->{sanitycount} =3D 0;
|
|
$self->{updatecount} =3D 0;
|
|
$self->{insertcount} =3D 0;
|
|
$self->{deletecount} =3D 0;
|
|
|
|
$self->{current} =3D \%info;
|
|
|
|
$self->{DBH}->{AutoCommit} =3D 0; #turn on transaction behavior so we can =
|
|
roll back on sanity limits, etc.
|
|
|
|
$self->{status} =3D 'ready';
|
|
|
|
return undef;
|
|
}
|
|
|
|
#call this once all changes are submitted to commit them;
|
|
sub end_changes {
|
|
my $self =3D shift;
|
|
return undef if $self->{status} ne 'ready';
|
|
$self->{DBH}->commit;
|
|
$self->{DBH}->{AutoCommit} =3D 1;
|
|
$self->{status} =3D 'success';
|
|
return undef;
|
|
}
|
|
|
|
#call apply_change once for each row level client update
|
|
# Accepts 4 params: rowid, action, timestamp and reference to data array
|
|
# Note: timestamp can be undef, data can be undef
|
|
# timestamp MUST be in perl time (secs since 1970)
|
|
|
|
#this routine checks basic timestamp info and sanity limits, then passes th=
|
|
e info along to do_action() for processing
|
|
sub apply_change {
|
|
my $self =3D shift;
|
|
my $rowid =3D shift || return 'Row ID is required'; #don't die just for on=
|
|
e bad row
|
|
my $action =3D shift || return 'Action is required'; #don't die just for o=
|
|
ne bad row
|
|
my $timestamp =3D shift;
|
|
my $dataref =3D shift;
|
|
$action =3D lc($action);
|
|
|
|
$timestamp =3D str2time($timestamp) if $timestamp;
|
|
|
|
return 'Status failure, cannot accept changes: '.$self->{status} if $self-=
|
|
>{status} ne 'ready';
|
|
|
|
my %info =3D %{$self->{current}};
|
|
|
|
$self->{sanitycount}++;
|
|
if ($info{sanitylimit} && $self->{sanitycount} > $info{sanitylimit}) {
|
|
# too many changes from client
|
|
my $ret =3D $self->sanity('limit');
|
|
return $ret if $ret;
|
|
}
|
|
|
|
=09
|
|
if ($timestamp && $timestamp > time() + 3600) { # current time + one hour
|
|
#client's clock is way off, cannot submit changes in future
|
|
my $ret =3D $self->collide('future', $info{table}, $rowid, $action, undef=
|
|
, $timestamp, $dataref, $self->{queue});
|
|
return $ret if $ret;
|
|
}
|
|
|
|
if ($timestamp && $timestamp < $info{last_session} - 3600) { # last sessio=
|
|
n time less one hour
|
|
#client's clock is way off, cannot submit changes that occured before las=
|
|
t sync date
|
|
my $ret =3D $self->collide('past', $info{table}, $rowid, $action, undef, =
|
|
$timestamp, $dataref , $self->{queue});
|
|
return $ret if $ret;
|
|
}
|
|
|
|
my ($crow, $cver, $ctime); #current row,ver,time
|
|
if ($action ne 'insert') {
|
|
my $sql =3D "select ____rowid____, ____rowver____, ____stamp____ from $in=
|
|
fo{table} where ____rowid____ =3D $rowid";
|
|
($crow, $cver, $ctime) =3D $self->GetOneRow($sql);
|
|
if (!defined($crow)) {
|
|
my $ret =3D $self->collide('norow', $info{table}, $rowid, $action, undef=
|
|
, $timestamp, $dataref , $self->{queue});
|
|
return $ret if $ret;=09=09
|
|
}
|
|
|
|
$ctime =3D~ s/([+|-]\d\d?)$/ $1/; #put space between timezone
|
|
$ctime =3D str2time($ctime) if $ctime; #convert to perl time
|
|
|
|
if ($timestamp) {
|
|
if ($ctime < $timestamp) {
|
|
my $ret =3D $self->collide('time', $info{table}, $rowid, $action, undef=
|
|
, $timestamp, $dataref, $self->{queue} );=09=09
|
|
return $ret if $ret;
|
|
}
|
|
|
|
} else {
|
|
if ($cver > $self->{this_post_ver}) {
|
|
my $ret =3D $self->collide('version', $info{table}, $rowid, $action, un=
|
|
def, $timestamp, $dataref, $self->{queue} );
|
|
return $ret if $ret;
|
|
}
|
|
}
|
|
=09
|
|
}
|
|
|
|
if ($action eq 'insert') {
|
|
$self->{insertcount}++;
|
|
if ($info{insertlimit} && $self->{insertcount} > $info{insertlimit}) {
|
|
# too many changes from client
|
|
my $ret =3D $self->sanity('insert');
|
|
return $ret if $ret;
|
|
}
|
|
|
|
my $qtable =3D $self->{DBH}->quote($info{table});
|
|
my ($rowidsequence) =3D '_'.$self->GetOneRow("select table_id from ____ta=
|
|
bles____ where tablename =3D $qtable").'__rowid_seq';
|
|
return 'Table incorrectly registered, cannot get rowid sequence name: '.$=
|
|
self->{DBH}->errstr() if not defined $rowidsequence;
|
|
|
|
my @data;
|
|
foreach my $val (@{$dataref}) {
|
|
push @data, $self->{DBH}->quote($val);
|
|
}
|
|
my $sql =3D "insert into $info{table} (";
|
|
if ($timestamp) {
|
|
$sql =3D $sql . join(',',@{$info{cols}}) . ',____rowver____, ____stamp__=
|
|
__) values (';
|
|
$sql =3D $sql . join (',',@data) .','.$self->{this_post_ver}.',\''.local=
|
|
time($timestamp).'\')';
|
|
} else {
|
|
$sql =3D $sql . join(',',@{$info{cols}}) . ',____rowver____) values (';
|
|
$sql =3D $sql . join (',',@data) .','.$self->{this_post_ver}.')';
|
|
}
|
|
my $ret =3D $self->{DBH}->do($sql);
|
|
if (!$ret) {
|
|
my $ret =3D $self->collide($self->{DBH}->errstr(), $info{table}, $rowid,=
|
|
$action, undef, $timestamp, $dataref , $self->{queue});
|
|
return $ret if $ret;=09=09
|
|
}
|
|
my ($newrowid) =3D $self->GetOneRow("select currval('$rowidsequence')");
|
|
return 'Failed to get current rowid on inserted row'.$self->{DBH}->errstr=
|
|
if not defined $newrowid;
|
|
$self->changerowid($rowid, $newrowid);
|
|
}
|
|
|
|
if ($action eq 'update') {
|
|
$self->{updatecount}++;
|
|
if ($info{updatelimit} && $self->{updatecount} > $info{updatelimit}) {
|
|
# too many changes from client
|
|
my $ret =3D $self->sanity('update');
|
|
return $ret if $ret;
|
|
}
|
|
my @data;
|
|
foreach my $val (@{$dataref}) {
|
|
push @data, $self->{DBH}->quote($val);
|
|
}=09
|
|
|
|
my $sql =3D "update $info{table} set ";
|
|
my @cols =3D @{$info{cols}};
|
|
foreach my $col (@cols) {
|
|
my $val =3D shift @data;
|
|
$sql =3D $sql . "$col =3D $val,";
|
|
}
|
|
$sql =3D $sql." ____rowver____ =3D $self->{this_post_ver}";
|
|
$sql =3D $sql.", ____stamp____ =3D '".localtime($timestamp)."'" if $times=
|
|
tamp;
|
|
$sql =3D $sql." where ____rowid____ =3D $rowid";
|
|
$sql =3D $sql." and $info{whereclause}" if $info{whereclause};
|
|
my $ret =3D $self->{DBH}->do($sql);
|
|
if (!$ret) {
|
|
my $ret =3D $self->collide($self->{DBH}->errstr(), $info{table}, $rowid,=
|
|
$action, undef, $timestamp, $dataref , $self->{queue});
|
|
return $ret if $ret;=09=09
|
|
}
|
|
|
|
}
|
|
|
|
if ($action eq 'delete') {
|
|
$self->{deletecount}++;
|
|
if ($info{deletelimit} && $self->{deletecount} > $info{deletelimit}) {
|
|
# too many changes from client
|
|
my $ret =3D $self->sanity('delete');
|
|
return $ret if $ret;
|
|
}
|
|
if ($timestamp) {
|
|
my $sql =3D "update $info{table} set ____rowver____ =3D $self->{this_pos=
|
|
t_ver}, ____stamp____ =3D '".localtime($timestamp)."' where ____rowid____ =
|
|
=3D $rowid";
|
|
$sql =3D $sql . " where $info{whereclause}" if $info{whereclause};
|
|
$self->{DBH}->do($sql) || return 'Predelete update failed: '.$self->{DBH=
|
|
}->errstr;
|
|
} else {
|
|
my $sql =3D "update $info{table} set ____rowver____ =3D $self->{this_pos=
|
|
t_ver} where ____rowid____ =3D $rowid";
|
|
$sql =3D $sql . " where $info{whereclause}" if $info{whereclause};
|
|
$self->{DBH}->do($sql) || return 'Predelete update failed: '.$self->{DBH=
|
|
}->errstr;
|
|
}
|
|
my $sql =3D "delete from $info{table} where ____rowid____ =3D $rowid";
|
|
$sql =3D $sql . " where $info{whereclause}" if $info{whereclause};
|
|
my $ret =3D $self->{DBH}->do($sql);
|
|
if (!$ret) {
|
|
my $ret =3D $self->collide($self->{DBH}->errstr(), $info{table}, $rowid,=
|
|
$action, undef, $timestamp, $dataref , $self->{queue});
|
|
return $ret if $ret;=09=09
|
|
}
|
|
}
|
|
=09
|
|
=09
|
|
return undef;
|
|
}
|
|
|
|
sub changerowid {
|
|
my $self =3D shift;
|
|
my $oldid =3D shift;
|
|
my $newid =3D shift;
|
|
$self->writeclient('changeid',"$oldid\t$newid");
|
|
}
|
|
|
|
#writes info to client
|
|
sub writeclient {
|
|
my $self =3D shift;
|
|
my $type =3D shift;
|
|
my @info =3D @_;
|
|
print "$type: ",join("\t",@info),"\n";
|
|
return undef;
|
|
}
|
|
|
|
# Override this for custom behavior. Default is to echo back the sanity fa=
|
|
ilure reason.=20=20
|
|
# If you want to override a collision, you can do so by returning undef.
|
|
sub sanity {
|
|
my $self =3D shift;
|
|
my $reason =3D shift;
|
|
$self->{status} =3D 'sanity exceeded';
|
|
$self->{DBH}->rollback;
|
|
return $reason;
|
|
}
|
|
|
|
# Override this for custom behavior. Default is to echo back the failure r=
|
|
eason.=20=20
|
|
# If you want to override a collision, you can do so by returning undef.
|
|
sub collide {
|
|
my $self =3D shift;
|
|
my ($reason,$table,$rowid,$action,$rowver,$timestamp,$data, $queue) =3D @_;
|
|
|
|
my @data;
|
|
foreach my $val (@{$data}) {
|
|
push @data, $self->{DBH}->quote($val);
|
|
}=09
|
|
|
|
if ($reason =3D~ /integrity/i || $reason =3D~ /constraint/i) {
|
|
$self->{status} =3D 'intergrity violation';
|
|
$self->{DBH}->rollback;
|
|
}
|
|
|
|
my $datastring;
|
|
my @cols =3D @{$self->{current}->{cols}};
|
|
foreach my $col (@cols) {
|
|
my $val =3D shift @data;
|
|
$datastring =3D $datastring . "$col =3D $val,";
|
|
}
|
|
chop $datastring; #remove trailing comma
|
|
|
|
if ($queue eq 'server') {
|
|
$timestamp =3D localtime($timestamp) if defined($timestamp);
|
|
$rowid =3D $self->{DBH}->quote($rowid);
|
|
$rowid =3D 'null' if !defined($rowid);
|
|
$rowver =3D 'null' if !defined($rowver);
|
|
$timestamp =3D $self->{DBH}->quote($timestamp);
|
|
$data =3D $self->{DBH}->quote($data);
|
|
my $qtable =3D $self->{DBH}->quote($table);
|
|
my $qreason =3D $self->{DBH}->quote($reason);
|
|
my $qaction =3D $self->{DBH}->quote($action);
|
|
my $quser =3D $self->{DBH}->quote($self->{user});
|
|
my $qnode =3D $self->{DBH}->quote($self->{node});
|
|
$datastring =3D $self->{DBH}->quote($datastring);
|
|
|
|
|
|
my $sql =3D "insert into ____collision____ (rowid,
|
|
tablename, rowver, stamp, data, reason, action, username,
|
|
nodename, queue) values($rowid,$qtable, $rowver, $timestamp,$datastring,
|
|
$qreason, $qaction,$quser, $qnode)";
|
|
$self->{DBH}->do($sql) || die 'Failed to write to collision table: '.$sel=
|
|
f->{DBH}->errstr;
|
|
|
|
} else {
|
|
|
|
$self->writeclient('collision',$rowid,$table, $rowver, $timestamp,$reason=
|
|
, $action,$self->{user}, $self->{node}, $data);
|
|
|
|
}
|
|
return $reason;
|
|
}
|
|
|
|
#calls get_updates once for each publication the user/node is subscribed to=
|
|
in correct sync_order
|
|
sub get_all_updates {
|
|
my $self =3D shift;
|
|
my $quser =3D $self->{DBH}->quote($self->{user});
|
|
my $qnode =3D $self->{DBH}->quote($self->{node});
|
|
|
|
foreach my $pub (@{$self->{orderpubs}}) {
|
|
$self->get_updates($pub, 1); #request update as sync unless overrridden b=
|
|
y flags
|
|
}
|
|
|
|
}
|
|
|
|
# Call this once for each table the client needs refreshed or sync'ed AFTER=
|
|
all inbound client changes have been posted
|
|
# Accepts publication and sync flag as arguments
|
|
sub get_updates {
|
|
my $self =3D shift;
|
|
my $pub =3D shift || die 'Publication is required';
|
|
my $sync =3D shift;
|
|
|
|
my $qpub =3D $self->{DBH}->quote($pub);
|
|
my $quser =3D $self->{DBH}->quote($self->{user});
|
|
my $qnode =3D $self->{DBH}->quote($self->{node});
|
|
|
|
#enforce refresh and refreshonce flags
|
|
undef $sync if !$self->{pubs}->{$pub};=20
|
|
|
|
|
|
my %info =3D $self->{current};
|
|
|
|
my @cols =3D $self->GetColList("select col_name from ____subscribed_cols__=
|
|
__ where username =3D $quser and nodename =3D $qnode and pubname =3D $qpub"=
|
|
);;
|
|
|
|
my ($table) =3D $self->GetOneRow("select tablename from ____publications__=
|
|
__ where pubname =3D $qpub");
|
|
return 'Table incorrectly registered for read' if !defined($table);
|
|
my $qtable =3D $self->{DBH}->quote($table);=09
|
|
|
|
|
|
my $sql =3D "select pubname, last_session, post_ver, last_ver, whereclause=
|
|
from ____subscribed____ where username =3D $quser and pubname =3D $qpub an=
|
|
d nodename =3D $qnode";
|
|
my ($junk, $last_session, $post_ver, $last_ver, $whereclause) =3D $self->G=
|
|
etOneRow($sql);
|
|
|
|
my ($wc) =3D $self->GetOneRow("select whereclause from ____publications___=
|
|
_ where pubname =3D $qpub");
|
|
|
|
$whereclause =3D '('.$whereclause.')' if $whereclause;
|
|
|
|
$whereclause =3D $whereclause.' and ('.$wc.')' if $wc;
|
|
|
|
|
|
if ($sync) {
|
|
$self->writeclient('start synchronize', $pub);
|
|
} else {
|
|
$self->writeclient('start refresh', $pub);
|
|
$self->{DBH}->do("update ____subscribed____ set refreshonce =3D false whe=
|
|
re pubname =3D $qpub and username =3D $quser and nodename =3D $qnode") || r=
|
|
eturn 'Failed to clear RefreshOnce flag: '.$self->{DBH}->errstr;
|
|
}
|
|
|
|
$self->writeclient('columns',@cols);
|
|
|
|
|
|
|
|
my $sql =3D "select ____rowid____, ".join(',', @cols)." from $table";
|
|
if ($sync) {
|
|
$sql =3D $sql." where (____rowver____ <=3D $self->{max_ver} and ____rowve=
|
|
r____ > $last_ver)";
|
|
if (defined($self->{this_post_ver})) {
|
|
$sql =3D $sql . " and (____rowver____ <> $post_ver)";
|
|
}
|
|
} else {
|
|
$sql =3D $sql." where (____rowver____ <=3D $self->{max_ver})";
|
|
}
|
|
$sql =3D $sql." and $whereclause" if $whereclause;
|
|
=09
|
|
my $sth =3D $self->{DBH}->prepare($sql) || return 'Failed to get prepare S=
|
|
QL for updates: '.$self->{DBH}->errstr;
|
|
$sth->execute || return 'Failed to execute SQL for updates: '.$self->{DBH}=
|
|
->errstr;
|
|
my @row;
|
|
while (@row =3D $sth->fetchrow_array) {
|
|
$self->writeclient('update/insert',@row);
|
|
}
|
|
|
|
$sth->finish;
|
|
|
|
# now get deleted rows
|
|
if ($sync) {
|
|
$sql =3D "select rowid from ____deleted____ where (tablename =3D $qtable)=
|
|
";
|
|
$sql =3D $sql." and (rowver <=3D $self->{max_ver} and rowver > $last_ver)=
|
|
";
|
|
if (defined($self->{this_post_ver})) {
|
|
$sql =3D $sql . " and (rowver <> $self->{this_post_ver})";
|
|
}
|
|
$sql =3D $sql." and $whereclause" if $whereclause;
|
|
|
|
$sth =3D $self->{DBH}->prepare($sql) || return 'Failed to get prepare SQL=
|
|
for deletes: '.$self->{DBH}->errstr;
|
|
$sth->execute || return 'Failed to execute SQL for deletes: '.$self->{DBH=
|
|
}->errstr;
|
|
my @row;
|
|
while (@row =3D $sth->fetchrow_array) {
|
|
$self->writeclient('delete',@row);
|
|
}
|
|
|
|
$sth->finish;
|
|
}
|
|
|
|
if ($sync) {
|
|
$self->writeclient('end synchronize', $pub);
|
|
} else {
|
|
$self->writeclient('end refresh', $pub);
|
|
}
|
|
|
|
my $qpub =3D $self->{DBH}->quote($pub);
|
|
my $quser =3D $self->{DBH}->quote($self->{user});
|
|
my $qnode =3D $self->{DBH}->quote($self->{node});
|
|
|
|
$self->{DBH}->do("update ____subscribed____ set last_ver =3D $self->{max_v=
|
|
er}, last_session =3D now(), post_ver =3D $self->{this_post_ver} where user=
|
|
name =3D $quser and nodename =3D $qnode and pubname =3D $qpub");
|
|
return undef;
|
|
}
|
|
|
|
|
|
# Call this once when everything else is done. Does housekeeping.=20
|
|
# (MAKE THIS AN OBJECT DESTRUCTOR?)
|
|
sub DESTROY {
|
|
my $self =3D shift;
|
|
|
|
#release version from lock table (including old ones)
|
|
my $quser =3D $self->{DBH}->quote($self->{user});
|
|
my $qnode =3D $self->{DBH}->quote($self->{node});
|
|
my $sql =3D "delete from ____last_stable____ where username =3D $quser and=
|
|
nodename =3D $qnode";
|
|
$self->{DBH}->do($sql);
|
|
|
|
#clean up deleted table
|
|
my ($version) =3D $self->GetOneRow("select min(last_ver) from ____subscrib=
|
|
ed____");
|
|
return undef if not defined $version;
|
|
$self->{DBH}->do("delete from ____deleted____ where rowver < $version") ||=
|
|
return 'Failed to prune deleted table'.$self->{DBH}->errstr;;
|
|
|
|
|
|
#disconnect from DBD sessions
|
|
$self->{DBH}->disconnect;
|
|
$self->{DBLOG}->disconnect;
|
|
return undef;
|
|
}
|
|
|
|
############# Helper Subs ############
|
|
sub GetColList {
|
|
my $self =3D shift;
|
|
my $sql =3D shift || die 'Must provide sql select statement';
|
|
my $sth =3D $self->{DBH}->prepare($sql) || return undef;
|
|
$sth->execute || return undef;
|
|
my $val;
|
|
my @col;
|
|
while (($val) =3D $sth->fetchrow_array) {
|
|
push @col, $val;
|
|
}
|
|
$sth->finish;
|
|
return @col;
|
|
}
|
|
|
|
sub GetOneRow {
|
|
my $self =3D shift;
|
|
my $sql =3D shift || die 'Must provide sql select statement';
|
|
my $sth =3D $self->{DBH}->prepare($sql) || return undef;
|
|
$sth->execute || return undef;
|
|
my @row =3D $sth->fetchrow_array;
|
|
$sth->finish;
|
|
return @row;
|
|
}
|
|
|
|
=20
|
|
|
|
|
|
|
|
package SyncManager;
|
|
|
|
use DBI;
|
|
# new requires 3 arguments: dbi connection string, plus the corresponding u=
|
|
sername and password
|
|
|
|
sub new {
|
|
my $proto =3D shift;
|
|
my $class =3D ref($proto) || $proto;
|
|
my $self =3D {};
|
|
|
|
my $dbi =3D shift;
|
|
my $user =3D shift;
|
|
my $pass =3D shift;
|
|
|
|
$self->{DBH} =3D DBI->connect($dbi,$user,$pass) || die "Failed to connect =
|
|
to database: ".DBI->errstr();
|
|
|
|
$self->{DBLOG}=3D DBI->connect($dbi,$user,$pass) || die "cannot log to DB:=
|
|
".DBI->errstr();
|
|
=09
|
|
return bless ($self, $class);
|
|
}
|
|
|
|
sub dblog {=20
|
|
my $self =3D shift;
|
|
my $msg =3D $self->{DBLOG}->quote($_[0]);
|
|
my $quser =3D $self->{DBH}->quote($self->{user});
|
|
my $qnode =3D $self->{DBH}->quote($self->{node});
|
|
$self->{DBLOG}->do("insert into ____sync_log____ (username, nodename,stamp=
|
|
, message) values($quser, $qnode, now(), $msg)");
|
|
}
|
|
|
|
#this should never need to be called, but it might if a node bails without =
|
|
releasing their locks
|
|
sub ReleaseAllLocks {
|
|
my $self =3D shift;
|
|
$self->{DBH}->do("delete from ____last_stable____)");
|
|
}
|
|
# Adds a publication to the system. Also adds triggers, sequences, etc ass=
|
|
ociated with the table if approproate.
|
|
# accepts two argument: the name of a physical table and the name under wh=
|
|
ich to publish it=20
|
|
# NOTE: the publication name is optional and will default to the table na=
|
|
me if not supplied
|
|
# returns undef if ok, else error string;
|
|
sub publish {
|
|
my $self =3D shift;
|
|
my $table =3D shift || die 'You must provide a table name (and optionally =
|
|
a unique publication name)';
|
|
my $pub =3D shift;
|
|
$pub =3D $table if not defined($pub);
|
|
|
|
my $qpub =3D $self->{DBH}->quote($pub);
|
|
my $sql =3D "select tablename from ____publications____ where pubname =3D =
|
|
$qpub";
|
|
my ($junk) =3D $self->GetOneRow($sql);
|
|
return 'Publication already exists' if defined($junk);
|
|
|
|
my $qtable =3D $self->{DBH}->quote($table);
|
|
|
|
$sql =3D "select table_id, refcount from ____tables____ where tablename =
|
|
=3D $qtable";
|
|
my ($id, $refcount) =3D $self->GetOneRow($sql);
|
|
|
|
if(!defined($id)) {
|
|
$self->{DBH}->do("insert into ____tables____ (tablename, refcount) values=
|
|
($qtable,1)") || return 'Failed to register table: ' . $self->{DBH}->errst=
|
|
r;
|
|
my $sql =3D "select table_id from ____tables____ where tablename =3D $qta=
|
|
ble";
|
|
($id) =3D $self->GetOneRow($sql);
|
|
}
|
|
|
|
if (defined($refcount)) {
|
|
$self->{DBH}->do("update ____tables____ set refcount =3D refcount+1 where=
|
|
table_id =3D $id") || return 'Failed to update refrence count: ' . $self->=
|
|
{DBH}->errstr;
|
|
} else {
|
|
=09=09
|
|
$id =3D '_'.$id.'_';=20
|
|
|
|
my @cols =3D $self->GetTableCols($table, 1); # 1 =3D get hidden cols too
|
|
my %skip;
|
|
foreach my $col (@cols) {
|
|
$skip{$col} =3D 1;
|
|
}
|
|
=09=09
|
|
if (!$skip{____rowver____}) {
|
|
$self->{DBH}->do("alter table $table add column ____rowver____ int4"); #=
|
|
don't fail here in case table is being republished, just accept the error s=
|
|
ilently
|
|
}
|
|
$self->{DBH}->do("update $table set ____rowver____ =3D ____version_seq___=
|
|
_.last_value - 1") || return 'Failed to initialize rowver: ' . $self->{DBH}=
|
|
->errstr;
|
|
|
|
if (!$skip{____rowid____}) {
|
|
$self->{DBH}->do("alter table $table add column ____rowid____ int4"); #d=
|
|
on't fail here in case table is being republished, just accept the error si=
|
|
lently
|
|
}
|
|
|
|
my $index =3D $id.'____rowid____idx';
|
|
$self->{DBH}->do("create index $index on $table(____rowid____)") || retur=
|
|
n 'Failed to create rowid index: ' . $self->{DBH}->errstr;
|
|
|
|
my $sequence =3D $id.'_rowid_seq';
|
|
$self->{DBH}->do("create sequence $sequence") || return 'Failed to create=
|
|
rowver sequence: ' . $self->{DBH}->errstr;
|
|
|
|
$self->{DBH}->do("alter table $table alter column ____rowid____ set defau=
|
|
lt nextval('$sequence')"); #don't fail here in case table is being republis=
|
|
hed, just accept the error silently
|
|
|
|
$self->{DBH}->do("update $table set ____rowid____ =3D nextval('$sequence=
|
|
')") || return 'Failed to initialize rowid: ' . $self->{DBH}->errstr;
|
|
|
|
if (!$skip{____stamp____}) {
|
|
$self->{DBH}->do("alter table $table add column ____stamp____ timestamp"=
|
|
); #don't fail here in case table is being republished, just accept the err=
|
|
or silently
|
|
}
|
|
|
|
$self->{DBH}->do("update $table set ____stamp____ =3D now()") || return =
|
|
'Failed to initialize stamp: ' . $self->{DBH}->errstr;
|
|
|
|
my $trigger =3D $id.'_ver_ins';
|
|
$self->{DBH}->do("create trigger $trigger before insert on $table for eac=
|
|
h row execute procedure sync_insert_ver()") || return 'Failed to create tri=
|
|
gger: ' . $self->{DBH}->errstr;
|
|
|
|
my $trigger =3D $id.'_ver_upd';
|
|
$self->{DBH}->do("create trigger $trigger before update on $table for eac=
|
|
h row execute procedure sync_update_ver()") || return 'Failed to create tri=
|
|
gger: ' . $self->{DBH}->errstr;
|
|
|
|
my $trigger =3D $id.'_del_row';
|
|
$self->{DBH}->do("create trigger $trigger after delete on $table for each=
|
|
row execute procedure sync_delete_row()") || return 'Failed to create trig=
|
|
ger: ' . $self->{DBH}->errstr;
|
|
}
|
|
|
|
$self->{DBH}->do("insert into ____publications____ (pubname, tablename) va=
|
|
lues ('$pub','$table')") || return 'Failed to create publication entry: '.$=
|
|
self->{DBH}->errstr;
|
|
|
|
return undef;
|
|
}
|
|
|
|
|
|
# Removes a publication from the system. Also drops triggers, sequences, e=
|
|
tc associated with the table if approproate.
|
|
# accepts one argument: the name of a publication
|
|
# returns undef if ok, else error string;
|
|
sub unpublish {
|
|
my $self =3D shift;
|
|
my $pub =3D shift || return 'You must provide a publication name';
|
|
my $qpub =3D $self->{DBH}->quote($pub);
|
|
my $sql =3D "select tablename from ____publications____ where pubname =3D =
|
|
$qpub";
|
|
my ($table) =3D $self->GetOneRow($sql);
|
|
return 'Publication does not exist' if !defined($table);
|
|
|
|
my $qtable =3D $self->{DBH}->quote($table);
|
|
|
|
$sql =3D "select table_id, refcount from ____tables____ where tablename =
|
|
=3D $qtable";
|
|
my ($id, $refcount) =3D $self->GetOneRow($sql);
|
|
return 'Table: $table is not correctly registered!' if not defined($id);
|
|
|
|
$self->{DBH}->do("update ____tables____ set refcount =3D refcount -1 where=
|
|
tablename =3D $qtable") || return 'Failed to decrement reference count: ' =
|
|
. $self->{DBH}->errstr;
|
|
|
|
$self->{DBH}->do("delete from ____subscribed____ where pubname =3D $qpub")=
|
|
|| return 'Failed to delete user subscriptions: ' . $self->{DBH}->errstr;
|
|
$self->{DBH}->do("delete from ____subscribed_cols____ where pubname =3D $q=
|
|
pub") || return 'Failed to delete subscribed columns: ' . $self->{DBH}->err=
|
|
str;
|
|
$self->{DBH}->do("delete from ____publications____ where tablename =3D $qt=
|
|
able and pubname =3D $qpub") || return 'Failed to delete from publications:=
|
|
' . $self->{DBH}->errstr;
|
|
|
|
#if this is the last reference, we want to drop triggers, etc;
|
|
if ($refcount <=3D 1) {
|
|
$id =3D "_".$id."_";
|
|
|
|
$self->{DBH}->do("alter table $table alter column ____rowver____ drop def=
|
|
ault") || return 'Failed to alter column default: ' . $self->{DBH}->errstr;
|
|
$self->{DBH}->do("alter table $table alter column ____rowid____ drop defa=
|
|
ult") || return 'Failed to alter column default: ' . $self->{DBH}->errstr;
|
|
$self->{DBH}->do("alter table $table alter column ____stamp____ drop defa=
|
|
ult") || return 'Failed to alter column default: ' . $self->{DBH}->errstr;
|
|
|
|
my $trigger =3D $id.'_ver_upd';
|
|
$self->{DBH}->do("drop trigger $trigger on $table") || return 'Failed to =
|
|
drop trigger: ' . $self->{DBH}->errstr;
|
|
|
|
my $trigger =3D $id.'_ver_ins';
|
|
$self->{DBH}->do("drop trigger $trigger on $table") || return 'Failed to =
|
|
drop trigger: ' . $self->{DBH}->errstr;
|
|
|
|
my $trigger =3D $id.'_del_row';
|
|
$self->{DBH}->do("drop trigger $trigger on $table") || return 'Failed to =
|
|
drop trigger: ' . $self->{DBH}->errstr;
|
|
|
|
my $sequence =3D $id.'_rowid_seq';
|
|
$self->{DBH}->do("drop sequence $sequence") || return 'Failed to drop seq=
|
|
uence: ' . $self->{DBH}->errstr;
|
|
|
|
my $index =3D $id.'____rowid____idx';
|
|
$self->{DBH}->do("drop index $index") || return 'Failed to drop index: ' =
|
|
. $self->{DBH}->errstr;
|
|
$self->{DBH}->do("delete from ____tables____ where tablename =3D $qtable"=
|
|
) || return 'remove entry from tables: ' . $self->{DBH}->errstr;
|
|
}
|
|
return undef;
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
#Subscribe user/node to a publication
|
|
# Accepts 3 arguements: Username, Nodename, Publication
|
|
# NOTE: the remaining arguments can be supplied as column names to which =
|
|
the user/node should be subscribed
|
|
# Return undef if ok, else returns an error string
|
|
|
|
sub subscribe {
|
|
my $self =3D shift;
|
|
my $user =3D shift || die 'You must provide user, node and publication as =
|
|
arguments';
|
|
my $node =3D shift || die 'You must provide user, node and publication as =
|
|
arguments';
|
|
my $pub =3D shift || die 'You must provide user, node and publication as a=
|
|
rguments';
|
|
my @cols =3D @_;
|
|
|
|
my $quser =3D $self->{DBH}->quote($user);
|
|
my $qnode =3D $self->{DBH}->quote($node);
|
|
my $qpub =3D $self->{DBH}->quote($pub);
|
|
|
|
my $sql =3D "select tablename from ____publications____ where pubname =3D =
|
|
$qpub";
|
|
my ($table) =3D $self->GetOneRow($sql);
|
|
return "Publication $pub does not exist." if not defined $table;
|
|
my $qtable =3D $self->{DBH}->quote($table);
|
|
|
|
@cols =3D $self->GetTableCols($table) if !@cols; # get defaults if cols we=
|
|
re not spefified by caller
|
|
|
|
$self->{DBH}->do("insert into ____subscribed____ (username, nodename,pubna=
|
|
me,last_ver,refreshonce) values('$user', '$node','$pub',0, true)") || retur=
|
|
n 'Failes to create subscription: ' . $self->{DBH}->errstr;=09
|
|
|
|
foreach my $col (@cols) {
|
|
$self->{DBH}->do("insert into ____subscribed_cols____ (username, nodename=
|
|
, pubname, col_name) values ('$user','$node','$pub','$col')") || return 'Fa=
|
|
iles to subscribe column: ' . $self->{DBH}->errstr;=09
|
|
}
|
|
|
|
return undef;
|
|
}
|
|
|
|
|
|
#Unsubscribe user/node to a publication
|
|
# Accepts 3 arguements: Username, Nodename, Publication
|
|
# Return undef if ok, else returns an error string
|
|
|
|
sub unsubscribe {
|
|
my $self =3D shift;
|
|
my $user =3D shift || die 'You must provide user, node and publication as =
|
|
arguments';
|
|
my $node =3D shift || die 'You must provide user, node and publication as =
|
|
arguments';
|
|
my $pub =3D shift || die 'You must provide user, node and publication as a=
|
|
rguments';
|
|
my @cols =3D @_;
|
|
|
|
my $quser =3D $self->{DBH}->quote($user);
|
|
my $qnode =3D $self->{DBH}->quote($node);
|
|
my $qpub =3D $self->{DBH}->quote($pub);
|
|
|
|
my $sql =3D "select tablename from ____publications____ where pubname =3D =
|
|
$qpub";
|
|
my $table =3D $self->GetOneRow($sql);
|
|
return "Publication $pub does not exist." if not defined $table;
|
|
|
|
$self->{DBH}->do("delete from ____subscribed_cols____ where pubname =3D $q=
|
|
pub and username =3D $quser and nodename =3D $qnode") || return 'Failed to =
|
|
remove column subscription: '. $self->{DBH}->errstr;
|
|
$self->{DBH}->do("delete from ____subscribed____ where pubname =3D $qpub a=
|
|
nd username =3D $quser and nodename =3D $qnode") || return 'Failed to remov=
|
|
e subscription: '. $self->{DBH}->errstr;
|
|
|
|
|
|
return undef;
|
|
}
|
|
|
|
|
|
|
|
#INSTALL creates the necessary management tables.=20=20
|
|
#returns undef if everything is ok, else returns a string describing the e=
|
|
rror;
|
|
sub INSTALL {
|
|
my $self =3D shift;
|
|
|
|
#check to see if management tables are already installed
|
|
|
|
my ($test) =3D $self->GetOneRow("select * from pg_class where relname =3D '=
|
|
____publications____'");
|
|
if (defined($test)) {
|
|
return 'It appears that synchronization manangement tables are already ins=
|
|
talled here. Please uninstall before reinstalling.';
|
|
};
|
|
|
|
|
|
|
|
#install the management tables, etc.
|
|
|
|
$self->{DBH}->do("create table ____publications____ (pubname text primary k=
|
|
ey,description text, tablename text, sync_order int4, whereclause text)") |=
|
|
| return $self->{DBH}->errstr();
|
|
|
|
$self->{DBH}->do("create table ____subscribed_cols____ (nodename text, user=
|
|
name text, pubname text, col_name text, description text, primary key(noden=
|
|
ame, username, pubname,col_name))") || return $self->{DBH}->errstr();
|
|
|
|
$self->{DBH}->do("create table ____subscribed____ (nodename text, username =
|
|
text, pubname text, last_session timestamp, post_ver int4, last_ver int4, w=
|
|
hereclause text, sanity_limit int4 default 0, sanity_delete int4 default 0,=
|
|
sanity_update int4 default 0, sanity_insert int4 default 50, readonly bool=
|
|
ean, disabled boolean, fullrefreshonly boolean, refreshonce boolean, primar=
|
|
y key(nodename, username, pubname))") || return $self->{DBH}->errstr();
|
|
|
|
$self->{DBH}->do("create table ____last_stable____ (version int4, username =
|
|
text, nodename text, primary key(version, username, nodename))") || return =
|
|
$self->{DBH}->errstr();
|
|
|
|
$self->{DBH}->do("create table ____tables____ (tablename text, table_id int=
|
|
4, refcount int4, primary key(tablename, table_id))") || return $self->{DBH=
|
|
}->errstr();
|
|
|
|
$self->{DBH}->do("create sequence ____table_id_seq____") || return $self->{=
|
|
DBH}->errstr();
|
|
|
|
$self->{DBH}->do("alter table ____tables____ alter column table_id set defa=
|
|
ult nextval('____table_id_seq____')") || return $self->{DBH}->errstr();
|
|
|
|
$self->{DBH}->do("create table ____deleted____ (rowid int4, tablename text,=
|
|
rowver int4, stamp timestamp, primary key (rowid, tablename))") || return =
|
|
$self->{DBH}->errstr();
|
|
|
|
$self->{DBH}->do("create table ____collision____ (rowid text, tablename tex=
|
|
t, rowver int4, stamp timestamp, faildate timestamp default now(),data text=
|
|
,reason text, action text, username text, nodename text,queue text)") || re=
|
|
turn $self->{DBH}->errstr();
|
|
|
|
$self->{DBH}->do("create sequence ____version_seq____") || return $self->{D=
|
|
BH}->errstr();
|
|
|
|
$self->{DBH}->do("create table ____sync_log____ (username text, nodename te=
|
|
xt, stamp timestamp, message text)") || return $self->{DBH}->errstr();
|
|
|
|
$self->{DBH}->do("create function sync_insert_ver() returns opaque as
|
|
'begin
|
|
if new.____rowver____ isnull then
|
|
new.____rowver____ :=3D ____version_seq____.last_value;
|
|
end if;
|
|
if new.____stamp____ isnull then
|
|
new.____stamp____ :=3D now();
|
|
end if;
|
|
return NEW;
|
|
end;' language 'plpgsql'") || return $self->{DBH}->errstr();
|
|
|
|
$self->{DBH}->do("create function sync_update_ver() returns opaque as
|
|
'begin
|
|
if new.____rowver____ =3D old.____rowver____ then
|
|
new.____rowver____ :=3D ____version_seq____.last_value;
|
|
end if;
|
|
if new.____stamp____ =3D old.____stamp____ then
|
|
new.____stamp____ :=3D now();
|
|
end if;
|
|
return NEW;
|
|
end;' language 'plpgsql'") || return $self->{DBH}->errstr();
|
|
|
|
|
|
$self->{DBH}->do("create function sync_delete_row() returns opaque as=20
|
|
'begin=20
|
|
insert into ____deleted____ (rowid,tablename,rowver,stamp) values
|
|
(old.____rowid____, TG_RELNAME, old.____rowver____,old.____stamp____);=20
|
|
return old;=20
|
|
end;' language 'plpgsql'") || return $self->{DBH}->errstr();
|
|
|
|
return undef;
|
|
}
|
|
|
|
#removes all management tables & related stuff
|
|
#returns undef if ok, else returns an error message as a string
|
|
sub UNINSTALL {
|
|
my $self =3D shift;
|
|
|
|
#Make sure all tables are unpublished first
|
|
my $sth =3D $self->{DBH}->prepare("select pubname from ____publications____=
|
|
");
|
|
$sth->execute;
|
|
my $pub;
|
|
while (($pub) =3D $sth->fetchrow_array) {
|
|
$self->unpublish($pub);=09
|
|
}
|
|
$sth->finish;
|
|
|
|
$self->{DBH}->do("drop table ____publications____") || return $self->{DBH}-=
|
|
>errstr();
|
|
$self->{DBH}->do("drop table ____subscribed_cols____") || return $self->{DB=
|
|
H}->errstr();
|
|
$self->{DBH}->do("drop table ____subscribed____") || return $self->{DBH}->e=
|
|
rrstr();
|
|
$self->{DBH}->do("drop table ____last_stable____") || return $self->{DBH}->=
|
|
errstr();
|
|
$self->{DBH}->do("drop table ____deleted____") || return $self->{DBH}->errs=
|
|
tr();
|
|
$self->{DBH}->do("drop table ____collision____") || return $self->{DBH}->er=
|
|
rstr();
|
|
$self->{DBH}->do("drop table ____tables____") || return $self->{DBH}->errst=
|
|
r();
|
|
$self->{DBH}->do("drop table ____sync_log____") || return $self->{DBH}->err=
|
|
str();
|
|
|
|
$self->{DBH}->do("drop sequence ____table_id_seq____") || return $self->{DB=
|
|
H}->errstr();
|
|
$self->{DBH}->do("drop sequence ____version_seq____") || return $self->{DBH=
|
|
}->errstr();
|
|
|
|
$self->{DBH}->do("drop function sync_insert_ver()") || return $self->{DBH}-=
|
|
>errstr();
|
|
$self->{DBH}->do("drop function sync_update_ver()") || return $self->{DBH}-=
|
|
>errstr();
|
|
$self->{DBH}->do("drop function sync_delete_row()") || return $self->{DBH}-=
|
|
>errstr();
|
|
|
|
return undef;
|
|
|
|
}
|
|
|
|
sub DESTROY {
|
|
my $self =3D shift;
|
|
|
|
$self->{DBH}->disconnect;
|
|
$self->{DBLOG}->disconnect;
|
|
return undef;
|
|
}
|
|
|
|
############# Helper Subs ############
|
|
|
|
sub GetOneRow {
|
|
my $self =3D shift;
|
|
my $sql =3D shift || die 'Must provide sql select statement';
|
|
my $sth =3D $self->{DBH}->prepare($sql) || return undef;
|
|
$sth->execute || return undef;
|
|
my @row =3D $sth->fetchrow_array;
|
|
$sth->finish;
|
|
return @row;
|
|
}
|
|
|
|
#call this with second non-zero value to get hidden columns
|
|
sub GetTableCols {
|
|
my $self =3D shift;
|
|
my $table =3D shift || die 'Must provide table name';
|
|
my $wanthidden =3D shift;
|
|
my $sql =3D "select * from $table where 0 =3D 1";
|
|
my $sth =3D $self->{DBH}->prepare($sql) || return undef;
|
|
$sth->execute || return undef;
|
|
my @row =3D @{$sth->{NAME}};
|
|
$sth->finish;
|
|
return @row if $wanthidden;
|
|
my @cols;
|
|
foreach my $col (@row) {
|
|
next if $col eq '____rowver____';
|
|
next if $col eq '____stamp____';
|
|
next if $col eq '____rowid____';
|
|
push @cols, $col;=09
|
|
}
|
|
return @cols;
|
|
}
|
|
|
|
|
|
1; #happy require
|
|
|
|
------=_NextPart_000_0062_01C0541E.125CAF30--
|
|
|
|
|
|
From pgsql-hackers-owner+M9917@postgresql.org Mon Jun 11 15:53:25 2001
|
|
Return-path: <pgsql-hackers-owner+M9917@postgresql.org>
|
|
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5BJrPL01206
|
|
for <pgman@candle.pha.pa.us>; Mon, 11 Jun 2001 15:53:25 -0400 (EDT)
|
|
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5BJrPE67753;
|
|
Mon, 11 Jun 2001 15:53:25 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M9917@postgresql.org)
|
|
Received: from mail.greatbridge.com (mail.greatbridge.com [65.196.68.36])
|
|
by postgresql.org (8.11.3/8.11.1) with ESMTP id f5BJmLE65620
|
|
for <pgsql-hackers@postgresql.org>; Mon, 11 Jun 2001 15:48:21 -0400 (EDT)
|
|
(envelope-from djohnson@greatbridge.com)
|
|
Received: from j2.us.greatbridge.com (djohnsonpc.us.greatbridge.com [65.196.69.70])
|
|
by mail.greatbridge.com (8.11.2/8.11.2) with SMTP id f5BJm2Q28847
|
|
for <pgsql-hackers@postgresql.org>; Mon, 11 Jun 2001 15:48:02 -0400
|
|
From: Darren Johnson <djohnson@greatbridge.com>
|
|
Date: Mon, 11 Jun 2001 19:46:44 GMT
|
|
Message-ID: <20010611.19464400@j2.us.greatbridge.com>
|
|
Subject: [HACKERS] Postgres Replication
|
|
To: pgsql-hackers@postgresql.org
|
|
Reply-To: Darren Johnson <djohnson@greatbridge.com>
|
|
X-Mailer: Mozilla/3.0 (compatible; StarOffice/5.2;Linux)
|
|
X-Priority: 3 (Normal)
|
|
MIME-Version: 1.0
|
|
Content-Type: text/plain; charset=ISO-8859-1
|
|
Content-Transfer-Encoding: 8bit
|
|
X-MIME-Autoconverted: from quoted-printable to 8bit by postgresql.org id f5BJmLE65621
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
We have been researching replication for several months now, and
|
|
I have some opinions to share to the community for feedback,
|
|
discussion, and/or participation. Our goal is to get a replication
|
|
solution for PostgreSQL that will meet most needs of users
|
|
and applications alike (mission impossible theme here :).
|
|
|
|
My research work along with others contributors has been collected
|
|
and presented here http://www.greatbridge.org/genpage?replication_top
|
|
If there is something missing, especially PostgreSQL related
|
|
work, I would like to know about it, and my apologies to any
|
|
one who got left off the list. This work is ongoing and doesn't
|
|
draw a conclusion, which IMHO should be left up to the user,
|
|
but I'm offering my opinions to spur discussion and/or feed back
|
|
from this list, and try not to offend any one.
|
|
|
|
Here's my opinion: of the approaches we've surveyed, the most
|
|
promising one is the Postgres-R project from the Information and
|
|
Communication Systems Group, ETH in Zurich, Switzerland, originally
|
|
produced by Bettina Kemme, Gustavo Alonso, and others. Although
|
|
Postgres-R is a synchronous approach, I believe it is the closest to
|
|
the goal mentioned above. Here is an abstract of the advantages.
|
|
|
|
1) Postgres-R is built on the PostgreSQL-6.4.2 code base. The
|
|
replication
|
|
functionality is an optional parameter, so there will be insignificant
|
|
overhead for non replication situations. The replication and
|
|
communication
|
|
managers are the two new modules added to the PostgreSQL code base.
|
|
|
|
2) The replication manager's main function is controlling the
|
|
replication protocol via a message handling process. It receives
|
|
messages from the local and remote backends and forwards write
|
|
sets and decision messages via the communication manager to the
|
|
other servers. The replication manager controls all the transactions
|
|
running on the local server by keeping track of the states, including
|
|
which protocol phase (read, send, lock, or write) the transaction is
|
|
in. The replication manager maintains a two way channel
|
|
implemented as buffered sockets to each backend.
|
|
|
|
3) The main task of the communication manager is to provide simple
|
|
socket based interface between the replication manager and the
|
|
group communication system (currently Ensemble). The
|
|
communication system is a cluster of servers connected via
|
|
the communication manager. The replication manager also maintains
|
|
three one-way channels to the communication system: a broadcast
|
|
channel to send messages, a total-order channel to receive
|
|
totally orders write sets, and a no-order channel to listen for
|
|
decision messages from the communication system. Decision
|
|
messages can be received at any time where the reception of
|
|
totally ordered write sets can be blocked in certain phases.
|
|
|
|
4) Based on a two phase locking approach, all dead lock situations
|
|
are local and detectable by Postgres-R code base, and aborted.
|
|
|
|
5) The write set messages used to send database changes to other
|
|
servers, can use either the SQL statements or the actual tuples
|
|
changed. This is a parameter based on number of tuples changed
|
|
by a transaction. While sending the tuple changes reduces
|
|
overhead in query parse, plan and execution, there is a negative
|
|
effect in sending a large write set across the network.
|
|
|
|
6) Postgres-R uses a synchronous approach that keeps the data on
|
|
all sites consistent and provides serializability. The user does not
|
|
have to bother with conflict resolution, and receives the same
|
|
correctness and consistency of a centralized system.
|
|
|
|
7) Postgres-R could be part of a good fault-resilient and load
|
|
distribution
|
|
solution. It is peer-to-peer based and incurs low overhead propagating
|
|
updates to the other cluster members. All replicated databases locally
|
|
process queries.
|
|
|
|
8) Compared to other synchronous replication strategies (e.g., standard
|
|
distributed 2-phase-locking + 2-phase-commit), Postgres-R has much
|
|
better performance using 2-phase-locking.
|
|
|
|
|
|
There are some issues that are not currently addressed by
|
|
Postgres-R, but some enhancements made to PostgreSQL since the
|
|
6.4.2 tree are very favorable to addressing these short comings.
|
|
|
|
1) The addition of WAL in 7.1 has the information for recovering
|
|
failed/off-line servers, currently all the servers would have to be
|
|
stopped, and a copy would be used to get all the servers synchronized
|
|
before starting again.
|
|
|
|
2)Being synchronous, Postgres-R would not be a good solution
|
|
for off line/WAN scenarios where asynchronous replication is
|
|
required. There are some theories on this issue which involve servers
|
|
connecting and disconnecting from the cluster.
|
|
|
|
3)As in any serialized synchronous approach there is change in the
|
|
flow of execution of a transaction; while most of these changes can
|
|
be solved by calling newly developed functions at certain time points,
|
|
synchronous replica control is tightly coupled with the concurrency
|
|
control.
|
|
Hence, especially in PostgreSQL 7.2 some parts of the concurrency control
|
|
(MVCC) might have to be adjusted. This can lead to a slightly more
|
|
complicated maintenance than a system that does not change the backend.
|
|
|
|
4)Partial replication is not addressed.
|
|
|
|
|
|
Any feedback on this post will be appreciated.
|
|
|
|
Thanks,
|
|
|
|
Darren
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 2: you can get off all lists at once with the unregister command
|
|
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
|
|
|
From pgsql-hackers-owner+M9923@postgresql.org Mon Jun 11 18:14:23 2001
|
|
Return-path: <pgsql-hackers-owner+M9923@postgresql.org>
|
|
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5BMENL18644
|
|
for <pgman@candle.pha.pa.us>; Mon, 11 Jun 2001 18:14:23 -0400 (EDT)
|
|
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5BMEQE14877;
|
|
Mon, 11 Jun 2001 18:14:26 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M9923@postgresql.org)
|
|
Received: from spoetnik.xs4all.nl (spoetnik.xs4all.nl [194.109.249.226])
|
|
by postgresql.org (8.11.3/8.11.1) with ESMTP id f5BM6ME12270
|
|
for <pgsql-hackers@postgresql.org>; Mon, 11 Jun 2001 18:06:23 -0400 (EDT)
|
|
(envelope-from reinoud@xs4all.nl)
|
|
Received: from KAYAK (kayak [192.168.1.20])
|
|
by spoetnik.xs4all.nl (Postfix) with SMTP id 865A33E1B
|
|
for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 00:06:16 +0200 (CEST)
|
|
From: reinoud@xs4all.nl (Reinoud van Leeuwen)
|
|
To: pgsql-hackers@postgresql.org
|
|
Subject: Re: [HACKERS] Postgres Replication
|
|
Date: Mon, 11 Jun 2001 22:06:07 GMT
|
|
Organization: Not organized in any way
|
|
Reply-To: reinoud@xs4all.nl
|
|
Message-ID: <3b403d96.562404297@192.168.1.10>
|
|
References: <20010611.19464400@j2.us.greatbridge.com>
|
|
In-Reply-To: <20010611.19464400@j2.us.greatbridge.com>
|
|
X-Mailer: Forte Agent 1.5/32.451
|
|
MIME-Version: 1.0
|
|
Content-Type: text/plain; charset=us-ascii
|
|
Content-Transfer-Encoding: 8bit
|
|
X-MIME-Autoconverted: from quoted-printable to 8bit by postgresql.org id f5BM6PE12276
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
On Mon, 11 Jun 2001 19:46:44 GMT, you wrote:
|
|
|
|
>We have been researching replication for several months now, and
|
|
>I have some opinions to share to the community for feedback,
|
|
>discussion, and/or participation. Our goal is to get a replication
|
|
>solution for PostgreSQL that will meet most needs of users
|
|
>and applications alike (mission impossible theme here :).
|
|
>
|
|
>My research work along with others contributors has been collected
|
|
>and presented here http://www.greatbridge.org/genpage?replication_top
|
|
>If there is something missing, especially PostgreSQL related
|
|
>work, I would like to know about it, and my apologies to any
|
|
>one who got left off the list. This work is ongoing and doesn't
|
|
>draw a conclusion, which IMHO should be left up to the user,
|
|
>but I'm offering my opinions to spur discussion and/or feed back
|
|
>from this list, and try not to offend any one.
|
|
>
|
|
>Here's my opinion: of the approaches we've surveyed, the most
|
|
>promising one is the Postgres-R project from the Information and
|
|
>Communication Systems Group, ETH in Zurich, Switzerland, originally
|
|
>produced by Bettina Kemme, Gustavo Alonso, and others. Although
|
|
>Postgres-R is a synchronous approach, I believe it is the closest to
|
|
>the goal mentioned above. Here is an abstract of the advantages.
|
|
>
|
|
>1) Postgres-R is built on the PostgreSQL-6.4.2 code base. The
|
|
>replication
|
|
>functionality is an optional parameter, so there will be insignificant
|
|
>overhead for non replication situations. The replication and
|
|
>communication
|
|
>managers are the two new modules added to the PostgreSQL code base.
|
|
>
|
|
>2) The replication manager's main function is controlling the
|
|
>replication protocol via a message handling process. It receives
|
|
>messages from the local and remote backends and forwards write
|
|
>sets and decision messages via the communication manager to the
|
|
>other servers. The replication manager controls all the transactions
|
|
>running on the local server by keeping track of the states, including
|
|
>which protocol phase (read, send, lock, or write) the transaction is
|
|
>in. The replication manager maintains a two way channel
|
|
>implemented as buffered sockets to each backend.
|
|
|
|
what does "manager controls all the transactions" mean? I hope it does
|
|
*not* mean that a bug in the manager would cause transactions not to
|
|
commit...
|
|
|
|
>
|
|
>3) The main task of the communication manager is to provide simple
|
|
>socket based interface between the replication manager and the
|
|
>group communication system (currently Ensemble). The
|
|
>communication system is a cluster of servers connected via
|
|
>the communication manager. The replication manager also maintains
|
|
>three one-way channels to the communication system: a broadcast
|
|
>channel to send messages, a total-order channel to receive
|
|
>totally orders write sets, and a no-order channel to listen for
|
|
>decision messages from the communication system. Decision
|
|
>messages can be received at any time where the reception of
|
|
>totally ordered write sets can be blocked in certain phases.
|
|
>
|
|
>4) Based on a two phase locking approach, all dead lock situations
|
|
>are local and detectable by Postgres-R code base, and aborted.
|
|
|
|
Does this imply locking over different servers? That would mean a
|
|
grinding halt when a network outage occurs...
|
|
|
|
>5) The write set messages used to send database changes to other
|
|
>servers, can use either the SQL statements or the actual tuples
|
|
>changed. This is a parameter based on number of tuples changed
|
|
>by a transaction. While sending the tuple changes reduces
|
|
>overhead in query parse, plan and execution, there is a negative
|
|
>effect in sending a large write set across the network.
|
|
>
|
|
>6) Postgres-R uses a synchronous approach that keeps the data on
|
|
>all sites consistent and provides serializability. The user does not
|
|
>have to bother with conflict resolution, and receives the same
|
|
>correctness and consistency of a centralized system.
|
|
>
|
|
>7) Postgres-R could be part of a good fault-resilient and load
|
|
>distribution
|
|
>solution. It is peer-to-peer based and incurs low overhead propagating
|
|
>updates to the other cluster members. All replicated databases locally
|
|
>process queries.
|
|
>
|
|
>8) Compared to other synchronous replication strategies (e.g., standard
|
|
>distributed 2-phase-locking + 2-phase-commit), Postgres-R has much
|
|
>better performance using 2-phase-locking.
|
|
|
|
Coming from a Sybase background I have some experience with
|
|
replication. The way it works in Sybase Replication server is as
|
|
follows:
|
|
- for each replicated database, there is a "log reader" process that
|
|
reads the WAL and captures only *committed transactions* to the
|
|
replication server. (it does not make much sense to replicate other
|
|
things IMHO :-).
|
|
- the replication server stores incoming data in a que ("stable
|
|
device"), until it is sure it has reached its final destination
|
|
|
|
- a replication server can send data to another replication server in
|
|
a compact (read: WAN friendly) way. A chain of replication servers can
|
|
be made, depending on network architecture)
|
|
|
|
- the final replication server makes a almost standard client
|
|
connection to the target database and translates the compact
|
|
transactions back to SQL statements. By using masks, extra
|
|
functionality can be built in.
|
|
|
|
This kind of architecture has several advantages:
|
|
- only committed transactions are replicated which saves overhead
|
|
- it does not have very much impact on performance of the source
|
|
server (apart from reading the WAL)
|
|
- since every replication server has a stable device, data is stored
|
|
when the network is down and nothing gets lost (nor stops performing)
|
|
- because only the log reader and the connection from the final
|
|
replication server are RDBMS specific, it is possible to replicate
|
|
from MS to Oracle using a Sybase replication server (or different
|
|
versions etc).
|
|
|
|
I do not know how much of this is patented or copyrighted, but the
|
|
architecture seems elegant and robust to me. I have done
|
|
implementations of bi-directional replication too. It *is* possible
|
|
but does require some funky setup and maintenance. (but it is better
|
|
that letting offices on different continents working on the same
|
|
database :-)
|
|
|
|
just my 2 EURO cts :-)
|
|
|
|
|
|
--
|
|
__________________________________________________
|
|
"Nothing is as subjective as reality"
|
|
Reinoud van Leeuwen reinoud@xs4all.nl
|
|
http://www.xs4all.nl/~reinoud
|
|
__________________________________________________
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
|
|
|
|
From pgsql-hackers-owner+M9924@postgresql.org Mon Jun 11 18:41:51 2001
|
|
Return-path: <pgsql-hackers-owner+M9924@postgresql.org>
|
|
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5BMfpL28917
|
|
for <pgman@candle.pha.pa.us>; Mon, 11 Jun 2001 18:41:51 -0400 (EDT)
|
|
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5BMfsE25092;
|
|
Mon, 11 Jun 2001 18:41:54 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M9924@postgresql.org)
|
|
Received: from spider.pilosoft.com (p55-222.acedsl.com [160.79.55.222])
|
|
by postgresql.org (8.11.3/8.11.1) with ESMTP id f5BMalE23024
|
|
for <pgsql-hackers@postgresql.org>; Mon, 11 Jun 2001 18:36:47 -0400 (EDT)
|
|
(envelope-from alex@pilosoft.com)
|
|
Received: from localhost (alexmail@localhost)
|
|
by spider.pilosoft.com (8.9.3/8.9.3) with ESMTP id SAA06092;
|
|
Mon, 11 Jun 2001 18:46:05 -0400 (EDT)
|
|
Date: Mon, 11 Jun 2001 18:46:05 -0400 (EDT)
|
|
From: Alex Pilosov <alex@pilosoft.com>
|
|
To: Reinoud van Leeuwen <reinoud@xs4all.nl>
|
|
cc: pgsql-hackers@postgresql.org
|
|
Subject: Re: [HACKERS] Postgres Replication
|
|
In-Reply-To: <3b403d96.562404297@192.168.1.10>
|
|
Message-ID: <Pine.BSO.4.10.10106111828450.9902-100000@spider.pilosoft.com>
|
|
MIME-Version: 1.0
|
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
On Mon, 11 Jun 2001, Reinoud van Leeuwen wrote:
|
|
|
|
> On Mon, 11 Jun 2001 19:46:44 GMT, you wrote:
|
|
|
|
> what does "manager controls all the transactions" mean? I hope it does
|
|
> *not* mean that a bug in the manager would cause transactions not to
|
|
> commit...
|
|
Well yeah it does. Bugs are a fact of life. :)
|
|
|
|
> >4) Based on a two phase locking approach, all dead lock situations
|
|
> >are local and detectable by Postgres-R code base, and aborted.
|
|
>
|
|
> Does this imply locking over different servers? That would mean a
|
|
> grinding halt when a network outage occurs...
|
|
Don't know, but see below.
|
|
|
|
> Coming from a Sybase background I have some experience with
|
|
> replication. The way it works in Sybase Replication server is as
|
|
> follows:
|
|
> - for each replicated database, there is a "log reader" process that
|
|
> reads the WAL and captures only *committed transactions* to the
|
|
> replication server. (it does not make much sense to replicate other
|
|
> things IMHO :-).
|
|
> - the replication server stores incoming data in a que ("stable
|
|
> device"), until it is sure it has reached its final destination
|
|
>
|
|
> - a replication server can send data to another replication server in
|
|
> a compact (read: WAN friendly) way. A chain of replication servers can
|
|
> be made, depending on network architecture)
|
|
>
|
|
> - the final replication server makes a almost standard client
|
|
> connection to the target database and translates the compact
|
|
> transactions back to SQL statements. By using masks, extra
|
|
> functionality can be built in.
|
|
>
|
|
> This kind of architecture has several advantages:
|
|
> - only committed transactions are replicated which saves overhead
|
|
> - it does not have very much impact on performance of the source
|
|
> server (apart from reading the WAL)
|
|
> - since every replication server has a stable device, data is stored
|
|
> when the network is down and nothing gets lost (nor stops performing)
|
|
> - because only the log reader and the connection from the final
|
|
> replication server are RDBMS specific, it is possible to replicate
|
|
> from MS to Oracle using a Sybase replication server (or different
|
|
> versions etc).
|
|
>
|
|
> I do not know how much of this is patented or copyrighted, but the
|
|
> architecture seems elegant and robust to me. I have done
|
|
> implementations of bi-directional replication too. It *is* possible
|
|
> but does require some funky setup and maintenance. (but it is better
|
|
> that letting offices on different continents working on the same
|
|
> database :-)
|
|
Yes, the above architecture is what almost every vendor of replication
|
|
software uses. And I'm sure if you worked much with Sybase, you hate the
|
|
garbage that their repserver is :).
|
|
|
|
The architecture of postgres-r and repserver are fundamentally different
|
|
for a good reason: repserver only wants to replicate committed
|
|
transactions, while postgres-r is more of a 'clustering' solution (albeit
|
|
they don't say this word), and is capable to do much more than simple rep
|
|
server.
|
|
|
|
I.E. you can safely put half of your clients to second server in a
|
|
replicated postgres-r cluster without being worried that a conflict (or a
|
|
wierd locking situation) may occur.
|
|
|
|
Try that with sybase, it is fundamentally designed for one-way
|
|
replication, and the fact that you can do one-way replication in both
|
|
directions doesn't mean its safe to do that!
|
|
|
|
I'm not sure how postgres-r handles network problems. To be useful, a good
|
|
replication solution must have an option of "no network->no updates" as
|
|
well as "no network->queue updates and send them later". However, it is
|
|
far easier to add queuing to a correct 'eager locking' database than it is
|
|
to add proper locking to a queue-based replicator.
|
|
|
|
-alex
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 3: if posting/reading through Usenet, please send an appropriate
|
|
subscribe-nomail command to majordomo@postgresql.org so that your
|
|
message can get through to the mailing list cleanly
|
|
|
|
From pgsql-hackers-owner+M9932@postgresql.org Mon Jun 11 22:17:54 2001
|
|
Return-path: <pgsql-hackers-owner+M9932@postgresql.org>
|
|
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5C2HsL15803
|
|
for <pgman@candle.pha.pa.us>; Mon, 11 Jun 2001 22:17:54 -0400 (EDT)
|
|
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5C2HtE86836;
|
|
Mon, 11 Jun 2001 22:17:55 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M9932@postgresql.org)
|
|
Received: from femail15.sdc1.sfba.home.com (femail15.sdc1.sfba.home.com [24.0.95.142])
|
|
by postgresql.org (8.11.3/8.11.1) with ESMTP id f5C2BXE85020
|
|
for <pgsql-hackers@postgresql.org>; Mon, 11 Jun 2001 22:11:33 -0400 (EDT)
|
|
(envelope-from djohnson@greatbridge.com)
|
|
Received: from greatbridge.com ([65.2.95.27])
|
|
by femail15.sdc1.sfba.home.com
|
|
(InterMail vM.4.01.03.20 201-229-121-120-20010223) with ESMTP
|
|
id <20010612021124.OZRG17243.femail15.sdc1.sfba.home.com@greatbridge.com>;
|
|
Mon, 11 Jun 2001 19:11:24 -0700
|
|
Message-ID: <3B257969.6050405@greatbridge.com>
|
|
Date: Mon, 11 Jun 2001 22:07:37 -0400
|
|
From: Darren Johnson <djohnson@greatbridge.com>
|
|
User-Agent: Mozilla/5.0 (Windows; U; WinNT4.0; en-US; m18) Gecko/20001108 Netscape6/6.0
|
|
X-Accept-Language: en
|
|
MIME-Version: 1.0
|
|
To: Alex Pilosov <alex@pilosoft.com>, Reinoud van Leeuwen <reinoud@xs4all.nl>
|
|
cc: pgsql-hackers@postgresql.org
|
|
Subject: Re: [HACKERS] Postgres Replication
|
|
References: <Pine.BSO.4.10.10106111828450.9902-100000@spider.pilosoft.com>
|
|
Content-Type: text/plain; charset=us-ascii; format=flowed
|
|
Content-Transfer-Encoding: 7bit
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
|
|
Thanks for the feedback. I'll try to address both your issues here.
|
|
|
|
>> what does "manager controls all the transactions" mean?
|
|
>
|
|
The replication manager controls the transactions by serializing the
|
|
write set messages.
|
|
This ensures all transactions are committed in the same order on each
|
|
server, so bugs
|
|
here are not allowed ;-)
|
|
|
|
>> I hope it does
|
|
>> *not* mean that a bug in the manager would cause transactions not to
|
|
>> commit...
|
|
>
|
|
> Well yeah it does. Bugs are a fact of life. :
|
|
|
|
>
|
|
>>> 4) Based on a two phase locking approach, all dead lock situations
|
|
>>> are local and detectable by Postgres-R code base, and aborted.
|
|
>>
|
|
>> Does this imply locking over different servers? That would mean a
|
|
>> grinding halt when a network outage occurs...
|
|
>
|
|
> Don't know, but see below.
|
|
|
|
There is a branch of the Postgres-R code that has some failure detection
|
|
implemented,
|
|
so we will have to merge this functionality with the version of
|
|
Postgres-R we have, and
|
|
test this issue. I'll let you the results.
|
|
|
|
>>
|
|
>> - the replication server stores incoming data in a que ("stable
|
|
>> device"), until it is sure it has reached its final destination
|
|
>
|
|
I like this idea for recovering servers that have been down a short
|
|
period of time, using WAL
|
|
to recover transactions missed during the outage.
|
|
|
|
>>
|
|
>> This kind of architecture has several advantages:
|
|
>> - only committed transactions are replicated which saves overhead
|
|
>> - it does not have very much impact on performance of the source
|
|
>> server (apart from reading the WAL)
|
|
>> - since every replication server has a stable device, data is stored
|
|
>> when the network is down and nothing gets lost (nor stops performing)
|
|
>> - because only the log reader and the connection from the final
|
|
>> replication server are RDBMS specific, it is possible to replicate
|
|
>> from MS to Oracle using a Sybase replication server (or different
|
|
>> versions etc).
|
|
>
|
|
There are some issues with the "log reader" approach:
|
|
1) The databases are not synchronized until the log reader completes its
|
|
processing.
|
|
2) I'm not sure about Sybase, but the log reader sends SQL statements to
|
|
the other servers
|
|
which are then parsed, planned and executed. This over head could be
|
|
avoided if only
|
|
the tuple changes are replicated.
|
|
3) Works fine for read only situations, but peer-to-peer applications
|
|
using this approach
|
|
must be designed with a conflict resolution scheme.
|
|
|
|
Don't get me wrong, I believe we can learn from the replication
|
|
techniques used by commercial
|
|
databases like Sybase, and try to implement the good ones into
|
|
PostgreSQL. Postgres-R is
|
|
a synchronous approach which out performs the traditional approaches to
|
|
synchronous replication.
|
|
Being based on PostgreSQL-6.4.2, getting this approach in the 7.2 tree
|
|
might be better than
|
|
reinventing the wheel.
|
|
|
|
Thanks again,
|
|
|
|
Darren
|
|
|
|
|
|
Thanks again,
|
|
|
|
Darren
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 6: Have you searched our list archives?
|
|
|
|
http://www.postgresql.org/search.mpl
|
|
|
|
From pgsql-hackers-owner+M9936@postgresql.org Tue Jun 12 03:22:51 2001
|
|
Return-path: <pgsql-hackers-owner+M9936@postgresql.org>
|
|
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5C7MoL11061
|
|
for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 03:22:50 -0400 (EDT)
|
|
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5C7MPE35441;
|
|
Tue, 12 Jun 2001 03:22:25 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M9936@postgresql.org)
|
|
Received: from reorxrsm.server.lan.at (zep3.it-austria.net [213.150.1.73])
|
|
by postgresql.org (8.11.3/8.11.1) with ESMTP id f5C72ZE25009
|
|
for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 03:02:36 -0400 (EDT)
|
|
(envelope-from ZeugswetterA@wien.spardat.at)
|
|
Received: from gz0153.gc.spardat.at (gz0153.gc.spardat.at [172.20.10.149])
|
|
by reorxrsm.server.lan.at (8.11.2/8.11.2) with ESMTP id f5C72Qu27966
|
|
for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 09:02:26 +0200
|
|
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2650.21)
|
|
id <M3L15341>; Tue, 12 Jun 2001 09:02:21 +0200
|
|
Message-ID: <11C1E6749A55D411A9670001FA68796336831B@sdexcsrv1.f000.d0188.sd.spardat.at>
|
|
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
|
To: "'Darren Johnson'" <djohnson@greatbridge.com>,
|
|
pgsql-hackers@postgresql.org
|
|
Subject: AW: [HACKERS] Postgres Replication
|
|
Date: Tue, 12 Jun 2001 09:02:20 +0200
|
|
MIME-Version: 1.0
|
|
X-Mailer: Internet Mail Service (5.5.2650.21)
|
|
Content-Type: text/plain;
|
|
charset="iso-8859-1"
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
|
|
> Although
|
|
> Postgres-R is a synchronous approach, I believe it is the closest to
|
|
> the goal mentioned above. Here is an abstract of the advantages.
|
|
|
|
If you only want synchronous replication, why not simply use triggers ?
|
|
All you would then need is remote query access and two phase commit,
|
|
and maybe a little script that helps create the appropriate triggers.
|
|
|
|
Doing a replicate all or nothing approach that only works synchronous
|
|
is imho not flexible enough.
|
|
|
|
Andreas
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 6: Have you searched our list archives?
|
|
|
|
http://www.postgresql.org/search.mpl
|
|
|
|
From pgsql-hackers-owner+M9945@postgresql.org Tue Jun 12 10:18:29 2001
|
|
Return-path: <pgsql-hackers-owner+M9945@postgresql.org>
|
|
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5CEISL06372
|
|
for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 10:18:28 -0400 (EDT)
|
|
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5CEIQE77517;
|
|
Tue, 12 Jun 2001 10:18:26 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M9945@postgresql.org)
|
|
Received: from krypton.netropolis.org ([208.222.215.99])
|
|
by postgresql.org (8.11.3/8.11.1) with ESMTP id f5CEDuE75514
|
|
for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 10:13:56 -0400 (EDT)
|
|
(envelope-from root@generalogic.com)
|
|
Received: from [132.216.183.103] (helo=localhost)
|
|
by krypton.netropolis.org with esmtp (Exim 3.12 #1 (Debian))
|
|
id 159ouq-0003MU-00
|
|
for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 10:13:08 -0400
|
|
To: pgsql-hackers@postgresql.org
|
|
Subject: Re: AW: [HACKERS] Postgres Replication
|
|
In-Reply-To: <20010612.13321600@j2.us.greatbridge.com>
|
|
References: <Pine.BSF.4.33.0106120605130.411-100000@mobile.hub.org>
|
|
<20010612.13321600@j2.us.greatbridge.com>
|
|
X-Mailer: Mew version 1.94.2 on Emacs 20.7 / Mule 4.0 (HANANOEN)
|
|
MIME-Version: 1.0
|
|
Content-Type: Text/Plain; charset=us-ascii
|
|
Content-Transfer-Encoding: 7bit
|
|
Message-ID: <20010612123623O.root@generalogic.com>
|
|
Date: Tue, 12 Jun 2001 12:36:23 +0530
|
|
From: root <root@generalogic.com>
|
|
X-Dispatcher: imput version 20000414(IM141)
|
|
Lines: 47
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
|
|
Hello
|
|
|
|
I have hacked up a replication layer for Perl code accessing a
|
|
database throught the DBI interface. It works pretty well with MySQL
|
|
(I can run pre-bender slashcode replicated, haven't tried the more
|
|
recent releases).
|
|
|
|
Potentially this hack should also work with Pg but I haven't tried
|
|
yet. If someone would like to test it out with a complex Pg app and
|
|
let me know how it went that would be cool.
|
|
|
|
The replication layer is based on Eric Newton's Recall replication
|
|
library (www.fault-tolerant.org/recall), and requires that all
|
|
database accesses be through the DBI interface.
|
|
|
|
The replicas are live, in that every operation affects all the
|
|
replicas in real time. Replica outages are invisible to the user, so
|
|
long as a majority of the replicas are functioning. Disconnected
|
|
replicas can be used for read-only access.
|
|
|
|
The only code modification that should be required to use the
|
|
replication layer is to change the DSN in connect():
|
|
|
|
my $replicas = '192.168.1.1:7000,192.168.1.2:7000,192.168.1.3:7000';
|
|
my $dbh = DBI->connect("DBI:Recall:database=$replicas");
|
|
|
|
You should be able to install the replication modules with:
|
|
|
|
perl -MCPAN -eshell
|
|
cpan> install Replication::Recall::DBServer
|
|
|
|
and then install DBD::Recall (which doesn't seem to be accessible from
|
|
the CPAN shell yet, for some reason), by:
|
|
|
|
wget http://www.cpan.org/authors/id/AGUL/DBD-Recall-1.10.tar.gz
|
|
tar xzvf DBD-Recall-1.10.tar.gz
|
|
cd DBD-Recall-1.10
|
|
perl Makefile.PL
|
|
make install
|
|
|
|
I would be very interested in hearing about your experiences with
|
|
this...
|
|
|
|
Thanks
|
|
|
|
#!
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 3: if posting/reading through Usenet, please send an appropriate
|
|
subscribe-nomail command to majordomo@postgresql.org so that your
|
|
message can get through to the mailing list cleanly
|
|
|
|
From pgsql-hackers-owner+M9938@postgresql.org Tue Jun 12 05:12:54 2001
|
|
Return-path: <pgsql-hackers-owner+M9938@postgresql.org>
|
|
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5C9CrL15228
|
|
for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 05:12:53 -0400 (EDT)
|
|
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5C9CnE91297;
|
|
Tue, 12 Jun 2001 05:12:49 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M9938@postgresql.org)
|
|
Received: from mobile.hub.org (SHW39-29.accesscable.net [24.138.39.29])
|
|
by postgresql.org (8.11.3/8.11.1) with ESMTP id f5C98DE89175
|
|
for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 05:08:13 -0400 (EDT)
|
|
(envelope-from scrappy@hub.org)
|
|
Received: from localhost (scrappy@localhost)
|
|
by mobile.hub.org (8.11.3/8.11.1) with ESMTP id f5C97f361630;
|
|
Tue, 12 Jun 2001 06:07:46 -0300 (ADT)
|
|
(envelope-from scrappy@hub.org)
|
|
X-Authentication-Warning: mobile.hub.org: scrappy owned process doing -bs
|
|
Date: Tue, 12 Jun 2001 06:07:41 -0300 (ADT)
|
|
From: The Hermit Hacker <scrappy@hub.org>
|
|
To: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
|
cc: "'Darren Johnson'" <djohnson@greatbridge.com>,
|
|
<pgsql-hackers@postgresql.org>
|
|
Subject: Re: AW: [HACKERS] Postgres Replication
|
|
In-Reply-To: <11C1E6749A55D411A9670001FA68796336831B@sdexcsrv1.f000.d0188.sd.spardat.at>
|
|
Message-ID: <Pine.BSF.4.33.0106120605130.411-100000@mobile.hub.org>
|
|
MIME-Version: 1.0
|
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
|
|
which I believe is what the rserv implementation in contrib currently does
|
|
... no?
|
|
|
|
its funny ... what is in contrib right now was developed in a weekend by
|
|
Vadim, put in contrib, yet nobody has either used it *or* seen fit to
|
|
submit patches to improve it ... ?
|
|
|
|
On Tue, 12 Jun 2001, Zeugswetter Andreas SB wrote:
|
|
|
|
>
|
|
> > Although
|
|
> > Postgres-R is a synchronous approach, I believe it is the closest to
|
|
> > the goal mentioned above. Here is an abstract of the advantages.
|
|
>
|
|
> If you only want synchronous replication, why not simply use triggers ?
|
|
> All you would then need is remote query access and two phase commit,
|
|
> and maybe a little script that helps create the appropriate triggers.
|
|
>
|
|
> Doing a replicate all or nothing approach that only works synchronous
|
|
> is imho not flexible enough.
|
|
>
|
|
> Andreas
|
|
>
|
|
> ---------------------------(end of broadcast)---------------------------
|
|
> TIP 6: Have you searched our list archives?
|
|
>
|
|
> http://www.postgresql.org/search.mpl
|
|
>
|
|
|
|
Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
|
|
Systems Administrator @ hub.org
|
|
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
|
|
|
|
From pgsql-hackers-owner+M9940@postgresql.org Tue Jun 12 09:39:08 2001
|
|
Return-path: <pgsql-hackers-owner+M9940@postgresql.org>
|
|
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5CDd8L03200
|
|
for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 09:39:08 -0400 (EDT)
|
|
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5CDcmE58175;
|
|
Tue, 12 Jun 2001 09:38:48 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M9940@postgresql.org)
|
|
Received: from mail.greatbridge.com (mail.greatbridge.com [65.196.68.36])
|
|
by postgresql.org (8.11.3/8.11.1) with ESMTP id f5CDYAE56164
|
|
for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 09:34:10 -0400 (EDT)
|
|
(envelope-from djohnson@greatbridge.com)
|
|
Received: from j2.us.greatbridge.com (djohnsonpc.us.greatbridge.com [65.196.69.70])
|
|
by mail.greatbridge.com (8.11.2/8.11.2) with SMTP id f5CDXeQ03585;
|
|
Tue, 12 Jun 2001 09:33:40 -0400
|
|
From: Darren Johnson <djohnson@greatbridge.com>
|
|
Date: Tue, 12 Jun 2001 13:32:16 GMT
|
|
Message-ID: <20010612.13321600@j2.us.greatbridge.com>
|
|
Subject: Re: AW: [HACKERS] Postgres Replication
|
|
To: The Hermit Hacker <scrappy@hub.org>
|
|
cc: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>,
|
|
<pgsql-hackers@postgresql.org>
|
|
Reply-To: Darren Johnson <djohnson@greatbridge.com>
|
|
In-Reply-To: <Pine.BSF.4.33.0106120605130.411-100000@mobile.hub.org>
|
|
References: <Pine.BSF.4.33.0106120605130.411-100000@mobile.hub.org>
|
|
X-Mailer: Mozilla/3.0 (compatible; StarOffice/5.2;Linux)
|
|
X-Priority: 3 (Normal)
|
|
MIME-Version: 1.0
|
|
Content-Type: text/plain; charset=ISO-8859-1
|
|
Content-Transfer-Encoding: 8bit
|
|
X-MIME-Autoconverted: from quoted-printable to 8bit by postgresql.org id f5CDYAE56166
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
|
|
> which I believe is what the rserv implementation in contrib currently
|
|
does
|
|
> ... no?
|
|
|
|
We tried rserv, PG Link (Joseph Conway), and PosrgreSQL Replicator. All
|
|
these projects are trigger based asynchronous replication. They all have
|
|
some advantages over the current functionality of Postgres-R some of
|
|
which I believe can be addressed:
|
|
|
|
1) Partial replication - being able to replicate just one or part of a
|
|
table(s)
|
|
2) They make no changes to the PostgreSQL code base. (Postgres-R can't
|
|
address this one ;)
|
|
3) PostgreSQL Replicator has some very nice conflict resolution schemes.
|
|
|
|
|
|
Here are some disadvantages to using a "trigger based" approach:
|
|
|
|
1) Triggers simply transfer individual data items when they are modified,
|
|
they do not keep track of transactions.
|
|
2) The execution of triggers within a database imposes a performance
|
|
overhead to that database.
|
|
3) Triggers require careful management by database administrators.
|
|
Someone needs to keep track of all the "alarms" going off.
|
|
4) The activation of triggers in a database cannot be easily
|
|
rolled back or undone.
|
|
|
|
|
|
|
|
> On Tue, 12 Jun 2001, Zeugswetter Andreas SB wrote:
|
|
|
|
> > Doing a replicate all or nothing approach that only works synchronous
|
|
> > is imho not flexible enough.
|
|
> >
|
|
|
|
|
|
I agree. Partial and asynchronous replication need to be addressed,
|
|
and some of the common functionality of Postgres-R could possibly
|
|
be used to meet those needs.
|
|
|
|
|
|
Thanks for your feedback,
|
|
|
|
Darren
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 5: Have you checked our extensive FAQ?
|
|
|
|
http://www.postgresql.org/users-lounge/docs/faq.html
|
|
|
|
From pgsql-hackers-owner+M9969@postgresql.org Tue Jun 12 16:53:45 2001
|
|
Return-path: <pgsql-hackers-owner+M9969@postgresql.org>
|
|
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5CKriL23104
|
|
for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 16:53:44 -0400 (EDT)
|
|
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5CKrlE87423;
|
|
Tue, 12 Jun 2001 16:53:47 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M9969@postgresql.org)
|
|
Received: from sectorbase2.sectorbase.com (sectorbase2.sectorbase.com [63.88.121.62] (may be forged))
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5CHWkE69562
|
|
for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 13:32:46 -0400 (EDT)
|
|
(envelope-from vmikheev@SECTORBASE.COM)
|
|
Received: by sectorbase2.sectorbase.com with Internet Mail Service (5.5.2653.19)
|
|
id <MX6MWMV8>; Tue, 12 Jun 2001 10:30:29 -0700
|
|
Message-ID: <3705826352029646A3E91C53F7189E32016670@sectorbase2.sectorbase.com>
|
|
From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
|
To: "'Darren Johnson'" <djohnson@greatbridge.com>,
|
|
The Hermit Hacker
|
|
<scrappy@hub.org>
|
|
cc: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>,
|
|
pgsql-hackers@postgresql.org
|
|
Subject: RE: AW: [HACKERS] Postgres Replication
|
|
Date: Tue, 12 Jun 2001 10:30:27 -0700
|
|
MIME-Version: 1.0
|
|
X-Mailer: Internet Mail Service (5.5.2653.19)
|
|
Content-Type: text/plain;
|
|
charset="iso-8859-1"
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
> Here are some disadvantages to using a "trigger based" approach:
|
|
>
|
|
> 1) Triggers simply transfer individual data items when they
|
|
> are modified, they do not keep track of transactions.
|
|
|
|
I don't know about other *async* replication engines but Rserv
|
|
keeps track of transactions (if I understood you corectly).
|
|
Rserv transfers not individual modified data items but
|
|
*consistent* snapshot of changes to move slave database from
|
|
one *consistent* state (when all RI constraints satisfied)
|
|
to another *consistent* state.
|
|
|
|
> 4) The activation of triggers in a database cannot be easily
|
|
> rolled back or undone.
|
|
|
|
What do you mean?
|
|
|
|
Vadim
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 2: you can get off all lists at once with the unregister command
|
|
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
|
|
|
From pgsql-hackers-owner+M9967@postgresql.org Tue Jun 12 16:42:11 2001
|
|
Return-path: <pgsql-hackers-owner+M9967@postgresql.org>
|
|
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5CKgBL17982
|
|
for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 16:42:11 -0400 (EDT)
|
|
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5CKgDE80566;
|
|
Tue, 12 Jun 2001 16:42:13 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M9967@postgresql.org)
|
|
Received: from mail.greatbridge.com (mail.greatbridge.com [65.196.68.36])
|
|
by postgresql.org (8.11.3/8.11.1) with ESMTP id f5CIVdE07561
|
|
for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 14:31:39 -0400 (EDT)
|
|
(envelope-from djohnson@greatbridge.com)
|
|
Received: from j2.us.greatbridge.com (djohnsonpc.us.greatbridge.com [65.196.69.70])
|
|
by mail.greatbridge.com (8.11.2/8.11.2) with SMTP id f5CIUfQ10080;
|
|
Tue, 12 Jun 2001 14:30:41 -0400
|
|
From: Darren Johnson <djohnson@greatbridge.com>
|
|
Date: Tue, 12 Jun 2001 18:29:20 GMT
|
|
Message-ID: <20010612.18292000@j2.us.greatbridge.com>
|
|
Subject: RE: AW: [HACKERS] Postgres Replication
|
|
To: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
|
|
cc: The Hermit Hacker <scrappy@hub.org>,
|
|
Zeugswetter Andreas SB
|
|
<ZeugswetterA@wien.spardat.at>,
|
|
pgsql-hackers@postgresql.org
|
|
Reply-To: Darren Johnson <djohnson@greatbridge.com>
|
|
<3705826352029646A3E91C53F7189E32016670@sectorbase2.sectorbase.com>
|
|
References: <3705826352029646A3E91C53F7189E32016670@sectorbase2.sectorbase.com>
|
|
X-Mailer: Mozilla/3.0 (compatible; StarOffice/5.2;Linux)
|
|
X-Priority: 3 (Normal)
|
|
MIME-Version: 1.0
|
|
Content-Type: text/plain; charset=ISO-8859-1
|
|
Content-Transfer-Encoding: 8bit
|
|
X-MIME-Autoconverted: from quoted-printable to 8bit by postgresql.org id f5CIVdE07562
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
|
|
|
|
> > Here are some disadvantages to using a "trigger based" approach:
|
|
> >
|
|
> > 1) Triggers simply transfer individual data items when they
|
|
> > are modified, they do not keep track of transactions.
|
|
|
|
> I don't know about other *async* replication engines but Rserv
|
|
> keeps track of transactions (if I understood you corectly).
|
|
> Rserv transfers not individual modified data items but
|
|
> *consistent* snapshot of changes to move slave database from
|
|
> one *consistent* state (when all RI constraints satisfied)
|
|
> to another *consistent* state.
|
|
|
|
I thought Andreas did a good job of correcting me here. Transaction-
|
|
based replication with triggers do not apply to points 1 and 4. I
|
|
should have made a distinction between non-transaction and
|
|
transaction based replication with triggers. I was not trying to
|
|
single out rserv or any other project, and I can see how my wording
|
|
implies this misinterpretation (my apologies).
|
|
|
|
|
|
> > 4) The activation of triggers in a database cannot be easily
|
|
> > rolled back or undone.
|
|
|
|
> What do you mean?
|
|
|
|
Once the trigger fires, it is not an easy task to abort that
|
|
execution via rollback or undo. Again this is not an issue
|
|
with a transaction-based trigger approach.
|
|
|
|
|
|
Sincerely,
|
|
|
|
Darren
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 2: you can get off all lists at once with the unregister command
|
|
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
|
|
|
From pgsql-hackers-owner+M9943@postgresql.org Tue Jun 12 10:03:02 2001
|
|
Return-path: <pgsql-hackers-owner+M9943@postgresql.org>
|
|
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5CE32L04619
|
|
for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 10:03:02 -0400 (EDT)
|
|
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5CE31E70430;
|
|
Tue, 12 Jun 2001 10:03:01 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M9943@postgresql.org)
|
|
Received: from fizbanrsm.server.lan.at (zep4.it-austria.net [213.150.1.74])
|
|
by postgresql.org (8.11.3/8.11.1) with ESMTP id f5CDoQE64062
|
|
for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 09:50:26 -0400 (EDT)
|
|
(envelope-from ZeugswetterA@wien.spardat.at)
|
|
Received: from gz0153.gc.spardat.at (gz0153.gc.spardat.at [172.20.10.149])
|
|
by fizbanrsm.server.lan.at (8.11.2/8.11.2) with ESMTP id f5CDoJe11224
|
|
for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 15:50:19 +0200
|
|
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2650.21)
|
|
id <M3L15S4T>; Tue, 12 Jun 2001 15:50:15 +0200
|
|
Message-ID: <11C1E6749A55D411A9670001FA68796336831F@sdexcsrv1.f000.d0188.sd.spardat.at>
|
|
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
|
To: "'Darren Johnson'" <djohnson@greatbridge.com>,
|
|
The Hermit Hacker
|
|
<scrappy@hub.org>
|
|
cc: pgsql-hackers@postgresql.org
|
|
Subject: AW: AW: [HACKERS] Postgres Replication
|
|
Date: Tue, 12 Jun 2001 15:50:09 +0200
|
|
MIME-Version: 1.0
|
|
X-Mailer: Internet Mail Service (5.5.2650.21)
|
|
Content-Type: text/plain;
|
|
charset="iso-8859-1"
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
|
|
> Here are some disadvantages to using a "trigger based" approach:
|
|
>
|
|
> 1) Triggers simply transfer individual data items when they
|
|
> are modified, they do not keep track of transactions.
|
|
> 2) The execution of triggers within a database imposes a performance
|
|
> overhead to that database.
|
|
> 3) Triggers require careful management by database administrators.
|
|
> Someone needs to keep track of all the "alarms" going off.
|
|
> 4) The activation of triggers in a database cannot be easily
|
|
> rolled back or undone.
|
|
|
|
Yes, points 2 and 3 are a given, although point 2 buys you the functionality
|
|
of transparent locking across all involved db servers.
|
|
Points 1 and 4 are only the case for a trigger mechanism that does
|
|
not use remote connection and 2-phase commit.
|
|
|
|
Imho an implementation that opens a separate client connection to the
|
|
replication target is only suited for async replication, and for that a WAL
|
|
based solution would probably impose less overhead.
|
|
|
|
Andreas
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 2: you can get off all lists at once with the unregister command
|
|
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
|
|
|
From pgsql-hackers-owner+M9946@postgresql.org Tue Jun 12 10:47:09 2001
|
|
Return-path: <pgsql-hackers-owner+M9946@postgresql.org>
|
|
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5CEl9L08144
|
|
for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 10:47:09 -0400 (EDT)
|
|
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5CEihE88714;
|
|
Tue, 12 Jun 2001 10:44:43 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M9946@postgresql.org)
|
|
Received: from mail.greatbridge.com (mail.greatbridge.com [65.196.68.36])
|
|
by postgresql.org (8.11.3/8.11.1) with ESMTP id f5CEd6E85859
|
|
for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 10:39:06 -0400 (EDT)
|
|
(envelope-from djohnson@greatbridge.com)
|
|
Received: from j2.us.greatbridge.com (djohnsonpc.us.greatbridge.com [65.196.69.70])
|
|
by mail.greatbridge.com (8.11.2/8.11.2) with SMTP id f5CEcgQ04905;
|
|
Tue, 12 Jun 2001 10:38:42 -0400
|
|
From: Darren Johnson <djohnson@greatbridge.com>
|
|
Date: Tue, 12 Jun 2001 14:37:18 GMT
|
|
Message-ID: <20010612.14371800@j2.us.greatbridge.com>
|
|
Subject: Re: AW: AW: [HACKERS] Postgres Replication
|
|
To: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
|
cc: pgsql-hackers@postgresql.org
|
|
Reply-To: Darren Johnson <djohnson@greatbridge.com>
|
|
<11C1E6749A55D411A9670001FA68796336831F@sdexcsrv1.f000.d0188.sd.spardat.at>
|
|
References: <11C1E6749A55D411A9670001FA68796336831F@sdexcsrv1.f000.d0188.sd.spardat.at>
|
|
X-Mailer: Mozilla/3.0 (compatible; StarOffice/5.2;Linux)
|
|
X-Priority: 3 (Normal)
|
|
MIME-Version: 1.0
|
|
Content-Type: text/plain; charset=ISO-8859-1
|
|
Content-Transfer-Encoding: 8bit
|
|
X-MIME-Autoconverted: from quoted-printable to 8bit by postgresql.org id f5CEd6E85860
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
|
|
|
|
> Imho an implementation that opens a separate client connection to the
|
|
> replication target is only suited for async replication, and for that a
|
|
WAL
|
|
> based solution would probably impose less overhead.
|
|
|
|
|
|
Yes there is significant overhead with opening a connection to a
|
|
client, so Postgres-R creates a pool of backends at start up,
|
|
coupled with the group communication system (Ensemble) that
|
|
significantly reduces this issue.
|
|
|
|
|
|
Very good points,
|
|
|
|
Darren
|
|
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 6: Have you searched our list archives?
|
|
|
|
http://www.postgresql.org/search.mpl
|
|
|
|
From pgsql-hackers-owner+M9982@postgresql.org Tue Jun 12 19:04:06 2001
|
|
Return-path: <pgsql-hackers-owner+M9982@postgresql.org>
|
|
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5CN46E10043
|
|
for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 19:04:06 -0400 (EDT)
|
|
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5CN4AE62160;
|
|
Tue, 12 Jun 2001 19:04:10 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M9982@postgresql.org)
|
|
Received: from spoetnik.xs4all.nl (spoetnik.xs4all.nl [194.109.249.226])
|
|
by postgresql.org (8.11.3/8.11.1) with ESMTP id f5CMxaE60194
|
|
for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 18:59:36 -0400 (EDT)
|
|
(envelope-from reinoud@xs4all.nl)
|
|
Received: from KAYAK (kayak [192.168.1.20])
|
|
by spoetnik.xs4all.nl (Postfix) with SMTP id 435353E1B
|
|
for <pgsql-hackers@postgresql.org>; Wed, 13 Jun 2001 00:59:28 +0200 (CEST)
|
|
From: reinoud@xs4all.nl (Reinoud van Leeuwen)
|
|
To: pgsql-hackers@postgresql.org
|
|
Subject: Re: AW: AW: [HACKERS] Postgres Replication
|
|
Date: Tue, 12 Jun 2001 22:59:23 GMT
|
|
Organization: Not organized in any way
|
|
Reply-To: reinoud@xs4all.nl
|
|
Message-ID: <3b499c5b.652202125@192.168.1.10>
|
|
References: <11C1E6749A55D411A9670001FA68796336831F@sdexcsrv1.f000.d0188.sd.spardat.at>
|
|
In-Reply-To: <11C1E6749A55D411A9670001FA68796336831F@sdexcsrv1.f000.d0188.sd.spardat.at>
|
|
X-Mailer: Forte Agent 1.5/32.451
|
|
MIME-Version: 1.0
|
|
Content-Type: text/plain; charset=us-ascii
|
|
Content-Transfer-Encoding: 8bit
|
|
X-MIME-Autoconverted: from quoted-printable to 8bit by postgresql.org id f5CMxcE60196
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
On Tue, 12 Jun 2001 15:50:09 +0200, you wrote:
|
|
|
|
>
|
|
>> Here are some disadvantages to using a "trigger based" approach:
|
|
>>
|
|
>> 1) Triggers simply transfer individual data items when they
|
|
>> are modified, they do not keep track of transactions.
|
|
>> 2) The execution of triggers within a database imposes a performance
|
|
>> overhead to that database.
|
|
>> 3) Triggers require careful management by database administrators.
|
|
>> Someone needs to keep track of all the "alarms" going off.
|
|
>> 4) The activation of triggers in a database cannot be easily
|
|
>> rolled back or undone.
|
|
>
|
|
>Yes, points 2 and 3 are a given, although point 2 buys you the functionality
|
|
>of transparent locking across all involved db servers.
|
|
>Points 1 and 4 are only the case for a trigger mechanism that does
|
|
>not use remote connection and 2-phase commit.
|
|
>
|
|
>Imho an implementation that opens a separate client connection to the
|
|
>replication target is only suited for async replication, and for that a WAL
|
|
>based solution would probably impose less overhead.
|
|
|
|
Well as I read back the thread I see 2 different approaches to
|
|
replication:
|
|
|
|
1: tight integrated replication.
|
|
pro:
|
|
- bi-directional (or multidirectional): updates are possible
|
|
everywhere
|
|
- A cluster of servers allways has the same state.
|
|
- it does not matter to which server you connect
|
|
con:
|
|
- network between servers will be a bottleneck, especially if it is a
|
|
WAN connection
|
|
- only full replication possible
|
|
- what happens if one server is down? (or the network between) are
|
|
commits still possible
|
|
|
|
2: async replication
|
|
pro:
|
|
- long distance possible
|
|
- no problems with network outages
|
|
- only changes are replicated, selects do not have impact
|
|
- no locking issues accross servers
|
|
- partial replication possible (many->one (datawarehouse), or one-many
|
|
(queries possible everywhere, updates only central)
|
|
- goof for failover situations (backup server is standing by)
|
|
con:
|
|
- bidirectional replication hard to set up (you'll have to implement
|
|
conflict resolution according to your business rules)
|
|
- different servers are not guaranteed to be in the same state.
|
|
|
|
I can think of some scenarios where I would definitely want to
|
|
*choose* one of the options. A load-balanced web environment would
|
|
likely want the first option, but synchronizing offices in different
|
|
continents might not work with 2-phase commit over the network....
|
|
|
|
And we have not even started talking about *managing* replicated
|
|
environments. A lot of fail-over scenarios stop planning after the
|
|
backup host has take control. But how to get back?
|
|
--
|
|
__________________________________________________
|
|
"Nothing is as subjective as reality"
|
|
Reinoud van Leeuwen reinoud@xs4all.nl
|
|
http://www.xs4all.nl/~reinoud
|
|
__________________________________________________
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
|
|
|
|
From pgsql-hackers-owner+M9986@postgresql.org Tue Jun 12 19:48:48 2001
|
|
Return-path: <pgsql-hackers-owner+M9986@postgresql.org>
|
|
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5CNmmE13125
|
|
for <pgman@candle.pha.pa.us>; Tue, 12 Jun 2001 19:48:48 -0400 (EDT)
|
|
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5CNmqE76673;
|
|
Tue, 12 Jun 2001 19:48:52 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M9986@postgresql.org)
|
|
Received: from sss.pgh.pa.us ([192.204.191.242])
|
|
by postgresql.org (8.11.3/8.11.1) with ESMTP id f5CNdQE73923
|
|
for <pgsql-hackers@postgresql.org>; Tue, 12 Jun 2001 19:39:26 -0400 (EDT)
|
|
(envelope-from tgl@sss.pgh.pa.us)
|
|
Received: from sss2.sss.pgh.pa.us (tgl@localhost [127.0.0.1])
|
|
by sss.pgh.pa.us (8.11.3/8.11.3) with ESMTP id f5CNdI016442;
|
|
Tue, 12 Jun 2001 19:39:18 -0400 (EDT)
|
|
To: reinoud@xs4all.nl
|
|
cc: pgsql-hackers@postgresql.org
|
|
Subject: Re: AW: AW: [HACKERS] Postgres Replication
|
|
In-Reply-To: <3b499c5b.652202125@192.168.1.10>
|
|
References: <11C1E6749A55D411A9670001FA68796336831F@sdexcsrv1.f000.d0188.sd.spardat.at> <3b499c5b.652202125@192.168.1.10>
|
|
Comments: In-reply-to reinoud@xs4all.nl (Reinoud van Leeuwen)
|
|
message dated "Tue, 12 Jun 2001 22:59:23 +0000"
|
|
Date: Tue, 12 Jun 2001 19:39:18 -0400
|
|
Message-ID: <16439.992389158@sss.pgh.pa.us>
|
|
From: Tom Lane <tgl@sss.pgh.pa.us>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
reinoud@xs4all.nl (Reinoud van Leeuwen) writes:
|
|
> Well as I read back the thread I see 2 different approaches to
|
|
> replication:
|
|
> ...
|
|
> I can think of some scenarios where I would definitely want to
|
|
> *choose* one of the options.
|
|
|
|
Yes. IIRC, it looks to be possible to support a form of async
|
|
replication using the Postgres-R approach: you allow the cluster
|
|
to break apart when communications fail, and then rejoin when
|
|
your link comes back to life. (This can work in principle, how
|
|
close it is to reality is another question; but the rejoin operation
|
|
is the same as crash recovery, so you have to have it anyway.)
|
|
|
|
So this seems to me to allow getting most of the benefits of the async
|
|
approach. OTOH it is difficult to see how to go the other way: getting
|
|
the benefits of a synchronous solution atop a basically-async
|
|
implementation doesn't seem like it can work.
|
|
|
|
regards, tom lane
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 6: Have you searched our list archives?
|
|
|
|
http://www.postgresql.org/search.mpl
|
|
|
|
From pgsql-hackers-owner+M9997@postgresql.org Wed Jun 13 09:05:56 2001
|
|
Return-path: <pgsql-hackers-owner+M9997@postgresql.org>
|
|
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5DD5tE28260
|
|
for <pgman@candle.pha.pa.us>; Wed, 13 Jun 2001 09:05:55 -0400 (EDT)
|
|
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5DD5xE12437;
|
|
Wed, 13 Jun 2001 09:05:59 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M9997@postgresql.org)
|
|
Received: from fizbanrsm.server.lan.at (zep4.it-austria.net [213.150.1.74])
|
|
by postgresql.org (8.11.3/8.11.1) with ESMTP id f5DD19E00635
|
|
for <pgsql-hackers@postgresql.org>; Wed, 13 Jun 2001 09:01:10 -0400 (EDT)
|
|
(envelope-from ZeugswetterA@wien.spardat.at)
|
|
Received: from gz0153.gc.spardat.at (gz0153.gc.spardat.at [172.20.10.149])
|
|
by fizbanrsm.server.lan.at (8.11.2/8.11.2) with ESMTP id f5DD13m08153
|
|
for <pgsql-hackers@postgresql.org>; Wed, 13 Jun 2001 15:01:03 +0200
|
|
Received: by sdexcgtw01.f000.d0188.sd.spardat.at with Internet Mail Service (5.5.2650.21)
|
|
id <M6AB97MY>; Wed, 13 Jun 2001 15:00:02 +0200
|
|
Message-ID: <11C1E6749A55D411A9670001FA687963368322@sdexcsrv1.f000.d0188.sd.spardat.at>
|
|
From: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
|
To: "'reinoud@xs4all.nl'" <reinoud@xs4all.nl>, pgsql-hackers@postgresql.org
|
|
Subject: AW: AW: AW: [HACKERS] Postgres Replication
|
|
Date: Wed, 13 Jun 2001 11:55:48 +0200
|
|
MIME-Version: 1.0
|
|
X-Mailer: Internet Mail Service (5.5.2650.21)
|
|
Content-Type: text/plain;
|
|
charset="iso-8859-1"
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
|
|
> Well as I read back the thread I see 2 different approaches to
|
|
> replication:
|
|
>
|
|
> 1: tight integrated replication.
|
|
> pro:
|
|
> - bi-directional (or multidirectional): updates are possible everywhere
|
|
> - A cluster of servers allways has the same state.
|
|
> - it does not matter to which server you connect
|
|
> con:
|
|
> - network between servers will be a bottleneck, especially if it is a
|
|
> WAN connection
|
|
> - only full replication possible
|
|
|
|
I do not understand that point, if it is trigger based, you
|
|
have all the flexibility you need. (only some tables, only some rows,
|
|
different rows to different targets ....),
|
|
(or do you mean not all targets, that could also be achieved with triggers)
|
|
|
|
> - what happens if one server is down? (or the network between) are
|
|
> commits still possible
|
|
|
|
No, updates are not possible if one target is not reachable,
|
|
that would not be synchronous and would again need business rules
|
|
to resolve conflicts.
|
|
|
|
Allowing updates when a target is not reachable would require admin
|
|
intervention.
|
|
|
|
Andreas
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 4: Don't 'kill -9' the postmaster
|
|
|
|
From pgsql-hackers-owner+M10005@postgresql.org Wed Jun 13 11:15:48 2001
|
|
Return-path: <pgsql-hackers-owner+M10005@postgresql.org>
|
|
Received: from postgresql.org (webmail.postgresql.org [216.126.85.28])
|
|
by candle.pha.pa.us (8.10.1/8.10.1) with ESMTP id f5DFFmE08382
|
|
for <pgman@candle.pha.pa.us>; Wed, 13 Jun 2001 11:15:48 -0400 (EDT)
|
|
Received: from postgresql.org.org (webmail.postgresql.org [216.126.85.28])
|
|
by postgresql.org (8.11.3/8.11.1) with SMTP id f5DFFoE53621;
|
|
Wed, 13 Jun 2001 11:15:50 -0400 (EDT)
|
|
(envelope-from pgsql-hackers-owner+M10005@postgresql.org)
|
|
Received: from mail.greatbridge.com (mail.greatbridge.com [65.196.68.36])
|
|
by postgresql.org (8.11.3/8.11.1) with ESMTP id f5DEk7E38930
|
|
for <pgsql-hackers@postgresql.org>; Wed, 13 Jun 2001 10:46:07 -0400 (EDT)
|
|
(envelope-from djohnson@greatbridge.com)
|
|
Received: from j2.us.greatbridge.com (djohnsonpc.us.greatbridge.com [65.196.69.70])
|
|
by mail.greatbridge.com (8.11.2/8.11.2) with SMTP id f5DEhfQ22566;
|
|
Wed, 13 Jun 2001 10:43:41 -0400
|
|
From: Darren Johnson <djohnson@greatbridge.com>
|
|
Date: Wed, 13 Jun 2001 14:44:11 GMT
|
|
Message-ID: <20010613.14441100@j2.us.greatbridge.com>
|
|
Subject: Re: AW: AW: AW: [HACKERS] Postgres Replication
|
|
To: Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at>
|
|
cc: "'reinoud@xs4all.nl'" <reinoud@xs4all.nl>, pgsql-hackers@postgresql.org
|
|
Reply-To: Darren Johnson <djohnson@greatbridge.com>
|
|
<11C1E6749A55D411A9670001FA687963368322@sdexcsrv1.f000.d0188.sd.spardat.at>
|
|
References: <11C1E6749A55D411A9670001FA687963368322@sdexcsrv1.f000.d0188.sd.spardat.at>
|
|
X-Mailer: Mozilla/3.0 (compatible; StarOffice/5.2;Linux)
|
|
X-Priority: 3 (Normal)
|
|
MIME-Version: 1.0
|
|
Content-Type: text/plain; charset=ISO-8859-1
|
|
Content-Transfer-Encoding: 8bit
|
|
X-MIME-Autoconverted: from quoted-printable to 8bit by postgresql.org id f5DEk8E38931
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
|
|
> > - only full replication possible
|
|
|
|
> I do not understand that point, if it is trigger based, you
|
|
> have all the flexibility you need. (only some tables, only some rows,
|
|
> different rows to different targets ....),
|
|
> (or do you mean not all targets, that could also be achieved with
|
|
triggers)
|
|
|
|
Currently with Postgres-R, it is one database replicating all tables to
|
|
all servers in the group communication system. There are some ways
|
|
around
|
|
this by invoking the -r option when a SQL statement should be replicated,
|
|
and leaving the -r option off for non-replicated scenarios. IMHO this is
|
|
not a good solution.
|
|
|
|
A better solution will need to be implemented, which involves a
|
|
subscription table(s) with relation/server information. There are two
|
|
ideas for subscribing and receiving replicated data.
|
|
|
|
1) Receiver driven propagation - A simple solution where all
|
|
transactions are propagated and the receiving servers will reference
|
|
the subscription information before applying updates.
|
|
|
|
2) Sender driven propagation - A more optimal and complex solution
|
|
where servers do not receive any messages regarding data items for
|
|
which they have not subscribed
|
|
|
|
|
|
> > - what happens if one server is down? (or the network between) are
|
|
> > commits still possible
|
|
|
|
> No, updates are not possible if one target is not reachable,
|
|
|
|
AFAIK, Postgres-R can still replicate if one target is not reachable,
|
|
but only to the remaining servers ;).
|
|
|
|
There is a scenario that could arise if a server issues a lock
|
|
request then fails or goes off line. There is code that checks
|
|
for this condition, which needs to be merged with the branch we have.
|
|
|
|
> that would not be synchronous and would again need business rules
|
|
> to resolve conflicts.
|
|
|
|
Yes the failed server would not be synchronized, and getting this
|
|
failed server back in sync needs to be addressed.
|
|
|
|
> Allowing updates when a target is not reachable would require admin
|
|
> intervention.
|
|
|
|
In its current state yes, but our goal would be to eliminate this
|
|
requirement as well.
|
|
|
|
|
|
|
|
Darren
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 3: if posting/reading through Usenet, please send an appropriate
|
|
subscribe-nomail command to majordomo@postgresql.org so that your
|
|
message can get through to the mailing list cleanly
|
|
|
|
From pgsql-hackers-owner+M18443=candle.pha.pa.us=pgman@postgresql.org Mon Feb 4 19:16:17 2002
|
|
Return-path: <pgsql-hackers-owner+M18443=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g150GGP03822
|
|
for <pgman@candle.pha.pa.us>; Mon, 4 Feb 2002 19:16:16 -0500 (EST)
|
|
Received: (qmail 77444 invoked by alias); 5 Feb 2002 00:16:11 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 5 Feb 2002 00:16:11 -0000
|
|
Received: from snoopy.mohawksoft.com (h0050bf7a618d.ne.mediaone.net [24.147.138.78])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g150Esl77040
|
|
for <pgsql-hackers@postgresql.org>; Mon, 4 Feb 2002 19:14:54 -0500 (EST)
|
|
(envelope-from markw@mohawksoft.com)
|
|
Received: from mohawksoft.com (localhost [127.0.0.1])
|
|
by snoopy.mohawksoft.com (8.11.6/8.11.6) with ESMTP id g150AWh08676
|
|
for <pgsql-hackers@postgresql.org>; Mon, 4 Feb 2002 19:10:33 -0500
|
|
Message-ID: <3C5F22F8.C9B958F0@mohawksoft.com>
|
|
Date: Mon, 04 Feb 2002 19:10:32 -0500
|
|
From: mlw <markw@mohawksoft.com>
|
|
X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.4.17 i686)
|
|
X-Accept-Language: en
|
|
MIME-Version: 1.0
|
|
To: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
|
Subject: [HACKERS] Replication
|
|
Content-Type: text/plain; charset=us-ascii
|
|
Content-Transfer-Encoding: 7bit
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
I re-wrote RServ.pm to C, and wrote a replication daemon. It works, but it
|
|
works like the whole rserv project. I don't like it.
|
|
|
|
OK, what the hell do we need to do to get PostgreSQL replicating?
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 4: Don't 'kill -9' the postmaster
|
|
|
|
From pgsql-hackers-owner+M18445=candle.pha.pa.us=pgman@postgresql.org Mon Feb 4 19:57:01 2002
|
|
Return-path: <pgsql-hackers-owner+M18445=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g150v0P06518
|
|
for <pgman@candle.pha.pa.us>; Mon, 4 Feb 2002 19:57:00 -0500 (EST)
|
|
Received: (qmail 90440 invoked by alias); 5 Feb 2002 00:56:59 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 5 Feb 2002 00:56:59 -0000
|
|
Received: from www1.navtechinc.com ([192.234.226.140])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g150rMl89885
|
|
for <pgsql-hackers@postgresql.org>; Mon, 4 Feb 2002 19:53:22 -0500 (EST)
|
|
(envelope-from ssinger@navtechinc.com)
|
|
Received: from pcNavYkfAdm1.ykf.navtechinc.com (wall [192.234.226.190])
|
|
by www1.navtechinc.com (8.9.3/8.9.3) with ESMTP id AAA06047;
|
|
Tue, 5 Feb 2002 00:53:22 GMT
|
|
Received: from localhost (ssinger@localhost)
|
|
by pcNavYkfAdm1.ykf.navtechinc.com (8.9.3/8.9.3) with ESMTP id AAA10675;
|
|
Tue, 5 Feb 2002 00:52:43 GMT
|
|
Date: Tue, 5 Feb 2002 00:52:43 +0000 (GMT)
|
|
From: Steven <ssinger@navtechinc.com>
|
|
X-X-Sender: <ssinger@pcNavYkfAdm1.ykf.navtechinc.com>
|
|
To: mlw <markw@mohawksoft.com>
|
|
cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
|
Subject: Re: [HACKERS] Replication
|
|
In-Reply-To: <3C5F22F8.C9B958F0@mohawksoft.com>
|
|
Message-ID: <Pine.LNX.4.33.0202050040190.24027-100000@pcNavYkfAdm1.ykf.navtechinc.com>
|
|
MIME-Version: 1.0
|
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
On Mon, 4 Feb 2002, mlw wrote:
|
|
|
|
I've developed a replacement for Rserv and we are planning on releasing
|
|
it as open source(ie as a contrib module).
|
|
|
|
Like Rserv its trigger based but its much more flexible.
|
|
The key adventages it has over Rserv is that it has
|
|
-Support for multiple slaves
|
|
-It Perserves transactions while doing the mirroring. Ie If rows A,B are
|
|
originally added in the same transaction they will be mirrored in the same
|
|
transaction.
|
|
|
|
We have plans on adding filtering based on data/selective mirroring as
|
|
well. (Ie only rows with COUNTRY='Canada' go to
|
|
slave A, and rows with COUNTRY='China' go to slave B).
|
|
But I'm not sure when I'll get to that.
|
|
|
|
Support for conflict resolution(If allow edits to be made on the slaves)
|
|
would be nice.
|
|
|
|
I hope to be able to send a tarball with the source to the pgpatches list
|
|
within the next few days.
|
|
|
|
We've been using the system operationally for a number of months and have
|
|
been happy with it.
|
|
|
|
> I re-wrote RServ.pm to C, and wrote a replication daemon. It works, but it
|
|
> works like the whole rserv project. I don't like it.
|
|
> OK, what the hell do we need to do to get PostgreSQL replicating?
|
|
>
|
|
> ---------------------------(end of broadcast)---------------------------
|
|
> TIP 4: Don't 'kill -9' the postmaster
|
|
>
|
|
|
|
--
|
|
Steven Singer ssinger@navtechinc.com
|
|
Aircraft Performance Systems Phone: 519-747-1170 ext 282
|
|
Navtech Systems Support Inc. AFTN: CYYZXNSX SITA: YYZNSCR
|
|
Waterloo, Ontario ARINC: YKFNSCR
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 2: you can get off all lists at once with the unregister command
|
|
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
|
|
|
From pgsql-hackers-owner+M18447=candle.pha.pa.us=pgman@postgresql.org Mon Feb 4 20:06:57 2002
|
|
Return-path: <pgsql-hackers-owner+M18447=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g1516vP07508
|
|
for <pgman@candle.pha.pa.us>; Mon, 4 Feb 2002 20:06:57 -0500 (EST)
|
|
Received: (qmail 92753 invoked by alias); 5 Feb 2002 01:06:55 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 5 Feb 2002 01:06:55 -0000
|
|
Received: from inflicted.crimelabs.net (crimelabs.net [66.92.101.112])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g150vhl91978
|
|
for <pgsql-hackers@postgresql.org>; Mon, 4 Feb 2002 19:57:44 -0500 (EST)
|
|
(envelope-from bpalmer@crimelabs.net)
|
|
Received: from mizer.crimelabs.net (mizer.crimelabs.net [192.168.88.10])
|
|
by inflicted.crimelabs.net (Postfix) with ESMTP
|
|
id 9D6EE8779; Mon, 4 Feb 2002 19:57:46 -0500 (EST)
|
|
Date: Mon, 4 Feb 2002 19:57:34 -0500 (EST)
|
|
From: bpalmer <bpalmer@crimelabs.net>
|
|
To: mlw <markw@mohawksoft.com>
|
|
cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
|
Subject: Re: [HACKERS] Replication
|
|
In-Reply-To: <3C5F22F8.C9B958F0@mohawksoft.com>
|
|
Message-ID: <Pine.BSO.4.43.0202041955420.17121-100000@mizer.crimelabs.net>
|
|
MIME-Version: 1.0
|
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
>
|
|
> OK, what the hell do we need to do to get PostgreSQL replicating?
|
|
|
|
I hope you understand that replication, done right, is a massive
|
|
project. I know that Darren any myself (and the rest of the pg-repl
|
|
folks) have been waiting till 7.2 went gold till we did anymore work. I
|
|
think we hope to have master / slave replicatin working for 7.3 and then
|
|
target multimaster for 7.4. At least that's the hope.
|
|
|
|
- Brandon
|
|
|
|
----------------------------------------------------------------------------
|
|
c: 646-456-5455 h: 201-798-4983
|
|
b. palmer, bpalmer@crimelabs.net pgp:crimelabs.net/bpalmer.pgp5
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 2: you can get off all lists at once with the unregister command
|
|
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
|
|
|
From pgsql-hackers-owner+M18449=candle.pha.pa.us=pgman@postgresql.org Mon Feb 4 21:16:56 2002
|
|
Return-path: <pgsql-hackers-owner+M18449=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g152GtP10503
|
|
for <pgman@candle.pha.pa.us>; Mon, 4 Feb 2002 21:16:55 -0500 (EST)
|
|
Received: (qmail 6711 invoked by alias); 5 Feb 2002 02:16:53 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 5 Feb 2002 02:16:53 -0000
|
|
Received: from snoopy.mohawksoft.com (h0050bf7a618d.ne.mediaone.net [24.147.138.78])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g151qSl99469
|
|
for <pgsql-hackers@postgresql.org>; Mon, 4 Feb 2002 20:52:28 -0500 (EST)
|
|
(envelope-from markw@mohawksoft.com)
|
|
Received: from mohawksoft.com (localhost [127.0.0.1])
|
|
by snoopy.mohawksoft.com (8.11.6/8.11.6) with ESMTP id g151lph09147;
|
|
Mon, 4 Feb 2002 20:47:51 -0500
|
|
Message-ID: <3C5F39C7.970F4549@mohawksoft.com>
|
|
Date: Mon, 04 Feb 2002 20:47:51 -0500
|
|
From: mlw <markw@mohawksoft.com>
|
|
X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.4.17 i686)
|
|
X-Accept-Language: en
|
|
MIME-Version: 1.0
|
|
To: Steven <ssinger@navtechinc.com>
|
|
cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
|
Subject: Re: [HACKERS] Replication
|
|
References: <Pine.LNX.4.33.0202050040190.24027-100000@pcNavYkfAdm1.ykf.navtechinc.com>
|
|
Content-Type: text/plain; charset=us-ascii
|
|
Content-Transfer-Encoding: 7bit
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
Steven wrote:
|
|
>
|
|
> On Mon, 4 Feb 2002, mlw wrote:
|
|
>
|
|
> I've developed a replacement for Rserv and we are planning on releasing
|
|
> it as open source(ie as a contrib module).
|
|
>
|
|
> Like Rserv its trigger based but its much more flexible.
|
|
> The key adventages it has over Rserv is that it has
|
|
> -Support for multiple slaves
|
|
> -It Perserves transactions while doing the mirroring. Ie If rows A,B are
|
|
> originally added in the same transaction they will be mirrored in the same
|
|
> transaction.
|
|
|
|
I did a similar thing. I took the rserv trigger "as is," but rewrote the
|
|
replication support code. What I eventually did was write a "snapshot daemon"
|
|
which created snapshot files. Then a "slave daemon" which would check the last
|
|
snapshot applied and apply all the snapshots, in order, as needed. One would
|
|
run one of these daemons per slave server.
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 5: Have you checked our extensive FAQ?
|
|
|
|
http://www.postgresql.org/users-lounge/docs/faq.html
|
|
|
|
From pgsql-hackers-owner+M18448=candle.pha.pa.us=pgman@postgresql.org Mon Feb 4 20:57:25 2002
|
|
Return-path: <pgsql-hackers-owner+M18448=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g151vOP09239
|
|
for <pgman@candle.pha.pa.us>; Mon, 4 Feb 2002 20:57:24 -0500 (EST)
|
|
Received: (qmail 99828 invoked by alias); 5 Feb 2002 01:57:19 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 5 Feb 2002 01:57:19 -0000
|
|
Received: from snoopy.mohawksoft.com (h0050bf7a618d.ne.mediaone.net [24.147.138.78])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g151s0l99529
|
|
for <pgsql-hackers@postgresql.org>; Mon, 4 Feb 2002 20:54:00 -0500 (EST)
|
|
(envelope-from markw@mohawksoft.com)
|
|
Received: from mohawksoft.com (localhost [127.0.0.1])
|
|
by snoopy.mohawksoft.com (8.11.6/8.11.6) with ESMTP id g151nah09156;
|
|
Mon, 4 Feb 2002 20:49:37 -0500
|
|
Message-ID: <3C5F3A30.A4C46FB8@mohawksoft.com>
|
|
Date: Mon, 04 Feb 2002 20:49:36 -0500
|
|
From: mlw <markw@mohawksoft.com>
|
|
X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.4.17 i686)
|
|
X-Accept-Language: en
|
|
MIME-Version: 1.0
|
|
To: bpalmer <bpalmer@crimelabs.net>
|
|
cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
|
Subject: Re: [HACKERS] Replication
|
|
References: <Pine.BSO.4.43.0202041955420.17121-100000@mizer.crimelabs.net>
|
|
Content-Type: text/plain; charset=us-ascii
|
|
Content-Transfer-Encoding: 7bit
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
bpalmer wrote:
|
|
>
|
|
> >
|
|
> > OK, what the hell do we need to do to get PostgreSQL replicating?
|
|
>
|
|
> I hope you understand that replication, done right, is a massive
|
|
> project. I know that Darren any myself (and the rest of the pg-repl
|
|
> folks) have been waiting till 7.2 went gold till we did anymore work. I
|
|
> think we hope to have master / slave replicatin working for 7.3 and then
|
|
> target multimaster for 7.4. At least that's the hope.
|
|
|
|
I do know how hard replication is. I also understand how important it is.
|
|
|
|
If you guys have a project going, and need developers, I am more than willing.
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 5: Have you checked our extensive FAQ?
|
|
|
|
http://www.postgresql.org/users-lounge/docs/faq.html
|
|
|
|
From pgsql-hackers-owner+M18450=candle.pha.pa.us=pgman@postgresql.org Mon Feb 4 21:42:13 2002
|
|
Return-path: <pgsql-hackers-owner+M18450=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g152gCP11957
|
|
for <pgman@candle.pha.pa.us>; Mon, 4 Feb 2002 21:42:13 -0500 (EST)
|
|
Received: (qmail 14229 invoked by alias); 5 Feb 2002 02:42:09 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 5 Feb 2002 02:42:09 -0000
|
|
Received: from www1.navtechinc.com ([192.234.226.140])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g152SBl10682
|
|
for <pgsql-hackers@postgresql.org>; Mon, 4 Feb 2002 21:28:11 -0500 (EST)
|
|
(envelope-from ssinger@navtechinc.com)
|
|
Received: from pcNavYkfAdm1.ykf.navtechinc.com (wall [192.234.226.190])
|
|
by www1.navtechinc.com (8.9.3/8.9.3) with ESMTP id CAA06384;
|
|
Tue, 5 Feb 2002 02:28:13 GMT
|
|
Received: from localhost (ssinger@localhost)
|
|
by pcNavYkfAdm1.ykf.navtechinc.com (8.9.3/8.9.3) with ESMTP id CAA10682;
|
|
Tue, 5 Feb 2002 02:27:35 GMT
|
|
Date: Tue, 5 Feb 2002 02:27:35 +0000 (GMT)
|
|
From: Steven <ssinger@navtechinc.com>
|
|
X-X-Sender: <ssinger@pcNavYkfAdm1.ykf.navtechinc.com>
|
|
To: mlw <markw@mohawksoft.com>
|
|
cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
|
Subject: Re: [HACKERS] Replication
|
|
In-Reply-To: <3C5F39C7.970F4549@mohawksoft.com>
|
|
Message-ID: <Pine.LNX.4.33.0202050159591.26756-100000@pcNavYkfAdm1.ykf.navtechinc.com>
|
|
MIME-Version: 1.0
|
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
|
|
DBMirror doesn't use snapshot's instead it records a log of transactions
|
|
that are committed to the database in a pair of tables.
|
|
In the case of an INSERT this is the row that is being added.
|
|
In the case of a delete the primary key of the row being deleted.
|
|
|
|
And in the case of an UPDATE, the primary key before the update along with
|
|
all of the data the row should have after an update.
|
|
|
|
Then for each slave database a perl script walks though the transactions
|
|
that are pending for that host and reconstructs SQL to send the row edits
|
|
to that host. A record of the fact that transaction Y has been sent to
|
|
host X is also kept.
|
|
|
|
When transaction X has been sent to all of the hosts that are in the
|
|
system it is then deleted from the Pending tables.
|
|
|
|
I suspect that all of the information I'm storing in the Pending tables is
|
|
also being stored by Postgres in its log but I haven't investigated how
|
|
the information could be extracted(or how long it is kept for). That
|
|
would reduce the extra storage overhead that the replication system
|
|
imposes.
|
|
|
|
As I remember(Its been a while since I've looked at it) RServ uses OID's
|
|
in its tables to point to the data that needs to be replicated. We tried
|
|
a similar approach but found difficulties with doing partial updates.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
On Mon, 4 Feb 2002, mlw wrote:
|
|
|
|
> I did a similar thing. I took the rserv trigger "as is," but rewrote the
|
|
> replication support code. What I eventually did was write a "snapshot daemon"
|
|
> which created snapshot files. Then a "slave daemon" which would check the last
|
|
> snapshot applied and apply all the snapshots, in order, as needed. One would
|
|
> run one of these daemons per slave server.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
--
|
|
Steven Singer ssinger@navtechinc.com
|
|
Aircraft Performance Systems Phone: 519-747-1170 ext 282
|
|
Navtech Systems Support Inc. AFTN: CYYZXNSX SITA: YYZNSCR
|
|
Waterloo, Ontario ARINC: YKFNSCR
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 2: you can get off all lists at once with the unregister command
|
|
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
|
|
|
From pgsql-hackers-owner+M18554=candle.pha.pa.us=pgman@postgresql.org Thu Feb 7 02:49:48 2002
|
|
Return-path: <pgsql-hackers-owner+M18554=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g177nlP04347
|
|
for <pgman@candle.pha.pa.us>; Thu, 7 Feb 2002 02:49:47 -0500 (EST)
|
|
Received: (qmail 22556 invoked by alias); 7 Feb 2002 07:49:49 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 7 Feb 2002 07:49:49 -0000
|
|
Received: from linuxworld.com.au (www.linuxworld.com.au [203.34.46.50])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g177QfE19572
|
|
for <pgsql-hackers@postgresql.org>; Thu, 7 Feb 2002 02:26:42 -0500 (EST)
|
|
(envelope-from swm@linuxworld.com.au)
|
|
Received: from localhost (swm@localhost)
|
|
by linuxworld.com.au (8.11.4/8.11.4) with ESMTP id g177RiU06086;
|
|
Thu, 7 Feb 2002 18:27:45 +1100
|
|
Date: Thu, 7 Feb 2002 18:27:44 +1100 (EST)
|
|
From: Gavin Sherry <swm@linuxworld.com.au>
|
|
To: mlw <markw@mohawksoft.com>
|
|
cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
|
Subject: Re: [HACKERS] Replication
|
|
In-Reply-To: <3C5F22F8.C9B958F0@mohawksoft.com>
|
|
Message-ID: <Pine.LNX.4.21.0202071751240.5160-100000@linuxworld.com.au>
|
|
MIME-Version: 1.0
|
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
On Mon, 4 Feb 2002, mlw wrote:
|
|
|
|
> I re-wrote RServ.pm to C, and wrote a replication daemon. It works, but it
|
|
> works like the whole rserv project. I don't like it.
|
|
>
|
|
> OK, what the hell do we need to do to get PostgreSQL replicating?
|
|
|
|
The trigger model is not a very sophisticated one. I think I have a better
|
|
-- though more complicated -- one. This model would be able to handle
|
|
multiple masters and master->slave.
|
|
|
|
First of all, all machines in the cluster would have to be aware all the
|
|
machines in the cluster. This would have to be stored in a new system
|
|
table.
|
|
|
|
The FE/BE protocol would need to be modified to accepted parsed node trees
|
|
generated by pg_analyze_and_rewrite(). These could then be dispatched by
|
|
the executing server, inside of pg_exec_query_string, to all other servers
|
|
in the cluster (excluding itself). Naturally, this dispatch would need to
|
|
be non-blocking.
|
|
|
|
pg_exec_query_string() would need to check that nodetags to make sure
|
|
selects and perhaps some commands are not dispatched.
|
|
|
|
Before the executing server runs finish_xact_command(), it would check
|
|
that the query was successfully executed on all machines otherwise
|
|
abort. Such a system would need a few configuration options: whether or
|
|
not you abort on failed replication to slaves, the ability to replicate
|
|
only certain tables, etc.
|
|
|
|
Naturally, this would slow down writes to the system (possibly a lot
|
|
depending on the performance difference between the executing machine and
|
|
the least powerful machine in the cluster), but most usages of postgresql
|
|
are read intensive, not write.
|
|
|
|
Any reason this model would not work?
|
|
|
|
Gavin
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 4: Don't 'kill -9' the postmaster
|
|
|
|
From pgsql-hackers-owner+M18558=candle.pha.pa.us=pgman@postgresql.org Thu Feb 7 08:31:00 2002
|
|
Return-path: <pgsql-hackers-owner+M18558=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g17DUxP13923
|
|
for <pgman@candle.pha.pa.us>; Thu, 7 Feb 2002 08:30:59 -0500 (EST)
|
|
Received: (qmail 91796 invoked by alias); 7 Feb 2002 13:30:55 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 7 Feb 2002 13:30:55 -0000
|
|
Received: from snoopy.mohawksoft.com (h0050bf7a618d.ne.mediaone.net [24.147.138.78])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g17Cw0E87782
|
|
for <pgsql-hackers@postgresql.org>; Thu, 7 Feb 2002 07:58:01 -0500 (EST)
|
|
(envelope-from markw@mohawksoft.com)
|
|
Received: from mohawksoft.com (localhost [127.0.0.1])
|
|
by snoopy.mohawksoft.com (8.11.6/8.11.6) with ESMTP id g17CqNt16887;
|
|
Thu, 7 Feb 2002 07:52:24 -0500
|
|
Message-ID: <3C627887.CC9FF837@mohawksoft.com>
|
|
Date: Thu, 07 Feb 2002 07:52:23 -0500
|
|
From: mlw <markw@mohawksoft.com>
|
|
X-Mailer: Mozilla 4.78 [en] (X11; U; Linux 2.4.17 i686)
|
|
X-Accept-Language: en
|
|
MIME-Version: 1.0
|
|
To: Gavin Sherry <swm@linuxworld.com.au>
|
|
cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
|
Subject: Re: [HACKERS] Replication
|
|
References: <Pine.LNX.4.21.0202071751240.5160-100000@linuxworld.com.au>
|
|
Content-Type: text/plain; charset=us-ascii
|
|
Content-Transfer-Encoding: 7bit
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
Gavin Sherry wrote:
|
|
> Naturally, this would slow down writes to the system (possibly a lot
|
|
> depending on the performance difference between the executing machine and
|
|
> the least powerful machine in the cluster), but most usages of postgresql
|
|
> are read intensive, not write.
|
|
>
|
|
> Any reason this model would not work?
|
|
|
|
What, then is the purpose of replication to multiple masters?
|
|
|
|
I can think of only two reasons why you want replication. (1) Redundancy, make
|
|
sure that if one server dies, then another server has the same data and is used
|
|
seamlessly. (2) Increase performance over one system.
|
|
|
|
In reason (1) I submit that a server load balance which sits on top of
|
|
PostgreSQL, and executes writes on both servers while distributing reads would
|
|
be best. This is a HUGE project. The load balancer must know EXACTLY how the
|
|
system is configured, which includes all functions and everything.
|
|
|
|
In reason (2) your system would fail to provide the scalability that would be
|
|
needed. If writes take a long time, but reads are fine, what is the difference
|
|
between the trigger based replicator?
|
|
|
|
I have in the back of my mind, an idea of patching into the WAL stuff, and
|
|
using that mechanism to push changes out to the slaves.
|
|
|
|
Where one machine is still the master, but no trigger stuff, just a WAL patch.
|
|
Perhaps some shared memory paradigm to manage WAL visibility? I'm not sure
|
|
exactly, the idea hasn't completely formed yet.
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 5: Have you checked our extensive FAQ?
|
|
|
|
http://www.postgresql.org/users-lounge/docs/faq.html
|
|
|
|
From pgsql-hackers-owner+M18574=candle.pha.pa.us=pgman@postgresql.org Thu Feb 7 12:51:42 2002
|
|
Return-path: <pgsql-hackers-owner+M18574=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g17HpfP16661
|
|
for <pgman@candle.pha.pa.us>; Thu, 7 Feb 2002 12:51:41 -0500 (EST)
|
|
Received: (qmail 62955 invoked by alias); 7 Feb 2002 17:50:42 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 7 Feb 2002 17:50:42 -0000
|
|
Received: from www1.navtechinc.com ([192.234.226.140])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g17HnTE62256
|
|
for <pgsql-hackers@postgresql.org>; Thu, 7 Feb 2002 12:49:29 -0500 (EST)
|
|
(envelope-from ssinger@navtechinc.com)
|
|
Received: from pcNavYkfAdm1.ykf.navtechinc.com (wall [192.234.226.190])
|
|
by www1.navtechinc.com (8.9.3/8.9.3) with ESMTP id RAA07908;
|
|
Thu, 7 Feb 2002 17:49:31 GMT
|
|
Received: from localhost (ssinger@localhost)
|
|
by pcNavYkfAdm1.ykf.navtechinc.com (8.9.3/8.9.3) with ESMTP id RAA05687;
|
|
Thu, 7 Feb 2002 17:48:52 GMT
|
|
Date: Thu, 7 Feb 2002 17:48:51 +0000 (GMT)
|
|
From: Steven Singer <ssinger@navtechinc.com>
|
|
X-X-Sender: <ssinger@pcNavYkfAdm1.ykf.navtechinc.com>
|
|
To: Gavin Sherry <swm@linuxworld.com.au>
|
|
cc: mlw <markw@mohawksoft.com>,
|
|
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
|
Subject: Re: [HACKERS] Replication
|
|
In-Reply-To: <Pine.LNX.4.21.0202071751240.5160-100000@linuxworld.com.au>
|
|
Message-ID: <Pine.LNX.4.33.0202071735360.6435-100000@pcNavYkfAdm1.ykf.navtechinc.com>
|
|
MIME-Version: 1.0
|
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
|
|
|
|
What you describe sounds like a form of a two-stage commit protocol.
|
|
|
|
If the command worked on two of the replicated databases but failed on a
|
|
third then the executing server would have to be able to undo the command
|
|
on the replicated databases as well as itself.
|
|
|
|
The problems with two stage commit type approches to replication are
|
|
1) Speed as you mentioned. Write speed isn't a concern for some
|
|
applications but it is very important in others.
|
|
|
|
and
|
|
2) All of the databases must be able to communicate with each other at
|
|
all times in order for any edits to work. If the servers are
|
|
connected over some sort of WAN that periodically has short outages this
|
|
is a problem. Also if your using replication because you want to be able
|
|
to take down one of the databases for short periods of time without
|
|
bringing down the others your in trouble.
|
|
|
|
|
|
btw: I posted the alternative to Rserv that I mentioned the other day to
|
|
the pg-patches mailing list. If anyone is intreasted you should be able
|
|
to grab it off the archives.
|
|
|
|
On Thu, 7 Feb 2002, Gavin Sherry wrote:
|
|
|
|
>
|
|
> First of all, all machines in the cluster would have to be aware all the
|
|
> machines in the cluster. This would have to be stored in a new system
|
|
> table.
|
|
>
|
|
> The FE/BE protocol would need to be modified to accepted parsed node trees
|
|
> generated by pg_analyze_and_rewrite(). These could then be dispatched by
|
|
> the executing server, inside of pg_exec_query_string, to all other servers
|
|
> in the cluster (excluding itself). Naturally, this dispatch would need to
|
|
> be non-blocking.
|
|
>
|
|
> pg_exec_query_string() would need to check that nodetags to make sure
|
|
> selects and perhaps some commands are not dispatched.
|
|
>
|
|
> Before the executing server runs finish_xact_command(), it would check
|
|
> that the query was successfully executed on all machines otherwise
|
|
> abort. Such a system would need a few configuration options: whether or
|
|
> not you abort on failed replication to slaves, the ability to replicate
|
|
> only certain tables, etc.
|
|
>
|
|
> Naturally, this would slow down writes to the system (possibly a lot
|
|
> depending on the performance difference between the executing machine and
|
|
> the least powerful machine in the cluster), but most usages of postgresql
|
|
> are read intensive, not write.
|
|
>
|
|
> Any reason this model would not work?
|
|
>
|
|
> Gavin
|
|
>
|
|
>
|
|
> ---------------------------(end of broadcast)---------------------------
|
|
> TIP 4: Don't 'kill -9' the postmaster
|
|
>
|
|
|
|
--
|
|
Steven Singer ssinger@navtechinc.com
|
|
Aircraft Performance Systems Phone: 519-747-1170 ext 282
|
|
Navtech Systems Support Inc. AFTN: CYYZXNSX SITA: YYZNSCR
|
|
Waterloo, Ontario ARINC: YKFNSCR
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
|
|
|
|
From pgsql-hackers-owner+M18590=candle.pha.pa.us=pgman@postgresql.org Thu Feb 7 17:50:42 2002
|
|
Return-path: <pgsql-hackers-owner+M18590=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g17MoeP27121
|
|
for <pgman@candle.pha.pa.us>; Thu, 7 Feb 2002 17:50:40 -0500 (EST)
|
|
Received: (qmail 39930 invoked by alias); 7 Feb 2002 22:50:17 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 7 Feb 2002 22:50:17 -0000
|
|
Received: from odin.fts.net (wall.icgate.net [209.26.177.2])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g17Ma4E38041
|
|
for <pgsql-hackers@postgresql.org>; Thu, 7 Feb 2002 17:36:04 -0500 (EST)
|
|
(envelope-from fharvell@odin.fts.net)
|
|
Received: from odin.fts.net (fharvell@localhost)
|
|
by odin.fts.net (8.11.6/8.11.6) with ESMTP id g17MZhR17707;
|
|
Thu, 7 Feb 2002 17:35:43 -0500
|
|
Message-ID: <200202072235.g17MZhR17707@odin.fts.net>
|
|
X-Mailer: exmh version 2.2 06/23/2000 with nmh-1.0.4
|
|
From: F Harvell <fharvell@fts.net>
|
|
To: mlw <markw@mohawksoft.com>
|
|
cc: Gavin Sherry <swm@linuxworld.com.au>,
|
|
PostgreSQL-development <pgsql-hackers@postgresql.org>
|
|
Subject: Re: [HACKERS] Replication
|
|
In-Reply-To: Message from mlw
|
|
of "Thu, 07 Feb 2002 07:52:23 EST."
|
|
<3C627887.CC9FF837@mohawksoft.com>
|
|
MIME-Version: 1.0
|
|
Content-Type: text/plain; charset=us-ascii
|
|
Date: Thu, 07 Feb 2002 17:35:43 -0500
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
I'm not that familiar with the whole replication issues in PostgreSQL,
|
|
however, I would be partial to replication that was based upon the
|
|
playback of the (a?) journal file. (I believe that the WAL is a
|
|
journal file.)
|
|
|
|
By being based upon a journal file, it would be possible to accomplish
|
|
two significant items. First, it would be possible to "restore" a
|
|
database to an exact state just before a failure. Most commercial
|
|
databases provide the ability to do this. Banks, etc. log the journal
|
|
files directly to tape to provide a complete transaction history such
|
|
that they can rebuild their database from any given snapshot. (Note
|
|
that the journal file needs to be "editable" as a failure may be
|
|
"delete from x" with a missing where clause.)
|
|
|
|
This leads directly into the second advantage, the ability to have a
|
|
replicated database operating anywhere, over any connection on any
|
|
server. Speed of writes would not be a factor. In essence, as long
|
|
as the replicated database had a snapshot of the database and then was
|
|
provided with all journal files since the snapshot, it would be
|
|
possible to build a current database. If the replicant got behind in
|
|
the processing, it would catch up when things slowed down.
|
|
|
|
In my opionion, the first advantage is in many ways most important.
|
|
Replication becomes simply the restoration of the database in realtime
|
|
on a second server. The "replication" task becomes the definition of
|
|
a protocol for distributing the journal file. At least one major
|
|
database vendor does replication (shadowing) in exactly this mannor.
|
|
|
|
Maybe I'm all wet and the journal file and journal playback already
|
|
exists. If so, IMHO, basing replication off of this would be the
|
|
right direction.
|
|
|
|
|
|
On Thu, 07 Feb 2002 07:52:23 EST, mlw wrote:
|
|
>
|
|
> I have in the back of my mind, an idea of patching into the WAL stuff, and
|
|
> using that mechanism to push changes out to the slaves.
|
|
>
|
|
> Where one machine is still the master, but no trigger stuff, just a WAL patch.
|
|
> Perhaps some shared memory paradigm to manage WAL visibility? I'm not sure
|
|
> exactly, the idea hasn't completely formed yet.
|
|
>
|
|
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 4: Don't 'kill -9' the postmaster
|
|
|
|
From pgsql-hackers-owner+M18605=candle.pha.pa.us=pgman@postgresql.org Fri Feb 8 00:50:08 2002
|
|
Return-path: <pgsql-hackers-owner+M18605=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g185o7P27878
|
|
for <pgman@candle.pha.pa.us>; Fri, 8 Feb 2002 00:50:07 -0500 (EST)
|
|
Received: (qmail 17348 invoked by alias); 8 Feb 2002 05:50:03 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 8 Feb 2002 05:50:03 -0000
|
|
Received: from lakemtao03.mgt.cox.net (mtao3.east.cox.net [68.1.17.242])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g185cTE15241
|
|
for <pgsql-hackers@postgresql.org>; Fri, 8 Feb 2002 00:38:29 -0500 (EST)
|
|
(envelope-from darren.johnson@cox.net)
|
|
Received: from cox.net ([68.10.181.230]) by lakemtao03.mgt.cox.net
|
|
(InterMail vM.5.01.04.05 201-253-122-122-105-20011231) with ESMTP
|
|
id <20020208053833.YKTV6710.lakemtao03.mgt.cox.net@cox.net>
|
|
for <pgsql-hackers@postgresql.org>;
|
|
Fri, 8 Feb 2002 00:38:33 -0500
|
|
Message-ID: <3C636232.6060206@cox.net>
|
|
Date: Fri, 08 Feb 2002 00:29:22 -0500
|
|
From: Darren Johnson <darren.johnson@cox.net>
|
|
User-Agent: Mozilla/5.0 (Windows; U; WinNT4.0; en-US; m18) Gecko/20001108 Netscape6/6.0
|
|
X-Accept-Language: en
|
|
MIME-Version: 1.0
|
|
To: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
|
Subject: Re: [HACKERS] Replication
|
|
References: <Pine.LNX.4.33.0202071735360.6435-100000@pcNavYkfAdm1.ykf.navtechinc.com>
|
|
Content-Type: text/plain; charset=us-ascii; format=flowed
|
|
Content-Transfer-Encoding: 7bit
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
|
|
>
|
|
> The problems with two stage commit type approches to replication are
|
|
|
|
IMHO the biggest problem with two phased commit is it doesn't scale.
|
|
The more servers
|
|
you add to the replica the slower it goes. Also there's the potential
|
|
for dead locks across
|
|
server boundaries.
|
|
|
|
>
|
|
> 2) All of the databases must be able to communicate with each other at
|
|
> all times in order for any edits to work. If the servers are
|
|
> connected over some sort of WAN that periodically has short outages this
|
|
> is a problem. Also if your using replication because you want to be
|
|
able
|
|
> to take down one of the databases for short periods of time without
|
|
> bringing down the others your in trouble.
|
|
|
|
All true for two phased commit protocol. To have multi master
|
|
replication, you must have all
|
|
systems communicating, but you can use a multicast group communication
|
|
system instead of
|
|
2PC. Using total order messaging, you can ensure all changes are
|
|
delivered to all servers in the
|
|
replica in the same order. This group communication system also allows
|
|
failures to be detected
|
|
while other servers in the replica continue processing.
|
|
|
|
A few of us are working with this theory, and trying to integrate with
|
|
7.2. There is a working
|
|
model for 6.4, but its very limited. (insert, update, and deletes) We
|
|
are currently hosted at
|
|
|
|
http://gborg.postgresql.org/project/pgreplication/projdisplay.php
|
|
But the site has been down the last 2 days. I've contacted the web
|
|
master, but haven't seen
|
|
any results yet. If any one knows what going on with gborg, I'd
|
|
appreciate a status.
|
|
|
|
Darren
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 2: you can get off all lists at once with the unregister command
|
|
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
|
|
|
From pgsql-hackers-owner+M18617=candle.pha.pa.us=pgman@postgresql.org Fri Feb 8 06:20:44 2002
|
|
Return-path: <pgsql-hackers-owner+M18617=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g18BKhP06132
|
|
for <pgman@candle.pha.pa.us>; Fri, 8 Feb 2002 06:20:43 -0500 (EST)
|
|
Received: (qmail 90815 invoked by alias); 8 Feb 2002 11:20:40 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 8 Feb 2002 11:20:40 -0000
|
|
Received: from laptop.kieser.demon.co.uk (kieser.demon.co.uk [62.49.6.72])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g18B9ZE89589
|
|
for <pgsql-hackers@postgresql.org>; Fri, 8 Feb 2002 06:09:36 -0500 (EST)
|
|
(envelope-from brad@kieser.net)
|
|
Received: from laptop.kieser.demon.co.uk (localhost.localdomain [127.0.0.1])
|
|
by laptop.kieser.demon.co.uk (Postfix) with SMTP
|
|
id 598393A132; Fri, 8 Feb 2002 11:09:36 +0000 (GMT)
|
|
From: Bradley Kieser <brad@kieser.net>
|
|
Date: Fri, 08 Feb 2002 11:09:36 GMT
|
|
Message-ID: <20020208.11093600@laptop.kieser.demon.co.uk>
|
|
Subject: Re: [HACKERS] Replication
|
|
To: Darren Johnson <darren.johnson@cox.net>
|
|
cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
|
In-Reply-To: <3C636232.6060206@cox.net>
|
|
References: <Pine.LNX.4.33.0202071735360.6435-100000@pcNavYkfAdm1.ykf.navtechinc.com> <3C636232.6060206@cox.net>
|
|
X-Mailer: Mozilla/3.0 (compatible; StarOffice/5.2;Linux)
|
|
X-Priority: 3 (Normal)
|
|
MIME-Version: 1.0
|
|
Content-Type: text/plain; charset=ISO-8859-1
|
|
Content-Transfer-Encoding: 8bit
|
|
X-MIME-Autoconverted: from quoted-printable to 8bit by postgresql.org id g18BJoF90352
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
Darren,
|
|
Given that different replication strategies will probably be developed
|
|
for PG, do you envisage DBAs to be able to select the type of replication
|
|
for their installation? I.e. Replication being selectable rther like
|
|
storage structures?
|
|
|
|
Would be a killer bit of flexibility, given how enormous the impact of
|
|
replication will be to corporate adoption of PG.
|
|
|
|
Brad
|
|
|
|
|
|
>>>>>>>>>>>>>>>>>> Original Message <<<<<<<<<<<<<<<<<<
|
|
|
|
On 2/8/02, 5:29:22 AM, Darren Johnson <darren.johnson@cox.net> wrote
|
|
regarding Re: [HACKERS] Replication:
|
|
|
|
|
|
> >
|
|
> > The problems with two stage commit type approches to replication are
|
|
|
|
> IMHO the biggest problem with two phased commit is it doesn't scale.
|
|
> The more servers
|
|
> you add to the replica the slower it goes. Also there's the potential
|
|
> for dead locks across
|
|
> server boundaries.
|
|
|
|
> >
|
|
> > 2) All of the databases must be able to communicate with each other at
|
|
> > all times in order for any edits to work. If the servers are
|
|
> > connected over some sort of WAN that periodically has short outages this
|
|
> > is a problem. Also if your using replication because you want to be
|
|
> able
|
|
> > to take down one of the databases for short periods of time without
|
|
> > bringing down the others your in trouble.
|
|
|
|
> All true for two phased commit protocol. To have multi master
|
|
> replication, you must have all
|
|
> systems communicating, but you can use a multicast group communication
|
|
> system instead of
|
|
> 2PC. Using total order messaging, you can ensure all changes are
|
|
> delivered to all servers in the
|
|
> replica in the same order. This group communication system also allows
|
|
> failures to be detected
|
|
> while other servers in the replica continue processing.
|
|
|
|
> A few of us are working with this theory, and trying to integrate with
|
|
> 7.2. There is a working
|
|
> model for 6.4, but its very limited. (insert, update, and deletes) We
|
|
> are currently hosted at
|
|
|
|
> http://gborg.postgresql.org/project/pgreplication/projdisplay.php
|
|
> But the site has been down the last 2 days. I've contacted the web
|
|
> master, but haven't seen
|
|
> any results yet. If any one knows what going on with gborg, I'd
|
|
> appreciate a status.
|
|
|
|
> Darren
|
|
|
|
|
|
> ---------------------------(end of broadcast)---------------------------
|
|
> TIP 2: you can get off all lists at once with the unregister command
|
|
> (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
|
|
|
|
From pgsql-hackers-owner+M18642=candle.pha.pa.us=pgman@postgresql.org Fri Feb 8 12:40:36 2002
|
|
Return-path: <pgsql-hackers-owner+M18642=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g18HeZP08450
|
|
for <pgman@candle.pha.pa.us>; Fri, 8 Feb 2002 12:40:35 -0500 (EST)
|
|
Received: (qmail 74089 invoked by alias); 8 Feb 2002 17:40:30 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 8 Feb 2002 17:40:30 -0000
|
|
Received: from lakemtao03.mgt.cox.net (mtao3.east.cox.net [68.1.17.242])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g18HbwE73437
|
|
for <pgsql-hackers@postgresql.org>; Fri, 8 Feb 2002 12:37:58 -0500 (EST)
|
|
(envelope-from darren.johnson@cox.net)
|
|
Received: from cox.net ([68.10.181.230]) by lakemtao03.mgt.cox.net
|
|
(InterMail vM.5.01.04.05 201-253-122-122-105-20011231) with ESMTP
|
|
id <20020208173804.DKQS6710.lakemtao03.mgt.cox.net@cox.net>;
|
|
Fri, 8 Feb 2002 12:38:04 -0500
|
|
Message-ID: <3C63FB71.206@cox.net>
|
|
Date: Fri, 08 Feb 2002 11:23:13 -0500
|
|
From: Darren Johnson <darren.johnson@cox.net>
|
|
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; m18) Gecko/20010131 Netscape6/6.01
|
|
X-Accept-Language: en
|
|
MIME-Version: 1.0
|
|
To: Bradley Kieser <brad@kieser.net>
|
|
cc: pgsql-hackers@postgresql.org
|
|
Subject: Re: [HACKERS] Replication
|
|
References: <Pine.LNX.4.33.0202071735360.6435-100000@pcNavYkfAdm1.ykf.navtechinc.com> <3C636232.6060206@cox.net> <20020208.11093600@laptop.kieser.demon.co.uk>
|
|
Content-Type: text/plain; charset=us-ascii; format=flowed
|
|
Content-Transfer-Encoding: 7bit
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
>
|
|
> Given that different replication strategies will probably be developed
|
|
> for PG, do you envisage DBAs to be able to select the type of replication
|
|
> for their installation? I.e. Replication being selectable rther like
|
|
> storage structures?
|
|
|
|
I can't speak for other replication solutions, but we are using the
|
|
--with-replication or
|
|
-r parameter when starting postmaster. Some day I hope there will be
|
|
parameters for
|
|
master/slave partial/full and sync/async, but it will be some time
|
|
before we cross those
|
|
bridges.
|
|
|
|
Darren
|
|
|
|
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 6: Have you searched our list archives?
|
|
|
|
http://archives.postgresql.org
|
|
|
|
From pgsql-hackers-owner+M18658=candle.pha.pa.us=pgman@postgresql.org Fri Feb 8 14:42:40 2002
|
|
Return-path: <pgsql-hackers-owner+M18658=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g18JgdP28166
|
|
for <pgman@candle.pha.pa.us>; Fri, 8 Feb 2002 14:42:39 -0500 (EST)
|
|
Received: (qmail 18650 invoked by alias); 8 Feb 2002 19:42:39 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 8 Feb 2002 19:42:39 -0000
|
|
Received: from enigma.trueimpact.net (enigma.trueimpact.net [209.82.45.201])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g18JYBE17341
|
|
for <pgsql-hackers@postgresql.org>; Fri, 8 Feb 2002 14:34:11 -0500 (EST)
|
|
(envelope-from rjonasz@trueimpact.com)
|
|
Received: from nietzsche.trueimpact.net (unknown [209.82.45.200])
|
|
by enigma.trueimpact.net (Postfix) with ESMTP id A785066B04
|
|
for <pgsql-hackers@postgresql.org>; Fri, 8 Feb 2002 14:33:28 -0500 (EST)
|
|
Date: Fri, 8 Feb 2002 14:34:34 -0500 (EST)
|
|
From: Randall Jonasz <rjonasz@trueimpact.com>
|
|
X-X-Sender: <rjonasz@nietzsche.trueimpact.net>
|
|
To: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
|
Subject: Re: [HACKERS] Replication
|
|
In-Reply-To: <3C627887.CC9FF837@mohawksoft.com>
|
|
Message-ID: <20020208142932.H6545-100000@nietzsche.trueimpact.net>
|
|
MIME-Version: 1.0
|
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
I've been looking into database replication theory lately and have found
|
|
some interesting papers discussing various approaches. (Here's
|
|
one paper that struck me as being very helpful,
|
|
http://citeseer.nj.nec.com/460405.html ) So far I favour an
|
|
eager replication system which is predicated on a read local/write all
|
|
available. The system should not depend on two phase commit or primary
|
|
copy algorithms. The former leads to the whole system being as quick as
|
|
the slowest machine. In addition, 2 phase commit involves 2n messages for
|
|
each transaction which does not scale well at all. This idea will also
|
|
have to take into account a crashed node which did not ack a transaction.
|
|
The primary copy algorithms I've seen suffer from a single point of
|
|
failure and potential bottlenecks at the primary node.
|
|
|
|
Instead I like the master to master or peer to peer algorithm as discussed
|
|
in the above paper. This approach accounts for network partitions, nodes
|
|
leaving and joining a cluster and the ability to commit a transaction once
|
|
the communication module has determined the total order of the said
|
|
transaction, i.e. no need for waiting for acks. This scales well and
|
|
research has shown it to increase the number of transactions/second a
|
|
database cluster can handle over a single node.
|
|
|
|
Postgres-R is another interesting approach which I think should be taken
|
|
seriously. Anyone interested can read a paper on this at
|
|
http://citeseer.nj.nec.com/330257.html
|
|
|
|
Anyways, my two cents
|
|
|
|
Randall Jonasz
|
|
Software Engineer
|
|
Click2net Inc.
|
|
|
|
|
|
On Thu, 7 Feb 2002, mlw wrote:
|
|
|
|
> Gavin Sherry wrote:
|
|
> > Naturally, this would slow down writes to the system (possibly a lot
|
|
> > depending on the performance difference between the executing machine and
|
|
> > the least powerful machine in the cluster), but most usages of postgresql
|
|
> > are read intensive, not write.
|
|
> >
|
|
> > Any reason this model would not work?
|
|
>
|
|
> What, then is the purpose of replication to multiple masters?
|
|
>
|
|
> I can think of only two reasons why you want replication. (1) Redundancy, make
|
|
> sure that if one server dies, then another server has the same data and is used
|
|
> seamlessly. (2) Increase performance over one system.
|
|
>
|
|
> In reason (1) I submit that a server load balance which sits on top of
|
|
> PostgreSQL, and executes writes on both servers while distributing reads would
|
|
> be best. This is a HUGE project. The load balancer must know EXACTLY how the
|
|
> system is configured, which includes all functions and everything.
|
|
>
|
|
> In reason (2) your system would fail to provide the scalability that would be
|
|
> needed. If writes take a long time, but reads are fine, what is the difference
|
|
> between the trigger based replicator?
|
|
>
|
|
> I have in the back of my mind, an idea of patching into the WAL stuff, and
|
|
> using that mechanism to push changes out to the slaves.
|
|
>
|
|
> Where one machine is still the master, but no trigger stuff, just a WAL patch.
|
|
> Perhaps some shared memory paradigm to manage WAL visibility? I'm not sure
|
|
> exactly, the idea hasn't completely formed yet.
|
|
>
|
|
> ---------------------------(end of broadcast)---------------------------
|
|
> TIP 5: Have you checked our extensive FAQ?
|
|
>
|
|
> http://www.postgresql.org/users-lounge/docs/faq.html
|
|
>
|
|
>
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 5: Have you checked our extensive FAQ?
|
|
|
|
http://www.postgresql.org/users-lounge/docs/faq.html
|
|
|
|
From pgsql-hackers-owner+M18660=candle.pha.pa.us=pgman@postgresql.org Fri Feb 8 15:20:32 2002
|
|
Return-path: <pgsql-hackers-owner+M18660=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g18KKSP03731
|
|
for <pgman@candle.pha.pa.us>; Fri, 8 Feb 2002 15:20:29 -0500 (EST)
|
|
Received: (qmail 28961 invoked by alias); 8 Feb 2002 20:20:27 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 8 Feb 2002 20:20:27 -0000
|
|
Received: from inflicted.crimelabs.net (crimelabs.net [66.92.101.112])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g18KC7E27667
|
|
for <pgsql-hackers@postgresql.org>; Fri, 8 Feb 2002 15:12:07 -0500 (EST)
|
|
(envelope-from bpalmer@crimelabs.net)
|
|
Received: from mizer.crimelabs.net (mizer.crimelabs.net [192.168.88.10])
|
|
by inflicted.crimelabs.net (Postfix) with ESMTP
|
|
id 1066F8787; Fri, 8 Feb 2002 15:12:08 -0500 (EST)
|
|
Date: Fri, 8 Feb 2002 15:12:00 -0500 (EST)
|
|
From: bpalmer <bpalmer@crimelabs.net>
|
|
To: Randall Jonasz <rjonasz@trueimpact.com>
|
|
cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
|
Subject: Re: [HACKERS] Replication
|
|
In-Reply-To: <20020208142932.H6545-100000@nietzsche.trueimpact.net>
|
|
Message-ID: <Pine.BSO.4.43.0202081510130.21860-100000@mizer.crimelabs.net>
|
|
MIME-Version: 1.0
|
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
I've not looked at the first paper, but I wil.
|
|
|
|
> Postgres-R is another interesting approach which I think should be taken
|
|
> seriously. Anyone interested can read a paper on this at
|
|
> http://citeseer.nj.nec.com/330257.html
|
|
|
|
I would point you to the info on gborg, but it seems to be down at the
|
|
moment.
|
|
|
|
- Brandon
|
|
|
|
----------------------------------------------------------------------------
|
|
c: 646-456-5455 h: 201-798-4983
|
|
b. palmer, bpalmer@crimelabs.net pgp:crimelabs.net/bpalmer.pgp5
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 3: if posting/reading through Usenet, please send an appropriate
|
|
subscribe-nomail command to majordomo@postgresql.org so that your
|
|
message can get through to the mailing list cleanly
|
|
|
|
From pgsql-hackers-owner+M18666=candle.pha.pa.us=pgman@postgresql.org Fri Feb 8 17:41:03 2002
|
|
Return-path: <pgsql-hackers-owner+M18666=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g18Mf2P18046
|
|
for <pgman@candle.pha.pa.us>; Fri, 8 Feb 2002 17:41:03 -0500 (EST)
|
|
Received: (qmail 63057 invoked by alias); 8 Feb 2002 22:41:02 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 8 Feb 2002 22:41:02 -0000
|
|
Received: from lakemtao03.mgt.cox.net (mtao3.east.cox.net [68.1.17.242])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g18MR9E60361
|
|
for <pgsql-hackers@postgresql.org>; Fri, 8 Feb 2002 17:27:11 -0500 (EST)
|
|
(envelope-from darren.johnson@cox.net)
|
|
Received: from cox.net ([68.10.181.230]) by lakemtao03.mgt.cox.net
|
|
(InterMail vM.5.01.04.05 201-253-122-122-105-20011231) with ESMTP
|
|
id <20020208222634.GTRG6710.lakemtao03.mgt.cox.net@cox.net>;
|
|
Fri, 8 Feb 2002 17:26:34 -0500
|
|
Message-ID: <3C643F0F.70303@cox.net>
|
|
Date: Fri, 08 Feb 2002 16:11:43 -0500
|
|
From: Darren Johnson <darren.johnson@cox.net>
|
|
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; m18) Gecko/20010131 Netscape6/6.01
|
|
X-Accept-Language: en
|
|
MIME-Version: 1.0
|
|
To: Randall Jonasz <rjonasz@trueimpact.com>
|
|
cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
|
Subject: Re: [HACKERS] Replication
|
|
References: <20020208142932.H6545-100000@nietzsche.trueimpact.net>
|
|
Content-Type: text/plain; charset=us-ascii; format=flowed
|
|
Content-Transfer-Encoding: 7bit
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
|
|
> I've been looking into database replication theory lately and have found
|
|
> some interesting papers discussing various approaches. (Here's
|
|
> one paper that struck me as being very helpful,
|
|
> http://citeseer.nj.nec.com/460405.html )
|
|
|
|
|
|
Here is another one from that same group, that addresses the WAN issues.
|
|
|
|
> http://www.cnds.jhu.edu/pub/papers/cnds-2002-1.pdf
|
|
|
|
|
|
enjoy,
|
|
|
|
Darren
|
|
|
|
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
|
|
|
|
From pgsql-hackers-owner+M18674=candle.pha.pa.us=pgman@postgresql.org Fri Feb 8 19:20:30 2002
|
|
Return-path: <pgsql-hackers-owner+M18674=candle.pha.pa.us=pgman@postgresql.org>
|
|
Received: from server1.pgsql.org (www.postgresql.org [64.49.215.9])
|
|
by candle.pha.pa.us (8.11.6/8.10.1) with SMTP id g190KTP26980
|
|
for <pgman@candle.pha.pa.us>; Fri, 8 Feb 2002 19:20:29 -0500 (EST)
|
|
Received: (qmail 88124 invoked by alias); 9 Feb 2002 00:20:27 -0000
|
|
Received: from unknown (HELO postgresql.org) (64.49.215.8)
|
|
by www.postgresql.org with SMTP; 9 Feb 2002 00:20:27 -0000
|
|
Received: from localhost.localdomain (bgp01077650bgs.wanarb01.mi.comcast.net [68.40.135.112])
|
|
by postgresql.org (8.11.3/8.11.4) with ESMTP id g190H3E87489
|
|
for <pgsql-hackers@postgresql.org>; Fri, 8 Feb 2002 19:17:03 -0500 (EST)
|
|
(envelope-from camber@ais.org)
|
|
Received: from localhost (camber@localhost)
|
|
by localhost.localdomain (8.11.6/8.11.6) with ESMTP id g190H0P18427;
|
|
Fri, 8 Feb 2002 19:17:00 -0500
|
|
X-Authentication-Warning: localhost.localdomain: camber owned process doing -bs
|
|
Date: Fri, 8 Feb 2002 19:17:00 -0500 (EST)
|
|
From: Brian Bruns <camber@ais.org>
|
|
X-X-Sender: <camber@localhost.localdomain>
|
|
To: Randall Jonasz <rjonasz@trueimpact.com>
|
|
cc: PostgreSQL-development <pgsql-hackers@postgresql.org>
|
|
Subject: Re: [HACKERS] Replication
|
|
In-Reply-To: <20020208142932.H6545-100000@nietzsche.trueimpact.net>
|
|
Message-ID: <Pine.LNX.4.33.0202081904190.18420-100000@localhost.localdomain>
|
|
MIME-Version: 1.0
|
|
Content-Type: TEXT/PLAIN; charset=US-ASCII
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Status: OR
|
|
|
|
> > I have in the back of my mind, an idea of patching into the WAL stuff, and
|
|
> > using that mechanism to push changes out to the slaves.
|
|
> >
|
|
> > Where one machine is still the master, but no trigger stuff, just a WAL patch.
|
|
> > Perhaps some shared memory paradigm to manage WAL visibility? I'm not sure
|
|
> > exactly, the idea hasn't completely formed yet.
|
|
> >
|
|
|
|
FWIW, Sybase Replication Server does just such a thing.
|
|
|
|
They have a secondary log marker (prevents the log from truncating past
|
|
the oldest unreplicated transaction). A thread within the system called
|
|
the "rep agent" (but it use to be a separate process call the LTM), reads
|
|
the log and forwards it to the rep server, once the rep server has the
|
|
whole transaction and it is written to a stable device (aka synced to
|
|
disk) the rep server responds to the LTM telling him it's OK to move the
|
|
log marker forward.
|
|
|
|
Anyway, once the replication server proper has the transaction it uses a
|
|
publish/subscribe methodology to see who wants get the update.
|
|
|
|
Bidirectional replication is done by making two oneway replications. The
|
|
whole thing is table based, it marks the tables as replicated or not in
|
|
the database to save the trip to the repserver on un replicated tables.
|
|
|
|
Plus you can take parts of a database (replicate all rows where the
|
|
country is "us" to this server and all the rows with "uk" to that server).
|
|
Or opposite you can roll up smaller regional databases to bigger ones,
|
|
it's very flexible.
|
|
|
|
|
|
Cheers,
|
|
|
|
Brian
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 4: Don't 'kill -9' the postmaster
|
|
|