postgres/doc/TODO.detail/replication

From goran@kirra.net Mon Dec 20 14:30:54 1999
Received: from villa.bildbasen.se (villa.bildbasen.se [193.45.225.97])
	by candle.pha.pa.us (8.9.0/8.9.0) with SMTP id PAA29058
	for <pgman@candle.pha.pa.us>; Mon, 20 Dec 1999 15:30:17 -0500 (EST)
Received: (qmail 2485 invoked from network); 20 Dec 1999 20:29:53 -0000
Received: from a112.dial.kiruna.se (HELO kirra.net) (193.45.238.12)
  by villa.bildbasen.se with SMTP; 20 Dec 1999 20:29:53 -0000
Sender: goran
Message-ID: <385E9192.226CC37D@kirra.net>
Date: Mon, 20 Dec 1999 21:29:06 +0100
From: Goran Thyni <goran@kirra.net>
Organization: kirra.net
X-Mailer: Mozilla 4.6 [en] (X11; U; Linux 2.2.13 i586)
X-Accept-Language: sv, en
MIME-Version: 1.0
To: Bruce Momjian <pgman@candle.pha.pa.us>
CC: "neil d. quiogue" <nquiogue@ieee.org>,
        PostgreSQL-development <pgsql-hackers@postgreSQL.org>
Subject: Re: [HACKERS] Re: QUESTION: Replication
References: <199912201508.KAA20572@candle.pha.pa.us>
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
Status: OR

Bruce Momjian wrote:
> We need major work in this area, or at least a plan and an FAQ item.
> We are getting major questions on this, and I don't know enough even to
> make an FAQ item telling people their options.

My 2 cents, or 2 ören since I'm a Swede, on this:

It is pretty simple to build a replication with pg_dump, transfer,
empty replic and reload.
But if we want "live replicas" we better base our efforts on a
mechanism using WAL-logs to rollforward the replicas.

regards,
-----------------
Göran Thyni
On quiet nights you can hear Windows NT reboot!

From owner-pgsql-hackers@hub.org Fri Dec 24 10:01:18 1999
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA11295
	for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 11:01:17 -0500 (EST)
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.6 $) with ESMTP id KAA20310 for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 10:39:18 -0500 (EST)
Received: from localhost (majordom@localhost)
	by hub.org (8.9.3/8.9.3) with SMTP id KAA61760;
	Fri, 24 Dec 1999 10:31:13 -0500 (EST)
	(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Fri, 24 Dec 1999 10:30:48 -0500
Received: (from majordom@localhost)
	by hub.org (8.9.3/8.9.3) id KAA58879
	for pgsql-hackers-outgoing; Fri, 24 Dec 1999 10:29:51 -0500 (EST)
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from bocs170n.black-oak.COM ([38.149.137.131])
	by hub.org (8.9.3/8.9.3) with ESMTP id KAA58795
	for <pgsql-hackers@postgreSQL.org>; Fri, 24 Dec 1999 10:29:00 -0500 (EST)
	(envelope-from DWalker@black-oak.com)
From: DWalker@black-oak.com
To: pgsql-hackers@postgreSQL.org
Subject: [HACKERS] database replication
Date: Fri, 24 Dec 1999 10:27:59 -0500
Message-ID: <OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM>
X-Priority: 3 (Normal)
X-MIMETrack: Serialize by Router on notes01n/BOCS(Release 5.0.1|July 16, 1999) at 12/24/99
	10:28:01 AM
MIME-Version: 1.0
MIME-Version: 1.0
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Sender: owner-pgsql-hackers@postgreSQL.org
Status: OR

<P>I've been toying with the idea of implementing database replication for =
the last few days. &nbsp;The system I'm proposing will be a seperate progra=
m which can be run on any machine and will most likely be implemented in Py=
thon. &nbsp;What I'm looking for at this point are gaping holes in my think=
ing/logic/etc. &nbsp;Here's what I'm thinking...</P><P>&nbsp;</P><P>1) I wa=
nt to make this program an additional layer over PostgreSQL. &nbsp;I really=
 don't want to hack server code if I can get away with it. &nbsp;At this po=
int I don't feel I need to.</P><P>2) The replication system will need to ad=
d at least one field to each table in each database that needs to be replic=
ated. &nbsp;This field will be a date/time stamp which identifies the &quot=
;last update&quot; of the record. &nbsp;This field will be called PGR=5FTIM=
E for lack of a better name. &nbsp;Because this field will be used from wit=
hin programs and triggers it can be longer so as to not mistake it for a us=
er field.</P><P>3) For each table to be replicated the replication system w=
ill programatically add one plpgsql function and trigger to modify the PGR=
=5FTIME field on both UPDATEs and INSERTs. &nbsp;The name of this function =
and trigger will be along the lines of &lt;table=5Fname&gt;=5Freplication=
=5Fupdate=5Ftrigger and &lt;table=5Fname&gt;=5Freplication=5Fupdate=5Ffunct=
ion. &nbsp;The function is a simple two-line chunk of code to set the field=
 PGR=5FTIME equal to NOW. &nbsp;The trigger is called before each insert/up=
date. &nbsp;When looking at the Docs I see that times are stored in Zulu (G=
T) time. &nbsp;Because of this I don't have to worry about time zones and t=
he like. &nbsp;I need direction on this part (such as &quot;hey dummy, look=
 at page N of file X.&quot;).</P><P>4) At this point we have tables which c=
an, at a basic level, tell the replication system when they were last updat=
ed.</P><P>5) The replication system will have a database of its own to reco=
rd the last replication event, hold configuration, logs, etc. &nbsp;I'd pre=
fer to store the configuration in a PostgreSQL table but it could just as e=
asily be stored in a text file on the filesystem somewhere.</P><P>6) To han=
dle replication I basically check the local &quot;last replication time&quo=
t; and compare it against the remote PGR=5FTIME fields. &nbsp;If the remote=
 PGR=5FTIME is greater than the last replication time then change the local=
 copy of the database, otherwise, change the remote end of the database. &n=
bsp;At this point I don't have a way to know WHICH field changed between th=
e two replicas so either I do ROW level replication or I check each field. =
&nbsp;I check PGR=5FTIME to determine which field is the most current. &nbs=
p;Some fine tuning of this process will have to occur no doubt.</P><P>7) Th=
e commandline utility, fired off by something like cron, could run several =
times during the day -- command line parameters can be implemented to say P=
USH ALL CHANGES TO SERVER A, or PULL ALL CHANGES FROM SERVER B.</P><P>&nbsp=
;</P><P>Questions/Concerns:</P><P>1) How far do I go with this? &nbsp;Do I =
start manhandling the system catalogs (pg=5F* tables)?</P><P>2) As to #2 an=
d #3 above, I really don't like tools automagically changing my tables but =
at this point I don't see a way around it. &nbsp;I guess this is where the =
testing comes into play.</P><P>3) Security: the replication app will have t=
o have pretty good rights to the database so it can add the nessecary funct=
ions and triggers, modify table schema, etc. &nbsp;</P><P>&nbsp;</P><P>&nbs=
p; So, any &quot;you're insane and should run home to momma&quot; comments?=
</P><P>&nbsp;</P><P>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Damond=
</P><P></P>=

************

From owner-pgsql-hackers@hub.org Fri Dec 24 18:31:03 1999
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA26244
	for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 19:31:02 -0500 (EST)
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.6 $) with ESMTP id TAA12730 for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 19:30:05 -0500 (EST)
Received: from localhost (majordom@localhost)
	by hub.org (8.9.3/8.9.3) with SMTP id TAA57851;
	Fri, 24 Dec 1999 19:23:31 -0500 (EST)
	(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Fri, 24 Dec 1999 19:22:54 -0500
Received: (from majordom@localhost)
	by hub.org (8.9.3/8.9.3) id TAA57710
	for pgsql-hackers-outgoing; Fri, 24 Dec 1999 19:21:56 -0500 (EST)
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from Mail.austin.rr.com (sm2.texas.rr.com [24.93.35.55])
	by hub.org (8.9.3/8.9.3) with ESMTP id TAA57680
	for <pgsql-hackers@postgresql.org>; Fri, 24 Dec 1999 19:21:25 -0500 (EST)
	(envelope-from ELOEHR@austin.rr.com)
Received: from austin.rr.com ([24.93.40.248]) by Mail.austin.rr.com  with Microsoft SMTPSVC(5.5.1877.197.19);
  Fri, 24 Dec 1999 18:12:50 -0600
Message-ID: <38640E2D.75136600@austin.rr.com>
Date: Fri, 24 Dec 1999 18:22:05 -0600
From: Ed Loehr <ELOEHR@austin.rr.com>
X-Mailer: Mozilla 4.7 [en] (X11; U; Linux 2.2.12-20smp i686)
X-Accept-Language: en
MIME-Version: 1.0
To: DWalker@black-oak.com
CC: pgsql-hackers@postgreSQL.org
Subject: Re: [HACKERS] database replication
References: <OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-pgsql-hackers@postgreSQL.org
Status: OR

DWalker@black-oak.com wrote:

> 6) To handle replication I basically check the local "last
> replication time" and compare it against the remote PGR_TIME
> fields.  If the remote PGR_TIME is greater than the last replication
> time then change the local copy of the database, otherwise, change
> the remote end of the database.  At this point I don't have a way to
> know WHICH field changed between the two replicas so either I do ROW
> level replication or I check each field.  I check PGR_TIME to
> determine which field is the most current.  Some fine tuning of this
> process will have to occur no doubt.

Interesting idea.  I can see how this might sync up two databases
somehow.  For true replication, however, I would always want every
replicated database to be, at the very least, internally consistent
(i.e., referential integrity), even if it was a little behind on
processing transactions.  In this method, its not clear how
consistency is every achieved/guaranteed at any point in time if the
input stream of changes is continuous.  If the input stream ceased,
then I can see how this approach might eventually catch up and totally
resync everything, but it looks *very* computationally  expensive.

But I might have missed something.  How would internal consistency be
maintained?


> 7) The commandline utility, fired off by something like cron, could
> run several times during the day -- command line parameters can be
> implemented to say PUSH ALL CHANGES TO SERVER A, or PULL ALL CHANGES
> FROM SERVER B.

My two cents is that, while I can see this kind of database syncing as
valuable, this is not the kind of "replication" I had in mind.  This
may already possible by simply copying the database.  What replication
means to me is a live, continuously streaming sequence of updates from
one database to another where the replicated database is always
internally consistent, available for read-only queries, and never "too
far" out of sync with the source/primary database.

What does replication mean to others?

Cheers,
Ed Loehr


************

From owner-pgsql-hackers@hub.org Fri Dec 24 21:31:10 1999
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA02578
	for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 22:31:09 -0500 (EST)
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.6 $) with ESMTP id WAA16641 for <pgman@candle.pha.pa.us>; Fri, 24 Dec 1999 22:18:56 -0500 (EST)
Received: from localhost (majordom@localhost)
	by hub.org (8.9.3/8.9.3) with SMTP id WAA89135;
	Fri, 24 Dec 1999 22:11:12 -0500 (EST)
	(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Fri, 24 Dec 1999 22:10:56 -0500
Received: (from majordom@localhost)
	by hub.org (8.9.3/8.9.3) id WAA89019
	for pgsql-hackers-outgoing; Fri, 24 Dec 1999 22:09:59 -0500 (EST)
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from bocs170n.black-oak.COM ([38.149.137.131])
	by hub.org (8.9.3/8.9.3) with ESMTP id WAA88957;
	Fri, 24 Dec 1999 22:09:11 -0500 (EST)
	(envelope-from dwalker@black-oak.com)
Received: from gcx80 ([151.196.99.113])
          by bocs170n.black-oak.COM (Lotus Domino Release 5.0.1)
          with SMTP id 1999122422080835:6 ;
          Fri, 24 Dec 1999 22:08:08 -0500
Message-ID: <001b01bf4e9e$647287d0$af63a8c0@walkers.org>
From: "Damond Walker" <dwalker@black-oak.com>
To: <owner-pgsql-hackers@postgreSQL.org>
Cc: <pgsql-hackers@postgreSQL.org>
References: <OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM> <38640E2D.75136600@austin.rr.com>
Subject: Re: [HACKERS] database replication
Date: Fri, 24 Dec 1999 22:07:55 -0800
MIME-Version: 1.0
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.00.2314.1300
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
X-MIMETrack: Itemize by SMTP Server on notes01n/BOCS(Release 5.0.1|July 16, 1999) at 12/24/99
	10:08:09 PM,
	Serialize by Router on notes01n/BOCS(Release 5.0.1|July 16, 1999) at 12/24/99
	10:08:11 PM,
	Serialize complete at 12/24/99 10:08:11 PM
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-pgsql-hackers@postgreSQL.org
Status: OR

>
> Interesting idea.  I can see how this might sync up two databases
> somehow.  For true replication, however, I would always want every
> replicated database to be, at the very least, internally consistent
> (i.e., referential integrity), even if it was a little behind on
> processing transactions.  In this method, its not clear how
> consistency is every achieved/guaranteed at any point in time if the
> input stream of changes is continuous.  If the input stream ceased,
> then I can see how this approach might eventually catch up and totally
> resync everything, but it looks *very* computationally  expensive.
>

    What's the typical unit of work for the database?  Are we talking about
update transactions which span the entire DB?  Or are we talking about
updating maybe 1% or less of the database everyday?  I'd think it would be
more towards the latter than the former.  So, yes, this process would be
computationally expensive but how many records would actually have to be
sent back and forth?

> But I might have missed something.  How would internal consistency be
> maintained?
>

    Updates that occur at site A will be moved to site B and vice versa.
Consistency would be maintained.  The only problem that I can see right off
the bat would be what if site A and site B made changes to a row and then
site C was brought into the picture?  Which one wins?

    Someone *has* to win when it comes to this type of thing.  You really
DON'T want to start merging row changes...

>
> My two cents is that, while I can see this kind of database syncing as
> valuable, this is not the kind of "replication" I had in mind.  This
> may already possible by simply copying the database.  What replication
> means to me is a live, continuously streaming sequence of updates from
> one database to another where the replicated database is always
> internally consistent, available for read-only queries, and never "too
> far" out of sync with the source/primary database.
>

    Sounds like you're talking about distributed transactions to me.  That's
an entirely different subject all-together.  What you describe can be done
by copying a database...but as you say, this would only work in a read-only
situation.


                Damond


************

From owner-pgsql-hackers@hub.org Sat Dec 25 16:35:07 1999
Received: from hub.org (hub.org [216.126.84.1])
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA28890
	for <pgman@candle.pha.pa.us>; Sat, 25 Dec 1999 17:35:05 -0500 (EST)
Received: from localhost (majordom@localhost)
	by hub.org (8.9.3/8.9.3) with SMTP id RAA86997;
	Sat, 25 Dec 1999 17:29:10 -0500 (EST)
	(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Sat, 25 Dec 1999 17:28:09 -0500
Received: (from majordom@localhost)
	by hub.org (8.9.3/8.9.3) id RAA86863
	for pgsql-hackers-outgoing; Sat, 25 Dec 1999 17:27:11 -0500 (EST)
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from mtiwmhc08.worldnet.att.net (mtiwmhc08.worldnet.att.net [204.127.131.19])
	by hub.org (8.9.3/8.9.3) with ESMTP id RAA86798
	for <pgsql-hackers@postgreSQL.org>; Sat, 25 Dec 1999 17:26:34 -0500 (EST)
	(envelope-from pgsql@rkirkpat.net)
Received: from [192.168.3.100] ([12.74.72.219])
          by mtiwmhc08.worldnet.att.net (InterMail v03.02.07.07 118-134)
          with ESMTP id <19991225222554.VIOL28505@[12.74.72.219]>;
          Sat, 25 Dec 1999 22:25:54 +0000
Date: Sat, 25 Dec 1999 15:25:47 -0700 (MST)
From: Ryan Kirkpatrick <pgsql@rkirkpat.net>
X-Sender: rkirkpat@excelsior.rkirkpat.net
To: DWalker@black-oak.com
cc: pgsql-hackers@postgreSQL.org
Subject: Re: [HACKERS] database replication
In-Reply-To: <OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM>
Message-ID: <Pine.LNX.4.10.9912251433310.1551-100000@excelsior.rkirkpat.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-pgsql-hackers@postgreSQL.org
Status: OR

On Fri, 24 Dec 1999 DWalker@black-oak.com wrote:

> I've been toying with the idea of implementing database replication
> for the last few days.

	I too have been thinking about this some over the last year or
two, just trying to find a quick and easy way to do it. I am not so
interested in replication, as in synchronization, as in between a desktop
machine and a laptop, so I can keep the databases on each in sync with
each other. For this sort of purpose, both the local and remote databases
would be "idle" at the time of syncing.

> 2) The replication system will need to add at least one field to each
> table in each database that needs to be replicated. This field will be
> a date/time stamp which identifies the "last update" of the record.
> This field will be called PGR_TIME for lack of a better name.
> Because this field will be used from within programs and triggers it
> can be longer so as to not mistake it for a user field.

	How about a single, seperate table with the fields of 'database',
'tablename', 'oid', 'last_changed', that would store the same data as your
PGR_TIME field. It would be seperated from the actually data tables, and
therefore would be totally transparent to any database interface
applications. The 'oid' field would hold each row's OID, a nice, unique
identification number for the row, while the other fields would tell which
table and database the oid is in. Then this table can be compared with the
this table on a remote machine to quickly find updates and changes, then
each differences can be dealt with in turn.

> 3) For each table to be replicated the replication system will
> programatically add one plpgsql function and trigger to modify the
> PGR_TIME field on both UPDATEs and INSERTs.  The name of this function
> and trigger will be along the lines of
> <table_name>_replication_update_trigger and
> <table_name>_replication_update_function.  The function is a simple
> two-line chunk of code to set the field PGR_TIME equal to NOW.  The
> trigger is called before each insert/update.  When looking at the Docs
> I see that times are stored in Zulu (GT) time.  Because of this I
> don't have to worry about time zones and the like.  I need direction
> on this part (such as "hey dummy, look at page N of file X.").

	I like this idea, better than any I have come up with yet. Though,
how are you going to handle DELETEs?

> 6) To handle replication I basically check the local "last replication
> time" and compare it against the remote PGR_TIME fields.  If the
> remote PGR_TIME is greater than the last replication time then change
> the local copy of the database, otherwise, change the remote end of
> the database.  At this point I don't have a way to know WHICH field
> changed between the two replicas so either I do ROW level replication
> or I check each field.  I check PGR_TIME to determine which field is
> the most current.  Some fine tuning of this process will have to occur
> no doubt.

	Yea, this is indeed the sticky part, and would indeed require some
fine-tunning. Basically, the way I see it, is if the two timestamps for a
single row do not match (or even if the row and therefore timestamp is
missing on one side or the other altogether):
	local ts > remote ts => Local row is exported to remote.
	remote ts > local ts => Remote row is exported to local.
	local ts > last sync time && no remote ts =>
		Local row is inserted on remote.
	local ts < last sync time && no remote ts =>
		Local row is deleted.
	remote ts > last sync time && no local ts =>
		Remote row is inserted on local.
	remote ts < last sync time && no local ts =>
		Remote row is deleted.
where the synchronization process is running on the local machine. By
exported, I mean the local values are sent to the remote machine, and the
row on that remote machine is updated to the local values. How does this
sound?

> 7) The commandline utility, fired off by something like cron, could
> run several times during the day -- command line parameters can be
> implemented to say PUSH ALL CHANGES TO SERVER A, or PULL ALL CHANGES
> FROM SERVER B.

	Or run manually for my purposes. Also, maybe follow it
with a vacuum run on both sides for all databases, as this is going to
potenitally cause lots of table changes that could stand with a cleanup.

> 1) How far do I go with this?  Do I start manhandling the system catalogs (pg_* tables)?

	Initially, I would just stick to user table data... If you have
changes in triggers and other meta-data/executable code, you are going to
want to make syncs of that stuff manually anyway. At least I would want
to.

> 2) As to #2 and #3 above, I really don't like tools automagically
> changing my tables but at this point I don't see a way around it.  I
> guess this is where the testing comes into play.

	Hence the reason for the seperate table with just a row's
identification and last update time. Only modifications to the synced
database is the update trigger, which should be pretty harmless.

> 3) Security: the replication app will have to have pretty good rights
> to the database so it can add the nessecary functions and triggers,
> modify table schema, etc.

	Just run the sync program as the postgres super user, and there
are no problems. :)

>   So, any "you're insane and should run home to momma" comments?

	No, not at all. Though it probably should be remaned from
replication to synchronization. The former is usually associated with a
continuous stream of updates between the local and remote databases, so
they are almost always in sync, and have a queuing ability if their
connection is loss for span of time as well. Very complex and difficult to
implement, and would require hacking server code. :( Something only Sybase
and Oracle have (as far as I know), and from what I have seen of Sybase's
replication server support (dated by 5yrs) it was a pain to setup and get
running correctly.
	The latter, synchronization, is much more managable, and can still
be useful, especially when you have a large database you want in two
places, mainly for read only purposes at one end or the other, but don't
want to waste the time/bandwidth to move and load the entire database each
time it changes on one end or the other. Same idea as mirroring software
for FTP sites, just transfers the changes, and nothing more.
	I also like the idea of using Python. I have been using it
recently for some database interfaces (to PostgreSQL of course :), and it
is a very nice language to work with. Some worries about performance of
the program though, as python is only an interpreted lanuage, and I have
yet to really be impressed with the speed of execution of my database
interfaces yet.
	Anyway, it sound like a good project, and finally one where I
actually have a clue of what is going on, and the skills to help. So, if
you are interested in pursing this project, I would be more than glad to
help. TTYL.

---------------------------------------------------------------------------
|   "For to me to live is Christ, and to die is gain."                    |
|                                            --- Philippians 1:21 (KJV)   |
---------------------------------------------------------------------------
|   Ryan Kirkpatrick  |  Boulder, Colorado  |  http://www.rkirkpat.net/   |
---------------------------------------------------------------------------


************

From owner-pgsql-hackers@hub.org Sun Dec 26 08:31:09 1999
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA17976
	for <pgman@candle.pha.pa.us>; Sun, 26 Dec 1999 09:31:07 -0500 (EST)
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.6 $) with ESMTP id JAA23337 for <pgman@candle.pha.pa.us>; Sun, 26 Dec 1999 09:28:36 -0500 (EST)
Received: from localhost (majordom@localhost)
	by hub.org (8.9.3/8.9.3) with SMTP id JAA90738;
	Sun, 26 Dec 1999 09:21:58 -0500 (EST)
	(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Sun, 26 Dec 1999 09:19:19 -0500
Received: (from majordom@localhost)
	by hub.org (8.9.3/8.9.3) id JAA90498
	for pgsql-hackers-outgoing; Sun, 26 Dec 1999 09:18:21 -0500 (EST)
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from bocs170n.black-oak.COM ([38.149.137.131])
	by hub.org (8.9.3/8.9.3) with ESMTP id JAA90452
	for <pgsql-hackers@postgreSQL.org>; Sun, 26 Dec 1999 09:17:54 -0500 (EST)
	(envelope-from dwalker@black-oak.com)
Received: from vmware98 ([151.196.99.113])
          by bocs170n.black-oak.COM (Lotus Domino Release 5.0.1)
          with SMTP id 1999122609164808:7 ;
          Sun, 26 Dec 1999 09:16:48 -0500
Message-ID: <002201bf4fb3$623f0220$b263a8c0@vmware98.walkers.org>
From: "Damond Walker" <dwalker@black-oak.com>
To: "Ryan Kirkpatrick" <pgsql@rkirkpat.net>
Cc: <pgsql-hackers@postgreSQL.org>
Subject: Re: [HACKERS] database replication
Date: Sun, 26 Dec 1999 10:10:41 -0500
MIME-Version: 1.0
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 4.72.3110.1
X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3110.3
X-MIMETrack: Itemize by SMTP Server on notes01n/BOCS(Release 5.0.1|July 16, 1999) at 12/26/99
	09:16:51 AM,
	Serialize by Router on notes01n/BOCS(Release 5.0.1|July 16, 1999) at 12/26/99
	09:16:54 AM,
	Serialize complete at 12/26/99 09:16:54 AM
Content-Transfer-Encoding: 7bit
Content-Type: text/plain;
	charset="iso-8859-1"
Sender: owner-pgsql-hackers@postgreSQL.org
Status: OR

>
>     I too have been thinking about this some over the last year or
>two, just trying to find a quick and easy way to do it. I am not so
>interested in replication, as in synchronization, as in between a desktop
>machine and a laptop, so I can keep the databases on each in sync with
>each other. For this sort of purpose, both the local and remote databases
>would be "idle" at the time of syncing.
>

    I don't think it would matter if the databases are idle or not to be
honest with you.  At any single point in time when you replicate I'd figure
that the database would be in a consistent state.  So, you should be able to
replicate (or sync) a remote database that is in use.  After all, you're
getting a snapshot of the database as it stands at 8:45 PM.  At 8:46 PM it
may be totally different...but the next time syncing takes place those
changes would appear in your local copy.

    The one problem you may run into is if the remote host is running a
large batch process.  It's very likely that you will get 50% of their
changes when you replicate...but then again, that's why you can schedule the
event to work around such things.

>     How about a single, seperate table with the fields of 'database',
>'tablename', 'oid', 'last_changed', that would store the same data as your
>PGR_TIME field. It would be seperated from the actually data tables, and
>therefore would be totally transparent to any database interface
>applications. The 'oid' field would hold each row's OID, a nice, unique
>identification number for the row, while the other fields would tell which
>table and database the oid is in. Then this table can be compared with the
>this table on a remote machine to quickly find updates and changes, then
>each differences can be dealt with in turn.
>

    The problem with OID's is that they are unique at the local level but if
you try and use them between servers you can run into overlap.  Also, if a
database is under heavy use this table could quickly become VERY large.  Add
indexes to this table to help performance and you're taking up even more
disk space.

    Using the PGR_TIME field with an index will allow us to find rows which
have changed VERY quickly.  All we need to do now is somehow programatically
find the primary key for a table so the person setting up replication (or
syncing) doesn't have to have an indepth knowledge of the schema in order to
setup a syncing schedule.

>
>     I like this idea, better than any I have come up with yet. Though,
>how are you going to handle DELETEs?
>

    Oops...how about defining a trigger for this?  With deletion I guess we
would have to move a flag into another table saying we deleted record 'X'
with this primary key from this table.

>
>     Yea, this is indeed the sticky part, and would indeed require some
>fine-tunning. Basically, the way I see it, is if the two timestamps for a
>single row do not match (or even if the row and therefore timestamp is
>missing on one side or the other altogether):
>     local ts > remote ts => Local row is exported to remote.
>     remote ts > local ts => Remote row is exported to local.
>     local ts > last sync time && no remote ts =>
>          Local row is inserted on remote.
>     local ts < last sync time && no remote ts =>
>          Local row is deleted.
>     remote ts > last sync time && no local ts =>
>          Remote row is inserted on local.
>     remote ts < last sync time && no local ts =>
>          Remote row is deleted.
>where the synchronization process is running on the local machine. By
>exported, I mean the local values are sent to the remote machine, and the
>row on that remote machine is updated to the local values. How does this
>sound?
>

    The replication part will be the most complex...that much is for
certain...

    I've been writing systems in Lotus Notes/Domino for the last year or so
and I've grown quite spoiled with what it can do in regards to replication.
It's not real-time but you have to gear your applications to this type of
thing (it's possible to create documents, fire off email to notify people of
changes and have the email arrive before the replicated documents do).
Replicating large Notes/Domino databases takes quite a while....I don't see
any kind of replication or syncing running in a blink of an eye.

    Having said that, a good algo will have to be written to cut down on
network traffic and to keep database conversations down to a minimum.  This
will be appreciated by people with low bandwidth connections I'm sure
(dial-ups, fractional T1's, etc).

>     Or run manually for my purposes. Also, maybe follow it
>with a vacuum run on both sides for all databases, as this is going to
>potenitally cause lots of table changes that could stand with a cleanup.
>

    What would a vacuum do to a system being used by many people?

>     No, not at all. Though it probably should be remaned from
>replication to synchronization. The former is usually associated with a
>continuous stream of updates between the local and remote databases, so
>they are almost always in sync, and have a queuing ability if their
>connection is loss for span of time as well. Very complex and difficult to
>implement, and would require hacking server code. :( Something only Sybase
>and Oracle have (as far as I know), and from what I have seen of Sybase's
>replication server support (dated by 5yrs) it was a pain to setup and get
>running correctly.

    It could probably be named either way...but the one thing I really don't
want to do is start hacking server code.  The PostgreSQL people have enough
to do without worrying about trying to meld anything I've done to their
server.   :)

    Besides, I like the idea of having it operate as a stand-alone product.
The only PostgreSQL feature we would require would be triggers and
plpgsql...what was the earliest version of PostgreSQL that supported
plpgsql?  Even then I don't see the triggers being that complex to boot.

>     I also like the idea of using Python. I have been using it
>recently for some database interfaces (to PostgreSQL of course :), and it
>is a very nice language to work with. Some worries about performance of
>the program though, as python is only an interpreted lanuage, and I have
>yet to really be impressed with the speed of execution of my database
>interfaces yet.

    The only thing we'd need for Python is the Python extensions for
PostgreSQL...which in turn requires libpq and that's about it.  So, it
should be able to run on any platform supported by Python and libpq.  Using
TK for the interface components will require NT people to get additional
software from the 'net.  At least it did with older version of Windows
Python.  Unix folks should be happy....assuming they have X running on the
machine doing the replication or syncing.  Even then I wrote a curses based
Python interface awhile back which allows buttons, progress bars, input
fields, etc (I called it tinter and it's available at
http://iximd.com/~dwalker).  It's a simple interface and could probably be
cleaned up a bit but it works.  :)

>     Anyway, it sound like a good project, and finally one where I
>actually have a clue of what is going on, and the skills to help. So, if
>you are interested in pursing this project, I would be more than glad to
>help. TTYL.
>


    That would be a Good Thing.  Have webspace somewhere?  If I can get
permission from the "powers that be" at the office I could host a website on
our (Domino) webserver.

                Damond


************

From owner-pgsql-hackers@hub.org Sun Dec 26 19:11:48 1999
Received: from hub.org (hub.org [216.126.84.1])
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA26661
	for <pgman@candle.pha.pa.us>; Sun, 26 Dec 1999 20:11:46 -0500 (EST)
Received: from localhost (majordom@localhost)
	by hub.org (8.9.3/8.9.3) with SMTP id UAA14959;
	Sun, 26 Dec 1999 20:08:15 -0500 (EST)
	(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Sun, 26 Dec 1999 20:07:27 -0500
Received: (from majordom@localhost)
	by hub.org (8.9.3/8.9.3) id UAA14820
	for pgsql-hackers-outgoing; Sun, 26 Dec 1999 20:06:28 -0500 (EST)
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from mtiwmhc02.worldnet.att.net (mtiwmhc02.worldnet.att.net [204.127.131.37])
	by hub.org (8.9.3/8.9.3) with ESMTP id UAA14749
	for <pgsql-hackers@postgreSQL.org>; Sun, 26 Dec 1999 20:05:39 -0500 (EST)
	(envelope-from rkirkpat@rkirkpat.net)
Received: from [192.168.3.100] ([12.74.72.56])
          by mtiwmhc02.worldnet.att.net (InterMail v03.02.07.07 118-134)
          with ESMTP id <19991227010506.WJVW1914@[12.74.72.56]>;
          Mon, 27 Dec 1999 01:05:06 +0000
Date: Sun, 26 Dec 1999 18:05:02 -0700 (MST)
From: Ryan Kirkpatrick <pgsql@rkirkpat.net>
X-Sender: rkirkpat@excelsior.rkirkpat.net
To: Damond Walker <dwalker@black-oak.com>
cc: pgsql-hackers@postgreSQL.org
Subject: Re: [HACKERS] database replication
In-Reply-To: <002201bf4fb3$623f0220$b263a8c0@vmware98.walkers.org>
Message-ID: <Pine.LNX.4.10.9912261742550.7666-100000@excelsior.rkirkpat.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-pgsql-hackers@postgreSQL.org
Status: OR

On Sun, 26 Dec 1999, Damond Walker wrote:

> >     How about a single, seperate table with the fields of 'database',
> >'tablename', 'oid', 'last_changed', that would store the same data as your
> >PGR_TIME field. It would be seperated from the actually data tables, and
...
>     The problem with OID's is that they are unique at the local level but if
> you try and use them between servers you can run into overlap.

	Yea, forgot about that point, but became dead obvious once you
mentioned it. Boy, I feel stupid now. :)

>     Using the PGR_TIME field with an index will allow us to find rows which
> have changed VERY quickly.  All we need to do now is somehow programatically
> find the primary key for a table so the person setting up replication (or
> syncing) doesn't have to have an indepth knowledge of the schema in order to
> setup a syncing schedule.

	Hmm... Yea, maybe look to see which field(s) has a primary, unique
index on it? Then use those field(s) as a primary key. Just require that
any table to be synchronized to have some set of fields that uniquely
identify each row. Either that, or add another field to each table with
our own, cross system consistent, identification system. Don't know which
would be more efficient and easier to work with.
	The former could potentially get sticky if it takes a lots of
fields to generate a unique key value, but has the smallest effect on the
table to be synced. The latter could be difficult to keep straight between
systems (local vs. remote), and would require a trigger on inserts to
generate a new, unique id number, that does not exist locally or
remotely (nasty issue there), but would remove the uniqueness
requirement.

>     Oops...how about defining a trigger for this?  With deletion I guess we
> would have to move a flag into another table saying we deleted record 'X'
> with this primary key from this table.

	Or, according to my logic below, if a row is missing on one side
or the other, then just compare the remaining row's timestamp to the last
synchronization time (stored in a seperate table/db elsewhere). The
results of the comparsion and the state of row existences tell one if the
row was inserted or deleted since the last sync, and what should be done
to perform the sync.

> >     Yea, this is indeed the sticky part, and would indeed require some
> >fine-tunning. Basically, the way I see it, is if the two timestamps for a
> >single row do not match (or even if the row and therefore timestamp is
> >missing on one side or the other altogether):
> >     local ts > remote ts => Local row is exported to remote.
> >     remote ts > local ts => Remote row is exported to local.
> >     local ts > last sync time && no remote ts =>
> >          Local row is inserted on remote.
> >     local ts < last sync time && no remote ts =>
> >          Local row is deleted.
> >     remote ts > last sync time && no local ts =>
> >          Remote row is inserted on local.
> >     remote ts < last sync time && no local ts =>
> >          Remote row is deleted.
> >where the synchronization process is running on the local machine. By
> >exported, I mean the local values are sent to the remote machine, and the
> >row on that remote machine is updated to the local values. How does this
> >sound?

>     Having said that, a good algo will have to be written to cut down on
> network traffic and to keep database conversations down to a minimum.  This
> will be appreciated by people with low bandwidth connections I'm sure
> (dial-ups, fractional T1's, etc).

	Of course! In reflection, the assigned identification number I
mentioned above might be the best then, instead of having to transfer the
entire set of key fields back and forth.

>     What would a vacuum do to a system being used by many people?

	Probably lock them out of tables while they are vacuumed... Maybe
not really required in the end, possibly optional?

>     It could probably be named either way...but the one thing I really don't
> want to do is start hacking server code.  The PostgreSQL people have enough
> to do without worrying about trying to meld anything I've done to their
> server.   :)

	Yea, they probably would appreciate that. They already have enough
on thier plate for 7.x as it is! :)

>     Besides, I like the idea of having it operate as a stand-alone product.
> The only PostgreSQL feature we would require would be triggers and
> plpgsql...what was the earliest version of PostgreSQL that supported
> plpgsql?  Even then I don't see the triggers being that complex to boot.

	No, provided that we don't do the identification number idea
(which the more I think about it, probably will not work). As for what
version support plpgsql, I don't know, one of the more hard-core pgsql
hackers can probably tell us that.

>     The only thing we'd need for Python is the Python extensions for
> PostgreSQL...which in turn requires libpq and that's about it.  So, it
> should be able to run on any platform supported by Python and libpq.

	Of course. If it ran on NT as well as Linux/Unix, that would be
even better. :)

> Unix folks should be happy....assuming they have X running on the
> machine doing the replication or syncing.  Even then I wrote a curses
> based Python interface awhile back which allows buttons, progress
> bars, input fields, etc (I called it tinter and it's available at
> http://iximd.com/~dwalker).  It's a simple interface and could
> probably be cleaned up a bit but it works.  :)

	Why would we want any type of GUI (X11 or curses) for this sync
program. I imagine just a command line program with a few options (local
machine, remote machine, db name, etc...), and nothing else.
	Though I will take a look at your curses interface, as I have been
wanting to make a curses interface to a few db interfaces I have, in a
simple as manner as possible.

>     That would be a Good Thing.  Have webspace somewhere?  If I can get
> permission from the "powers that be" at the office I could host a website on
> our (Domino) webserver.

	Yea, I got my own web server (www.rkirkpat.net) with 1GB+ of disk
space available, sitting on a decent speed DSL. Even can setup of a
virtual server if we want (i.e. pgsync.rkirkpat.net :). CVS repository,
email lists, etc... possible with some effort (and time).
	So, where should we start? TTYL.

	PS. The current pages on my web site are very out of date at the
moment (save for the pgsql information). I hope to have updated ones up
within the week.

---------------------------------------------------------------------------
|   "For to me to live is Christ, and to die is gain."                    |
|                                            --- Philippians 1:21 (KJV)   |
---------------------------------------------------------------------------
|   Ryan Kirkpatrick  |  Boulder, Colorado  |  http://www.rkirkpat.net/   |
---------------------------------------------------------------------------


************

From owner-pgsql-hackers@hub.org Mon Dec 27 12:33:32 1999
Received: from hub.org (hub.org [216.126.84.1])
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA24817
	for <pgman@candle.pha.pa.us>; Mon, 27 Dec 1999 13:33:29 -0500 (EST)
Received: from localhost (majordom@localhost)
	by hub.org (8.9.3/8.9.3) with SMTP id NAA53391;
	Mon, 27 Dec 1999 13:29:02 -0500 (EST)
	(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Mon, 27 Dec 1999 13:28:38 -0500
Received: (from majordom@localhost)
	by hub.org (8.9.3/8.9.3) id NAA53248
	for pgsql-hackers-outgoing; Mon, 27 Dec 1999 13:27:40 -0500 (EST)
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from gtv.ca (h139-142-238-17.cg.fiberone.net [139.142.238.17])
	by hub.org (8.9.3/8.9.3) with ESMTP id NAA53170
	for <pgsql-hackers@hub.org>; Mon, 27 Dec 1999 13:26:40 -0500 (EST)
	(envelope-from aaron@genisys.ca)
Received: from stilborne (24.67.90.252.ab.wave.home.com [24.67.90.252])
	by gtv.ca (8.9.3/8.8.7) with SMTP id MAA01200
	for <pgsql-hackers@hub.org>; Mon, 27 Dec 1999 12:36:39 -0700
From: "Aaron J. Seigo" <aaron@gtv.ca>
To: pgsql-hackers@hub.org
Subject: Re: [HACKERS] database replication
Date: Mon, 27 Dec 1999 11:23:19 -0700
X-Mailer: KMail [version 1.0.28]
Content-Type: text/plain
References: <199912271135.TAA10184@netrinsics.com>
In-Reply-To: <199912271135.TAA10184@netrinsics.com>
MIME-Version: 1.0
Message-Id: <99122711245600.07929@stilborne>
Content-Transfer-Encoding: 8bit
Sender: owner-pgsql-hackers@postgreSQL.org
Status: OR

hi..

> Before anyone starts implementing any database replication, I'd strongly
> suggest doing some research, first:
>
> http://sybooks.sybase.com:80/onlinebooks/group-rs/rsg1150e/rs_admin/@Generic__BookView;cs=default;ts=default

good idea, but perhaps sybase isn't the best study case.. here's some extremely
detailed online coverage of Oracle 8i's replication, from the oracle online
library:

http://bach.towson.edu/oracledocs/DOC/server803/A54651_01/toc.htm

--
Aaron J. Seigo
Sys Admin

************

From owner-pgsql-hackers@hub.org Thu Dec 30 08:01:09 1999
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id JAA10317
	for <pgman@candle.pha.pa.us>; Thu, 30 Dec 1999 09:01:08 -0500 (EST)
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.6 $) with ESMTP id IAA02365 for <pgman@candle.pha.pa.us>; Thu, 30 Dec 1999 08:37:10 -0500 (EST)
Received: from localhost (majordom@localhost)
	by hub.org (8.9.3/8.9.3) with SMTP id IAA87902;
	Thu, 30 Dec 1999 08:34:22 -0500 (EST)
	(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Thu, 30 Dec 1999 08:32:24 -0500
Received: (from majordom@localhost)
	by hub.org (8.9.3/8.9.3) id IAA85771
	for pgsql-hackers-outgoing; Thu, 30 Dec 1999 08:31:27 -0500 (EST)
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from sandman.acadiau.ca (dcurrie@sandman.acadiau.ca [131.162.129.111])
	by hub.org (8.9.3/8.9.3) with ESMTP id IAA85234
	for <pgsql-hackers@postgresql.org>; Thu, 30 Dec 1999 08:31:10 -0500 (EST)
	(envelope-from dcurrie@sandman.acadiau.ca)
Received: (from dcurrie@localhost)
	by sandman.acadiau.ca (8.8.8/8.8.8/Debian/GNU) id GAA18698;
	Thu, 30 Dec 1999 06:30:58 -0400
From: Duane Currie <dcurrie@sandman.acadiau.ca>
Message-Id: <199912301030.GAA18698@sandman.acadiau.ca>
Subject: Re: [HACKERS] database replication
In-Reply-To: <OFD38C9424.B391F434-ON85256851.0054F41A@black-oak.COM> from "DWalker@black-oak.com" at "Dec 24, 99 10:27:59 am"
To: DWalker@black-oak.com
Date: Thu, 30 Dec 1999 10:30:58 +0000 (AST)
Cc: pgsql-hackers@postgresql.org
X-Mailer: ELM [version 2.4ME+ PL39 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-pgsql-hackers@postgresql.org
Status: OR

Hi Guys,

Now for one of my REALLY rare posts.
Having done a little bit of distributed data systems, I figured I'd
pitch in a couple cents worth.

> 2) The replication system will need to add at least one field to each
>    table in each database that needs to be re plicated. &nbsp;This
>    field will be a date/time stamp which identifies the &quot; last
>    update&quot; of the record. &nbsp;This field will be called PGR_TIME
>    for la ck of a better name. &nbsp;Because this field will be used
>    from within programs and triggers it can be longer so as to not
>    mistake it for a user field.

I just started reading this thread, but I figured I'd throw in a couple
suggestions for distributed data control  (a few idioms I've had to
deal with b4):
	- Never use time (not reliable from system to system).  Use
	  a version number of some sort that can stay consistent across
	  all replicas

	  This way, if a system's time is or goes out of wack, it doesn't
	  cause your database to disintegrate, and it's easier to track
	  conflicts (see below.  If using time, the algorithm gets
	  nightmarish)

	- On an insert, set to version 1

	- On an update, version++

	- On a delete, mark deleted, and add a delete stub somewhere for the
	  replicator process to deal with in sync'ing the databases.

	- If two records have the same version but different data, there's
	  a conflict.  A few choices:
	  	1.  Pick one as the correct one (yuck!! invisible data loss)
		2.  Store both copies, pick one as current, and alert
		    database owner of the conflict, so they can deal with
		    it "manually."
		3.  If possible, some conflicts can be merged.  If a disjoint
		    set of fields were changed in each instance, these changes
		    may both be applied and the record merged.  (Problem:
		    takes a lot more space.  Requires a version number for
		    every field, or persistent storage of some old records.
		    However, this might help the "which fields changed" issue
		    you were talking about in #6)

	- A unique id across all systems should exist (or something that
	  effectively simulates a unique id.  Maybe a composition of the
	  originating oid (from the insert) and the originating database
	  (oid of the database's record?) might do it.  Store this as
	  an extra field in every record.

	  (Two extra fieldss so far: 'unique id' and 'version')

I do like your approach:  triggers and a separate process. (Maintainable!! :)

Anyway, just figured I'd throw in a few suggestions,
Duane

************

From owner-pgsql-patches@hub.org Sun Jan  2 23:01:38 2000
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id AAA16274
	for <pgman@candle.pha.pa.us>; Mon, 3 Jan 2000 00:01:28 -0500 (EST)
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.6 $) with ESMTP id XAA02655 for <pgman@candle.pha.pa.us>; Sun, 2 Jan 2000 23:45:55 -0500 (EST)
Received: from hub.org (hub.org [216.126.84.1])
	by hub.org (8.9.3/8.9.3) with ESMTP id XAA13828;
	Sun, 2 Jan 2000 23:40:47 -0500 (EST)
	(envelope-from owner-pgsql-patches@hub.org)
Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 02 Jan 2000 23:38:34 +0000 (EST)
Received: (from majordom@localhost)
	by hub.org (8.9.3/8.9.3) id XAA13624
	for pgsql-patches-outgoing; Sun, 2 Jan 2000 23:37:36 -0500 (EST)
	(envelope-from owner-pgsql-patches@postgreSQL.org)
Received: from falla.videotron.net (falla.videotron.net [205.151.222.106])
	by hub.org (8.9.3/8.9.3) with ESMTP id XAA13560
	for <pgsql-patches@postgresql.org>; Sun, 2 Jan 2000 23:37:02 -0500 (EST)
	(envelope-from P.Marchesso@Videotron.ca)
Received: from Videotron.ca ([207.253.210.234])
	by falla.videotron.net (Sun Internet Mail Server sims.3.5.1999.07.30.00.05.p8)
	with ESMTP id <0FNQ000TEST8VI@falla.videotron.net> for pgsql-patches@postgresql.org; Sun,
	2 Jan 2000 23:37:01 -0500 (EST)
Date: Sun, 02 Jan 2000 23:39:23 -0500
From: Philippe Marchesseault <P.Marchesso@Videotron.ca>
Subject: [PATCHES] Distributed PostgreSQL!
To: pgsql-patches@postgreSQL.org
Message-id: <387027FB.EB88D757@Videotron.ca>
MIME-version: 1.0
X-Mailer: Mozilla 4.51 [en] (X11; I; Linux 2.2.11 i586)
Content-type: MULTIPART/MIXED; BOUNDARY="Boundary_(ID_GeYGc69fE1/bkYLTPwOGFg)"
X-Accept-Language: en
Sender: owner-pgsql-patches@postgreSQL.org
Precedence: bulk
Status: ORr

This is a multi-part message in MIME format.

--Boundary_(ID_GeYGc69fE1/bkYLTPwOGFg)
Content-type: text/plain; charset=us-ascii
Content-transfer-encoding: 7bit

Hi all!

Here is a small patch to make postgres a distributed database. By
distributed I mean that you can have the same copy of the database on N
different machines and keep them all in sync.
It does not improve performances unless you distribute your clients in a
sensible manner. It does not allow you to do parallel selects.

The support page is : pages.infinit.net/daemon  and soon to be in
english.

The patch was tested with RedHat Linux 6.0 on Intel with kernel 2.2.11.
Only two machines where used so i'm not competely sure that it works
with more than two. -But it should-

I would like to know if somebody else is interested in this otherwise
i'm probably not gonna keep it growing. So please reply me to my e-mail
(P.Marchesso@videotron.ca) to give me an idea of the amount of people
interested in this.

Thanks all.

Philippe Marchesseault

--
It's not the size of the dog in the fight,
but the size of the fight in the dog.
                        -Archie Griffen


--Boundary_(ID_GeYGc69fE1/bkYLTPwOGFg)
Content-type: application/octet-stream; name=replicator-0.1.tgz
Content-disposition: attachment; filename=replicator-0.1.tgz
Content-transfer-encoding: base64

H4sIAOeZbzgAA+w8a3PayLL7Nara/9Dx3iRAME/brO11tjDGDjc2OIBvTm5OihLSAFoLiUgi
LGfX//12z0MaAX7UOU72nLpW7cagmenp6enp9xCwmetYZuQHxR++1QM7pVq5DD8AQG2vxv+W
d3b4X/mUsKVUrdbKO7XKLrZWdkqVH2D3m2GkPfMwMgOAH2Z+GI0DFt7ejwW3Nv7nPkGy/xfm
NRs5Lnv0Ocql0l6pdOv+l8u1XbH/e7W90m6NWsu7pR+g9OiYbHj+n+9/q904vzppwhFst6A4
D4NiGFjF2Tj84vJPjme5c5vFra5vma5sl21FwzBdFw7AGwb+NQtgKv8mvAXYZWw6HnayXGZ6
d3cVXQ7AeBZMYXukOquv0/TXZKhhGLJrjEvBAvlhYjwbW5b2/r8ycu1Z2B6P7SFs+8lMYRQ4
s/irodA8UJOvwZ3eCXeahqu+GtqyD7SFrEFPNW2aQAO07c6+wPb52m65ztB49pNAQKfZzIys
CR7DA+OZ7YxGm5hgaFrXzLOLkeXPiuqgFHzXfnhnC96A+rJX2C3wWf9q5n96dPn/zbaH5L+U
95vkf6VUS/T/Xq2Crbt7ldqT/P8eD+6NWS6V8+VSzXgDxdzlBfTnDE6YBZWfcSMOqtWDnX0o
7+/v54rY4yelELaKE3/KYqYpaoykZNeW3v+XcBkWo+UMZcHkzVqDM7M2vZ6GY/G6mHuOmHEM
jL3arrn3cxU/PnO8SJmQan72fs6C5SH22sNeu/m9/T3qWcwBAuiZEfw36hYoQ7l6UKrg6pD9
SiXqATlojSCaMECOiNiUeVFIHwP8s3CiCZjw6vWrPPXw0t3AQkqEMAr8qQREzZpIngW+xcKw
IFppEepxRpD5QggPSDB740+lz3B0RBNl/+C9Ya0VXsGrQ9GWXjM2ndbPe03eeEP/MDdkf9zS
td+90npy+iJme7V9a29/3/gFnskH4CVHIQoYG7hOGOXBZvSvabmdrywIHJtlD43t7W0x0V3P
gyDVSjsmegH5WnWfID7j/6wxZvWgXDso7SSMSX0APuC2oPKRG4DThRHR2IleheD5Ee5hyFxm
Rcnm8aEgQcTbQThuv8F9nZqe3UeuhedH0Lg4GfSa581GH16+XKGo2C5cf69/0R9c9M4AefdQ
viM2nhLvI93L6mU4wy2NRhnRyH5HQmy9fhFu5VNbnlXd5R+HDwg9m/6MWZRpd06ag3fNj3lo
XTYGjW6z3oc/obS3t5fNw0vslAfEZtBr/W8zD6Us/FKKUQWYIdX9ILN1EY57SLVTE81/+wC2
4lk5cwgWURshGOWvlltPz+M8mtiOTdjHnuNu/69cre0I/69Sq5XLpSq1Vit7T/r/ezzFnP4Y
/YkTQivk8nOGhrw5ZvgZtSZ6eP4ijIWe43vKnO+9PwfbjMyhGTJAF880bIdk13AeMRuY99UJ
fI8kbcEwGv5sGTjjSQSZRpbLboDLieM6M5SNF2ZgTVBNMnPuRoZABTXnODCn4JB6ZSjc/VG0
MAN2CEt/Dhaq8oAl06GcB5TXRXRrpj46M0t6Mfds9NxoQRELpiH4Qsmfta/gjHksMF24nA9x
VXDuWMwLmWHivPQmnOAChkve/ZRm78nZ4dRHqJwKh8DQNsAJUIGFRJWKISeQ0PKABkAGKYgI
B+DPaFAWsVyCi7ojHldYX7BORkeYHBN/JvcDF7Zw0OseMkDOHM3dvIE94UOr/7Zz1Yd6+yN8
qHe79Xb/4yE3X3xsZV+l6eJMcRsRLC4mML1oiUQxLprdxlvsXz9unbf6Hwnt01a/3ez14LTT
hTpc1rv9VuPqvN6Fy6vuZafXLKC6Y4QQM24nJ4wQ0tRHqtksQgWDZpDxETcvRJTQg5yYX0lV
W8z5igiZaEvNlg/ZI9f3xsIuizS6HZK2R0Wfh0XgIENE/vruGcnuodL0rEIedvehz5AmDC5d
02KwDb05Da9WS3k4Rj6nrhd1gFKlXC5vl6ulWh6uevWCkTo+RcNAIyQJBRxAf0J8HdJGyxCC
Iw+XMAlxBeh8R7gJ4RS3lO+NhwSbiiNGJiXSxPNtZqBZqT8CXKhW6HMmpI6qpUCTI9Vxwvbr
8goWwdzz0LoA31uFS8Cmoi9KPOROvn9tDlrABIQmV4FNBI0vwrQsNkNb2fI9Dy0sRB93GnIE
/wOZzAsmrGkSJfTF9gnKwsRl0wwEThuKJ5060QmnSQiKJBla2pmLeq/f7MJxt/MO/1x2Ow1k
02Yva5Ahp7kQke345ECkXrnOcPUdWVrpd2gYeatD0SEJfeuaRen3Hosc/L/oeOv9aXn01rjF
D9roBW30gZK3W4l/ZRhffQeF7O9ONPDwZLjLDLlEhOWpjXYeWnjCyNxquH5IG56QmIw8AAvf
s4wcQC8IVqaEH2+ImXsT349WvB3JcZp/E02CuWAc3B9SGWjBzhluBceuPpu5y54anrEmZpBD
cFM8pN3mWavX79b7rU4bN3gQsDHHmhbxxbHzgIZul4WoDwi1tGmNL4q5fhoRjzGbH4lrz1+I
07NANzWF2wLFO/oIEaKXtsz59NtvWic0F/KENVvqxjlhzClEExdPjq/OFG0z8sRnIV7lAbxA
Nn1hZ//uoUGfghK4zEvgZhVIXC7iII16wd+3mPWHdyDw/qp1cnAEL2w+L8JU4BO/gdP1NsdA
dwq0ZZFLNeLOgeAaDpID5eeEXNZm5bh1lo35jX9VnVP96g08rFpH/r23setp/eq8r3Xl3zf2
bJ10L7SO9HVzv3a/q/fDr7f0+5/6eaojft/Ys91BKmo9+XfVkx+mMv9yIw6UMLNQmnFtfzr3
+FnkTqsQqa0TMmHAM6dMqUEu1ddOHXq05ERzkcfPWZeNHZLbbeyeuedkrR8nvqX6cUAIg/7H
yyY1Tdl05TRwiMjOzj+YP0rN9k9y9L/Ko2L1gdCbTwz7CAyLNOMsC2/JjOD6GY1G5B+p/Ikt
j4UqSpiwIfRL32/HrasqKeY7cjAYl+w6/yCLJzt5KFnWm0+PlxELlfzNdYXFmOrMJTquNaN6
I/uhZflVqjfkKL33ZuYlNruTzxrcaqVYUrABhY0HH9JHM4WF4v0FekEMR4hp9RWmI5J8jXJP
b1+noGy8QkXx1dU9cIVqUSsLo+fGkCeLMG6gsYD2/oL8ExRXaOBJ+5zJrY6RD7HBmmTEWyFx
NJQs8iTPO413g8t6412zf4BWKzOvD1c7nMXtsgPi0PFc9Kz8cXrWZ3wEJ4MaYjyDVcskRodL
uNVtegYb0Wj+rZUA1Q0xZVMhUk18jQRFq95dHnC6MImjE/4aEwXQQRqR73uQkGKzpJPO0tC0
5RIpQqnvEmzEJG6+0XYP/yczihzsWBkJz8xjX3lmNpoHdLQS9bX5KRp0UKem42X4Xl6z5QDP
PVp7zH7HlofSWsPXSpeJNlIufrBEFTieCrNMnXm2uERNBZufYu4qFO45+QeZrDj8ZJ6awwYe
Jjtg3qdy6fOhWp6UWqQzSbGSN27JfvFYLx6KR6q0OnUaEB6+IdIHQXEwtKfclw1n5sJjNoeJ
7DO3hPAzbTsYoBvPU7J1/KJEGzWOUCELg7aH5J6tOEO4yejaFnJFgYYUcmIc4il8kkz9dNBq
N/t56NHJQYHWrF/I8w7Jgb/ruFvI3hGT8BJ2Sh35G6VM43UU0K8YjMyp41JaQWJxuNZj5geU
qzmCCTrTYeYcRW6zPbjsdPvZ9c4m/8D/xCNa7frJSXdQb3/kZ+qcBqApwek8/AcL/MzLTBoK
vUSR93NiASDdhg7aF4J2ecisbFAuCy9jGLH4XOn0UP1AM+n6QD+OI1vqBoEUMSV6BQqtYxQn
A5Rv+tbdNZMY/pC5ijkZBzj3/VlIR4cOOadhWgNt4Fyh/RXrxme0x6VL8gop30O6EWNuJB+C
ynLvjbD5oNx/buNqnlzC/VI+cn2nZkPgItxw504mCKM0l2il1d9dRBUTrDnMd1CY20xSHxZz
ZzzdlAx/LheiVi6iJxTdYYs4HCS7o2CTMjIxstJ0uOTGtRB9Dz/h1P8hy9ARbY1ItJnKBOSy
TqKboKQwOtIQ4fLeU0uiBZHM1UJU4HsrsatCoRCrw02WZMwA2cM1raWrq8RM2kSNNbixxuM2
DwF4/vxuMpGkNkcsWuK0UYK1IJxIuyoifCCdx2YU5LOuVzVPgqiuthI19Po1ZXylJkzvzQ0Z
FujmIMQPdHQ5pI1benNH/H9D2v7Rcwz31P9VK3s7qv6vWqtWeP3fTvUp//M9nqf8z1P+5yn/
83j5H+MnZ4QMN5K5icHbH42f8KvjMe1N/Oqi/rdBo9NuNxsUfOhRLVLcplnIXEzu7lSrCSxp
ISrXpFJKA20fJ63VSnm/koBFt/ld86MaCLs/7yRDVTGJatwraQglITw5Eo9v0ijDdGpkuVRB
uCLWf3Lcrl80j/7Ymi7t4dbNITmSypMKtYyCTDvRN4asTSzOQwkUjyS9ibpcSB6HDKUppTMc
Mg+pA5GcoUsGf2iBg7wWJMjr7n9ed9vzcajzBujfQeeyKaJBiKmCLczLP7ingvPykAXPn+AK
gUcMPikSfBbRMxV0wW+x4xwHfvG/eZhIl8TK48HfdaLIxsiX0SYWGLLMjKxHbjrBBZ0xPXtI
4hHlA5rDOJOQNsrz5quZB4xH7mDDKosS0dYJODai44wcFgIzrQmX+3KrkjQintwpNhIrZBx+
egOmEJiitPs1cdBFgoUTboKg2uaUfdpFR924ScUBkfh4dpiHchyxUYcnV/zxqfboER/N/vP+
mvof2N2t7Kj6H5Rxu7z+p7z7ZP99j+fJ/nuy/57sv0et/7mj2sXxnMgxXQqP3RoA/5E0O1kB
vDRGi55oJpKQ1NQT1eG/Y71Lqre9Mn/kTNm3LIrR8hA/bg7CCzKqUKbsQMYIGV453Lcx43aJ
nibPqTz5vclKFZN0WfxKptyFYXVpBiFTMf6RM55Ls5YuX8a1UEJSiPiQIeNIVN4iA4ZyY2Tk
6RHyAd6/kg8oFvudkw78Znrz0BDRLEVCRAg/EW2HSypqyGzxXgpcshO35hHWOt2TSljrz9MI
1D+XUWFi+TaX1TDdfjPhLyUQlVNYAbaeVNictFH7It/cGahOprgr57CeCti0Z4m8EFBvyUhL
vCl+OA/Ws+hyy/iG6Y0FZbFz7halSDleFIiOgvI3vCWKZJTOc3QbuPAzw+vVkjFicx+dBREV
D1EL2f40Q5Ih0746PxdFJKmZcYIjkP14KxKXymZiyj4wu/9AQvJrJA9I7ce05NKeRtHSabVk
FKXlNl9rqizmjtsbh2s5mdWCnY2lFEmCQaVWzLj0TZaTKlP3C5oP4rJmEsEWNTiB9fWWGpx0
Gc4mEqq6BVF+A4nc0LLnWoJBrwFAumhO+qFKRFHpUTo3n5TPpYt70vxwVxXEg4UggbxD9tH/
SQ3lveH2f7tH8//0S8aPOsd9/l9V+n90/3Nnl+5/7JbK5Sf/73s838gES16i9Tn7sj1i/HUx
d+b6Q9Mls+byjHRVDsYDSn3hx08UM/x8aKRLF3mzXtynuol6DUrihTyiCOl0oFSEFGcjISys
JinuVOIxQMM7EtG3cK2A7Vj0lfXKswGOp8m40ODBKzlDT1zN3C1XPitrz7nXTDTimsoVrRXP
c0dtJVU5kd9IMTkqtUZVnJoAvFd073GGfqzNfjPR0QRunl3URbUEKoWMc1RCX+mXmILw+rUj
5WEx1/ahzUQOWoUbxbVXzxc2KDbI1cdqAzLrO+V8pnJqygqvaPKsFKcy26rp0nUgEkPKg6ar
WrOoVV3XtzK3EkqS+HaYDzIaOKjXr2NekwpH3d9MsUEetshCOnqBHilaqUe7O9UK2EOyo474
3c5bjCkRIt9Q2v0i5JXcqUl4t+TYOESay/eyiz3MrHc2xA5dvqdCwnmY0QdnaX+SJMTguH4i
GUHEtsPxp3Ll589SGyeXVnGpf/cyidKIM+vEgcgg8vzAi/AAqSGLcR9CgFgf4yR31lHH0fT4
0JszimuE6xcWeOBann4aFqLVazHun6hyLq3gWomJxcQXsfj5Csik3pWXEMbCIswk3uJsQPgT
JS/PAn6HIYeLD1My4iGH8dQJ8DBbE2Zdg7NyQR1N7qk063gUJEacAg7eqwgWplg/t2adSNWg
FIsOdzGs6Uz5Y2SBEjccHaFxhByEPv6cxeXDfDHbb6SVdrThmIqzvjpUzpbiaHE3QYcodzng
layX79nvzEqxqNY7sfmIoZ/TkD//xDGCwj3B3vglSxe2L8+6zd6g0bm4qLdPBp13scEXc3al
tKNY+x7mjstEgY6kNG7JvCYeBnFMdSzzt9BIsXxsUq4xO9BhdpkZ8HXoBrNeWLK5zw0SWxaF
IG4rNZv8qlcH3b3f/CFnexLvps1DVNoNmVAxVBxu4gqSx6QUK/EYFQ7z6W69cnKUCaXYMOR5
IQk8uT8m7qT5VKiLHiBBcjzkaM+S50+Lgmr5pQQdylmlq01/TEdZ5DECPFXGP3ET4T6n6y/x
r1a48R4nS3xedbUc9WMD8hCrtKdWsfVBFrGmNT/uiG49pau/YN1kWnXPbimLEjPp8qwg5Knk
KV1oJ3VhqxL3ZXJyhEP2H+V//dWP5v/hOTi5aH6DOe75/Z+9cq0W+3/lCv3+G358uv//XZ5u
EpL7WiqUDcM4b120+tzy7cmUlMwD0sVJLrUj9C4oeUTJFxd1/e+Abp/HXKgUKoVymaR1l9lv
zUi27hVKBaPuhn4qrUjgdEgC9II8i+lsHtHdZin2PYavg+sC/T5P6E8ZaY0JjqYK3ZCrMZ5e
kvUHiDAqxlD+EAyaRzzxRj/lQvdnCpChlFX9qv+208Uu3G40hsz1F1nDaIEZhvMpaSuyRma+
F5pDx3UiMiqTZJPKQhbgCueJc3sLDwInvMY5WrCgAJLhOtdSlgpx7Xi4skS9cp5aVXp5uvqs
sptkTZkWKjpmmOIermkHjJSx+MaErufjbd+aCyHa15EEDpptm+5sYj7nqTeOHIVqIz8ykeoX
9ROO5dwTkCiLS+jYc2FW60lcIABccg8Z8wycwWN2gX7/Rz2G8eEtatRWD1r9X9Ms5JAVQQUp
28JHSFLISLM4yUz1IiKlzHeRLqGHwEOlvayBo4gZcPR4wsgaoIEhMnEkbPTIxB0Ac8qrdAiQ
MF5wmDPFJSHeqFJ5jQwaHGEhjZ/tM/E7QUuGQOcznmmg2xeui+zNf5cHKM3KE3NEQwph2+YS
9TtZNgbn4YW5jPdvw5x5osJwmaR3ycARPMrr1YntDfolPjSUKMQh7Ka6NKPk7w9pP1FFW6yu
xseRDX6JQZ2IgsFvRdH7dqe9fSuImTk2tdLzpCybUrbG284HOOk0aVfhQ6f77leDJzrp1wm8
NZi0RpXUzSeXWMmhinO9aY+KZ7P5kU3ngFNuHP8dJ7E2vY48idvym+ho1dG+slQfHr9Irtmu
XWfXTF2+l15ee21wSPEvS8WG8vpVeFXIv2K5huSnye0wMglgDWPQadx/2wTyW67arYaIINUv
Ou0zXqDXk+Tvo9xzPGeqsslobwUmUsfkXMCNJ48Hktbse9pvrcbMSGWV11eP7hJd7OPL5dfp
nWidFgU4UTxrqHKQUJ6SuUx3ciqIWfJJMQJ/q1yAAjQV4kAyPvKN2BDFI87cUXKpMLVlHQHM
JikcOrTsib+gKwF55TXrW2mjQBdLxAW5hB1tVhj5MzprkAkZo/pCd5mFse/bVANnamKTV2CM
jERLMJ3hN/pPYqkyPSQYk4nfjdM8mwIdNC491Ulrd+izsU1JYyGFxA+YhbLjqx78X3vX2tzG
dWQ/79T+iAlSFYlVEEyKlmRbqexCJCQhoUiZDzv6tBkSQ3BsAIOdAUhjf/32Od195w4ethMn
dtUuUXFsAjP32befp/ueDN5eppdn8th/yHOXEZAy+z4v71WDn43lt2+9dJ09IGP+Xo4d8CUL
2Rd5qNYFDKadHLeQqiAN9MGg00tznd5nfEoZrBH67WTVCw8Ko8dshd6XdbqcA3MxG+9ZQZOe
DTa7Lu+1WFsB2xECXv6vYpLepByPeYoc+omXPoihXzyTcwPblSJxHkBJpFFKJ7cvXazizeEm
N3Y2s6VaX/omv8mWTGjNZ2sPweAUTQFnqc1KMjtzsqBTUS3b9U1aT97K2a3vxJhNP+TZjEZn
d53v8qSOSiUZ7qIJ9xsnxbqc3CtKSWaNXU7Xp8OF2CiKF6UXyUHIyHbrnMvcFKdBdZiGkRj9
Z7TQ1sw2DmqKsr3ZTMfvW+ioQll+1CCxc0BIAaZRyaFZgGaS5M3VuwsnUlPzfJE7VonvtS2O
/9lBa3Xuf+t6jblewjhSLBCcSYRQLR4KsJchZVPGCiEzEeZjTn74ZEq5/9+E/dTLClirB92m
H7CZohD04hMU0RxfUm6yAs6HYXifalH5ZHvpRankkKXXxUJdstATqijR2z0JHSQPutM0dIyn
R6CxDoBEo+sOv6HkbKz2abnWFouxQBdayrFe9Ri+gDNR8Qq2cJ1iflN31MeDrvB3Ne3YIa00
AQzrIntPtYobUf8OUkuV6WQnlE9+6Pk35X/eCy8tF6JO9m6yRyjvb/2J7P/l/F/Uh9j/r37E
/k+fvzww+3//4OWrA9j/+4fPH+3/X+Pz+999Jqzms/ouSX5vduNNVcyDOrCcwxosq1FefSVP
HOzFCoar355bKQ881wdOt2nn8vPhnmkPQTzFbs2k50Uo/yD/OWv+s3nmD7/1gv0f+0TnH5rf
v6SPnzr/6SvD/7/Yf7XP+t+Hh4eP+P9f5dOcf1gC0Pv9cgL/e7b2d0Mxv/XYHz+//BOd/6Oz
j5+Gp+/++X3I+f989/n//HD/1cs0ffn54YsXL/afHwL/dfji5WP9/1/lo3XAmWgxOB2c90/S
j1dvToZHqfwzOL0YJP/mNd2/8QSXbvrnpZhmB19+eZAk6XpKzxdfdvnTzowZzblI0p2fl69e
oFppnfbvxSI+yoQhFaNxzgSM/ecHh18y9SJJB/d5tVIjERb9tFgsHJo0X9GSiTKD5Nlr6X6K
H4u8ToLXfGJZKu4979LBe3OXzeh5KGgush4IXBpwdSf/pmvyscplbJMctUwuWVmILdWW8FIv
Ghc8zfi8LsYzc69m38uX5iiuEqQ2jeAyKjUlhIPnEJDS1EvTNytGAqoM5fK3Z8Yknm5D//1C
7H3tarzMkOKTWxzkx7rCb4mP+dkzQsi/N2tYrfYmosBsLEa/malbayyjl1KHTHZkAgV4Tqnr
Y76XXdk5T6IghoEPguUZIAIPd7C0s+XirqyYFsw6umWyrHX7ZEhPL+BT0td2UWVrcjfwmtGr
kvhinxTXVVatduU4wX2ZZ6PeXsr4CFz/WRPJTrj0NmJ4CsqyB6oJmULzPGNpkFaOW9frrlT5
LW5HoFPDN7ALmkzmFX0aBHVsH1m9QXvxnmroKgmB+Ig6orOjR2ZjfOlTo51qTFJINDaWV/fS
tTs3Hor6bq8bugrODgVgSdNiJ6DcjyzYOEdgJfEXgVwqFtGrjIcppbaoUV6Hl07GeKOjRCMz
IhY4Xl/31+bdsOZYItbbHZXmSaKTrebuXJZ4dYFsCu4fuVzNXXEQJNdShIaslJUtZvOyGNfF
KBFiBXvCYuYzjQlpJ9oSBg6Srr/Xn0rsSpWHdEV9ipGMer0XFIxG/iDZXV4tcLeWuaKLELDE
8UTLuqLJ1h2NV5Kpirb8IX+SS/FWfsh/yJAh1/UntjZXL1EJugllPtzlOHbJGO5bzlgRM7e5
NMR+4CQdu/tLqKOYaxDMgxa2VlhXHCM6unp6yvjuGjnDCccD1g2kFpEXgh0R5Uk7fSGJMI6a
Pr+7fOrEwIRRjVuvlGDo0kt8axLeOLKFSixI8SB7usjn9Vfp04M9yiUVle1VF7JMnorhXCJ+
YmQSSaaHO5TXxhrV/HGSj+WYU+LVtYEt0XQ33mFNg/VtjPvjqBGN73Iv6N5V9vmk9qkQS6m5
QUrwjiPkahvBJVzw3KUw82wRxBnVYSuUnc7KJtFUXeEmQJImkB7u3onYMAdfqHfbHcIAEXFo
88zinRhfYtyijinIy0BzMA9OHOopNZlOd7tsSTHLJoi765QgZGQhRLRPKUsZCNdhqNdTXa5a
LeqW18l4JStrKzF59ATW0nKRWTV0nCT8PFl12UnMnhSHyow7MGoR91jLhYgQzt6E4xw/w5cM
ugNvJQchElVTjoQ7VjrjKDG6VJRBposeJCcmUcxGxX0xoms4La/JSLSToM904QHKhTZveNos
vSk0g8BxVeSiRa96xjQR2ltwm7uhLOI0GzE/m6jF1NfZJqTH7zroUIqsdNJ6YuoGuDxLxS2a
5zQ1vec62Bz7H04u5VMpM1Suecu6cvMVcdpN+F1pXTPG3aV/W0Lb6yX/nvykgiy/Xg7OP1yk
/dNj4KiPh1rLBUnTZlJ102PAyYdvrjSUKw9+ODsevrXYLga/bzGULaqSkSMXG/Ec6jHEQChn
IEZEFJAEsacFZO8cScwhdb1hO3flBMKlzlam2k5FA73O40zzZGu6vAxsu3rR02XvfNTxdUR7
BpSmm1BnCcOnWIjmgNF71KnDqSBcHNAL3lqCiFLtGfbRL2iDIJG8Ku5lx+5zXRAdfDPhSfbw
lZ5phbvKzKVbfdaWzbP14paJ3QcZUJnoJk2Kv9oQmAH4e0wytbPcIJuRtc/5c8eSiZzNZTbG
kj1FRWVhBLcLhKf9BatsY5kzo5C1PylEpbWfZ4nvTNqJe+9A82Tk3E6GYm1Go0phIVmddkR2
dOSg9IW936uCUNq6Emi041y0JkllEopnoyErdRg5vFYWS61suUAYnky5ltadVIBmKm8Tx0fE
S29M2TUdIEcsHoXWWsgpeyWJlHXGOXnXF8sokk8qGy0WlIjpBqEl3vNTYYP5HKrXjFaJxlKJ
cFLGJfPcMuK9HjAwvohKZNUS6vbc4c8ud8IkATEiuzpQKBWO4M8wWF1Xs2ae1LEeg+2NlWuo
zcBwyQmZihRYiiKGwHdBk9CVfizNvLhZlst6or0LzyEvF9qVb6wERsCS2CDjp5LmpBnnsUnc
TLJiqsVxXfK/1iqIheLZTLtL9LXaJdatlyiKOeEsIJIACLOYPeYWmk7wDJXIxj6MFIH20rXw
RU0/7YoS4WkrVsFdUkuHymsI6c/vVnUBPJLStR5mN9e0J1XwVtZKuxKI6XxBPYr0LwjdH9wy
d6WZlPO8oRzT7wziiFlV2wnGOaZxtkQ5W0pkBrAQOtydrLhrslTpNFY0ydrbjNAY/LbKKxc2
uYOEsI8tdCmkIQr3NM8XjkLxEL/L8a+0eGy21xgBitNg0Nt1RhRTIM+/kbXlwsocFVBHkrPk
YkJSIhuT6608R1twDjSCtWWEp0/1dBzXG+MgbRKj6M1G6wWAiZ4sM22LGZtRHBIMMNZ7XgSx
zu9qFXUxtqi9sZZLXZnaXd7CCGppVADMWi8ZVsHpGSKKp7GoRqEVENAuTcBFv07/Zs9V97D0
Lui14rmwEiB16gZrAPdUlUEMCZ+xyQujFQYb2YS6lKBR/shiKtK2c2GcCJCeJjY0DVJJtGo5
6mOqRiJpK8IyuSZM8sGqyaaIogSCVnqazcqlcBeFL1II81C0OF66leNlbMC+2G37PIVOO0FJ
INPAAn3YKdBxhBf2GoeFIoVx4lv1jsiBbLW5XWxh/cCYGM0nE5dfaM4BLvdF/rDGE9lKo+E9
HfyAUsjS1FeOpQsiWyF5RWtT4UFlEwFOGyhBF1+9BLPWkneVibU4kM9mU0MgWGzUbOdaY709
0dzdb8JnDeiq2KWyTa/sszkeNEYTr3qXiRnoOHE7AzAn+YoqQzuPZpdyyapiyBmoy5m0Rlcu
VKOKGmKjd2jlP6CNF6rO1qbvTWWN7w2vOGudQd1ZQp9xRLsOP4rmWYpoC8N/CCiiFg1NU96J
FHcNp/NyEV5I1oiOmM3QLApmZQr3chajpgmRVs2WJutChYw1VjhNaGkbbhTaW86FkvYKqAO4
cYeonadKgCvDNcCxeVAkEmxtZd24kql54uoOkS9ofOq0qnycVaMJEXG3xDA/QEyrc+xSXuxG
YQKMlP73RWCYdZMUS8Uo8v9RUa0XSew6coQ7KlhJO6kOVh0B8tzrVHbpjoZD0xXNmyT/Ia/U
/HXHmWVqL6pysnWxIwOqrESdm8Cb4eZUvVUVkDkPieEvNJozJTpsPMYqebMOAuQ8WAt/S0PJ
uq5FBmlXLe/URPa02OR9OVkyYTcBALOsAHRTnt7MT3XfhgtdV87/otEp2yRNw0rZKuUOf1xV
X5/C+uhhQqowdfXn+R5RNdffwafiPnBFKpLfQCPbIn+TCz9xBxzDcwPP7lCihBnAZWZnSl0a
sgKN/tS/Qa4N1BWWk7PdwHeTnLKuUp8yBaGh159BmGOQqkA1RkjXzryf2rh83W5NUGVNezpa
TVQ370ZaK6dZhapQS3cMNU5CCB3Vxl7LEnaDRrY5syycJ6rc3fQ+mxTaHBKVhDsv6H/Tea3y
rGKgpjErqCCRIay6ppCbBjWzBCQa0gzoUTGyCJdbCJB+ijsm1FwXLqbXLqWwrj1bWF/xVl5E
e3Na+0DFTwXwz9uD3euvM/kH9uBmF3XFBWojm5X6qSeeYINU9q/FoXZMGToKvWfZRMYyU35m
aoyFbdU9oDdNzJgMJpxSzLYNd4e7ESD08H4rXSX4an/y8HK+QUHNAtXBLJd1qSzp7WJ57dLh
WldfVBfmrsTm/W3DVNQjpmNhWFC3YxokJx5CMM48tW3LjCk0PY3nZLN40OqRC0dfe0/Yu3bp
8ZiNcQEyVI6WsJWKxmoRy26yrGmZZHVd3hTuEJMjgAqLLBtdhLJ3/rzyYQAQrUCySDWXXxhc
YX4yqj3wkE8mWaw4NDOSWb73BBDodkk9z7njuSuz3Y35xMeFIT5IDfPHJaGsmbt6glIbv/YU
Zru6C61lWaNrWiAJ9mkvTl/5jhrAVCia2ulTnSFGrAmVqprUYON7NsPEUiFwAFb1QlQ3Opl4
i2xr/rCUZFWXM+otHHPoKjG1PbMT6iUzotUTIX+7oS1ErUPFik4AojXmJyOhIxOhSc8k37Lk
eA1Fkxropjat1t9Koa4La8Yo1xrYoD5Xt6mMsjHkRVDPr5NtamWLS1r+Ubkc30W8vbCIuTo5
p/Oc6TFbhrDmLooWA1GDNP280Rm0Lh8cQequEfuPTvRQHna7LpEopYJ68x/mcOTSgDJR7+w8
UlUQzYSDSahivkio4zxQGyx3dr+7d/BPxJWUBhkrypYQAwsTZpAiBTayFffcMqwknENfYKjQ
7bqv6rPiYniYndsLCeEaWuQTDPE3Ry4UVQO/CQPj0eE2wbwBL/YBiD2IQJf873Y5Uc4yKZh+
Bun1QrfOzbvY2rRLgdoWSF3AKenBaZKOwS3IbMP0LYVHY5hjmPjqtm2Hcs2lJyx8x8aUzIRb
j30o9gYWb+ZWWcUg3V1xXSzUVT/JHkL0vowvOVojo0t4L3A3VBcQGB0Q07CqdtbqhuLrjq9t
TvY9de4g4HgTqEb7b9X+8D3WGwQRpobH0WFGf09gT0cchp+sLeKaiWNQh5eWgYhSlKag/Jiq
/xMzXsSghrUDZMQPE9lPo7O0xAPJ9osiRfQQt32JUYDfxyWn+zstAZosIhhDe80cQmHsqRDJ
YJ7L22XFeFULcBIKprlT/UkajE3PoFMGQLqWpbhjiKuXtE+SIVSszm0OXfBGr6X2E2ghpYgd
cx5rFtkrFANQwU53CrLaPDIAISBW+3fLEespp6qkRNapxpwT0UQhcXJ/6Nb20+MH8NekTzXa
PC0MW2jxas1L2+smERVSGeY6khBAO08N/4JJ6aiYSSEDF3PZO2449Z7LaUD95JgsTNMPXayd
kVbOGcQFnJ/oN4jG3e8q5MLwT3g99umXpo3XQO0IedXFdDmRY5prsEgDGCJDxqZXNlw/icM2
EVovrxbqfo9eM9G/sYlQvZ0wd5w9C/tvIpMy392AngmVPBUjmlblSsyE1TNCCqLDHekJ3osw
P1V79VrTpo6DhVhGhWYvqts+/CVmJLUKmYdOkZynXRuAo/LlvZZFstr1bR4YFWpARL2C0Aru
IG7yjwxfdbgo6LPhkJL/vMsn0KTVGAaSbqaHMqeWp6KXTeAw3iwnqHtYVDfLqaZrK4e7ziZx
Lm3UfIRETdQp6fEUfygKS6whVw1AOVMSSuJuEUEdtlxu82VFDrbF5yY7szT5zL/01Efok7qB
VcDRL6S6Mu8Z3XUO1DNfnToOChbNZyP0ZuuTr9udM4GdKuOkNUKP8hmSBpMeV9biwmCYjYHd
2mJV+rvBv5qglD85iYr4ucIznPrndMljwdL0A/cxL1HmPkBykjFwHXKsletYN8EUf0AIv2IM
Eui+jSHlo8SpnazLbBKiEY2fay3OutDi6nHerDnTE33ptTlRUUnbwr0EUX02Kme6ASORPiMi
Swm1Sus70gyUQYr3lrMgjNXH1zAjG6TCTwJewtigSUJlxHdlQZ3wcu3UxGRKSBwGil7g3SfA
6cGMxGtZhvxeDwBqY69LK5Wq9WKDPdOI+KLnwbV1P8Vnhnpd41isQuDwCYQPHBxKw6jiHT6F
S6SI+q9XTWQrttOVRzfqyAaWCFyRplfdGsemGUCOno1G6ncAEch2j3M8Pr9jBL01xQj0InJN
Y3GJMuIwla5CM7NF+9VWOoC6c2ZUAlATJmkWQlnHsrYO8hFE4kyDU7jaudvmxaLkl3KCESKp
ydCjIco5F6p0B6OFH6/L0QbKgMrLlyyxsRuKjpVy9EWV3xeM3uqWA9RsN4bUftHIrms3qANA
i8VxwpULaXqBucVt8PCAMEXCF2DuMvZ6XlRRUT+hJxxce0PTIzBCLVGDF/ReD7J4BRz5laaK
oNQwhxAiIZBUrv2yFFkY+Ffhb8QWyh4vZdLgi/6E3rLc4EPdNqY3h3dSZWvPbhgSyiojQJ1J
2g6Yd+sSlk63seIosh2j0TjPIwdqW6F2kJhHCH1QZeWogVZX22+YSbaQw8bcm4CGLsJq2xKs
BclWAcNSup7vr/Da1B+/7ybKyVDo0n7PlUfHoEang7rCBv6EWDjlvzEKtbb4XesErynVoRjS
2q0uKh8Sw9BDfW8saVMNgxQI8ciYzf3Eyv/YJTKtK4CQwlFOcxyyOqE8CE7GOiCeLU0DQozr
7tf3CMmPmrEAMj4uswlPN89ede9kp2oBS5ySpuT9xglQ+9WybeUBtrNqHNMy2OzI/FFsA644
MDESXhkrP2FRDkt1Oj0Ltwlx/w966ZvBUf/qYsBKRR/Pz96d9z+g5JehYo/Tt+eDQXr2Nj16
3z9/N+jiufMBnojbAkY2aqCLMjb4e/DXy8HpZfpxcP5heHkprb35lPY/fpTG+29OBulJ/1tZ
zcFfjwYfL9Nv3w9OkzM0/+1QxnNx2ccLw9P02/Ph5fD0ndVS+vjpfPju/WX6/uzkeHBOtO5n
0jtf1KuNBheJjOOb4XF7Up3+hQy7E65W8sFjcrhm6S/D0+NuOhiyocFfP54PLmT+ibQ9/CAj
HsiPw9Ojk6tjAoHfXGlNH1bZk3FennFp/FlvXQYj7W/cyQTk8M+4lIlLKI3Igp8PL/6S9i8S
W9ivr/qhIVldaeND//SIG7W2kZhu+unsClJD5n1yjAcSfwALNUiPB29RNPob2V55Urq5uPow
sPW+uOQCnZykp4MjGW///FN6MTj/ZniEdUjOBx/7Q1l+YKTPz7X0tPKW5z1snlDJ4BvQwNXp
CWZ7Pvj6SuazhRLQRv+dUBsWM9r35NuhdI4dWt/8Ll+RH5rN/yRkhALpnxSY/cnIQ4YZkNtt
qhCiaKiz/+YMa/BGxjPksGQgWBBs0XH/Q//d4KKbBCJg1wYm76YXHwdHQ/yH/C6kJ3t9oqsi
p+jrK+yifGGNpH3ZTkwNdGhbhjMIWjt1GpG+18/l06bvNfoDXZycXYDYpJPLfsoRy7/fDPD0
+eBU1ovHqX90dHUuRwtP4A0ZzcWVHLbhKTclwXx5mofnx36euM7p2/7w5Op8g8akZ71oc6C0
FjbEiexir0saSIdvpauj97Z7aevUfkrfy1a8Gchj/eNvhuA82k8iZ+FiaGtyZi3YOpKxMddU
5sfntwD4gf3vzwHOKX74Ck5cyAEtT6t+1ktqAfLlJ7DdU1F5TNbVoGOTjyMRr5Ny3lzz3qAp
oyw3w+qZyBwzC6ReJGKJqLNsWQcppAae2d05K0yt1DN9B0NDVR9Fu1MSFYukLRFUEoa0nY07
9KKE0BAydiei58W5Y3axyCzw1ChIAdLr+qM6I1KrvFRnt5gaRhzenjb3khqMiDAci7TwWixP
GdU8FEUOippwn68sciUqfG3KWgM5JpAHTbGN+MY5j/lTk+8EpaCDkqXmvErnJe0gLXuXWxIs
AwYG9UMaE9QAg0L+EevJ9x03EC3Ak1pLzGvT12KB3GpVOUKK9EY/YsP/xLbW06pXK2nfa9Sr
4vMn7fUX3ZNo/mNFUf4DdyWma3clBsTez78v0TuJ7ktkKz/rzsRtC/D33ptIyMgvuzsRTfyy
+xNDktHPvkMRb/zyexQVIvGP36WI9zfvU/x5KfxIRgFOCV6BGBYCz5myWy9/CyIWBZmVD6ty
JuPXFEDR91EDfqKuzhZCo4VI7Tov9ESSDMtWBRCv1umF+5qAR9S3XNCIYRZFC9sqByY3BNW7
mSjV96rNOzm/lKltObvtk7vxthaC5A7031ycnYi2cfIp1pRfkwJs8/VG7L8xW/XhSa85BOun
v5EzZPz5BP1olbgWM2ALljsV/EVugr2Ou7t5Eg+kp1CVu9Uchh3jWg3K28fHMYS3jVo907aV
TdKyG3fmm53dMpRi0Y+mP4aKa3g1V3BoIMbGCLDYZfQoRMlOW4dmuUvqmedpv86TaSlNPruR
EXxPR8Y0ny1lwfJp/ewZuDaN5xr1/9I4x799xSnBeEg/5iO4Z7RcoS6WZ7oH+LG9PUVdWM3d
rpIaJvtEYxszRbAjuIzEucYZ16TcdJrMFNc1UKgUqfG1Zmi+N2R6BtzEfCIigqgpvgMy1fyK
T+WqHK1muZ9oyD8tWmw+8WwSewN5QqCNGMO1zqWhv0V0/gQBMWIE5TTWmsLLUtYOfKn3ghNN
OvszRpO+R6H/igzvjwodQbK3UMnlSk5aOftTNz0QvawqJqw+AgVFf+iiQkddeE7XN0JB5snd
wWSDX8UiRY1PI5RonkfejCTKfA1FBkJYrYpZUYagbFUiJg1mw1ISwSmTOB6cGZlg8iqZGG7U
kYhSQTRX3GOr6qnjUBJr3J1GyhQeHBbqadwjUd48Y2ZLdYtke3WLTWfmb1295vHzSz9R/afh
qZhzJyf//D5+4v6H588P9v3+h8NXuAvw4PDF4eP9f7/KZ+P+h6+vhkd/SY0WNi+AiG9siC6A
D+pP7wWFqvy7dwhFbiiCYWLXQSThOohu+8YIYe4He2l6NUNPUYG5ZzKi3mL8P+o8Vqba9Gnm
4aJiWuRzaeBDA4ahwed7itHMs8XNXagOrY9UdYB3qrEXCrpDqBmCGkH9EhyVuup0rYvm0kNZ
kygZn0/opWahloNDmsF5P51dnadeJ51yt5ce57eolwuO3ZmuRtedXnoUKvV+EJGJHnvJ4Z5d
gqOS1ufF5bHQZpSu3NEHnn3cT/+4uSCd5HNp7Tw3ARwe6KVPj3mBmGgTLJxSsjg664QTGbaX
vJAXj5pyPPPxf91dZz3caG0phaJVBS+Dgrn4p5WxfvpJhO3QK3QYWLteip6117NyWvLeUPP5
6jUfgzT5FV/g5Vv+gUoizONLIamXXwjl7IO5vHjR83/2+dSikiOfvOS01d8Q7q1SRO+qQXU+
KwDoBaY8X3gd5730dpKNheJeSRNUjTs0LqT3jtosupSRGlQnX+BZSw1O14C6pKNZKJbaVE7d
CwlJ/NIFuJORQjIKRHFd7fKbUpIvMb2lftlZzjsGqpZzts+rzDDrCZMQUXn70nZX94couspf
RlVQf113revbxvLwSxTJlzZ21H7eWfn5QlSSP4sunh6mz/f3fxVW//h5/Dx+Hj+Pn8fP4+fx
8/h5/Dx+Hj+Pn8fP/9fP/wKykq3cAMgAAA==

--Boundary_(ID_GeYGc69fE1/bkYLTPwOGFg)--

************


From owner-pgsql-hackers@hub.org Mon Jan  3 13:47:07 2000
Received: from hub.org (hub.org [216.126.84.1])
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA23987
	for <pgman@candle.pha.pa.us>; Mon, 3 Jan 2000 14:47:06 -0500 (EST)
Received: from localhost (majordom@localhost)
	by hub.org (8.9.3/8.9.3) with SMTP id OAA03234;
	Mon, 3 Jan 2000 14:39:56 -0500 (EST)
	(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Mon, 3 Jan 2000 14:39:49 -0500
Received: (from majordom@localhost)
	by hub.org (8.9.3/8.9.3) id OAA03050
	for pgsql-hackers-outgoing; Mon, 3 Jan 2000 14:38:50 -0500 (EST)
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from ara.zf.jcu.cz (zakkr@ara.zf.jcu.cz [160.217.161.4])
	by hub.org (8.9.3/8.9.3) with ESMTP id OAA02975
	for <pgsql-hackers@postgreSQL.org>; Mon, 3 Jan 2000 14:38:05 -0500 (EST)
	(envelope-from zakkr@zf.jcu.cz)
Received: from localhost (zakkr@localhost)
	by ara.zf.jcu.cz (8.9.3/8.9.3/Debian/GNU) with SMTP id UAA19297;
	Mon, 3 Jan 2000 20:23:35 +0100
Date: Mon, 3 Jan 2000 20:23:35 +0100 (CET)
From: Karel Zak - Zakkr <zakkr@zf.jcu.cz>
To: P.Marchesso@videotron.ca
cc: pgsql-hackers <pgsql-hackers@postgresql.org>
Subject: [HACKERS] replicator
Message-ID: <Pine.LNX.3.96.1000103194931.19115A-100000@ara.zf.jcu.cz>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-pgsql-hackers@postgresql.org
Status: OR


Hi,

I look at your (Philippe's) replicator, but I don't good understand
your replication concept.


    node1:  SQL --IPC--> node-broker
                       |
                      TCP/IP
                       |
                    master-node --IPC--> replikator
                                         |   |   |
                                           libpq
                                         |   |   |
                                       node2 node..n

(Is it right picture?)

If I good understand, all nodes make connection to master node and data
replicate "replicator" on this master node. But it (master node) is very
critical space in this concept - If master node not work replication for
*all* nodes is lost. Hmm.. but I want use replication for high available
applications...

IMHO is problem with node registration / authentification on master node.
Why concept is not more upright? As:

	SQL --IPC--> node-replicator
			|  |  |
		     via libpq send data to all nodes with
                     current client/backend auth.

	(not exist any master node, all nodes have connection to all nodes)


Use replicator as external proces and copy data from SQL to this replicator
via IPC is (your) very good idea.

							Karel


----------------------------------------------------------------------
Karel Zak <zakkr@zf.jcu.cz>              http://home.zf.jcu.cz/~zakkr/

Docs:        http://docs.linux.cz                    (big docs archive)
Kim Project: http://home.zf.jcu.cz/~zakkr/kim/        (process manager)
FTP:         ftp://ftp2.zf.jcu.cz/users/zakkr/        (C/ncurses/PgSQL)
-----------------------------------------------------------------------


************

From owner-pgsql-hackers@hub.org Tue Jan  4 10:31:01 2000
Received: from renoir.op.net (root@renoir.op.net [207.29.195.4])
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id LAA17522
	for <pgman@candle.pha.pa.us>; Tue, 4 Jan 2000 11:31:00 -0500 (EST)
Received: from hub.org (hub.org [216.126.84.1]) by renoir.op.net (o1/$Revision: 1.6 $) with ESMTP id LAA01541 for <pgman@candle.pha.pa.us>; Tue, 4 Jan 2000 11:27:30 -0500 (EST)
Received: from localhost (majordom@localhost)
	by hub.org (8.9.3/8.9.3) with SMTP id LAA09992;
	Tue, 4 Jan 2000 11:18:07 -0500 (EST)
	(envelope-from owner-pgsql-hackers)
Received: by hub.org (bulk_mailer v1.5); Tue, 4 Jan 2000 11:17:58 -0500
Received: (from majordom@localhost)
	by hub.org (8.9.3/8.9.3) id LAA09856
	for pgsql-hackers-outgoing; Tue, 4 Jan 2000 11:17:17 -0500 (EST)
	(envelope-from owner-pgsql-hackers@postgreSQL.org)
Received: from ara.zf.jcu.cz (zakkr@ara.zf.jcu.cz [160.217.161.4])
	by hub.org (8.9.3/8.9.3) with ESMTP id LAA09763
	for <pgsql-hackers@postgreSQL.org>; Tue, 4 Jan 2000 11:16:43 -0500 (EST)
	(envelope-from zakkr@zf.jcu.cz)
Received: from localhost (zakkr@localhost)
	by ara.zf.jcu.cz (8.9.3/8.9.3/Debian/GNU) with SMTP id RAA31673;
	Tue, 4 Jan 2000 17:02:06 +0100
Date: Tue, 4 Jan 2000 17:02:06 +0100 (CET)
From: Karel Zak - Zakkr <zakkr@zf.jcu.cz>
To: Philippe Marchesseault <P.Marchesso@Videotron.ca>
cc: pgsql-hackers <pgsql-hackers@postgreSQL.org>
Subject: Re: [HACKERS] replicator
In-Reply-To: <38714B6F.2DECAEC0@Videotron.ca>
Message-ID: <Pine.LNX.3.96.1000104162226.27234D-100000@ara.zf.jcu.cz>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-pgsql-hackers@postgreSQL.org
Status: OR


On Mon, 3 Jan 2000, Philippe Marchesseault wrote:

> So it could become:
>
> SQL --IPC--> node-replicator
>                            |   |   |
>       via TCP send statements to each node
>                       replicator (on local node)
>                            |
>          via libpq send data to
>         current (local) backend.
>
> >  (not exist any master node, all nodes have connection to all nodes)
>
> Exactly, if the replicator dies only the node dies, everything else keeps
> working.


 Hi,

 I a little explore replication conception on Oracle and Sybase (in manuals).
(Know anyone some interesting links or publication about it?)

 Firstly, I sure, untimely is write replication to PgSQL now, if we
haven't exactly conception for it. It need more suggestion from more
developers. We need firstly answers for next qestion:

	1/ How replication concept choose for PG?
	2/ How manage transaction for nodes? (and we need define any
           replication protocol for this)
	3/ How involve replication in current PG transaction code?

My idea (dream:-) is replication that allow you use full read-write on all
nodes and replication which use current transaction method in PG - not is
difference between more backends on one host or more backend on more hosts
- it makes "global transaction consistency".

Now is transaction manage via ICP (one host), my dream is alike manage
this transaction, but between more host via TCP. (And make optimalization
for this - transfer commited data/commands only.)


Any suggestion?


-------------------
Note:

(transaction oriented replication)

 Sybase - I. model (only one node is read-write)

	 primary SQL data (READ-WRITE)
                |
	 replication agent (transaction log monitoring)
		|
	 primary distribution server (one or more repl. servers)
	        |               /  |  \
                |            nodes (READ-ONLY)
                |
         secondary dist. server
                          /  |  \
                       nodes (READ-ONLY)


       If primary SQL is read-write and the other nodes *read-only*
       => system good work if connection is disable (data are save to
          replication-log and if connection is available log is write
	  to node).


 Sybase - II. model (all nodes read-write)

     	    SQL data 1 --->--+                        NODE I.
                |            |
                ^            |
	        |     replication agent 1 (transaction log monitoring)
                V        |
		|        V
                |        |
         replication server 1
                |
		^
                V
                |
         replication server 2                        NODE II.
                |         |
                ^         +-<-->--- SQL data 2
                |                    |
               replcation agent 2 -<--


Sorry, I not sure if I re-draw previous picture total good..

								Karel


************

From pgsql-hackers-owner+M3133@hub.org Fri Jun  9 15:02:25 2000
Received: from hub.org (root@hub.org [216.126.84.1])
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA22319
	for <pgman@candle.pha.pa.us>; Fri, 9 Jun 2000 15:02:24 -0400 (EDT)
Received: from hub.org (majordom@localhost [127.0.0.1])
	by hub.org (8.10.1/8.10.1) with SMTP id e59IsET81137;
	Fri, 9 Jun 2000 14:54:14 -0400 (EDT)
Received: from ultra2.quiknet.com (ultra2.quiknet.com [207.183.249.4])
	by hub.org (8.10.1/8.10.1) with SMTP id e59IrQT80458
	for <pgsql-hackers@postgresql.org>; Fri, 9 Jun 2000 14:53:26 -0400 (EDT)
Received: (qmail 13302 invoked from network); 9 Jun 2000 18:53:21 -0000
Received: from 18.67.tc1.oro.pmpool.quiknet.com (HELO quiknet.com) (pecondon@207.231.67.18)
  by ultra2.quiknet.com with SMTP; 9 Jun 2000 18:53:21 -0000
Message-ID: <39413D08.A6BDC664@quiknet.com>
Date: Fri, 09 Jun 2000 11:52:57 -0700
From: Paul Condon <pecondon@quiknet.com>
X-Mailer: Mozilla 4.73 [en] (X11; U; Linux 2.2.14-5.0 i686)
X-Accept-Language: en
MIME-Version: 1.0
To: ohp@pyrenet.fr, pgsql-hackers@postgresql.org
Subject: [HACKERS] Re: Big project, please help
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailing-List: pgsql-hackers@postgresql.org
Precedence: bulk
Sender: pgsql-hackers-owner@hub.org
Status: OR

Two way replication on a single "table" is availabe in Lotus Notes. In
Notes, every record has a time-stamp, which contains the time of the
last update. (It also has a creation timestamp.) During replication,
timestamps are compared at the row/record level, and compared with the
timestamp of the last replication. If, for corresponding rows in two
replicas, the timestamp of one row is newer than the last replication,
the contents of this newer row is copied to the other replica. But if
both of the corresponding rows have newer timestamps, there is a
problem. The Lotus Notes solution is to:
  1. send a replication conflict message to the Notes Administrator,
which message contains full copies of both rows.
  2. copy the newest row over the less new row in the replicas.
  3. there is a mechanism for the Administrator to reverse the default
decision in 2, if the semantics of the message history, or off-line
investigation indicates that the wrong decision was made.

In practice, the Administrator is not overwhelmed with replication
conflict messages because updates usually only originate at the site
that originally created the row. Or updates fill only fields that were
originally 'TBD'. The full logic is perhaps more complicated than I have
described here, but it is already complicated enough to give you an idea
of what you're really being asked to do. I am not aware of a supplier of
relational database who really supports two way replication at the level
that Notes supports it, but Notes isn't a relational database.

The difficulty of the position that you appear to be in is that
management might believe that the full problem is solved in brand X
RDBMS, and you will have trouble convincing management that this is not
really true.


From pgsql-hackers-owner+M2401@hub.org Tue May 23 12:19:54 2000
Received: from news.tht.net (news.hub.org [216.126.91.242])
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA28410
	for <pgman@candle.pha.pa.us>; Tue, 23 May 2000 12:19:53 -0400 (EDT)
Received: from hub.org (majordom@hub.org [216.126.84.1])
	by news.tht.net (8.9.3/8.9.3) with ESMTP id MAB53304;
	Tue, 23 May 2000 12:00:08 -0400 (EDT)
	(envelope-from pgsql-hackers-owner+M2401@hub.org)
Received: from gwineta.repas.de (gwineta.repas.de [193.101.49.1])
	by hub.org (8.9.3/8.9.3) with ESMTP id LAA39896
	for <pgsql-hackers@postgresql.org>; Tue, 23 May 2000 11:57:31 -0400 (EDT)
	(envelope-from kardos@repas-aeg.de)
Received: (from smap@localhost)
	by gwineta.repas.de (8.8.8/8.8.8) id RAA27154
	for <pgsql-hackers@postgresql.org>; Tue, 23 May 2000 17:57:23 +0200
Received: from dragon.dr.repas.de(172.30.48.206) by gwineta.repas.de via smap (V2.1)
	id xma027101; Tue, 23 May 00 17:56:20 +0200
Received: from kardos.dr.repas.de ([172.30.48.153])
  by dragon.dr.repas.de (UCX V4.2-21C, OpenVMS V6.2 Alpha);
	Tue, 23 May 2000 17:57:24 +0200
Message-ID: <010201bfc4cf$7334d5a0$99301eac@Dr.repas.de>
From: "Kardos, Dr. Andreas" <kardos@repas-aeg.de>
To: "Todd M. Shrider" <tshrider@varesearch.com>,
        <pgsql-hackers@postgresql.org>
References: <Pine.LNX.4.04.10005180846290.15739-100000@silicon.su.valinux.com>
Subject: Re: [HACKERS] failing over with postgresql
Date: Tue, 23 May 2000 17:56:20 +0200
Organization: repas AEG Automation GmbH
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.00.2314.1300
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
X-Mailing-List: pgsql-hackers@postgresql.org
Precedence: bulk
Sender: pgsql-hackers-owner@hub.org
Status: OR

For a SCADA system (Supervisory Control and Data Akquisition) which consists
of one  master and one hot-standby server I have implemented such a
solution. To these UNIX servers client workstations are connected (NT and/or
UNIX). The database client programms run on client and server side.

When developing this approach I had to goals in mind:
1) Not to get dependend on the PostgreSQL sources since they change very
dynamically.
2) Not to get dependend on the fe/be protocol  since there are discussions
around to change it.

So the approach is quite simple: Forward all database requests to the
standby server on TCP/IP level.

On both servers the postmaster listens on port 5433 and not on 5432. On
standard port 5432 my program listens instead. This program forks twice for
every incomming connection. The first instance forwards all packets from the
frontend to both backends. The second instance receives the packets from all
backends and forwards the packets from the master backend to the frontend.
So a frontend running on a server machine connects to port 5432 of
localhost.

On the client machine runs another program (on NT as a service). This
program forks for every incomming connections twice. The first instance
forwards all packets to port 5432 of the current master server and the
second instance forwards the packets from the master server to the frontend.

During standby computer startup the database of the master computer is
dumped, zipped, copied to the standby computer, unzipped and loaded into
that database.
If a standby startup took place, all client connections are aborted to allow
a login into the standby database. The frontends need to reconnect in this
case. So the database of the standby computer is always in sync.

The disadvantage of this method is that a query cannot be canceled in the
standby server since the request key of this connections gets lost. But we
can live with that.

Both programms are able to run on Unix and on (native!) NT. On NT threads
are created instead of forked processes.

This approach is simple, but it is effective and it works.

We hope to survive this way until real replication will be implemented in
PostgreSQL.

Andreas Kardos

-----Ursprüngliche Nachricht-----
Von: Todd M. Shrider <tshrider@varesearch.com>
An: <pgsql-hackers@postgresql.org>
Gesendet: Donnerstag, 18. Mai 2000 17:48
Betreff: [HACKERS] failing over with postgresql


>
> is anyone working on or have working a fail-over implentation for the
> postgresql stuff. i'd be interested in seeing if and how any might be
> dealing with just general issues as well as the database syncing issues.
>
> we are looking to do this with heartbeat and lvs in mind. also if anyone
> is load ballancing their databases that would be cool to talk about to.
>
> ---
> Todd M. Shrider VA Linux Systems
> Systems Engineer
> tshrider@valinux.com www.valinux.com
>


From pgsql-hackers-owner+M3662@postgresql.org Tue Jan 23 16:23:34 2001
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA04456
	for <pgman@candle.pha.pa.us>; Tue, 23 Jan 2001 16:23:34 -0500 (EST)
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0NLKf004705;
	Tue, 23 Jan 2001 16:20:41 -0500 (EST)
	(envelope-from pgsql-hackers-owner+M3662@postgresql.org)
Received: from sectorbase2.sectorbase.com ([208.48.122.131])
	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0NLAe003753
	for <pgsql-hackers@postgresql.org>; Tue, 23 Jan 2001 16:10:40 -0500 (EST)
	(envelope-from vmikheev@SECTORBASE.COM)
Received: by sectorbase2.sectorbase.com with Internet Mail Service (5.5.2653.19)
	id <DG1W4Q8F>; Tue, 23 Jan 2001 12:49:07 -0800
Message-ID: <8F4C99C66D04D4118F580090272A7A234D32AF@sectorbase1.sectorbase.com>
From: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
To: "'dom@idealx.com'" <dom@idealx.com>, pgsql-hackers@postgresql.org
Subject: RE: [HACKERS] Re: AW: Re: MySQL and BerkleyDB (fwd)
Date: Tue, 23 Jan 2001 13:10:34 -0800
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain;
	charset="iso-8859-1"
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: ORr

>   I had thought that the pre-commit information could be stored in an
> auxiliary table by the middleware program ; we would then have
> to re-implement some sort of higher-level WAL (I thought of the list
> of the commands performed in the current transaction, with a sequence
> number for each of them that would guarantee correct ordering between
> concurrent transactions in case of a REDO). But I fear I am missing

This wouldn't work for READ COMMITTED isolation level.
But why do you want to log commands into WAL where each modification
is already logged in, hm, correct order?
Well, it has sense if you're looking for async replication but
you need not in two-phase commit for this and should aware about
problems with READ COMMITTED isolevel.

Back to two-phase commit - it's easiest part of work required for
distributed transaction processing.
Currently we place single commit record to log and transaction is
committed when this record (and so all other transaction records)
is on disk.
Two-phase commit:

1. For 1st phase we'll place into log "prepared-to-commit" record
   and this phase will be accomplished after record is flushed on disk.
   At this point transaction may be committed at any time because of
   all its modifications are logged. But it still may be rolled back
   if this phase failed on other sites of distributed system.

2. When all sites are prepared to commit we'll place "committed"
   record into log. No need to flush it because of in the event of
   crash for all "prepared" transactions recoverer will have to
   communicate other sites to know their statuses anyway.

That's all! It is really hard to implement distributed lock- and
communication- managers but there is no problem with logging two
records instead of one. Period.

Vadim

From pgsql-hackers-owner+M3665@postgresql.org Tue Jan 23 17:05:26 2001
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA05972
	for <pgman@candle.pha.pa.us>; Tue, 23 Jan 2001 17:05:24 -0500 (EST)
Received: from mail.postgresql.org (webmail.postgresql.org [216.126.85.28])
	by mail.postgresql.org (8.11.1/8.11.1) with SMTP id f0NM31008120;
	Tue, 23 Jan 2001 17:03:01 -0500 (EST)
	(envelope-from pgsql-hackers-owner+M3665@postgresql.org)
Received: from candle.pha.pa.us (candle.navpoint.com [162.33.245.46])
	by mail.postgresql.org (8.11.1/8.11.1) with ESMTP id f0NLsU007188
	for <pgsql-hackers@postgresql.org>; Tue, 23 Jan 2001 16:54:30 -0500 (EST)
	(envelope-from pgman@candle.pha.pa.us)
Received: (from pgman@localhost)
	by candle.pha.pa.us (8.9.0/8.9.0) id QAA05300;
	Tue, 23 Jan 2001 16:53:53 -0500 (EST)
From: Bruce Momjian <pgman@candle.pha.pa.us>
Message-Id: <200101232153.QAA05300@candle.pha.pa.us>
Subject: Re: [HACKERS] Re: AW: Re: MySQL and BerkleyDB (fwd)
In-Reply-To: <8F4C99C66D04D4118F580090272A7A234D32AF@sectorbase1.sectorbase.com>
	"from Mikheev, Vadim at Jan 23, 2001 01:10:34 pm"
To: "Mikheev, Vadim" <vmikheev@SECTORBASE.COM>
Date: Tue, 23 Jan 2001 16:53:53 -0500 (EST)
CC: "'dom@idealx.com'" <dom@idealx.com>, pgsql-hackers@postgresql.org
X-Mailer: ELM [version 2.4ME+ PL77 (25)]
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=US-ASCII
Precedence: bulk
Sender: pgsql-hackers-owner@postgresql.org
Status: OR

[ Charset ISO-8859-1 unsupported, converting... ]
> >   I had thought that the pre-commit information could be stored in an
> > auxiliary table by the middleware program ; we would then have
> > to re-implement some sort of higher-level WAL (I thought of the list
> > of the commands performed in the current transaction, with a sequence
> > number for each of them that would guarantee correct ordering between
> > concurrent transactions in case of a REDO). But I fear I am missing
>
> This wouldn't work for READ COMMITTED isolation level.
> But why do you want to log commands into WAL where each modification
> is already logged in, hm, correct order?
> Well, it has sense if you're looking for async replication but
> you need not in two-phase commit for this and should aware about
> problems with READ COMMITTED isolevel.
>

I believe the issue here is that while SERIALIZABLE ISOLATION means all
queries can be run serially, our default is READ COMMITTED, meaning that
open transactions see committed transactions, even if the transaction
committed after our transaction started.  (FYI, see my chapter on
transactions for help,  http://www.postgresql.org/docs/awbook.html.)

To do higher-level WAL, you would have to record not only the queries,
but the other queries that were committed at the start of each command
in your transaction.

Ideally, you could number every commit by its XID your log, and then
when processing the query, pass the "committed" transaction ids that
were visible at the time each command began.

In other words, you can replay the queries in transaction commit order,
except that you have to have some transactions committed at specific
points while other transactions are open, i.e.:

XID	Open XIDS	Query
500			UPDATE t SET col = 3;
501	500		BEGIN;
501	500		UPDATE t SET col = 4;
501			UPDATE t SET col = 5;
501			COMMIT;

This is a silly example, but it shows that 500 must commit after the
first command in transaction 501, but before the second command in the
transaction.  This is because UPDATE t SET col = 5 actually sees the
changes made by transaction 500 in READ COMMITTED isolation level.

I am not advocating this.  I think WAL is a better choice.  I just
wanted to outline how replaying the queries in commit order is
insufficient.

> Back to two-phase commit - it's easiest part of work required for
> distributed transaction processing.
> Currently we place single commit record to log and transaction is
> committed when this record (and so all other transaction records)
> is on disk.
> Two-phase commit:
>
> 1. For 1st phase we'll place into log "prepared-to-commit" record
>    and this phase will be accomplished after record is flushed on disk.
>    At this point transaction may be committed at any time because of
>    all its modifications are logged. But it still may be rolled back
>    if this phase failed on other sites of distributed system.
>
> 2. When all sites are prepared to commit we'll place "committed"
>    record into log. No need to flush it because of in the event of
>    crash for all "prepared" transactions recoverer will have to
>    communicate other sites to know their statuses anyway.
>
> That's all! It is really hard to implement distributed lock- and
> communication- managers but there is no problem with logging two
> records instead of one. Period.

Great.


--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026