Add TODO detail directory.

1999-09-20 15:40:12 +00:00 · 1999-09-20 15:40:12 +00:00 · 957e6a6921
commit 957e6a6921
parent 7559677551
19 changed files with 12082 additions and 0 deletions
--- a/doc/TODO.detail/README
+++ b/doc/TODO.detail/README
@ -0,0 +1,2 @@
+These files are in standard Unix mailbox format, and are detail
+information related to the TODO list.
--- a/doc/TODO.detail/alpha
+++ b/doc/TODO.detail/alpha
@ -0,0 +1,107 @@
+From owner-pgsql-hackers@hub.org Fri May 14 16:00:46 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id QAA02173
+	for <maillist@candle.pha.pa.us>; Fri, 14 May 1999 16:00:44 -0400 (EDT)
+Received: from hub.org (hub.org [209.167.229.1]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id QAA02824 for <maillist@candle.pha.pa.us>; Fri, 14 May 1999 16:00:45 -0400 (EDT)
+Received: from hub.org (hub.org [209.167.229.1])
+	by hub.org (8.9.3/8.9.3) with ESMTP id PAA47798;
+	Fri, 14 May 1999 15:57:54 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 14 May 1999 15:54:30 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.9.3/8.9.3) id PAA47191
+	for pgsql-hackers-outgoing; Fri, 14 May 1999 15:54:28 -0400 (EDT)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from thelab.hub.org (nat194.147.mpoweredpc.net [142.177.194.147])
+	by hub.org (8.9.3/8.9.3) with ESMTP id PAA46457
+	for <pgsql-hackers@postgresql.org>; Fri, 14 May 1999 15:49:35 -0400 (EDT)
+	(envelope-from scrappy@hub.org)
+Received: from localhost (scrappy@localhost)
+	by thelab.hub.org (8.9.3/8.9.1) with ESMTP id QAA16128;
+	Fri, 14 May 1999 16:49:44 -0300 (ADT)
+	(envelope-from scrappy@hub.org)
+X-Authentication-Warning: thelab.hub.org: scrappy owned process doing -bs
+Date: Fri, 14 May 1999 16:49:44 -0300 (ADT)
+From: The Hermit Hacker <scrappy@hub.org>
+To: pgsql-hackers@postgreSQL.org
+cc: Jack Howarth <howarth@nitro.med.uc.edu>
+Subject: [HACKERS] postgresql bug report (fwd)
+Message-ID: <Pine.BSF.4.05.9905141649150.47191-100000@thelab.hub.org>
+MIME-Version: 1.0
+Content-Type: TEXT/PLAIN; charset=US-ASCII
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: ROr
+
+
+Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
+Systems Administrator @ hub.org 
+primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
+
+---------- Forwarded message ----------
+Date: Fri, 14 May 1999 14:50:58 -0400
+From: Jack Howarth <howarth@nitro.med.uc.edu>
+To: scrappy@hub.org
+Subject: postgresql bug report
+
+Marc,
+      In porting the RedHat 6.0 srpm set for a linuxppc release we
+believe a bug has been identified in
+the postgresql source for 6.5-0.beta1. Our development tools are as
+follows...
+
+glibc 2.1.1 pre 2
+linux 2.2.6
+egcs 1.1.2
+the latest binutils snapshot
+
+The bug that we see is that when egcs compiles postgresql at -O1 or
+higher (-O0 is fine),
+postgresql creates incorrectly formed databases such that when the user
+does a destroydb
+the database can not be destroyed. Franz Sirl has identified the problem
+as follows...
+
+    it seems that this problem is a type casting/promotion bug in the
+source. The
+    routine _bt_checkkeys() in backend/access/nbtree/nbtutils.c calls
+int2eq() in
+    backend/utils/adt/int.c via a function pointer
+*fmgr_faddr(&key[0].sk_func). As
+    the type information for int2eq is lost via the function pointer,
+the compiler
+    passes 2 ints, but int2eq expects 2 (preformatted in a 32bit reg)
+int16's.
+    This particular bug goes away, if I for example change int2eq to:
+
+    bool
+    int2eq(int32 arg1, int32 arg2)
+    {
+            return (int16)arg1 == (int16)arg2;
+    }
+
+    This moves away the type casting/promotion "work" from caller to the
+callee and
+    is probably the right thing to do for functions used via function
+pointers.
+
+...because of the large number of changes required to do this, Franz
+thought we should
+pass this on to the postgresql maintainers for correction. Please feel
+free to contact
+Franz Sirl (Franz.Sirl-kernel@lauterbach.com) if you have any questions
+on this bug
+report.
+
+--
+------------------------------------------------------------------------------
+Jack W. Howarth, Ph.D.                                     231 Bethesda Avenue
+NMR Facility Director                              Cincinnati, Ohio 45267-0524
+Dept. of Molecular Genetics                              phone: (513) 558-4420
+Univ. of Cincinnati College of Medicine                    fax: (513) 558-8474
+
+
+
+
+
+
--- a/doc/TODO.detail/arrays
+++ b/doc/TODO.detail/arrays
@ -0,0 +1,94 @@
+From owner-pgsql-hackers@hub.org Wed Nov 25 19:01:02 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id TAA16399
+	for <maillist@candle.pha.pa.us>; Wed, 25 Nov 1998 19:01:01 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id SAA05250 for <maillist@candle.pha.pa.us>; Wed, 25 Nov 1998 18:53:12 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.1/8.9.1) with SMTP id SAA17798;
+	Wed, 25 Nov 1998 18:49:38 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 25 Nov 1998 18:49:07 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.1/8.9.1) id SAA17697
+	for pgsql-hackers-outgoing; Wed, 25 Nov 1998 18:49:06 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from mail.enterprise.net (root@mail.enterprise.net [194.72.192.18])
+	by hub.org (8.9.1/8.9.1) with ESMTP id SAA17650;
+	Wed, 25 Nov 1998 18:48:55 -0500 (EST)
+	(envelope-from olly@lfix.co.uk)
+Received: from linda.lfix.co.uk (root@max01-040.enterprise.net [194.72.197.40])
+	by mail.enterprise.net (8.8.5/8.8.5) with ESMTP id XAA20539;
+	Wed, 25 Nov 1998 23:48:52 GMT
+Received: from linda.lfix.co.uk (olly@localhost [127.0.0.1])
+	by linda.lfix.co.uk (8.9.1a/8.9.1/Debian/GNU) with ESMTP id XAA12089;
+	Wed, 25 Nov 1998 23:48:52 GMT
+Message-Id: <199811252348.XAA12089@linda.lfix.co.uk>
+X-Mailer: exmh version 2.0.2 2/24/98 (debian) 
+X-URL: http://www.lfix.co.uk/oliver
+X-face: "xUFVDj+ZJtL_IbURmI}!~xAyPC"Mrk=MkAm&tPQnNq(FWxv49R}\>0oI8VM?O2VY+N7@F-
+	KMLl*!h}B)u@TW|B}6<X<J|}QsVlTi:RA:O7Abc(@D2Y/"J\S,b1!<&<B/J}b.Ii9@B]H6V!+#sE0Q
+	_+=`K$5TI|4I0-=Cp%pt~L#QYydO'iBXR~\tT?uftep9n9AF`@SzTwsw6uqJ}pL,h(cZi}T#PB"#!k
+	p^e=Z.K~fuw$l?]lUV)?R]U}l;f*~Ol)#fpKR)Yt}XOr6BI\_Jjr0!@GMnpCTnTym4f;c{;Ms=0{`D
+	Lq9MO6{wj%s-*N"G,g
+To: bugs@postgreSQL.org, hackers@postgreSQL.org
+Subject: [HACKERS] Failures with arrays
+Mime-Version: 1.0
+Content-Type: text/plain; charset=us-ascii
+Date: Wed, 25 Nov 1998 23:48:51 +0000
+From: "Oliver Elphick" <olly@lfix.co.uk>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: ROr
+
+This was reported as a bug with the Debian package of 6.3.2; the same
+behaviour is still present in 6.4. 
+
+bray=> create table foo ( t text[]);
+CREATE
+bray=> insert into foo values ( '{"a"}');
+INSERT 201354 1
+bray=> insert into foo values ( '{"a","b"}');
+INSERT 201355 1
+bray=>  insert into foo values ( '{"a","b","c"}');
+INSERT 201356 1
+bray=>  select * from foo;
+t            
+-------------
+{"a"}        
+{"a","b"}    
+{"a","b","c"}
+(3 rows)
+
+bray=> select t[1] from foo;
+ERROR:  type name lookup of t failed
+bray=> select * from foo;
+t            
+-------------
+{"a"}        
+{"a","b"}    
+{"a","b","c"}
+(3 rows)
+
+bray=> select foo.t[1] from foo;
+t
+-
+a
+a
+a
+(3 rows)
+
+bray=> select count(foo.t[1]) from foo;
+pqReadData() -- backend closed the channel unexpectedly.
+
+-- 
+Oliver Elphick                                Oliver.Elphick@lfix.co.uk
+Isle of Wight                              http://www.lfix.co.uk/oliver
+               PGP key from public servers; key ID 32B8FAA1
+                 ========================================
+     "Let us therefore come boldly unto the throne of grace,
+      that we may obtain mercy, and find grace to help in 
+      time of need."             Hebrews 4:16 
+
+
+
+
--- a/doc/TODO.detail/cnfify
+++ b/doc/TODO.detail/cnfify
--- a/doc/TODO.detail/flock
+++ b/doc/TODO.detail/flock
@ -0,0 +1,351 @@
+From tgl@sss.pgh.pa.us Sun Aug 30 11:25:23 1998
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA12607
+	for <maillist@candle.pha.pa.us>; Sun, 30 Aug 1998 11:25:20 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id LAA15788;
+	Sun, 30 Aug 1998 11:23:38 -0400 (EDT)
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+cc: dz@cs.unitn.it (Massimo Dal Zotto), hackers@postgreSQL.org
+Subject: Re: [HACKERS] flock patch breaks things here 
+In-reply-to: Your message of Sun, 30 Aug 1998 08:19:52 -0400 (EDT) 
+             <199808301219.IAA08821@candle.pha.pa.us> 
+Date: Sun, 30 Aug 1998 11:23:38 -0400
+Message-ID: <15786.904490618@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Status: RO
+
+Bruce Momjian <maillist@candle.pha.pa.us> writes:
+> Can't we just have configure check for flock().  Another idea is to
+> create a 'pid' file in the pgsql/data/base directory, and do a kill -0
+> to see if it is stil running before removing the lock.
+
+The latter approach is what I was going to suggest.  Writing a pid file
+would be a fine idea anyway --- for one thing, it makes it a lot easier
+to write a "kill the postmaster" script.  Given that the postmaster
+should write a pid file, a new postmaster should look for an existing
+pid file, and try to do a kill(pid, 0) on the number contained therein.
+If this doesn't return an error, then you figure there is already a
+postmaster running, complain, and exit.  Otherwise you figure you is it,
+(re)write the pid file and away you go.  Then pqcomm.c can just
+unconditionally delete any old file that's in the way of making the
+pipe.
+
+The pidfile checking and creation probably ought to go in postmaster.c,
+not down inside pqcomm.c.  I never liked the fact that a critical
+interlock function was being done by a low-level library that one might
+not even want to invoke (if all your clients are using TCP, opening up
+the Unix-domain socket is a waste of time, no?).
+
+BTW, there is another problem with relying on flock on the socket file
+for this purpose: it opens up a hole for a denial-of-service attack.
+Anyone who can write the file can flock it.  (We already had a problem
+with DOS via creating a dummy file at /tmp/.s.PGSQL.5432, but it would
+be harder to spot the culprit with an flock-based interference.)
+
+			regards, tom lane
+
+From owner-pgsql-hackers@hub.org Sun Aug 30 12:27:41 1998
+Received: from hub.org (hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id MAA12976
+	for <maillist@candle.pha.pa.us>; Sun, 30 Aug 1998 12:27:37 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id MAA09234; Sun, 30 Aug 1998 12:24:51 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 30 Aug 1998 12:23:26 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id MAA09167 for pgsql-hackers-outgoing; Sun, 30 Aug 1998 12:23:25 -0400 (EDT)
+Received: from mambo.cs.unitn.it (mambo.cs.unitn.it [193.205.199.204]) by hub.org (8.8.8/8.7.5) with SMTP id MAA09150 for <hackers@postgreSQL.org>; Sun, 30 Aug 1998 12:23:08 -0400 (EDT)
+Received: from boogie.cs.unitn.it (dz@boogie [193.205.199.79]) by mambo.cs.unitn.it (8.6.12/8.6.12) with ESMTP id SAA29572; Sun, 30 Aug 1998 18:21:42 +0200
+Received: (from dz@localhost) by boogie.cs.unitn.it (8.8.5/8.6.9) id SAA05993; Sun, 30 Aug 1998 18:21:41 +0200
+From: Massimo Dal Zotto <dz@cs.unitn.it>
+Message-Id: <199808301621.SAA05993@boogie.cs.unitn.it>
+Subject: Re: [HACKERS] flock patch breaks things here
+To: hackers@postgreSQL.org (PostgreSQL Hackers)
+Date: Sun, 30 Aug 1998 18:21:41 +0200 (MET DST)
+Cc: tgl@sss.pgh.pa.us (Tom Lane)
+In-Reply-To: <15786.904490618@sss.pgh.pa.us> from "Tom Lane" at Aug 30, 98 11:23:38 am
+X-Mailer: ELM [version 2.4 PL24 ME4]
+MIME-Version: 1.0
+Content-Type: text/plain; charset=iso-8859-1
+Content-Transfer-Encoding: 8bit
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: ROr
+
+> 
+> Bruce Momjian <maillist@candle.pha.pa.us> writes:
+> > Can't we just have configure check for flock().  Another idea is to
+> > create a 'pid' file in the pgsql/data/base directory, and do a kill -0
+> > to see if it is stil running before removing the lock.
+> 
+> The latter approach is what I was going to suggest.  Writing a pid file
+> would be a fine idea anyway --- for one thing, it makes it a lot easier
+> to write a "kill the postmaster" script.  Given that the postmaster
+> should write a pid file, a new postmaster should look for an existing
+> pid file, and try to do a kill(pid, 0) on the number contained therein.
+> If this doesn't return an error, then you figure there is already a
+> postmaster running, complain, and exit.  Otherwise you figure you is it,
+> (re)write the pid file and away you go.  Then pqcomm.c can just
+> unconditionally delete any old file that's in the way of making the
+> pipe.
+> 
+> The pidfile checking and creation probably ought to go in postmaster.c,
+> not down inside pqcomm.c.  I never liked the fact that a critical
+> interlock function was being done by a low-level library that one might
+> not even want to invoke (if all your clients are using TCP, opening up
+> the Unix-domain socket is a waste of time, no?).
+> 
+> BTW, there is another problem with relying on flock on the socket file
+> for this purpose: it opens up a hole for a denial-of-service attack.
+> Anyone who can write the file can flock it.  (We already had a problem
+> with DOS via creating a dummy file at /tmp/.s.PGSQL.5432, but it would
+> be harder to spot the culprit with an flock-based interference.)
+
+This came to my mind, but I didn't think this would have happened so
+quickly. In my opinion the socket and the pidfile should be created in a
+directory owned by postgres, for example /tmp/.Pgsql-unix, like does X.
+
+-- 
+Massimo Dal Zotto
+
+----------------------------------------------------------------------+
+|  Massimo Dal Zotto                email:  dz@cs.unitn.it             |
+|  Via Marconi, 141                 phone:  ++39-461-534251            |
+|  38057 Pergine Valsugana (TN)     www:  http://www.cs.unitn.it/~dz/  |
+|  Italy                            pgp:  finger dz@tango.cs.unitn.it  |
+----------------------------------------------------------------------+
+
+
+From owner-pgsql-hackers@hub.org Sun Aug 30 13:01:10 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id NAA13785
+	for <maillist@candle.pha.pa.us>; Sun, 30 Aug 1998 13:01:09 -0400 (EDT)
+Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id MAA29386 for <maillist@candle.pha.pa.us>; Sun, 30 Aug 1998 12:58:24 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id MAA11406; Sun, 30 Aug 1998 12:54:48 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 30 Aug 1998 12:52:22 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id MAA11310 for pgsql-hackers-outgoing; Sun, 30 Aug 1998 12:52:20 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id MAA11296 for <hackers@postgreSQL.org>; Sun, 30 Aug 1998 12:52:13 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id MAA16094;
+	Sun, 30 Aug 1998 12:50:55 -0400 (EDT)
+To: Massimo Dal Zotto <dz@cs.unitn.it>
+cc: hackers@postgreSQL.org (PostgreSQL Hackers)
+Subject: Re: [HACKERS] flock patch breaks things here 
+In-reply-to: Your message of Sun, 30 Aug 1998 18:21:41 +0200 (MET DST) 
+             <199808301621.SAA05993@boogie.cs.unitn.it> 
+Date: Sun, 30 Aug 1998 12:50:55 -0400
+Message-ID: <16092.904495855@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: RO
+
+Massimo Dal Zotto <dz@cs.unitn.it> writes:
+> In my opinion the socket and the pidfile should be created in a
+> directory owned by postgres, for example /tmp/.Pgsql-unix, like does X.
+
+The pidfile belongs at the top level of the database directory (eg,
+/usr/local/pgsql/data/postmaster.pid), because what it actually
+represents is that there is a postmaster running *for that database
+group*.
+
+If you want to support multiple database sets on one machine (which I
+do), then the interlock has to be per database directory.  Putting the
+pidfile into a common directory would mean we'd have to invent some
+kind of pidfile naming convention to keep multiple postmasters from
+tromping on each other.  This is unnecessarily complex.
+
+I agree with you that putting the socket file into a less easily munged
+directory than /tmp would be a good idea for security.  But that's a
+separate issue.  On machines that understand stickybits for directories,
+the security hole is not really very big.
+
+At this point, the fact that /tmp/.s.PGSQL.port# is the socket path is
+effectively a version-independent aspect of the FE/BE protocol, and so
+we can't change it without breaking old applications.  I'm not sure that
+that's worth the security improvement.
+
+What I'd like to see someday is a postmaster command line switch to tell
+it to use *only* TCP connections and not create a Unix socket at all.
+That hasn't been possible so far, because we were relying on the socket
+file to provide a safety interlock against starting multiple
+postmasters.  But an interlock using a pidfile would be much better.
+(Look around; *every* other Unix daemon I know of that wants to ensure
+that there's only one of it uses a pidfile interlock.  Not file locking.
+There's a reason why that's the well-trodden path.)
+
+			regards, tom lane
+
+
+From owner-pgsql-hackers@hub.org Sun Aug 30 15:31:13 1998
+Received: from hub.org (hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id PAA15275
+	for <maillist@candle.pha.pa.us>; Sun, 30 Aug 1998 15:31:11 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id PAA22194; Sun, 30 Aug 1998 15:27:20 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 30 Aug 1998 15:23:58 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id PAA21800 for pgsql-hackers-outgoing; Sun, 30 Aug 1998 15:23:57 -0400 (EDT)
+Received: from thelab.hub.org (nat0118.mpoweredpc.net [142.177.188.118]) by hub.org (8.8.8/8.7.5) with ESMTP id PAA21696 for <hackers@postgreSQL.org>; Sun, 30 Aug 1998 15:22:51 -0400 (EDT)
+Received: from localhost (scrappy@localhost)
+	by thelab.hub.org (8.9.1/8.8.8) with SMTP id QAA18542;
+	Sun, 30 Aug 1998 16:21:29 -0300 (ADT)
+	(envelope-from scrappy@hub.org)
+X-Authentication-Warning: thelab.hub.org: scrappy owned process doing -bs
+Date: Sun, 30 Aug 1998 16:21:28 -0300 (ADT)
+From: The Hermit Hacker <scrappy@hub.org>
+To: Tom Lane <tgl@sss.pgh.pa.us>
+cc: Massimo Dal Zotto <dz@cs.unitn.it>,
+        PostgreSQL Hackers <hackers@postgreSQL.org>
+Subject: Re: [HACKERS] flock patch breaks things here 
+In-Reply-To: <16092.904495855@sss.pgh.pa.us>
+Message-ID: <Pine.BSF.4.02.9808301618350.343-100000@thelab.hub.org>
+MIME-Version: 1.0
+Content-Type: TEXT/PLAIN; charset=US-ASCII
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: RO
+
+On Sun, 30 Aug 1998, Tom Lane wrote:
+
+> Massimo Dal Zotto <dz@cs.unitn.it> writes:
+> > In my opinion the socket and the pidfile should be created in a
+> > directory owned by postgres, for example /tmp/.Pgsql-unix, like does X.
+> 
+> The pidfile belongs at the top level of the database directory (eg,
+> /usr/local/pgsql/data/postmaster.pid), because what it actually
+> represents is that there is a postmaster running *for that database
+> group*.
+
+	I have to agree with this one...but then it also negates the
+argument about the flock() DoS...*grin*
+
+	BTW...I like the kill(pid,0) solution myself, primarily because it
+is, i think, the most portable solution.  
+
+	I would not consider a patch to remove the flock() solution and
+replace it with the kill(pid,0) solution a new feature, just an
+improvement of an existing one...either way, moving the pid file (or
+socket, for that matter) from /tmp should be listed as a security related
+requirement for v6.4 :)
+
+Marc G. Fournier                                
+Systems Administrator @ hub.org 
+primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 
+
+
+
+From owner-pgsql-hackers@hub.org Sun Aug 30 22:41:10 1998
+Received: from hub.org (hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id WAA01526
+	for <maillist@candle.pha.pa.us>; Sun, 30 Aug 1998 22:41:08 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id WAA29298; Sun, 30 Aug 1998 22:38:18 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 30 Aug 1998 22:35:05 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id WAA29203 for pgsql-hackers-outgoing; Sun, 30 Aug 1998 22:35:03 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id WAA29017 for <hackers@postgreSQL.org>; Sun, 30 Aug 1998 22:34:55 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id WAA20075;
+	Sun, 30 Aug 1998 22:34:41 -0400 (EDT)
+To: The Hermit Hacker <scrappy@hub.org>
+cc: PostgreSQL Hackers <hackers@postgreSQL.org>
+Subject: Re: [HACKERS] flock patch breaks things here 
+In-reply-to: Your message of Sun, 30 Aug 1998 16:21:28 -0300 (ADT) 
+             <Pine.BSF.4.02.9808301618350.343-100000@thelab.hub.org> 
+Date: Sun, 30 Aug 1998 22:34:40 -0400
+Message-ID: <20073.904530880@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: ROr
+
+The Hermit Hacker <scrappy@hub.org> writes:
+> either way, moving the pid file (or
+> socket, for that matter) from /tmp should be listed as a security related
+> requirement for v6.4 :)
+
+Huh?  There is no pid file being generated in /tmp (or anywhere else)
+at the moment.  If we do add one, it should not go into /tmp for the
+reasons I gave before.
+
+Where the Unix-domain socket file lives is an entirely separate issue.
+
+If we move the socket out of /tmp then we have just kicked away all the
+work we did to preserve backwards compatibility of the FE/BE protocol
+with existing clients.  Being able to talk to a 1.0 client isn't much
+good if you aren't listening where he's going to try to contact you.
+So I think I have to vote in favor of leaving the socket where it is.
+
+			regards, tom lane
+
+
+From owner-pgsql-hackers@hub.org Mon Aug 31 11:31:19 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA21195
+	for <maillist@candle.pha.pa.us>; Mon, 31 Aug 1998 11:31:13 -0400 (EDT)
+Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id LAA06827 for <maillist@candle.pha.pa.us>; Mon, 31 Aug 1998 11:17:41 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA24792; Mon, 31 Aug 1998 11:12:18 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 31 Aug 1998 11:10:31 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA24742 for pgsql-hackers-outgoing; Mon, 31 Aug 1998 11:10:29 -0400 (EDT)
+Received: from trillium.nmsu.edu (trillium.NMSU.Edu [128.123.5.15]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA24725 for <hackers@postgreSQL.org>; Mon, 31 Aug 1998 11:10:22 -0400 (EDT)
+Received: (from brook@localhost)
+	by trillium.nmsu.edu (8.8.8/8.8.8) id JAA03282;
+	Mon, 31 Aug 1998 09:09:01 -0600 (MDT)
+Date: Mon, 31 Aug 1998 09:09:01 -0600 (MDT)
+Message-Id: <199808311509.JAA03282@trillium.nmsu.edu>
+From: Brook Milligan <brook@trillium.NMSU.Edu>
+To: tgl@sss.pgh.pa.us
+CC: dg@informix.com, hackers@postgreSQL.org
+In-reply-to: <23042.904573041@sss.pgh.pa.us> (message from Tom Lane on Mon, 31
+	Aug 1998 10:17:21 -0400)
+Subject: Re: [HACKERS] flock patch breaks things here
+References:  <23042.904573041@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: ROr
+
+   I just came up with an idea that might help alleviate the /tmp security
+   exposure without creating a backwards-compatibility problem.  It works
+   like this:
+
+   1. During installation, create a subdirectory of /tmp to hold Postgres'
+   socket files and associated pid lockfiles.  This subdirectory should be
+   owned by the Postgres superuser and have permissions 755
+   (world-readable, writable only by Postgres superuser).  Maybe call it
+   /tmp/.pgsql --- the name should start with a dot to keep it out of the
+   way.  (Bruce points out that some systems clear /tmp during reboot, so
+   it might be that a postmaster will have to be prepared to recreate this
+   directory at startup --- anyone know if subdirectories of /tmp are
+   zapped too?  My system doesn't do that...)
+
+   ...
+
+   I notice that on my system, the X11 socket files in /tmp/.X11-unix are
+   actually symlinks to socket files in /usr/spool/sockets/X11.  I dunno if
+   it's worth our trouble to get into putting our sockets under /usr/spool
+   or /var/spool or whatever --- seems like another configuration choice to
+   mess up.  It'd be nice if the socket directory lived somewhere where the
+   parent dirs weren't world-writable, but this would mean one more thing
+   that you have to have root permissions for in order to install pgsql.
+
+It seems like we need a directory for locks (= pid files) and one for
+sockets (perhaps the same one).  I strongly suggest that the location
+for these be configurable.  By default, it might make sense to put
+them in ~pgsql/locks and ~pgsql/sockets.  It is easy (i.e., I'll be
+glad to do it) to modify configure.in to take options like
+
+	     --lock-dir=/var/spool/lock
+	     --socket-dir=/var/spool/sockets
+
+that set cc defines and have the code respond accordingly.  This way,
+those who don't care (or don't have root access) can use the defaults,
+whereas those with root access who like to keep locks and sockets in a
+common place can do so easily.  Either way, multiple postmasters (all
+compiled with the same options of course) can check the appropriate
+locks in the well-known places.  Finally, drop the link into /tmp for
+the old socket and document that it will be disappearing at some
+point, and all is fine.  
+
+If someone wants to give me some guidance on what preprocessor
+variables should be set in response to the above options (or something
+like them), I'll do the configure stuff.
+
+Cheers,
+Brook
+
+
--- a/doc/TODO.detail/fsync
+++ b/doc/TODO.detail/fsync
@ -0,0 +1,69 @@
+From owner-pgsql-general@hub.org Fri Dec 18 06:31:23 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id GAA05554
+	for <maillist@candle.pha.pa.us>; Fri, 18 Dec 1998 06:31:21 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id EAA21127 for <maillist@candle.pha.pa.us>; Fri, 18 Dec 1998 04:46:38 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.1/8.9.1) with SMTP id EAA01409;
+	Fri, 18 Dec 1998 04:44:19 -0500 (EST)
+	(envelope-from owner-pgsql-general@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 18 Dec 1998 04:43:22 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.1/8.9.1) id EAA01093
+	for pgsql-general-outgoing; Fri, 18 Dec 1998 04:43:18 -0500 (EST)
+	(envelope-from owner-pgsql-general@postgreSQL.org)
+Received: from dune.krs.ru (dune.krs.ru [195.161.16.38])
+	by hub.org (8.9.1/8.9.1) with ESMTP id EAA01067
+	for <pgsql-general@postgreSQL.org>; Fri, 18 Dec 1998 04:43:09 -0500 (EST)
+	(envelope-from vadim@krs.ru)
+Received: from krs.ru (localhost.krs.ru [127.0.0.1])
+	by dune.krs.ru (8.8.8/8.8.7) with ESMTP id QAA16201;
+	Fri, 18 Dec 1998 16:41:44 +0700 (KRS)
+	(envelope-from vadim@krs.ru)
+Message-ID: <367A2354.E998763@krs.ru>
+Date: Fri, 18 Dec 1998 16:41:40 +0700
+From: Vadim Mikheev <vadim@krs.ru>
+Organization: OJSC Rostelecom (Krasnoyarsk)
+X-Mailer: Mozilla 4.5 [en] (X11; I; FreeBSD 2.2.6-RELEASE i386)
+X-Accept-Language: ru, en
+MIME-Version: 1.0
+To: Anton de Wet <adw@obsidian.co.za>
+CC: pgsql-general@postgreSQL.org
+Subject: Re: [GENERAL] Why PostgreSQL is better than other commerial softwares?
+References: <Pine.LNX.4.04.9812181046030.9458-100000@ra.obsidian.co.za>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Sender: owner-pgsql-general@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Anton de Wet wrote:
+> 
+> >
+> > Often quick mailing list support?
+> 
+> :-)
+> 
+> While on the subject I finally found the solution to a problem I (and one
+> or two other people) posted about without answer. (So sometimes it's slow
+> mailing list support).
+> 
+> In importing about 5 million records (which I copy in blocks of 10000) the
+> copy became linearly slower. After a friend RTFM and refered me, I used
+> the -F switch (passed by the postmaster to the backend processes) and the
+> time became linear and a LOT shorter. Import time for the 5000000 records
+> now the same (or maybe even slightly faster, I didn't accurately time
+> them) as importing the data into oracle on the same machine.
+
+"While on the subject..." -:)
+
+This is the problem of buffer manager, known for very long time:
+when copy eats all buffers, manager begins write/fsync each
+durty buffer to free buffer for new data. All updated relations
+should be fsynced _once_ @ transaction commit. You would get
+the same results without -F...
+I still have no time to implement this -:(
+
+Vadim
+
+
--- a/doc/TODO.detail/lex
+++ b/doc/TODO.detail/lex
@ -0,0 +1,332 @@
+From selkovjr@mcs.anl.gov Sat Jul 25 05:31:05 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id FAA16564
+	for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 05:31:03 -0400 (EDT)
+Received: from antares.mcs.anl.gov (mcs.anl.gov [140.221.9.6]) by renoir.op.net (o1/$Revision: 1.1 $) with SMTP id FAA01775 for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 05:28:22 -0400 (EDT)
+Received: from mcs.anl.gov (wit.mcs.anl.gov [140.221.5.148]) by antares.mcs.anl.gov (8.6.10/8.6.10)  with ESMTP
+	id EAA28698 for <maillist@candle.pha.pa.us>; Sat, 25 Jul 1998 04:27:05 -0500
+Sender: selkovjr@mcs.anl.gov
+Message-ID: <35B9968D.21CF60A2@mcs.anl.gov>
+Date: Sat, 25 Jul 1998 08:25:49 +0000
+From: "Gene Selkov, Jr." <selkovjr@mcs.anl.gov>
+Organization: MCS, Argonne Natl. Lab
+X-Mailer: Mozilla 4.03 [en] (X11; I; Linux 2.0.32 i586)
+MIME-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+Subject: position-aware scanners
+References: <199807250524.BAA07296@candle.pha.pa.us>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: RO
+
+Bruce,
+
+I attached here (trough the web links) a couple examples, totally
+irrelevant to postgres but good enough to discuss token locations. I
+might as well try to patch the backend parser, though not sure how soon.
+
+
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+1. 
+
+The first c parser I wrote,
+http://wit.mcs.anl.gov/~selkovjr/unit-troff.tgz, is not very
+sophisticated, so token locations reported by yyerr() may be slightly
+incorrect (+/- one position depending on the existence and type of the
+lookahead token. It is a filter used to typeset the units of measurement
+with eqn. To use it, unpack the tar file and run make. The Makefile is
+not too generic but I built it on various systems including linux,
+freebsd and sunos 4.3. The invocation can be something like this:
+
+./check 0 parse "l**3/(mmoll*min)"
+parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
+`'(''
+
+l**3/(mmoll*min)
+      ^^^^^
+
+Now to the guts. As far as I can imagine, the only way to consistently
+keep track of each character read by the scanner (regardless of the
+length of expressions it will match) is to redefine its YY_INPUT like
+this:
+
+#undef YY_INPUT
+#define YY_INPUT(buf,result,max_size) \
+{ \
+	int c	= (int) buffer[pos++]; \
+	result = (c == '\0') ?	YY_NULL	: (buf[0] = c, 1); \
+}
+
+Here, buffer is the pointer to the origin of the string being scanned
+and pos is a global variable, similar in usage to a file pointer (you
+can both read and manipulate it at will). The buffer and the pointer are
+initialized by the function 
+
+void setString(char *s)
+{
+   buffer = s;
+   pos = 0;
+}
+
+each time the new string is to be parsed. This (exportable) function is
+part of the interface. 
+
+In this simplistic design, yyerror() is part of the scanner module and
+it uses the pos variable to report the location of unexpected tokens.
+The downside of such arrangement is that in case of error condition, you
+can't easily tell whether your context is current or lookahead token, it
+just reports the position of the last token read (be it $ (end of
+buffer) or something else):
+
+./check 0 convert "mol/foo"
+parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
+`'(''
+
+mol/foo
+       ^^^
+
+(should be at the beginning of "foo")
+
+./check 0 convert "mmol//l"        
+parse error, expecting `BASIC_UNIT' or `INTEGER' or `POSITIVE_NUMBER' or
+`'(''
+
+mmol//l
+    ^
+
+(should be at the second '/')
+
+
+I believe this is why most simple parsers made with yacc would report
+parse errors being "at or near" some token, which is fair enough if the
+expression is not too complex.
+
+
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+2. The second version of the same scanner,
+http://wit.mcs.anl.gov/~selkovjr/scanner-example.tgz, addresses this
+problem by recording exact locations of the tokens in each instance of
+the token semantic data structure. The global,
+
+UNIT_YYSTYPE unit_yylval;
+
+would be normally used to export the token semantics (including its
+original or modified text and location data) to the parser.
+Unfortunately, I cannot show you the parser part in c, because that's
+about when I stopped writing parsers in c. Instead, I included a small
+test program, test.c, that mimics the parser's expectations for the
+scanner data pretty well. I am assuming here that you are not interested
+in digging someone else's ugly guts for relatively small bit of
+information; let me know if I am wrong and I will send you the complete
+perl code (also generated with bison).
+
+To run this example, unpack the tar file and run Make. Then do
+
+  gcc test.c scanner.o
+
+and run a.out
+
+Note the line
+
+    yylval = unit_getyylval();
+
+in test.c. You will not normally need it in a c parser. It is enough to
+define yylval as an external variable and link it to yylval in yylex()
+
+In the bison-generated parser, yylval gets pushed into a stack (pointed
+to by yylsp) each time a new token is read. For each syntax rule, the
+bison macros @1, @2, ... are just shortcuts to locations in the stack 1,
+2, ... levels deep. In following code fragment, @3 refers to the
+location info for the third term in the rule (INTEGER):
+
+(sorry about perl, but I think you can do the same things in c without
+significant changes to your existing parser)
+
+term:           base    {
+                        $$ = $1;
+                        $$->{'order'} = 1;
+                }
+        |       base EXP INTEGER {
+                        $$ = $1;
+                        $$->{'order'} = @3->{'text'};
+                        $$->{'scale'} = $$->{'scale'} ** $$->{'order'};
+                        if ( $$->{'order'} == 0 ) {
+                                yyerror("Error: expecting a non-zero
+integer exponent");
+                                YYERROR;
+                        }
+                }
+
+
+which translates to:
+
+  ($yyn == 10)    && do {
+          $yyval = $yyvsa[-1];
+          $yyval->{'order'} = 1;
+          last SWITCH;
+  };
+
+  ($yyn == 11)    && do {
+          $yyval = $yyvsa[-3];
+          $yyval->{'order'} = $yylsa[-1]->{'text'}
+          $yyval->{'scale'} = $yyval->{'scale'} ** $yyval->{'order'};
+          if ( $yyval->{'order'} == 0 ) {
+                   yyerror("Error: expecting a non-zero integer
+exponent");
+                   goto yyerrlab1 ;
+          }
+          last SWITCH;
+  };
+
+In c, you will have a bit more complicated pointer arithmetic to adress
+the stack, but the usage of objects will be the same. Note here that it
+is convenient to keep all information about the token in its location
+info, (yylsa, yylsp, yylval, @n), while everything relating to the value
+of the expression, or to the parse tree, is better placed in the
+semantic stack (yyssa, yyssp, yysval, $n). Also note that in some cases
+you can do semantic checks inside rules and report useful messages
+before or instead of invoking yyerror();
+
+Finally, it is useful to make the following wrapper function around
+external yylex() in order to maintain your own token stack. Unlike the
+parser's internal stack which is only as deep as the rule being reduced,
+this one can hold all tokens recognized during the current run, and that
+can be extremely helpful for error reporting and any transformations you
+may need. In this way, you can even scan (tokenize) the whole buffer
+before handing it off to the parser (who knows, you may need a token
+ahead of what is currently seen by the parser):
+
+
+sub tokenize {
+    undef @tokenTable;
+    my ($tok, $text, $name, $unit, $first_line, $first_column,
+$last_line, $last_column);
+    
+    while ( ($tok = &UnitLex::yylex()) > 0 ) { # this is where the
+c-coded yylex is called,
+                                               # UnitLex is the perl
+extension encapsulating it                            
+       ( $text, $name, $unit, $first_line, $first_column, $last_line,
+$last_column ) = &UnitLex::getyylval;
+       push(@tokenTable, 
+           Unit::yyltype->new (
+              'token'         => $tok,
+              'text'          => $text,
+              'name'          => $name,
+              'unit'          => $unit,
+              'first_line'    => $first_line,
+              'first_column'  => $first_column,
+              'last_line'     => $last_line,
+              'last_column'   => $last_column,
+           )
+       )
+    }
+
+}
+
+
+It is now a lot easier to handle various state-related problems, such as
+backtracking and error reporting. The yylex() function as seen by the
+parser might be constructed somewhat like this:
+
+sub yylex {
+    $yylloc = $tokenTable[$tokenNo];  # $tokenNo is a global; now
+instead of a "file pointer",
+                                      # as in the first example, we have
+a "token pointer"
+    undef $yylval;
+
+
+    # disregard this; name this block "computing semantic values"       
+    if ( $yylloc->{'token'} == UNIT) {
+        $yylval = Unit::Operand->new(
+        'unit'  => Unit::Dict::unit($yylloc->{'unit'}),
+        'base'  => Unit::Dict::base($yylloc->{'unit'}),
+        'scale' => Unit::Dict::scale($yylloc->{'unit'}),
+        'scaleToBase' => Unit::Dict::scaleToBase($yylloc->{'unit'}),
+        'loc'   => $yylloc,
+       );    
+    }
+    elsif ( ($yylloc->{'token'} == INTEGER ) || ($yylloc->{'token'} ==
+POSITIVE_NUMBER) ) {
+        $yylval = Unit::Operand->new(
+          'unit' => '1',
+          'base' => '1',
+          'scale' => 1,
+          'scaleToBase' => 1,
+          'loc'   => $yylloc,
+        );
+    }
+
+    $tokenNo++;
+    return(%{$yylloc}->{'token'}); # This is all the parser needs to
+know about this token. 
+                                   # But we already made sure we saved
+everything we need to know.
+}
+
+
+Now the most interesting part, the error reporting routine:
+
+
+sub yyerror {
+    my ($str) = @_;
+    my ($message, $start, $end, $loc);
+
+    $loc = $tokenTable[$tokenNo-1]; # This is the same as to say, 
+                                    # "obtain the location info for the
+current token"
+  
+    # You may use this routine for your own purposes or let parser use
+it
+    if( $str ne 'parse error' ) {
+        $message = "$str instead of `" . $loc->{'name'} . "' <" .
+$loc->{'text'} . ">,  at line " . $loc->{'first_line'} . ":\n\
+n";
+    }
+    else {
+        $message = "unexpected token `" . $loc->{'name'} . "' <" .
+$loc->{'text'} . ">,  at line " . loc->{'first_line'} . ":\n
+\n";
+    }
+
+    $message .= $parseBuffer . "\n"; # that's the original string that
+was used to set the parser buffer
+
+    $message .= ( ' ' x ($loc->{'first_column'} + 1) ) . ( '^' x
+length($loc->{'text'}) ). "\n";
+    if( $str ne 'parse error' ) {
+        print STDERR "$str instead of `", $loc->{'name'}, "' {",
+$loc->{'text'}, "},  at line ", $loc->{'first_line'}, ":\n\n";
+    }
+    else {
+        print STDERR "unexpected token `", $loc->{'name'}, "' {",
+$loc->{'text'}, "},  at line ", $loc->{'first_line'}, ":\n\n";
+    }
+    
+    print STDERR "$parseBuffer\n";
+    print STDERR ' ' x ($loc->{'first_column'} + 1), '^' x
+length($loc->{'text'}), "\n";
+}
+
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Scanners used in these examples assume there is a single line of text on
+the input (the first_line and last_line elements of yylloc are simply
+ignored). If you want to be able to parse multi-line buffers, just add a
+lex rule for '\n' that will increment the line count and reset the pos
+variable to zero.
+
+
+Ugly as it may seem, I find this approach extremely liberating. If the
+grammar becomes too complicated for a LALR(1) parser, I can cascade
+multiple parsers. The token table can then be used to reassemble parts
+of original expression for subordinate parsers, preserving the location
+info all the way down, so that subordinate parsers can report their
+problems consistently. You probably don't need this, as SQL is very well
+thought of and has parsable grammar. But it may be of some help, for
+error reporting. 
+
+
+--Gene
+
--- a/doc/TODO.detail/limit
+++ b/doc/TODO.detail/limit
--- a/doc/TODO.detail/logging
+++ b/doc/TODO.detail/logging
@ -0,0 +1,207 @@
+From owner-pgsql-hackers@hub.org Fri Nov 13 13:24:37 1998
+Received: from hub.org (majordom@hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA13457
+	for <maillist@candle.pha.pa.us>; Fri, 13 Nov 1998 13:24:35 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.1/8.9.1) with SMTP id NAA02464;
+	Fri, 13 Nov 1998 13:22:52 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 13 Nov 1998 13:21:14 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.1/8.9.1) id NAA02331
+	for pgsql-hackers-outgoing; Fri, 13 Nov 1998 13:21:12 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from orion.SAPserv.Hamburg.dsh.de (Tpolaris2.sapham.debis.de [53.2.131.8])
+	by hub.org (8.9.1/8.9.1) with SMTP id NAA02316
+	for <pgsql-hackers@postgreSQL.org>; Fri, 13 Nov 1998 13:21:06 -0500 (EST)
+	(envelope-from wieck@sapserv.debis.de)
+Received: by orion.SAPserv.Hamburg.dsh.de 
+	for pgsql-hackers@postgreSQL.org 
+	id m0zeOEf-000EBPC; Fri, 13 Nov 98 19:46 MET
+Message-Id: <m0zeOEf-000EBPC@orion.SAPserv.Hamburg.dsh.de>
+From: jwieck@debis.com (Jan Wieck)
+Subject: [HACKERS] shmem limits and redolog
+To: pgsql-hackers@postgreSQL.org (PostgreSQL HACKERS)
+Date: Fri, 13 Nov 1998 19:46:20 +0100 (MET)
+Reply-To: jwieck@debis.com (Jan Wieck)
+X-Mailer: ELM [version 2.4 PL25]
+Content-Type: text
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: ROr
+
+Hi,
+
+    I'm  currently  hacking  around on a solution for logging all
+    database operations at query level that can recover a crashed
+    database  from  the last successful backup by redoing all the
+    commands.
+
+    Well, I wanted it to be as flexible as can. So I  decided  to
+    make  it  per  database  configurable.  One  could  say which
+    databases are logged and if a database is, if  it  is  logged
+    sync  or async (in sync mode, every COMMIT forces an fsync of
+    the actual logfile and controlfiles).
+
+    To make async mode as fast as can, I'm using a shared  memory
+    of  32K per database (not per backend) that is used as a wrap
+    around  buffer  from  the  backends  to  place  their   query
+    information.  So  the  log writer can fall a little behind if
+    there are many backends doing  different  things  that  don't
+    lock each other.
+
+    Now  I'm  a  little  in  doubt about the shared memory limits
+    reported.  Was it a good decision to use shared memory? Am  I
+    better off using socket's?
+
+    The  bad  thing  in  what  I  have  up  to now (it's far from
+    complete) is, that even if a database isn't currently logged,
+    a redolog writer is started and creates the 32K shmem segment
+    (plus a semaphore set with 5 semaphores). This is  because  I
+    plan to create commands like
+
+        ALTER DATABASE LOG MODE=ASYNC LOGDIR='/somewhere/dbname';
+
+    and the like that can be used at runtime (while more than one
+    backend is connected to the database) to turn logging on/off,
+    switch  to/from  backup  mode (all other activity is stopped)
+    etc.
+
+    So every 32 databases will require another megabyte of shared
+    memory.  The  logging  master  controls  which databases have
+    activity  and  kills  redolog  writers  after  some  time  of
+    inactivity,  and  the shmem is freed then. But it can hurt if
+    someone really has many many databases that are all  used  at
+    the same time.
+
+    What do the others say?
+
+
+Jan
+
+--
+
+#======================================================================#
+# It's easier to get forgiveness for being wrong than for being right. #
+# Let's break this rule - forgive me.                                  #
+#======================================== jwieck@debis.com (Jan Wieck) #
+
+
+
+
+From owner-pgsql-hackers@hub.org Wed Dec 16 15:46:41 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id PAA00521
+	for <maillist@candle.pha.pa.us>; Wed, 16 Dec 1998 15:46:40 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id PAA08772 for <maillist@candle.pha.pa.us>; Wed, 16 Dec 1998 15:10:01 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.1/8.9.1) with SMTP id PAA01254;
+	Wed, 16 Dec 1998 15:06:56 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 16 Dec 1998 14:58:11 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.1/8.9.1) id OAA00660
+	for pgsql-hackers-outgoing; Wed, 16 Dec 1998 14:58:10 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from orion.SAPserv.Hamburg.dsh.de (Tpolaris2.sapham.debis.de [53.2.131.8])
+	by hub.org (8.9.1/8.9.1) with SMTP id OAA00643
+	for <pgsql-hackers@postgreSQL.org>; Wed, 16 Dec 1998 14:58:05 -0500 (EST)
+	(envelope-from wieck@sapserv.debis.de)
+Received: by orion.SAPserv.Hamburg.dsh.de 
+	for pgsql-hackers@postgreSQL.org 
+	id m0zqNDo-000EBTC; Wed, 16 Dec 98 21:07 MET
+Message-Id: <m0zqNDo-000EBTC@orion.SAPserv.Hamburg.dsh.de>
+From: jwieck@debis.com (Jan Wieck)
+Subject: Re: [HACKERS] redolog - for discussion
+To: vadim@krs.ru (Vadim Mikheev)
+Date: Wed, 16 Dec 1998 21:07:00 +0100 (MET)
+Cc: jwieck@debis.com, pgsql-hackers@postgreSQL.org
+Reply-To: jwieck@debis.com (Jan Wieck)
+In-Reply-To: <3677B71D.C67462B3@krs.ru> from "Vadim Mikheev" at Dec 16, 98 08:35:25 pm
+X-Mailer: ELM [version 2.4 PL25]
+Content-Type: text
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Vadim wrote:
+
+>
+> Jan Wieck wrote:
+> >
+> >     RECOVER DATABASE {ALL | UNTIL 'datetime' | RESET};
+> >
+> ...
+> >
+> >         For  the  others, the backend starts the recovery program
+> >         which  reads  the  redolog  files,  establishes  database
+> >         connections  as  required  and reruns all the commands in
+>                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^
+> >         them. If a required logfile isn't  found,  it  tells  the
+>           ^^^^^
+>
+> I foresee problems with using _commands_ logging for
+> recovery/replication -:((
+>
+> Let's consider two concurrent updates in READ COMMITTED mode:
+>
+> update test set x = 2 where y = 1;
+>
+>    and
+>
+> update test set x = 3 where y = 1;
+>
+> The result of both committed transaction will be x = 2
+> if the 1st transaction updated row _after_ 2nd transaction
+> and x = 3 if the 2nd transaction gets row after 1st one.
+> Order of updates is not defined by order in which commands
+> begun and so order in which commands should be rerun
+> will be unknown...
+
+    Yepp,  the order in which commands begun is absolutely not of
+    interest. Locking could already delay the  execution  of  one
+    command  until  another  one  started  later has finished and
+    released the lock.  It's a classic race condition.
+
+    Thus, my plan was to log the queries just before the call  to
+    CommitTransactionCommand()  in  tcop. This has the advantage,
+    that queries which bail out with errors don't  get  into  the
+    log  at  all  and  must not get rerun. And I can set a static
+    flag to false before starting the command, which  is  set  to
+    true  in  the buffer manager when a buffer is written (marked
+    dirty), so filtering out queries that do no updates at all is
+    easy.
+
+    Unfortunately  query  level  logging get's hit by the current
+    implementation of sequence numbers. If  a  query  that  get's
+    aborted  somewhere  in the middle (maybe by a trigger) called
+    nextval() for rows processed  earlier,  the  sequence  number
+    isn't  advanced  at  recovery  time,  because  the  query  is
+    suppressed at all.   And  sequences  aren't  locked,  so  for
+    concurrently  running  queries  getting numbers from the same
+    sequence,  the  results   aren't   reproduceable.   If   some
+    application  selects  a  value  resulting from a sequence and
+    uses that later in another query, how could the redolog  know
+    that  this has changed? It's a Const in the query logged, and
+    all that corrupts the whole thing.
+
+    All that is painful and I don't see another solution yet than
+    to  hook  into  nextval(),  log  out the numbers generated in
+    normal operation and getting back the same  numbers  in  redo
+    mode.
+
+    The whole thing gets more and more complicated :-(
+
+
+Jan
+
+--
+
+#======================================================================#
+# It's easier to get forgiveness for being wrong than for being right. #
+# Let's break this rule - forgive me.                                  #
+#======================================== jwieck@debis.com (Jan Wieck) #
+
+
+
+
--- a/doc/TODO.detail/memory
+++ b/doc/TODO.detail/memory
--- a/doc/TODO.detail/nulls
+++ b/doc/TODO.detail/nulls
@ -0,0 +1,119 @@
+From owner-pgsql-general@hub.org Fri Oct  9 18:22:09 1998
+Received: from hub.org (majordom@hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA04220
+	for <maillist@candle.pha.pa.us>; Fri, 9 Oct 1998 18:22:08 -0400 (EDT)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.8.8/8.8.8) with SMTP id SAA26960;
+	Fri, 9 Oct 1998 18:18:29 -0400 (EDT)
+	(envelope-from owner-pgsql-general@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Fri, 09 Oct 1998 18:18:07 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.8.8/8.8.8) id SAA26917
+	for pgsql-general-outgoing; Fri, 9 Oct 1998 18:18:04 -0400 (EDT)
+	(envelope-from owner-pgsql-general@postgreSQL.org)
+X-Authentication-Warning: hub.org: majordom set sender to owner-pgsql-general@postgreSQL.org using -f
+Received: from gecko.statsol.com (gecko.statsol.com [198.11.51.133])
+	by hub.org (8.8.8/8.8.8) with ESMTP id SAA26904
+	for <pgsql-general@postgresql.org>; Fri, 9 Oct 1998 18:17:46 -0400 (EDT)
+	(envelope-from statsol@statsol.com)
+Received: from gecko (gecko [198.11.51.133])
+	by gecko.statsol.com (8.9.0/8.9.0) with SMTP id SAA00557
+	for <pgsql-general@postgresql.org>; Fri, 9 Oct 1998 18:18:00 -0400 (EDT)
+Date: Fri, 9 Oct 1998 18:18:00 -0400 (EDT)
+From: Steve Doliov <statsol@statsol.com>
+X-Sender: statsol@gecko
+To: pgsql-general@postgreSQL.org
+Subject: Re: [GENERAL] Making NULLs visible.
+Message-ID: <Pine.GSO.3.96.981009181716.545B-100000@gecko>
+MIME-Version: 1.0
+Content-Type: TEXT/PLAIN; charset=US-ASCII
+Sender: owner-pgsql-general@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+On Fri, 9 Oct 1998, Bruce Momjian wrote:
+
+> [Charset iso-8859-1 unsupported, filtering to ASCII...]
+> > > Yes, \ always outputs as \\, excepts someone changed it last week, and I
+> > > am requesting a reversal.  Do you like the \N if it is unique?
+> > 
+> > Well, it's certainly clear, but could be confused with \n (newline). Can we
+> > have \0 instead?
+> 
+> Yes, but it is uppercase.  \0 looks like an octal number to me, and I
+> think we even output octals sometimes, don't we?
+> 
+
+my first suggestion may have been hare-brained, but why not just make the
+specifics of the output user-configurable.  So if the user chooses \0, so
+be it, if the user chooses \N so be it, if the user likes NULL so be it.
+but the option would only have one value per database at any given point
+in time.  so database x could use \N on tuesday and NULL on wednesday, but
+database x could never have two references to the characters(s) used to
+represent a null value.
+
+steve
+
+
+
+
+From owner-pgsql-general@hub.org Sun Oct 11 17:31:08 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA20043
+	for <maillist@candle.pha.pa.us>; Sun, 11 Oct 1998 17:31:02 -0400 (EDT)
+Received: from hub.org (majordom@hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id RAA03069 for <maillist@candle.pha.pa.us>; Sun, 11 Oct 1998 17:10:34 -0400 (EDT)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.8.8/8.8.8) with SMTP id QAA10856;
+	Sun, 11 Oct 1998 16:57:34 -0400 (EDT)
+	(envelope-from owner-pgsql-general@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 11 Oct 1998 16:53:35 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.8.8/8.8.8) id QAA10393
+	for pgsql-general-outgoing; Sun, 11 Oct 1998 16:53:34 -0400 (EDT)
+	(envelope-from owner-pgsql-general@postgreSQL.org)
+X-Authentication-Warning: hub.org: majordom set sender to owner-pgsql-general@postgreSQL.org using -f
+Received: from mail1.panix.com (mail1.panix.com [166.84.0.212])
+	by hub.org (8.8.8/8.8.8) with ESMTP id QAA10378
+	for <pgsql-general@postgreSQL.org>; Sun, 11 Oct 1998 16:53:28 -0400 (EDT)
+	(envelope-from tomg@admin.nrnet.org)
+Received: from mailhost.nrnet.org (root@mailhost.nrnet.org [166.84.192.39])
+	by mail1.panix.com (8.8.8/8.8.8/PanixM1.3) with ESMTP id QAA16311
+	for <pgsql-general@postgreSQL.org>; Sun, 11 Oct 1998 16:53:24 -0400 (EDT)
+Received: from admin.nrnet.org (uucp@localhost)
+          by mailhost.nrnet.org (8.8.7/8.8.4) with UUCP
+   id QAA16345 for pgsql-general@postgreSQL.org; Sun, 11 Oct 1998 16:28:47 -0400
+Received: from localhost (tomg@localhost)
+	by admin.nrnet.org (8.8.7/8.8.7) with SMTP id QAA11569
+	for <pgsql-general@postgreSQL.org>; Sun, 11 Oct 1998 16:28:41 -0400
+Date: Sun, 11 Oct 1998 16:28:41 -0400 (EDT)
+From: Thomas Good <tomg@admin.nrnet.org>
+To: pgsql-general@postgreSQL.org
+Subject: Re: [GENERAL] Making NULLs visible.
+In-Reply-To: <Pine.GSO.3.96.981009181716.545B-100000@gecko>
+Message-ID: <Pine.LNX.3.96.981011161908.11556A-100000@admin.nrnet.org>
+MIME-Version: 1.0
+Content-Type: TEXT/PLAIN; charset=US-ASCII
+Sender: owner-pgsql-general@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Watching all this go by...as a guy who has to move alot of data
+from legacy dbs to postgres, I've gotten used to \N being a null.
+
+My vote, if I were allowed to cast one, would be to have one null 
+and that would be the COPY command null.  I have no difficulty
+distinguishing a null from a newline...
+
+At the pgsql command prompt I would find seeing \N rather reassuring.
+I've seen alot of these little guys.
+
+    ---------- Sisters of Charity Medical Center ----------
+                   Department of Psychiatry
+                            ----     
+    Thomas Good                          <tomg@q8.nrnet.org>
+    Coordinator, North Richmond C.M.H.C. Information Systems
+    75 Vanderbilt Ave, Quarters 8        Phone: 718-354-5528
+    Staten Island, NY   10304            Fax:   718-354-5056
+
+
+
--- a/doc/TODO.detail/optimizer
+++ b/doc/TODO.detail/optimizer
@ -0,0 +1,987 @@
+From owner-pgsql-hackers@hub.org Mon Mar 22 18:43:41 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id SAA23978
+	for <maillist@candle.pha.pa.us>; Mon, 22 Mar 1999 18:43:39 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id SAA06472 for <maillist@candle.pha.pa.us>; Mon, 22 Mar 1999 18:36:44 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.2/8.9.1) with SMTP id SAA92604;
+	Mon, 22 Mar 1999 18:34:23 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 22 Mar 1999 18:33:50 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.2/8.9.1) id SAA92469
+	for pgsql-hackers-outgoing; Mon, 22 Mar 1999 18:33:47 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from po8.andrew.cmu.edu (PO8.ANDREW.CMU.EDU [128.2.10.108])
+	by hub.org (8.9.2/8.9.1) with ESMTP id SAA92456
+	for <pgsql-hackers@postgresql.org>; Mon, 22 Mar 1999 18:33:41 -0500 (EST)
+	(envelope-from er1p+@andrew.cmu.edu)
+Received: (from postman@localhost) by po8.andrew.cmu.edu (8.8.5/8.8.2) id SAA12894 for pgsql-hackers@postgresql.org; Mon, 22 Mar 1999 18:33:38 -0500 (EST)
+Received: via switchmail; Mon, 22 Mar 1999 18:33:38 -0500 (EST)
+Received: from cloudy.me.cmu.edu via qmail
+          ID </afs/andrew.cmu.edu/service/mailqs/q007/QF.Aqxh7Lu00gNtQ0TZE5>;
+          Mon, 22 Mar 1999 18:27:20 -0500 (EST)
+Received: from cloudy.me.cmu.edu via qmail
+          ID </afs/andrew.cmu.edu/usr2/er1p/.Outgoing/QF.Uqxh7JS00gNtMmTJFk>;
+          Mon, 22 Mar 1999 18:27:17 -0500 (EST)
+Received: from mms.4.60.Jun.27.1996.03.05.56.sun4.41.EzMail.2.0.CUILIB.3.45.SNAP.NOT.LINKED.cloudy.me.cmu.edu.sun4m.412
+          via MS.5.6.cloudy.me.cmu.edu.sun4_41;
+          Mon, 22 Mar 1999 18:27:15 -0500 (EST)
+Message-ID: <sqxh7H_00gNtAmTJ5Q@andrew.cmu.edu>
+Date: Mon, 22 Mar 1999 18:27:15 -0500 (EST)
+From: Erik Riedel <riedel+@CMU.EDU>
+To: pgsql-hackers@postgreSQL.org
+Subject: [HACKERS] optimizer and type question
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+
+[last week aggregation, this week, the optimizer]
+
+I have a somewhat general optimizer question/problem that I would like
+to get some input on - i.e. I'd like to know what is "supposed" to
+work here and what I should be expecting.  Sadly, I think the patch
+for this is more involved than my last message.
+
+Using my favorite table these days:
+
+Table    = lineitem
+------------------------+----------------------------------+-------+
+|              Field     |              Type                | Length|
+------------------------+----------------------------------+-------+
+| l_orderkey             | int4 not null                    |     4 |
+| l_partkey              | int4 not null                    |     4 |
+| l_suppkey              | int4 not null                    |     4 |
+| l_linenumber           | int4 not null                    |     4 |
+| l_quantity             | float4 not null                  |     4 |
+| l_extendedprice        | float4 not null                  |     4 |
+| l_discount             | float4 not null                  |     4 |
+| l_tax                  | float4 not null                  |     4 |
+| l_returnflag           | char() not null                  |     1 |
+| l_linestatus           | char() not null                  |     1 |
+| l_shipdate             | date                             |     4 |
+| l_commitdate           | date                             |     4 |
+| l_receiptdate          | date                             |     4 |
+| l_shipinstruct         | char() not null                  |    25 |
+| l_shipmode             | char() not null                  |    10 |
+| l_comment              | char() not null                  |    44 |
+------------------------+----------------------------------+-------+
+Index:    lineitem_index_
+
+and the query:
+
+--
+-- Query 1
+--
+explain select l_returnflag, l_linestatus, sum(l_quantity) as sum_qty, 
+sum(l_extendedprice) as sum_base_price, 
+sum(l_extendedprice*(1-l_discount)) as sum_disc_price, 
+sum(l_extendedprice*(1-l_discount)*(1+l_tax)) as sum_charge, 
+avg(l_quantity) as avg_qty, avg(l_extendedprice) as avg_price, 
+avg(l_discount) as avg_disc, count(*) as count_order 
+from lineitem 
+where l_shipdate <= '1998-09-02'::date 
+group by l_returnflag, l_linestatus 
+order by l_returnflag, l_linestatus;
+
+
+note that I have eliminated the date calculation in my query of last
+week and manually replaced it with a constant (since this wasn't
+happening automatically - but let's not worry about that for now).
+And this is only an explain, we care about the optimizer.  So we get:
+
+Sort  (cost=34467.88 size=0 width=0)
+ ->  Aggregate  (cost=34467.88 size=0 width=0)
+   ->  Group  (cost=34467.88 size=0 width=0)
+     ->  Sort  (cost=34467.88 size=0 width=0)
+       ->  Seq Scan on lineitem  (cost=34467.88 size=200191 width=44)
+
+so let's think about the selectivity that is being chosen for the
+seq scan (the where l_shipdate <= '1998-09-02').
+
+Turns out the optimizer is choosing "33%", even though the real answer
+is somewhere in 90+% (that's how the query is designed).  So, why does
+it do that?
+
+Turns out that selectivity in this case is determined via
+plancat::restriction_selectivity() which calls into functionOID = 103
+(intltsel) for operatorOID = 1096 (date "<=") on relation OID = 18663
+(my lineitem).
+
+This all follows because of the description of 1096 (date "<=") in
+pg_operator.  Looking at local1_template1.bki.source near line 1754
+shows:
+
+insert OID = 1096 ( "<=" PGUID 0 <...> date_le intltsel intltjoinsel )
+
+where we see that indeed, it thinks "intltsel" is the right function
+to use for "oprrest" in the case of dates.
+
+Question 1 - is intltsel the right thing for selectivity on dates?
+
+Hope someone is still with me.
+
+So now we're running selfuncs::intltsel() where we make a further call
+to selfuncs::gethilokey().  The job of gethilokey is to determine the
+min and max values of a particular attribute in the table, which will
+then be used with the constant in my where clause to estimate the
+selectivity.  It is going to search the pg_statistic relation with
+three key values:
+
+Anum_pg_statistic_starelid     18663  (lineitem)
+Anum_pg_statistic_staattnum       11  (l_shipdate)
+Anum_pg_statistic_staop         1096  (date "<=")
+
+this finds no tuples in pg_statistic.  Why is that?  The only nearby
+tuple in pg_statistic is:
+
+starelid|staattnum|staop|stalokey        |stahikey       
+--------+---------+-----+----------------+----------------
+   18663|       11|    0|01-02-1992      |12-01-1998
+
+and the reason the query doesn't match anything?  Because 1096 != 0.
+But why is it 0 in pg_statistic?  Statistics are determined near line
+1844 in vacuum.c (assuming a 'vacuum analyze' run at some point)
+
+             i = 0;
+             values[i++] = (Datum) relid;            /* 1 */
+             values[i++] = (Datum) attp->attnum; /* 2 */
+====>        values[i++] = (Datum) InvalidOid;       /* 3 */
+             fmgr_info(stats->outfunc, &out_function);
+             out_string = <...min...>
+             values[i++] = (Datum) fmgr(F_TEXTIN, out_string);
+             pfree(out_string);
+             out_string = <...max...>
+             values[i++] = (Datum) fmgr(F_TEXTIN, out_string);
+             pfree(out_string);
+             stup = heap_formtuple(sd->rd_att, values, nulls);
+
+the "offending" line is setting the staop to InvalidOid (i.e. 0).
+
+Question 2 - is this right?  Is the intent for 0 to serve as a
+"wildcard", or should it be inserting an entry for each operation
+individually?
+
+In the case of "wildcard" then gethilokey() should allow a match for 
+
+Anum_pg_statistic_staop         0
+
+instead of requiring the more restrictive 1096.  In the current code,
+what happens next is gethilokey() returns "not found" and intltsel()
+returns the default 1/3 which I see in the resultant query plan (size
+= 200191 is 1/3 of the number of lineitem tuples).
+
+Question 3 - is there any inherent reason it couldn't get this right?
+The statistic is in the table 1992 to 1998, so the '1998-09-02' date
+should be 90-some% selectivity, a much better guess than 33%.
+
+Doesn't make a difference for this particular query, of course,
+because the seq scan must proceed anyhow, but it could easily affect
+other queries where selectivities matter (and it affects the
+modifications I am trying to test in the optimizer to be "smarter"
+about selectivities - my overall context is to understand/improve the
+behavior that the underlying storage system sees from queries like this).
+
+OK, so let's say we treat 0 as a "wildcard" and stop checking for
+1096.  Not we let gethilokey() return the two dates from the statistic
+table.  The immediate next thing that intltsel() does, near lines 122
+in selfuncs.c is call atol() on the strings from gethilokey().  And
+guess what it comes up with?
+
+low = 1
+high = 12
+
+because it calls atol() on '01-02-1992' and '12-01-1998'.  This
+clearly isn't right, it should get some large integer that includes
+the year and day in the result.  Then it should compare reasonably
+with my constant from the where clause and give a decent selectivity
+value.  This leads to a re-visit of Question 1.
+
+Question 4 - should date "<=" use a dateltsel() function instead of
+intltsel() as oprrest?
+
+If anyone is still with me, could you tell me if this makes sense, or
+if there is some other location where the appropriate type conversion
+could take place so that intltsel() gets something reasonable when it
+does the atol() calls?
+
+Could someone also give me a sense for how far out-of-whack the whole
+current selectivity-handling structure is?  It seems that most of the
+operators in pg_operator actually use intltsel() and would have
+type-specific problems like that described.  Or is the problem in the
+way attribute values are stored in pg_statistic by vacuum analyze?  Or
+is there another layer where type conversion belongs?
+
+Phew.  Enough typing, hope someone can follow this and address at
+least some of the questions.
+
+Thanks.
+
+Erik Riedel
+Carnegie Mellon University
+www.cs.cmu.edu/~riedel
+
+
+
+From owner-pgsql-hackers@hub.org Mon Mar 22 20:31:11 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id UAA00802
+	for <maillist@candle.pha.pa.us>; Mon, 22 Mar 1999 20:31:09 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id UAA13231 for <maillist@candle.pha.pa.us>; Mon, 22 Mar 1999 20:15:20 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.2/8.9.1) with SMTP id UAA01981;
+	Mon, 22 Mar 1999 20:14:04 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 22 Mar 1999 20:13:32 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.2/8.9.1) id UAA01835
+	for pgsql-hackers-outgoing; Mon, 22 Mar 1999 20:13:28 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6])
+	by hub.org (8.9.2/8.9.1) with ESMTP id UAA01822
+	for <pgsql-hackers@postgreSQL.org>; Mon, 22 Mar 1999 20:13:21 -0500 (EST)
+	(envelope-from tgl@sss.pgh.pa.us)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id UAA23294;
+	Mon, 22 Mar 1999 20:12:43 -0500 (EST)
+To: Erik Riedel <riedel+@CMU.EDU>
+cc: pgsql-hackers@postgreSQL.org
+Subject: Re: [HACKERS] optimizer and type question 
+In-reply-to: Your message of Mon, 22 Mar 1999 18:27:15 -0500 (EST) 
+             <sqxh7H_00gNtAmTJ5Q@andrew.cmu.edu> 
+Date: Mon, 22 Mar 1999 20:12:43 -0500
+Message-ID: <23292.922151563@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: ROr
+
+Erik Riedel <riedel+@CMU.EDU> writes:
+> [ optimizer doesn't find relevant pg_statistic entry ]
+
+It's clearly a bug that the selectivity code is not finding this tuple.
+If your analysis is correct, then selectivity estimation has *never*
+worked properly, or at least not in recent memory :-(.  Yipes.
+Bruce and I found a bunch of other problems in the optimizer recently,
+so it doesn't faze me to assume that this is broken too.
+
+> the "offending" line is setting the staop to InvalidOid (i.e. 0).
+> Question 2 - is this right?  Is the intent for 0 to serve as a
+> "wildcard",
+
+My thought is that what the staop column ought to be is the OID of the
+comparison function that was used to determine the sort order of the
+column.  Without a sort op the lowest and highest keys in the column are
+not well defined, so it makes no sense to assert "these are the lowest
+and highest values" without providing the sort op that determined that.
+(For sufficiently complex data types one could reasonably have multiple
+ordering operators.  A crude example is sorting on "circumference" and
+"area" for polygons.)  But typically the sort op will be the "<"
+operator for the column data type.
+
+So, the vacuum code is definitely broken --- it's not storing the sort
+op that it used.  The code in gethilokey might be broken too, depending
+on how it is producing the operator it's trying to match against the
+tuple.  For example, if the actual operator in the query is any of
+< <= > >= on int4, then int4lt ought to be used to probe the pg_statistic
+table.  I'm not sure if we have adequate info in pg_operator or pg_type
+to let the optimizer code determine the right thing to probe with :-(
+
+> The immediate next thing that intltsel() does, near lines 122
+> in selfuncs.c is call atol() on the strings from gethilokey().  And
+> guess what it comes up with?
+> low = 1
+> high = 12
+> because it calls atol() on '01-02-1992' and '12-01-1998'.  This
+> clearly isn't right, it should get some large integer that includes
+> the year and day in the result.  Then it should compare reasonably
+> with my constant from the where clause and give a decent selectivity
+> value.  This leads to a re-visit of Question 1.
+> Question 4 - should date "<=" use a dateltsel() function instead of
+> intltsel() as oprrest?
+
+This is clearly busted as well.  I'm not sure that creating dateltsel()
+is the right fix, however, because if you go down that path then every
+single datatype needs its own selectivity function; that's more than we
+need.
+
+What we really want here is to be able to map datatype values into
+some sort of numeric range so that we can compute what fraction of the
+low-key-to-high-key range is on each side of the probe value (the
+constant taken from the query).  This general concept will apply to
+many scalar types, so what we want is a type-specific mapping function
+and a less-specific fraction-computing-function.  Offhand I'd say that
+we want intltsel() and floatltsel(), plus conversion routines that can
+produce either int4 or float8 from a data type as seems appropriate.
+Anything that couldn't map to one or the other would have to supply its
+own selectivity function.
+
+> Or is the problem in the
+> way attribute values are stored in pg_statistic by vacuum analyze?
+
+Looks like it converts the low and high values to text and stores them
+that way.  Ugly as can be :-( but I'm not sure there is a good
+alternative.  We have no "wild card" column type AFAIK, which is what
+these columns of pg_statistic would have to be to allow storage of
+unconverted min and max values.
+
+I think you've found a can of worms here.  Congratulations ;-)
+
+			regards, tom lane
+
+
+From owner-pgsql-hackers@hub.org Mon Mar 22 23:31:00 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id XAA03384
+	for <maillist@candle.pha.pa.us>; Mon, 22 Mar 1999 23:30:58 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id XAA25586 for <maillist@candle.pha.pa.us>; Mon, 22 Mar 1999 23:18:25 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.2/8.9.1) with SMTP id XAA17955;
+	Mon, 22 Mar 1999 23:17:24 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 22 Mar 1999 23:16:49 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.2/8.9.1) id XAA17764
+	for pgsql-hackers-outgoing; Mon, 22 Mar 1999 23:16:46 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from po8.andrew.cmu.edu (PO8.ANDREW.CMU.EDU [128.2.10.108])
+	by hub.org (8.9.2/8.9.1) with ESMTP id XAA17745
+	for <pgsql-hackers@postgreSQL.org>; Mon, 22 Mar 1999 23:16:39 -0500 (EST)
+	(envelope-from er1p+@andrew.cmu.edu)
+Received: (from postman@localhost) by po8.andrew.cmu.edu (8.8.5/8.8.2) id XAA04273; Mon, 22 Mar 1999 23:16:37 -0500 (EST)
+Received: via switchmail; Mon, 22 Mar 1999 23:16:37 -0500 (EST)
+Received: from hazy.adsl.net.cmu.edu via qmail
+          ID </afs/andrew.cmu.edu/service/mailqs/q000/QF.kqxlJ:S00anI00p040>;
+          Mon, 22 Mar 1999 23:15:09 -0500 (EST)
+Received: from hazy.adsl.net.cmu.edu via qmail
+          ID </afs/andrew.cmu.edu/usr2/er1p/.Outgoing/QF.MqxlJ3q00anI01hKE0>;
+          Mon, 22 Mar 1999 23:15:00 -0500 (EST)
+Received: from mms.4.60.Jun.27.1996.03.02.53.sun4.51.EzMail.2.0.CUILIB.3.45.SNAP.NOT.LINKED.hazy.adsl.net.cmu.edu.sun4m.54
+          via MS.5.6.hazy.adsl.net.cmu.edu.sun4_51;
+          Mon, 22 Mar 1999 23:14:55 -0500 (EST)
+Message-ID: <4qxlJ0200anI01hK40@andrew.cmu.edu>
+Date: Mon, 22 Mar 1999 23:14:55 -0500 (EST)
+From: Erik Riedel <riedel+@CMU.EDU>
+To: Tom Lane <tgl@sss.pgh.pa.us>
+Subject: Re: [HACKERS] optimizer and type question
+Cc: pgsql-hackers@postgreSQL.org
+In-Reply-To: <23292.922151563@sss.pgh.pa.us>
+References: <23292.922151563@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: ROr
+
+
+OK, building on your high-level explanation, I am attaching a patch that
+attempts to do something "better" than the current code.  Note that I
+have only tested this with the date type and my particular query.  I
+haven't run it through the regression, so consider it "proof of concept"
+at best.  Although hopefully it will serve my purposes.
+
+> My thought is that what the staop column ought to be is the OID of the
+> comparison function that was used to determine the sort order of the
+> column.  Without a sort op the lowest and highest keys in the column are
+> not well defined, so it makes no sense to assert "these are the lowest
+> and highest values" without providing the sort op that determined that.
+>
+> (For sufficiently complex data types one could reasonably have multiple
+> ordering operators.  A crude example is sorting on "circumference" and
+> "area" for polygons.)  But typically the sort op will be the "<"
+> operator for the column data type.
+>  
+I changed vacuum.c to do exactly that.  oid of the lt sort op.
+
+> So, the vacuum code is definitely broken --- it's not storing the sort
+> op that it used.  The code in gethilokey might be broken too, depending
+> on how it is producing the operator it's trying to match against the
+> tuple.  For example, if the actual operator in the query is any of
+> < <= > >= on int4, then int4lt ought to be used to probe the pg_statistic
+> table.  I'm not sure if we have adequate info in pg_operator or pg_type
+> to let the optimizer code determine the right thing to probe with :-(
+>  
+This indeed seems like a bigger problem.  I thought about somehow using
+type-matching from the sort op and the actual operator in the query - if
+both the left and right type match, then consider them the same for
+purposes of this probe.  That seemed complicated, so I punted in my
+example - it just does the search with relid and attnum and assumes that
+only returns one tuple.  This works in my case (maybe in all cases,
+because of the way vacuum is currently written - ?).
+
+> What we really want here is to be able to map datatype values into
+> some sort of numeric range so that we can compute what fraction of the
+> low-key-to-high-key range is on each side of the probe value (the
+> constant taken from the query).  This general concept will apply to
+> many scalar types, so what we want is a type-specific mapping function
+> and a less-specific fraction-computing-function.  Offhand I'd say that
+> we want intltsel() and floatltsel(), plus conversion routines that can
+> produce either int4 or float8 from a data type as seems appropriate.
+> Anything that couldn't map to one or the other would have to supply its
+> own selectivity function.
+>  
+This is what my example then does.  Uses the stored sort op to get the
+type and then uses typinput to convert from the string to an int4.
+
+Then puts the int4 back into string format because that's what everyone
+was expecting.
+
+It seems to work for my particular query.  I now get:
+
+(selfuncs) gethilokey() obj 18663 attr 11 opid 1096 (ignored)
+(selfuncs) gethilokey() found op 1087 in pg_proc
+(selfuncs) gethilokey() found type 1082 in pg_type
+(selfuncs) gethilokey() going to use 1084 to convert type 1082
+(selfuncs) gethilokey() have low -2921 high -396
+(selfuncs) intltsel() high -396 low -2921 val -486
+(plancat) restriction_selectivity() for func 103 op 1096 rel 18663 attr
+11 const -486 flag 3 returns 0.964356
+NOTICE:  QUERY PLAN:
+
+Sort  (cost=34467.88 size=0 width=0)
+ ->  Aggregate  (cost=34467.88 size=0 width=0)
+  ->  Group  (cost=34467.88 size=0 width=0)
+   ->  Sort  (cost=34467.88 size=0 width=0)
+    ->  Seq Scan on lineitem  (cost=34467.88 size=579166 width=44)
+
+including my printfs, which exist in the patch as well.
+
+Selectivity is now the expected 96% and the size estimate for the seq
+scan is much closer to correct.
+
+Again, not tested with anything besides date, so caveat not-tested.
+
+Hope this helps.
+
+Erik
+
+----------------------[optimizer_fix.sh]------------------------
+
+#! /bin/sh
+# This is a shell archive, meaning:
+# 1. Remove everything above the #! /bin/sh line.
+# 2. Save the resulting text in a file.
+# 3. Execute the file with /bin/sh (not csh) to create:
+#	selfuncs.c.diff
+#	vacuum.c.diff
+# This archive created: Mon Mar 22 22:58:14 1999
+export PATH; PATH=/bin:/usr/bin:$PATH
+if test -f 'selfuncs.c.diff'
+then
+	echo shar: "will not over-write existing file 'selfuncs.c.diff'"
+else
+cat << \SHAR_EOF > 'selfuncs.c.diff'
+***
+/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/611/src/backend/utils/adt
+/selfuncs.c	Thu Mar 11 23:59:35 1999
+---
+/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/615/src/backend/utils/adt
+/selfuncs.c	Mon Mar 22 22:57:25 1999
+***************
+*** 32,37 ****
+--- 32,40 ----
+  #include "utils/lsyscache.h"	/* for get_oprrest() */
+  #include "catalog/pg_statistic.h"
+  
+ #include "catalog/pg_proc.h"    /* for Form_pg_proc */
+ #include "catalog/pg_type.h"    /* for Form_pg_type */
+ 
+  /* N is not a valid var/constant or relation id */
+  #define NONVALUE(N)		((N) == -1)
+  
+***************
+*** 103,110 ****
+  				bottom;
+  
+  	result = (float64) palloc(sizeof(float64data));
+! 	if (NONVALUE(attno) || NONVALUE(relid))
+  		*result = 1.0 / 3;
+  	else
+  	{
+  		/* XXX			val = atol(value); */
+--- 106,114 ----
+  				bottom;
+  
+  	result = (float64) palloc(sizeof(float64data));
+! 	if (NONVALUE(attno) || NONVALUE(relid)) {
+  		*result = 1.0 / 3;
+ 	}
+  	else
+  	{
+  		/* XXX			val = atol(value); */
+***************
+*** 117,130 ****
+  		}
+  		high = atol(highchar);
+  		low = atol(lowchar);
+  		if ((flag & SEL_RIGHT && val < low) ||
+  			(!(flag & SEL_RIGHT) && val > high))
+  		{
+  			float32data nvals;
+  
+  			nvals = getattdisbursion(relid, (int) attno);
+! 			if (nvals == 0)
+  				*result = 1.0 / 3.0;
+  			else
+  			{
+  				*result = 3.0 * (float64data) nvals;
+--- 121,136 ----
+  		}
+  		high = atol(highchar);
+  		low = atol(lowchar);
+ 		printf("(selfuncs) intltsel() high %d low %d val %d\n",high,low,val);
+  		if ((flag & SEL_RIGHT && val < low) ||
+  			(!(flag & SEL_RIGHT) && val > high))
+  		{
+  			float32data nvals;
+  
+  			nvals = getattdisbursion(relid, (int) attno);
+! 			if (nvals == 0) {
+  				*result = 1.0 / 3.0;
+ 			}
+  			else
+  			{
+  				*result = 3.0 * (float64data) nvals;
+***************
+*** 336,341 ****
+--- 342,353 ----
+  {
+  	Relation	rel;
+  	HeapScanDesc scan;
+ 	/* this assumes there is only one row in the statistics table for any
+particular */
+ 	/* relid, attnum pair - could be more complicated if staop is also
+used.         */
+ 	/* at the moment, if there are multiple rows, this code ends up
+picking the      */
+ 	/* "first" one                                                       
+   - er1p  */
+ 	/* the actual "ignoring" is done in the call to heap_beginscan()
+below, where    */
+ 	/* we only mention 2 of the 3 keys in this array                     
+   - er1p  */
+  	static ScanKeyData key[3] = {
+  		{0, Anum_pg_statistic_starelid, F_OIDEQ, {0, 0, F_OIDEQ}},
+  		{0, Anum_pg_statistic_staattnum, F_INT2EQ, {0, 0, F_INT2EQ}},
+***************
+*** 344,355 ****
+  	bool		isnull;
+  	HeapTuple	tuple;
+  
+  	rel = heap_openr(StatisticRelationName);
+  
+  	key[0].sk_argument = ObjectIdGetDatum(relid);
+  	key[1].sk_argument = Int16GetDatum((int16) attnum);
+  	key[2].sk_argument = ObjectIdGetDatum(opid);
+! 	scan = heap_beginscan(rel, 0, SnapshotNow, 3, key);
+  	tuple = heap_getnext(scan, 0);
+  	if (!HeapTupleIsValid(tuple))
+  	{
+--- 356,377 ----
+  	bool		isnull;
+  	HeapTuple	tuple;
+  
+ 	HeapTuple tup;
+ 	Form_pg_proc proc;
+ 	Form_pg_type typ;
+ 	Oid which_op;
+ 	Oid which_type;
+ 	int32 low_value;
+ 	int32 high_value;
+ 
+  	rel = heap_openr(StatisticRelationName);
+  
+  	key[0].sk_argument = ObjectIdGetDatum(relid);
+  	key[1].sk_argument = Int16GetDatum((int16) attnum);
+  	key[2].sk_argument = ObjectIdGetDatum(opid);
+! 	printf("(selfuncs) gethilokey() obj %d attr %d opid %d (ignored)\n",
+! 	       key[0].sk_argument,key[1].sk_argument,key[2].sk_argument);
+! 	scan = heap_beginscan(rel, 0, SnapshotNow, 2, key);
+  	tuple = heap_getnext(scan, 0);
+  	if (!HeapTupleIsValid(tuple))
+  	{
+***************
+*** 376,383 ****
+--- 398,461 ----
+  								&isnull));
+  	if (isnull)
+  		elog(DEBUG, "gethilokey: low key is null");
+ 
+  	heap_endscan(scan);
+  	heap_close(rel);
+ 
+ 	/* now we deal with type conversion issues                           
+         */
+ 	/* when intltsel() calls this routine (who knows what other callers
+might do)  */
+ 	/* it assumes that it can call atol() on the strings and then use
+integer      */
+ 	/* comparison from there.  what we are going to do here, then, is try
+to use   */
+ 	/* the type information from Anum_pg_statistic_staop to convert the
+high       */
+ 	/* and low values                                                   
+- er1p    */
+ 
+ 	/* WARNING: this code has only been tested with the date type and has
+NOT      */
+ 	/* been regression tested.  consider it "sample" code of what might
+be the     */
+ 	/* right kind of thing to do                                        
+- er1p    */
+ 
+ 	/* get the 'op' from pg_statistic and look it up in pg_proc */
+ 	which_op = heap_getattr(tuple,
+ 				Anum_pg_statistic_staop,
+ 				RelationGetDescr(rel),
+ 				&isnull);
+ 	if (InvalidOid == which_op) {
+ 	  /* ignore all this stuff, try conversion only if we have a valid staop */
+ 	  /* note that there is an accompanying change to 'vacuum analyze' that  */
+ 	  /* gets this set to something useful.                                  */
+ 	} else {
+ 	  /* staop looks valid, so let's see what we can do about conversion */
+ 	  tup = SearchSysCacheTuple(PROOID, ObjectIdGetDatum(which_op), 0, 0, 0);
+ 	  if (!HeapTupleIsValid(tup)) {
+ 	    elog(ERROR, "selfuncs: unable to find op in pg_proc %d", which_op);
+ 	  }
+ 	  printf("(selfuncs) gethilokey() found op %d in pg_proc\n",which_op);
+ 	  
+ 	  /* use that to determine the type of stahikey and stalokey via pg_type */
+ 	  proc = (Form_pg_proc) GETSTRUCT(tup);
+ 	  which_type = proc->proargtypes[0]; /* XXX - use left and right
+separately? */
+ 	  tup = SearchSysCacheTuple(TYPOID, ObjectIdGetDatum(which_type), 0, 0, 0);
+ 	  if (!HeapTupleIsValid(tup)) {
+ 	    elog(ERROR, "selfuncs: unable to find type in pg_type %d", which_type);
+ 	  }
+ 	  printf("(selfuncs) gethilokey() found type %d in pg_type\n",which_type);
+ 	  
+ 	  /* and use that type to get the conversion function to int4 */
+ 	  typ = (Form_pg_type) GETSTRUCT(tup);
+ 	  printf("(selfuncs) gethilokey() going to use %d to convert type
+%d\n",typ->typinput,which_type);
+ 	  
+ 	  /* and convert the low and high strings */
+ 	  low_value = (int32) fmgr(typ->typinput, *low, -1);
+ 	  high_value = (int32) fmgr(typ->typinput, *high, -1);
+ 	  printf("(selfuncs) gethilokey() have low %d high
+%d\n",low_value,high_value);
+ 	  
+ 	  /* now we have int4's, which we put back into strings because
+that's what out  */
+ 	  /* callers (intltsel() at least) expect                            
+    - er1p */
+ 	  pfree(*low); pfree(*high); /* let's not leak the old strings */
+ 	  *low = int4out(low_value);
+ 	  *high = int4out(high_value);
+ 
+ 	  /* XXX - this probably leaks the two tups we got from
+SearchSysCacheTuple() - er1p */
+ 	}
+  }
+  
+  float64
+SHAR_EOF
+fi
+if test -f 'vacuum.c.diff'
+then
+	echo shar: "will not over-write existing file 'vacuum.c.diff'"
+else
+cat << \SHAR_EOF > 'vacuum.c.diff'
+***
+/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/611/src/backend/commands/
+vacuum.c	Thu Mar 11 23:59:09 1999
+---
+/afs/ece.cmu.edu/project/lcs/lcs-004/er1p/postgres/615/src/backend/commands/
+vacuum.c	Mon Mar 22 21:23:15 1999
+***************
+*** 1842,1848 ****
+  					i = 0;
+  					values[i++] = (Datum) relid;		/* 1 */
+  					values[i++] = (Datum) attp->attnum; /* 2 */
+! 					values[i++] = (Datum) InvalidOid;	/* 3 */
+  					fmgr_info(stats->outfunc, &out_function);
+  					out_string = (*fmgr_faddr(&out_function)) (stats->min,
+stats->attr->atttypid);
+  					values[i++] = (Datum) fmgr(F_TEXTIN, out_string);
+--- 1842,1848 ----
+  					i = 0;
+  					values[i++] = (Datum) relid;		/* 1 */
+  					values[i++] = (Datum) attp->attnum; /* 2 */
+! 					values[i++] = (Datum) stats->f_cmplt.fn_oid;	/* 3 */ /* get the
+'<' oid, instead of 'invalid' - er1p */
+  					fmgr_info(stats->outfunc, &out_function);
+  					out_string = (*fmgr_faddr(&out_function)) (stats->min,
+stats->attr->atttypid);
+  					values[i++] = (Datum) fmgr(F_TEXTIN, out_string);
+SHAR_EOF
+fi
+exit 0
+#	End of shell archive
+
+
+
+From owner-pgsql-hackers@hub.org Tue Mar 23 12:31:05 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA17491
+	for <maillist@candle.pha.pa.us>; Tue, 23 Mar 1999 12:31:04 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id MAA08839 for <maillist@candle.pha.pa.us>; Tue, 23 Mar 1999 12:08:14 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.2/8.9.1) with SMTP id MAA93649;
+	Tue, 23 Mar 1999 12:04:57 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Tue, 23 Mar 1999 12:03:00 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.2/8.9.1) id MAA93355
+	for pgsql-hackers-outgoing; Tue, 23 Mar 1999 12:02:55 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6])
+	by hub.org (8.9.2/8.9.1) with ESMTP id MAA93336
+	for <pgsql-hackers@postgreSQL.org>; Tue, 23 Mar 1999 12:02:43 -0500 (EST)
+	(envelope-from tgl@sss.pgh.pa.us)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id MAA24455;
+	Tue, 23 Mar 1999 12:01:57 -0500 (EST)
+To: Erik Riedel <riedel+@CMU.EDU>
+cc: pgsql-hackers@postgreSQL.org
+Subject: Re: [HACKERS] optimizer and type question 
+In-reply-to: Your message of Mon, 22 Mar 1999 23:14:55 -0500 (EST) 
+             <4qxlJ0200anI01hK40@andrew.cmu.edu> 
+Date: Tue, 23 Mar 1999 12:01:57 -0500
+Message-ID: <24453.922208517@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Erik Riedel <riedel+@CMU.EDU> writes:
+> OK, building on your high-level explanation, I am attaching a patch that
+> attempts to do something "better" than the current code.  Note that I
+> have only tested this with the date type and my particular query.
+
+Glad to see you working on this.  I don't like the details of your
+patch too much though ;-).  Here are some suggestions for making it
+better.
+
+1. I think just removing staop from the lookup in gethilokey is OK for
+now, though I'm dubious about Bruce's thought that we could delete that
+field entirely.  As you observe, vacuum will not currently put more
+than one tuple for a column into pg_statistic, so we can just do the
+lookup with relid and attno and leave it at that.  But I think we ought
+to leave the field there, with the idea that vacuum might someday
+compute more than one statistic for a data column.  Fixing vacuum to
+put its sort op into the field is a good idea in the meantime.
+
+2. The type conversion you're doing in gethilokey is a mess; I think
+what you ought to make it do is simply the inbound conversion of the
+string from pg_statistic into the internal representation for the
+column's datatype, and return that value as a Datum.  It also needs
+a cleaner success/failure return convention --- this business with
+"n" return is ridiculously type-specific.  Also, the best and easiest
+way to find the type to convert to is to look up the column type in
+the info for the given relid, not search pg_proc with the staop value.
+(I'm not sure that will even work, since there are pg_proc entries
+with wildcard argument types.)
+
+3. The atol() calls currently found in intltsel are a type-specific
+cheat on what is conceptually a two-step process:
+  * Convert the string stored in pg_statistic back to the internal
+    form for the column data type.
+  * Generate a numeric representation of the data value that can be
+    used as an estimate of the range of values in the table.
+The second step is trivial for integers, which may obscure the fact
+that there are two steps involved, but nonetheless there are.  If
+you think about applying selectivity logic to strings, say, it
+becomes clear that the second step is a necessary component of the
+process.  Furthermore, the second step must also be applied to the
+probe value that's being passed into the selectivity operator.
+(The probe value is already in internal form, of course; but it is
+not necessarily in a useful numeric form.)
+
+We can do the first of these steps by applying the appropriate "XXXin"
+conversion function for the column data type, as you have done.  The
+interesting question is how to do the second one.  A really clean
+solution would require adding a column to pg_type that points to a
+function that will do the appropriate conversion.  I'd be inclined to
+make all of these functions return "double" (float8) and just have one
+top-level selectivity routine for all data types that can use
+range-based selectivity logic.
+
+We could probably hack something together that would not use an explicit
+conversion function for each data type, but instead would rely on
+type-specific assumptions inside the selectivity routines.  We'd need many
+more selectivity routines though (at least one for each of int, float4,
+float8, and text data types) so I'm not sure we'd really save any work
+compared to doing it right.
+
+BTW, now that I look at this issue it's real clear that the selectivity
+entries in pg_operator are horribly broken.  The intltsel/intgtsel
+selectivity routines are currently applied to 32 distinct data types:
+
+regression=> select distinct typname,oprleft from pg_operator, pg_type
+regression-> where pg_type.oid = oprleft
+regression-> and oprrest in (103,104);
+typname  |oprleft
+---------+-------
+_aclitem |   1034
+abstime  |    702
+bool     |     16
+box      |    603
+bpchar   |   1042
+char     |     18
+cidr     |    650
+circle   |    718
+date     |   1082
+datetime |   1184
+float4   |    700
+float8   |    701
+inet     |    869
+int2     |     21
+int4     |     23
+int8     |     20
+line     |    628
+lseg     |    601
+macaddr  |    829
+money    |    790
+name     |     19
+numeric  |   1700
+oid      |     26
+oid8     |     30
+path     |    602
+point    |    600
+polygon  |    604
+text     |     25
+time     |   1083
+timespan |   1186
+timestamp|   1296
+varchar  |   1043
+(32 rows)
+
+many of which are very obviously not compatible with integer for *any*
+purpose.  It looks to me like a lot of data types were added to
+pg_operator just by copy-and-paste, without paying attention to whether
+the selectivity routines were actually correct for the data type.
+
+As the code stands today, the bogus entries don't matter because
+gethilokey always fails, so we always get 1/3 as the selectivity
+estimate for any comparison operator (except = and != of course).
+I had actually noticed that fact and assumed that it was supposed
+to work that way :-(.  But, clearly, there is code in here that
+is *trying* to be smarter.
+
+As soon as we fix gethilokey so that it can succeed, we will start
+getting essentially-random selectivity estimates for those data types
+that aren't actually binary-compatible with integer.  That will not do;
+we have to do something about the issue.
+
+			regards, tom lane
+
+
+From tgl@sss.pgh.pa.us Tue Mar 23 12:31:02 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA17484
+	for <maillist@candle.pha.pa.us>; Tue, 23 Mar 1999 12:31:01 -0500 (EST)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id MAA09042 for <maillist@candle.pha.pa.us>; Tue, 23 Mar 1999 12:10:55 -0500 (EST)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id MAA24474;
+	Tue, 23 Mar 1999 12:09:52 -0500 (EST)
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+cc: riedel+@CMU.EDU, pgsql-hackers@postgreSQL.org
+Subject: Re: [HACKERS] optimizer and type question 
+In-reply-to: Your message of Mon, 22 Mar 1999 21:25:45 -0500 (EST) 
+             <199903230225.VAA01641@candle.pha.pa.us> 
+Date: Tue, 23 Mar 1999 12:09:52 -0500
+Message-ID: <24471.922208992@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Status: RO
+
+Bruce Momjian <maillist@candle.pha.pa.us> writes:
+> What we really need is some way to determine how far the requested value
+> is from the min/max values.  With int, we just do (val-min)/(max-min). 
+> That works, but how do we do that for types that don't support division.
+> Strings come to mind in this case.
+
+What I'm envisioning is that we still apply the (val-min)/(max-min)
+logic, but apply it to numeric values that are produced in a
+type-dependent way.
+
+For ints and floats the conversion is trivial, of course.
+
+For strings, the first thing that comes to mind is to return 0 for a
+null string and the value of the first byte for a non-null string.
+This would give you one-part-in-256 selectivity which is plenty good
+enough for what the selectivity code needs to do.  (Actually, it's
+only that good if the strings' first bytes are pretty well spread out.
+If you have a table containing English words, for example, you might
+only get about one part in 26 this way, since the first bytes will
+probably only run from A to Z.  Might be better to use the first two
+characters of the string to compute the selectivity representation.)
+
+In general, you can apply this logic as long as you can come up with
+some numerical approximation to the data type's sorting order.  It
+doesn't have to be exact.
+
+			regards, tom lane
+
+From owner-pgsql-hackers@hub.org Tue Mar 23 12:31:03 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id MAA17488
+	for <maillist@candle.pha.pa.us>; Tue, 23 Mar 1999 12:31:02 -0500 (EST)
+Received: from hub.org (majordom@hub.org [209.47.145.100]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id MAA09987 for <maillist@candle.pha.pa.us>; Tue, 23 Mar 1999 12:21:34 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.2/8.9.1) with SMTP id MAA95155;
+	Tue, 23 Mar 1999 12:18:33 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Tue, 23 Mar 1999 12:17:00 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.2/8.9.1) id MAA94857
+	for pgsql-hackers-outgoing; Tue, 23 Mar 1999 12:16:56 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6])
+	by hub.org (8.9.2/8.9.1) with ESMTP id MAA94469
+	for <pgsql-hackers@postgreSQL.org>; Tue, 23 Mar 1999 12:11:33 -0500 (EST)
+	(envelope-from tgl@sss.pgh.pa.us)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id MAA24474;
+	Tue, 23 Mar 1999 12:09:52 -0500 (EST)
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+cc: riedel+@CMU.EDU, pgsql-hackers@postgreSQL.org
+Subject: Re: [HACKERS] optimizer and type question 
+In-reply-to: Your message of Mon, 22 Mar 1999 21:25:45 -0500 (EST) 
+             <199903230225.VAA01641@candle.pha.pa.us> 
+Date: Tue, 23 Mar 1999 12:09:52 -0500
+Message-ID: <24471.922208992@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Bruce Momjian <maillist@candle.pha.pa.us> writes:
+> What we really need is some way to determine how far the requested value
+> is from the min/max values.  With int, we just do (val-min)/(max-min). 
+> That works, but how do we do that for types that don't support division.
+> Strings come to mind in this case.
+
+What I'm envisioning is that we still apply the (val-min)/(max-min)
+logic, but apply it to numeric values that are produced in a
+type-dependent way.
+
+For ints and floats the conversion is trivial, of course.
+
+For strings, the first thing that comes to mind is to return 0 for a
+null string and the value of the first byte for a non-null string.
+This would give you one-part-in-256 selectivity which is plenty good
+enough for what the selectivity code needs to do.  (Actually, it's
+only that good if the strings' first bytes are pretty well spread out.
+If you have a table containing English words, for example, you might
+only get about one part in 26 this way, since the first bytes will
+probably only run from A to Z.  Might be better to use the first two
+characters of the string to compute the selectivity representation.)
+
+In general, you can apply this logic as long as you can come up with
+some numerical approximation to the data type's sorting order.  It
+doesn't have to be exact.
+
+			regards, tom lane
+
+
--- a/doc/TODO.detail/outer
+++ b/doc/TODO.detail/outer
@ -0,0 +1,313 @@
+From lockhart@alumni.caltech.edu Thu Jan  7 13:31:08 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA07771
+	for <maillist@candle.pha.pa.us>; Thu, 7 Jan 1999 13:31:06 -0500 (EST)
+Received: from golem.jpl.nasa.gov (IDENT:root@hectic-2.jpl.nasa.gov [128.149.68.204]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id NAA14597 for <maillist@candle.pha.pa.us>; Thu, 7 Jan 1999 13:27:37 -0500 (EST)
+Received: from alumni.caltech.edu (localhost [127.0.0.1])
+	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id SAA13416;
+	Thu, 7 Jan 1999 18:26:56 GMT
+Sender: tgl@mythos.jpl.nasa.gov
+Message-ID: <3694FC70.FAD67BC3@alumni.caltech.edu>
+Date: Thu, 07 Jan 1999 18:26:56 +0000
+From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>
+Organization: Caltech/JPL
+X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.30 i686)
+MIME-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+CC: Postgres Hackers List <hackers@postgresql.org>
+Subject: Outer Joins (and need CASE help)
+References: <199901071747.MAA07054@candle.pha.pa.us>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: RO
+
+> Thomas, do you need help on outer joins?
+
+Yes. I'm going slowly partly because I get distracted with other
+Postgres stuff like docs, and partly because I don't understand all of
+the pieces I'm working with.
+
+I've identified the place in the MergeJoin code where the null filling
+for outer joins needs to happen, and have the "merge walk" code done.
+But I don't have the supporting code which actually would know how to
+null-fill a result tuple from the left or right. I thought you might be
+interested in that?
+
+I've done some work in the parser, and can now do things like:
+
+postgres=> select * from t1 join t2 using (i);
+NOTICE:  JOIN not yet implemented
+i|j|i|k
+-+-+-+-
+1|2|1|3
+(1 row)
+
+But this is just an inner join, and the result isn't quite right since
+the second "i" column should probably be omitted. At the moment I
+transform it from the syntax above into existing parse nodes, and
+everything from there on works.
+
+I don't yet pass an explicit join node into the planner/optimizer, and
+that will be the hardest part I assume. Perhaps we can work on that
+together.
+
+So, what I'll try to do (soon, in the next few days?) is put in
+
+  #ifdef ENABLE_OUTER_JOINS
+
+conditional code into the parser area (already there for the executor)
+and commit everything to the development tree. Does that sound OK?
+
+Oh, and if anyone is looking for something to do, I've got a couple of
+CASE statements in the case.sql regression test which are commented out
+because they crash the backend. They involve references to multiple
+tables within a single result column, and in other contexts that
+construct works. It would be great if someone had time to track it
+down...
+
+                     - Tom
+
+From lockhart@alumni.caltech.edu Mon Feb 22 02:01:13 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id CAA22073
+	for <maillist@candle.pha.pa.us>; Mon, 22 Feb 1999 02:01:12 -0500 (EST)
+Received: from golem.jpl.nasa.gov (IDENT:root@hectic-2.jpl.nasa.gov [128.149.68.204]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id BAA26054 for <maillist@candle.pha.pa.us>; Mon, 22 Feb 1999 01:57:00 -0500 (EST)
+Received: from alumni.caltech.edu (localhost [127.0.0.1])
+	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id GAA04715;
+	Mon, 22 Feb 1999 06:56:36 GMT
+Sender: tgl@mythos.jpl.nasa.gov
+Message-ID: <36D0FFA4.32ADB75C@alumni.caltech.edu>
+Date: Mon, 22 Feb 1999 06:56:36 +0000
+From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>
+Organization: Caltech/JPL
+X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686)
+MIME-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+CC: hackers@postgreSQL.org
+Subject: Re: start on outer join
+References: <199902220304.WAA10066@candle.pha.pa.us>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: ROr
+
+Bruce Momjian wrote:
+> 
+> > Will apply ... some other changes laying a bit of
+> > groundwork for outer joins so you can start on the planner/optimizer
+> > parts :)
+> Those will be a synch now that I understand the optimizer.  In fact, I
+> think it all will happen in the executor.
+
+I've modified executor/nodeMergeJoin.c to walk a left/right/both outer
+join, but didn't fill in the part which actually creates the result
+tuple (which will be the current left- or right-side tuple plus nulls
+for filler). I hope this is up your alley :)
+
+So far, I'm not certain what to pass to the planner. The syntax leads me
+to pass a select structure from gram.y with a "JoinExpr" structure in
+the "fromClause" list. I need to expand that with a combination of
+column names and qualifications, but at the time I see the JoinExpr I
+don't have access to the top query structure itself. So I may just keep
+a modestly transformed JoinExpr to expand later or to pass to the
+planner.
+
+btw, the EXCEPT/INTERSECT stuff from Stefan has some ugliness in gram.y
+which needs to be fixed (the shift/reduce conflict is not acceptable for
+our release version) and some of that code clearly needs to move to
+analyze.c or some other module.
+
+                     - Tom
+
+From maillist Wed Feb 24 05:27:08 1999
+Received: (from maillist@localhost)
+	by candle.pha.pa.us (8.9.0/8.9.0) id FAA09648;
+	Wed, 24 Feb 1999 05:27:08 -0500 (EST)
+From: Bruce Momjian <maillist>
+Message-Id: <199902241027.FAA09648@candle.pha.pa.us>
+Subject: Re: [HACKERS] OUTER joins
+In-Reply-To: <199902240953.EAA08561@candle.pha.pa.us> from Bruce Momjian at "Feb 24, 1999  4:53:21 am"
+To: maillist@candle.pha.pa.us (Bruce Momjian)
+Date: Wed, 24 Feb 1999 05:27:07 -0500 (EST)
+Cc: lockhart@alumni.caltech.edu, hackers@postgreSQL.org
+X-Mailer: ELM [version 2.4ME+ PL47 (25)]
+MIME-Version: 1.0
+Content-Type: text/plain; charset=US-ASCII
+Content-Transfer-Encoding: 7bit
+Status: RO
+
+> 
+> How do you propose doing outer joins in non-mergejoin situations?
+> Mergejoins can only be used currently in equal joins.
+
+Is your solution going to be to make sure the OUTER table is always a
+MergeJoin, or on the outside of a join loop?  That could work.
+
+That could get tricky if the table is joined to _two_ other tables. 
+With the cleaned-up optimizer, we can disable non-merge joins in certain
+circumstances, and prevent OUTER tables from being inner in the others. 
+Is that the plan?
+
+-- 
+  Bruce Momjian                        |  http://www.op.net/~candle
+  maillist@candle.pha.pa.us            |  (610) 853-3000
+  +  If your life is a hard drive,     |  830 Blythe Avenue
+  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
+
+From lockhart@alumni.caltech.edu Mon Mar  1 13:01:08 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id NAA21672
+	for <maillist@candle.pha.pa.us>; Mon, 1 Mar 1999 13:01:06 -0500 (EST)
+Received: from golem.jpl.nasa.gov (IDENT:root@hectic-2.jpl.nasa.gov [128.149.68.204]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id MAA12756 for <maillist@candle.pha.pa.us>; Mon, 1 Mar 1999 12:14:16 -0500 (EST)
+Received: from alumni.caltech.edu (localhost [127.0.0.1])
+	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id RAA09406;
+	Mon, 1 Mar 1999 17:10:49 GMT
+Sender: tgl@mythos.jpl.nasa.gov
+Message-ID: <36DACA19.E6DBE7D8@alumni.caltech.edu>
+Date: Mon, 01 Mar 1999 17:10:49 +0000
+From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>
+Organization: Caltech/JPL
+X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686)
+MIME-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+CC: PostgreSQL-development <hackers@postgreSQL.org>
+Subject: Re: OUTER joins
+References: <199902240953.EAA08561@candle.pha.pa.us>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: ROr
+
+(back from a short vacation...)
+
+> How do you propose doing outer joins in non-mergejoin situations?
+> Mergejoins can only be used currently in equal joins.
+
+Hadn't thought about it, other than figuring that implementing the
+equi-join first was a good start. There is a class of outer join syntax
+(the USING clause) which is implicitly an equi-join...
+
+                        - Tom
+
+From lockhart@alumni.caltech.edu Mon Mar  8 21:55:02 1999
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id VAA15978
+	for <maillist@candle.pha.pa.us>; Mon, 8 Mar 1999 21:54:57 -0500 (EST)
+Received: from golem.jpl.nasa.gov (IDENT:root@hectic-1.jpl.nasa.gov [128.149.68.203]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id VAA15837 for <maillist@candle.pha.pa.us>; Mon, 8 Mar 1999 21:48:33 -0500 (EST)
+Received: from alumni.caltech.edu (localhost [127.0.0.1])
+	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id CAA06996;
+	Tue, 9 Mar 1999 02:46:40 GMT
+Sender: tgl@mythos.jpl.nasa.gov
+Message-ID: <36E48B90.F3E902B7@alumni.caltech.edu>
+Date: Tue, 09 Mar 1999 02:46:40 +0000
+From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>
+Organization: Caltech/JPL
+X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686)
+MIME-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+CC: hackers@postgreSQL.org
+Subject: Re: OUTER joins
+References: <199903070325.WAA10357@candle.pha.pa.us>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: ROr
+
+> > Hadn't thought about it, other than figuring that implementing the
+> > equi-join first was a good start. There is a class of outer join 
+> > syntax (the USING clause) which is implicitly an equi-join...
+> Not that easy.  You don't automatically get a mergejoin from an
+> equijoin.  I will have to force outer's to be either mergejoins, or
+> inners of non-merge joins.  Can you add code to non-merge joins in the
+> executor to throw out a null row if it does not find an inner match 
+> for the outer row, and I will handle the optimizer so it doesn't throw 
+> a non-conforming plan to the executor.
+
+So far I don't have enough info in the parser to get the
+planner/optimizer going. Should we work from the front to the back, or
+should I go ahead and look at the non-merge joins? It's painfully
+obvious that I don't know anything about the middle parts of this to
+proceed without lots more research.
+
+                        - Tom
+
+From lockhart@alumni.caltech.edu Tue Mar  9 22:47:57 1999
+Received: from golem.jpl.nasa.gov (IDENT:root@hectic-1.jpl.nasa.gov [128.149.68.203])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id WAA07869
+	for <maillist@candle.pha.pa.us>; Tue, 9 Mar 1999 22:47:54 -0500 (EST)
+Received: from alumni.caltech.edu (localhost [127.0.0.1])
+	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id DAA14761;
+	Wed, 10 Mar 1999 03:46:43 GMT
+Sender: tgl@mythos.jpl.nasa.gov
+Message-ID: <36E5EB23.F5CD959B@alumni.caltech.edu>
+Date: Wed, 10 Mar 1999 03:46:43 +0000
+From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>
+Organization: Caltech/JPL
+X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686)
+MIME-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>, tgl@mythos.jpl.nasa.gov
+Subject: Re: SQL outer
+References: <199903100112.UAA05772@candle.pha.pa.us>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: RO
+
+>         select  *
+>         from    outer tab1, tab2, tab3
+>         where   tab1.col1 = tab2.col1 and
+>                 tab1.col1 = tab3.col1
+
+select *
+from t1 left join t2 using (c1)
+        join t3 on (c1 = t3.c1)
+
+Result:
+t1.c1	t1.c2	t2.c2	t3.c1
+2	12	NULL	32
+
+t1:
+c1	c2
+1	11
+2	12
+3	13
+4	14
+
+t2:
+c1	c2
+1	21
+3	23
+
+t3:
+c1	c2
+2	32
+
+From lockhart@alumni.caltech.edu Wed Mar 10 10:48:54 1999
+Received: from golem.jpl.nasa.gov (IDENT:root@hectic-1.jpl.nasa.gov [128.149.68.203])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id KAA16741
+	for <maillist@candle.pha.pa.us>; Wed, 10 Mar 1999 10:48:51 -0500 (EST)
+Received: from alumni.caltech.edu (localhost [127.0.0.1])
+	by golem.jpl.nasa.gov (8.8.5/8.8.5) with ESMTP id PAA17723;
+	Wed, 10 Mar 1999 15:48:31 GMT
+Sender: tgl@mythos.jpl.nasa.gov
+Message-ID: <36E6944F.1F93B08@alumni.caltech.edu>
+Date: Wed, 10 Mar 1999 15:48:31 +0000
+From: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>
+Organization: Caltech/JPL
+X-Mailer: Mozilla 4.07 [en] (X11; I; Linux 2.0.36 i686)
+MIME-Version: 1.0
+To: Bruce Momjian <maillist@candle.pha.pa.us>
+CC: Thomas Lockhart <lockhart@alumni.caltech.edu>
+Subject: Re: SQL outer
+References: <199903100112.UAA05772@candle.pha.pa.us> <36E5EB23.F5CD959B@alumni.caltech.edu>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Status: ROr
+
+Just thinking...
+
+If the initial RelOptInfo groupings are derived from the WHERE clause
+expressions, how about marking the "outer" property in those expressions
+in the parser? istm that is where the parser knows about two tables in
+one place, and I'm generating those expressions anyway. We could add a
+field(s) to the expression structure, or pass along a slightly different
+structure...
+
+                         - Tom
+
--- a/doc/TODO.detail/performance
+++ b/doc/TODO.detail/performance
@ -0,0 +1,343 @@
+From owner-pgsql-hackers@hub.org Sun Jun 14 18:45:04 1998
+Received: from hub.org (hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id SAA03690
+	for <maillist@candle.pha.pa.us>; Sun, 14 Jun 1998 18:45:00 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id SAA28049; Sun, 14 Jun 1998 18:39:42 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 14 Jun 1998 18:36:06 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id SAA27943 for pgsql-hackers-outgoing; Sun, 14 Jun 1998 18:36:04 -0400 (EDT)
+Received: from angular.illustra.com (ifmxoak.illustra.com [206.175.10.34]) by hub.org (8.8.8/8.7.5) with ESMTP id SAA27925 for <pgsql-hackers@postgresql.org>; Sun, 14 Jun 1998 18:35:47 -0400 (EDT)
+Received: from hawk.illustra.com (hawk.illustra.com [158.58.61.70]) by angular.illustra.com (8.7.4/8.7.3) with SMTP id PAA21293 for <pgsql-hackers@postgresql.org>; Sun, 14 Jun 1998 15:35:12 -0700 (PDT)
+Received: by hawk.illustra.com (5.x/smail2.5/06-10-94/S)
+	id AA07922; Sun, 14 Jun 1998 15:35:13 -0700
+From: dg@illustra.com (David Gould)
+Message-Id: <9806142235.AA07922@hawk.illustra.com>
+Subject: [HACKERS] performance tests, initial results
+To: pgsql-hackers@postgreSQL.org
+Date: Sun, 14 Jun 1998 15:35:13 -0700 (PDT)
+Mime-Version: 1.0
+Content-Type: text/plain; charset=US-ASCII
+Content-Transfer-Encoding: 7bit
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: RO
+
+
+I have been playing a little with the performance tests found in
+pgsql/src/tests/performance and have a few observations that might be of
+minor interest.
+
+The tests themselves are simple enough although the result parsing in the
+driver did not work on Linux. I am enclosing a patch below to fix this. I
+think it will also work better on the other systems.
+
+A summary of results from my testing are below. Details are at the bottom
+of this message.
+
+My test system is 'leslie':
+
+ linux 2.0.32, gcc version 2.7.2.3
+ P133, HX chipset, 512K L2, 32MB mem
+ NCR810 fast scsi, Quantum Atlas 2GB drive (7200 rpm).
+
+
+                     Results Summary (times in seconds)
+
+                    Single txn 8K txn    Create 8K idx 8K random Simple
+Case Description    8K insert  8K insert Index  Insert Scans     Orderby
+=================== ========== ========= ====== ====== ========= =======
+1 From Distribution
+  P90 FreeBsd -B256      39.56   1190.98   3.69  46.65     65.49    2.27
+  IDE
+
+2 Running on leslie
+  P133 Linux 2.0.32      15.48    326.75   2.99  20.69     35.81    1.68
+  SCSI 32M
+
+3 leslie, -o -F
+  no forced writes       15.90     24.98   2.63  20.46     36.43    1.69
+
+4 leslie, -o -F
+  no ASSERTS             14.92     23.23   1.38  18.67     33.79    1.58
+
+5 leslie, -o -F -B2048
+  more buffers           21.31     42.28   2.65  25.74     42.26    1.72
+
+6 leslie, -o -F -B2048
+  more bufs, no ASSERT   20.52     39.79   1.40  24.77     39.51    1.55
+
+
+
+
+                 Case to Case Difference Factors (+ is faster)
+
+                    Single txn 8K txn    Create 8K idx 8K random Simple
+Case Description    8K insert  8K insert Index  Insert Scans     Orderby
+=================== ========== ========= ====== ====== ========= =======
+
+leslie vs BSD P90.        2.56      3.65   1.23   2.25      1.83    1.35
+
+(noflush -F) vs no -F    -1.03     13.08   1.14   1.01     -1.02    1.00
+
+No Assert vs Assert       1.05      1.07   1.90   1.06      1.07    1.09
+
+-B256 vs -B2048           1.34      1.69   1.01   1.26      1.16    1.02
+
+
+Observations:
+
+ - leslie (P133 linux) appears to be about 1.8 times faster than the
+   P90 BSD system used for the test result distributed with the source, not
+   counting the 8K txn insert case which was completely disk bound.
+
+ - SCSI disks make a big (factor of 3.6) difference. During this test the
+   disk was hammering and cpu utilization was < 10%.
+
+ - Assertion checking seems to cost about 7% except for create index where
+   it costs 90%
+
+ - the -F option to avoid flushing buffers has tremendous effect if there are
+   many very small transactions. Or, another way, flushing at the end of the
+   transaction is a major disaster for performance.
+
+ - Something is very wrong with our buffer cache implementation. Going from
+   256 buffers to 2048 buffers costs an average of 25%. In the 8K txn case
+   it costs about 70%. I see looking at the code and profiling that in the 8K
+   txn case this is in BufferSync() which examines all the buffers at commit
+   time. I don't quite understand why it is so costly for the single 8K row
+   txn (35%) though.
+
+It would be nice to have some more tests. Maybe the Wisconsin stuff will
+be useful.
+
+
+
+----------------- patch to test harness. apply from pgsql ------------
+*** src/test/performance/runtests.pl.orig	Sun Jun 14 11:34:04 1998
+
+Differences %
+
+
+----------------- patch to test harness. apply from pgsql ------------
+*** src/test/performance/runtests.pl.orig	Sun Jun 14 11:34:04 1998
+--- src/test/performance/runtests.pl	Sun Jun 14 12:07:30 1998
+***************
+*** 84,123 ****
+  open (STDERR, ">$TmpFile") or die;
+  select (STDERR); $| = 1;
+  
+! for ($i = 0; $i <= $#perftests; $i++)
+! {
+  	$test = $perftests[$i];
+  	($test, $XACTBLOCK) = split (/ /, $test);
+  	$runtest = $test;
+! 	if ( $test =~ /\.ntm/ )
+! 	{
+! 		# 
+  		# No timing for this queries
+- 		# 
+  		close (STDERR);		# close $TmpFile
+  		open (STDERR, ">/dev/null") or die;
+  		$runtest =~ s/\.ntm//;
+  	}
+! 	else
+! 	{
+  		close (STDOUT);
+  		open(STDOUT, ">&SAVEOUT");
+  		print STDOUT "\nRunning: $perftests[$i+1] ...";
+  		close (STDOUT);
+  		open (STDOUT, ">/dev/null") or die;
+  		select (STDERR); $| = 1;
+! 		printf "$perftests[$i+1]: ";
+  	}
+  
+  	do "sqls/$runtest";
+  
+  	# Restore STDERR to $TmpFile
+! 	if ( $test =~ /\.ntm/ )
+! 	{
+  		close (STDERR);
+  		open (STDERR, ">>$TmpFile") or die;
+  	}
+- 
+  	select (STDERR); $| = 1;
+  	$i++;
+  }
+--- 84,116 ----
+  open (STDERR, ">$TmpFile") or die;
+  select (STDERR); $| = 1;
+  
+! for ($i = 0; $i <= $#perftests; $i++) {
+  	$test = $perftests[$i];
+  	($test, $XACTBLOCK) = split (/ /, $test);
+  	$runtest = $test;
+! 	if ( $test =~ /\.ntm/ ) {
+  		# No timing for this queries
+  		close (STDERR);		# close $TmpFile
+  		open (STDERR, ">/dev/null") or die;
+  		$runtest =~ s/\.ntm//;
+  	}
+! 	else {
+  		close (STDOUT);
+  		open(STDOUT, ">&SAVEOUT");
+  		print STDOUT "\nRunning: $perftests[$i+1] ...";
+  		close (STDOUT);
+  		open (STDOUT, ">/dev/null") or die;
+  		select (STDERR); $| = 1;
+! 		print "$perftests[$i+1]: ";
+  	}
+  
+  	do "sqls/$runtest";
+  
+  	# Restore STDERR to $TmpFile
+! 	if ( $test =~ /\.ntm/ ) {
+  		close (STDERR);
+  		open (STDERR, ">>$TmpFile") or die;
+  	}
+  	select (STDERR); $| = 1;
+  	$i++;
+  }
+***************
+*** 128,138 ****
+  open (TMPF, "<$TmpFile") or die;
+  open (RESF, ">$ResFile") or die;
+  
+! while (<TMPF>)
+! {
+! 	$str = $_;
+! 	($test, $rtime) = split (/:/, $str);
+! 	($tmp, $rtime, $rest) = split (/[ 	]+/, $rtime);
+! 	print RESF "$test: $rtime\n";
+  }
+  
+--- 121,130 ----
+  open (TMPF, "<$TmpFile") or die;
+  open (RESF, ">$ResFile") or die;
+  
+! while (<TMPF>) {
+!         if (m/^(.*: ).* ([0-9:.]+) *elapsed/) {
+! 	    ($test, $rtime) = ($1, $2);
+! 	     print RESF $test, $rtime, "\n";
+!         }
+  }
+
+------------------------------------------------------------------------
+
+  
+------------------------- testcase detail --------------------------
+   
+1. from distribution
+   DBMS:		PostgreSQL 6.2b10
+   OS:		FreeBSD 2.1.5-RELEASE
+   HardWare:	i586/90, 24M RAM, IDE
+   StartUp:	postmaster -B 256 '-o -S 2048' -S
+   Compiler:	gcc 2.6.3
+   Compiled:	-O, without CASSERT checking, with
+   		-DTBL_FREE_CMD_MEMORY (to free memory
+   		if BEGIN/END after each query execution)
+   DB connection startup: 0.20
+   8192 INSERTs INTO SIMPLE (1 xact): 39.58
+   8192 INSERTs INTO SIMPLE (8192 xacts): 1190.98
+   Create INDEX on SIMPLE: 3.69
+   8192 INSERTs INTO SIMPLE with INDEX (1 xact): 46.65
+   8192 random INDEX scans on SIMPLE (1 xact): 65.49
+   ORDER BY SIMPLE: 2.27
+   
+   
+2. run on leslie with asserts
+   DBMS:		PostgreSQL 6.3.2 (plus changes to 98/06/01)
+   OS:		Linux 2.0.32 leslie
+   HardWare:	i586/133 HX 512, 32M RAM, fast SCSI, 7200rpm
+   StartUp:	postmaster -B 256 '-o -S 2048' -S
+   Compiler:	gcc 2.7.2.3
+   Compiled:	-O, WITH CASSERT checking, with
+   		-DTBL_FREE_CMD_MEMORY (to free memory
+   		if BEGIN/END after each query execution)
+   DB connection startup: 0.10
+   8192 INSERTs INTO SIMPLE (1 xact): 15.48
+   8192 INSERTs INTO SIMPLE (8192 xacts): 326.75
+   Create INDEX on SIMPLE: 2.99
+   8192 INSERTs INTO SIMPLE with INDEX (1 xact): 20.69
+   8192 random INDEX scans on SIMPLE (1 xact): 35.81
+   ORDER BY SIMPLE: 1.68
+   
+   
+3. with -F to avoid forced i/o
+   DBMS:		PostgreSQL 6.3.2 (plus changes to 98/06/01)
+   OS:		Linux 2.0.32 leslie
+   HardWare:	i586/133 HX 512, 32M RAM, fast SCSI, 7200rpm
+   StartUp:	postmaster -B 256 '-o -S 2048 -F' -S
+   Compiler:	gcc 2.7.2.3
+   Compiled:	-O, WITH CASSERT checking, with
+   		-DTBL_FREE_CMD_MEMORY (to free memory
+   		if BEGIN/END after each query execution)
+   DB connection startup: 0.10
+   8192 INSERTs INTO SIMPLE (1 xact): 15.90
+   8192 INSERTs INTO SIMPLE (8192 xacts): 24.98
+   Create INDEX on SIMPLE: 2.63
+   8192 INSERTs INTO SIMPLE with INDEX (1 xact): 20.46
+   8192 random INDEX scans on SIMPLE (1 xact): 36.43
+   ORDER BY SIMPLE: 1.69
+   
+   
+4. no asserts, -F to avoid forced I/O
+   DBMS:		PostgreSQL 6.3.2 (plus changes to 98/06/01)
+   OS:		Linux 2.0.32 leslie
+   HardWare:	i586/133 HX 512, 32M RAM, fast SCSI, 7200rpm
+   StartUp:	postmaster -B 256 '-o -S 2048' -S
+   Compiler:	gcc 2.7.2.3
+   Compiled:	-O, No CASSERT checking, with
+   		-DTBL_FREE_CMD_MEMORY (to free memory
+   		if BEGIN/END after each query execution)
+   DB connection startup: 0.10
+   8192 INSERTs INTO SIMPLE (1 xact): 14.92
+   8192 INSERTs INTO SIMPLE (8192 xacts): 23.23
+   Create INDEX on SIMPLE: 1.38
+   8192 INSERTs INTO SIMPLE with INDEX (1 xact): 18.67
+   8192 random INDEX scans on SIMPLE (1 xact): 33.79
+   ORDER BY SIMPLE: 1.58
+   
+   
+5. with more buffers (2048 vs 256) and -F to avoid forced i/o
+   DBMS:		PostgreSQL 6.3.2 (plus changes to 98/06/01)
+   OS:		Linux 2.0.32 leslie
+   HardWare:	i586/133 HX 512, 32M RAM, fast SCSI, 7200rpm
+   StartUp:	postmaster -B 2048 '-o -S 2048 -F' -S
+   Compiler:	gcc 2.7.2.3
+   Compiled:	-O, WITH CASSERT checking, with
+   		-DTBL_FREE_CMD_MEMORY (to free memory
+   		if BEGIN/END after each query execution)
+   DB connection startup: 0.11
+   8192 INSERTs INTO SIMPLE (1 xact): 21.31
+   8192 INSERTs INTO SIMPLE (8192 xacts): 42.28
+   Create INDEX on SIMPLE: 2.65
+   8192 INSERTs INTO SIMPLE with INDEX (1 xact): 25.74
+   8192 random INDEX scans on SIMPLE (1 xact): 42.26
+   ORDER BY SIMPLE: 1.72
+   
+   
+6. No Asserts, more buffers (2048 vs 256) and -F to avoid forced i/o
+   DBMS:		PostgreSQL 6.3.2 (plus changes to 98/06/01)
+   OS:		Linux 2.0.32 leslie
+   HardWare:	i586/133 HX 512, 32M RAM, fast SCSI, 7200rpm
+   StartUp:	postmaster -B 2048 '-o -S 2048 -F' -S
+   Compiler:	gcc 2.7.2.3
+   Compiled:	-O, No CASSERT checking, with
+   		-DTBL_FREE_CMD_MEMORY (to free memory
+   		if BEGIN/END after each query execution)
+   DB connection startup: 0.11
+   8192 INSERTs INTO SIMPLE (1 xact): 20.52
+   8192 INSERTs INTO SIMPLE (8192 xacts): 39.79
+   Create INDEX on SIMPLE: 1.40
+   8192 INSERTs INTO SIMPLE with INDEX (1 xact): 24.77
+   8192 random INDEX scans on SIMPLE (1 xact): 39.51
+   ORDER BY SIMPLE: 1.55
+---------------------------------------------------------------------
+
+-dg
+
+David Gould            dg@illustra.com           510.628.3783 or 510.305.9468 
+Informix Software  (No, really)         300 Lakeside Drive  Oakland, CA 94612
+"Don't worry about people stealing your ideas.  If your ideas are any
+ good, you'll have to ram them down people's throats." -- Howard Aiken
+
+
--- a/doc/TODO.detail/persistent
+++ b/doc/TODO.detail/persistent
@ -0,0 +1,102 @@
+From owner-pgsql-hackers@hub.org Mon May 11 11:31:09 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03006
+	for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:31:07 -0400 (EDT)
+Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id LAA01663 for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:24:42 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA21841; Mon, 11 May 1998 11:15:25 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:15:12 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA21683 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:15:09 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA21451 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:15:03 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA24915;
+	Mon, 11 May 1998 11:14:43 -0400 (EDT)
+To: Brett McCormick <brett@work.chicken.org>
+cc: hackers@postgreSQL.org
+Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh] 
+In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT) 
+             <13655.4384.345723.466046@abraxas.scene.com> 
+Date: Mon, 11 May 1998 11:14:43 -0400
+Message-ID: <24913.894899683@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: RO
+
+Brett McCormick <brett@work.chicken.org> writes:
+> same way that the current network socket is passed -- through an execv
+> argument.  hopefully, however, the non-execv()ing fork will be in 6.4.
+
+Um, you missed the point, Brett.  David was hoping to transfer a client
+connection from the postmaster to an *already existing* backend process.
+Fork, with or without exec, solves the problem for a backend that's
+started after the postmaster has accepted the client socket.
+
+This does lead to a different line of thought, however.  Pre-started
+backends would have access to the "master" connection socket on which
+the postmaster listens for client connections, right?  Suppose that we
+fire the postmaster as postmaster, and demote it to being simply a
+manufacturer of new backend processes as old ones get used up.  Have
+one of the idle backend processes be the one doing the accept() on the
+master socket.  Once it has a client connection, it performs the
+authentication handshake and then starts serving the client (or just
+quits if authentication fails).  Meanwhile the next idle backend process
+has executed accept() on the master socket and is waiting for the next
+client; and shortly the postmaster/factory/whateverwecallitnow notices
+that it needs to start another backend to add to the idle-backend pool.
+
+This'd probably need some interlocking among the backends.  I have no
+idea whether it'd be safe to have all the idle backends trying to
+do accept() on the master socket simultaneously, but it sounds risky.
+Better to use a mutex so that only one gets to do it while the others
+sleep.
+
+			regards, tom lane
+
+
+From owner-pgsql-hackers@hub.org Mon May 11 11:35:55 1998
+Received: from hub.org (hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id LAA03043
+	for <maillist@candle.pha.pa.us>; Mon, 11 May 1998 11:35:53 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id LAA23494; Mon, 11 May 1998 11:27:10 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Mon, 11 May 1998 11:27:02 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id LAA23473 for pgsql-hackers-outgoing; Mon, 11 May 1998 11:27:01 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id LAA23462 for <hackers@postgreSQL.org>; Mon, 11 May 1998 11:26:56 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.8.5/8.8.5) with ESMTP id LAA25006;
+	Mon, 11 May 1998 11:26:44 -0400 (EDT)
+To: Brett McCormick <brett@work.chicken.org>
+cc: hackers@postgreSQL.org
+Subject: Re: [HACKERS] Re: [PATCHES] Try again: S_LOCK reduced contentionh] 
+In-reply-to: Your message of Mon, 11 May 1998 07:57:23 -0700 (PDT) 
+             <13655.4384.345723.466046@abraxas.scene.com> 
+Date: Mon, 11 May 1998 11:26:44 -0400
+Message-ID: <25004.894900404@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: RO
+
+Meanwhile, *I* missed the point about Brett's second comment :-(
+
+Brett McCormick <brett@work.chicken.org> writes:
+> There will have to be some sort of arg parsing in any case,
+> considering that you can pass configurable arguments to the backend..
+
+If we do the sort of change David and I were just discussing, then the
+pre-spawned backend would become responsible for parsing and dealing
+with the PGOPTIONS portion of the client's connection request message.
+That's just part of shifting the authentication handshake code from
+postmaster to backend, so it shouldn't be too hard.
+
+BUT: the whole point is to be able to initialize the backend before it
+is connected to a client.  How much of the expensive backend startup
+work depends on having the client connection options available?
+Any work that needs to know the options will have to wait until after
+the client connects.  If that means most of the startup work can't
+happen in advance anyway, then we're out of luck; a pre-started backend
+won't save enough time to be worth the effort.  (Unless we are willing
+to eliminate or redefine the troublesome options...)
+
+			regards, tom lane
+
+
--- a/doc/TODO.detail/pg_shadow
+++ b/doc/TODO.detail/pg_shadow
@ -0,0 +1,55 @@
+From owner-pgsql-hackers@hub.org Sun Aug  2 20:01:13 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id UAA15937
+	for <maillist@candle.pha.pa.us>; Sun, 2 Aug 1998 20:01:11 -0400 (EDT)
+Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id TAA01026 for <maillist@candle.pha.pa.us>; Sun, 2 Aug 1998 19:33:53 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id TAA19878; Sun, 2 Aug 1998 19:30:59 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sun, 02 Aug 1998 19:28:23 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id TAA19534 for pgsql-hackers-outgoing; Sun, 2 Aug 1998 19:28:22 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (sss.pgh.pa.us [206.210.65.6]) by hub.org (8.8.8/8.7.5) with ESMTP id TAA19521 for <pgsql-hackers@postgreSQL.org>; Sun, 2 Aug 1998 19:28:15 -0400 (EDT)
+Received: from sss.sss.pgh.pa.us (localhost [127.0.0.1])
+	by sss.sss.pgh.pa.us (8.9.1/8.9.1) with ESMTP id TAA22594
+	for <pgsql-hackers@postgreSQL.org>; Sun, 2 Aug 1998 19:28:13 -0400 (EDT)
+To: pgsql-hackers@postgreSQL.org
+Subject: [HACKERS] TODO item: make pg_shadow updates more robust
+Date: Sun, 02 Aug 1998 19:28:13 -0400
+Message-ID: <22591.902100493@sss.pgh.pa.us>
+From: Tom Lane <tgl@sss.pgh.pa.us>
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: ROr
+
+I learned the hard way last night that the postmaster's password
+authentication routines don't look at the pg_shadow table.  They
+look at a separate file named pg_pwd, which certain backend operations
+will update from pg_shadow.  (This is not documented in any user
+documentation that I could find; I had to burrow into
+src/backend/commands/user.c to discover it.)
+
+Unfortunately, if a clueless dbadmin (like me ;-)) tries to update
+password data with the obvious thing,
+	update pg_shadow set passwd = 'xxxxx' where usename = 'yyyy';
+pg_pwd doesn't get fixed.
+
+A more drastic problem is that pg_dump believes it can save and
+restore pg_shadow data using "copy".  Following an initdb and restore
+from a pg_dump -z script, pg_shadow will look just fine, but only
+the database admin will be listed in pg_pwd.  This is likely to provoke
+some confusion, IMHO.
+
+As a short-term thing, the fact that you *must* set passwords with
+ALTER USER ought to be documented, preferably someplace where a
+dbadmin who's never heard of ALTER USER is likely to find it.
+
+As a longer-term thing, I think it would be far better if ordinary
+SQL operations on pg_shadow just did the right thing.  Wouldn't it
+be possible to implement copying to pg_pwd by means of a trigger on
+pg_shadow updates, or something like that?
+
+(I'm afraid that pg_dump -z is pretty well broken for operations on
+a password-protected database, btw.  Has anyone used it successfully
+in that situation?)
+
+			regards, tom lane
+
+
--- a/doc/TODO.detail/prepare
+++ b/doc/TODO.detail/prepare
@ -0,0 +1,98 @@
+From owner-pgsql-hackers@hub.org Wed Nov 18 14:40:49 1998
+Received: from hub.org (majordom@hub.org [209.47.148.200])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id OAA29743
+	for <maillist@candle.pha.pa.us>; Wed, 18 Nov 1998 14:40:36 -0500 (EST)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.9.1/8.9.1) with SMTP id OAA03716;
+	Wed, 18 Nov 1998 14:37:04 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 18 Nov 1998 14:34:39 +0000 (EST)
+Received: (from majordom@localhost)
+	by hub.org (8.9.1/8.9.1) id OAA03395
+	for pgsql-hackers-outgoing; Wed, 18 Nov 1998 14:34:37 -0500 (EST)
+	(envelope-from owner-pgsql-hackers@postgreSQL.org)
+Received: from orion.SAPserv.Hamburg.dsh.de (Tpolaris2.sapham.debis.de [53.2.131.8])
+	by hub.org (8.9.1/8.9.1) with SMTP id OAA03381
+	for <pgsql-hackers@hub.org>; Wed, 18 Nov 1998 14:34:31 -0500 (EST)
+	(envelope-from wieck@sapserv.debis.de)
+Received: by orion.SAPserv.Hamburg.dsh.de 
+	for pgsql-hackers@hub.org 
+	id m0zgDnj-000EBTC; Wed, 18 Nov 98 21:02 MET
+Message-Id: <m0zgDnj-000EBTC@orion.SAPserv.Hamburg.dsh.de>
+From: jwieck@debis.com (Jan Wieck)
+Subject: Re: [HACKERS] PREPARE
+To: meskes@usa.net (Michael Meskes)
+Date: Wed, 18 Nov 1998 21:02:06 +0100 (MET)
+Cc: pgsql-hackers@hub.org
+Reply-To: jwieck@debis.com (Jan Wieck)
+In-Reply-To: <19981118084843.B869@usa.net> from "Michael Meskes" at Nov 18, 98 08:48:43 am
+X-Mailer: ELM [version 2.4 PL25]
+Content-Type: text
+Sender: owner-pgsql-hackers@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Michael Meskes wrote:
+
+>
+> On Wed, Nov 18, 1998 at 03:23:30AM +0000, Thomas G. Lockhart wrote:
+> > > I didn't get this one completly. What input do you mean?
+> >
+> > Just the original string/query to be prepared...
+>
+> I see. But wouldn't it be more useful to preprocess the query and store the
+> resulting nodes instead? We don't want to parse the statement everytime a
+> variable binding comes in.
+
+    Right.  A real improvement would only be to have the prepared
+    execution plan in the backend and just giving  the  parameter
+    values.
+
+    I can think of the following construct:
+
+        PREPARE optimizable-statement;
+
+    That one will run parser/rewrite/planner, create a new memory
+    context with a unique identifier and  saves  the  querytree's
+    and  plan's  in  it.   Parameter values are identified by the
+    usual $n notation. The command returns the identifier.
+
+        EXECUTE QUERY identifier [value [, ...]];
+
+    then get's back the prepared plan and querytree  by  the  id,
+    creates  an  executor  context  with  the given values in the
+    parameter array and calls ExecutorRun() for them.
+
+    The PREPARE needs to analyze the resulting parsetrees to  get
+    the  datatypes  (and maybe atttypmod's) of the parameters, so
+    EXECUTE QUERY can convert the values into Datum's  using  the
+    types  input  functions.  And  the  EXECUTE has to be handled
+    special in tcop (it's something between a regular  query  and
+    an utility statement). But it's not too hard to implement.
+
+    Finally a
+
+        FORGET QUERY identifier;
+
+    (don't  remember  how  the  others  named it) will remove the
+    prepared plan etc. simply by destroying  the  memory  context
+    and dropping the identifier from the id->mcontext+prepareinfo
+    mapping.
+
+    This all  restricts  the  usage  of  PREPARE  to  optimizable
+    statements.  Is  it  required  to  be able to prepare utility
+    statements (like CREATE TABLE or so) too?
+
+
+Jan
+
+--
+
+#======================================================================#
+# It's easier to get forgiveness for being wrong than for being right. #
+# Let's break this rule - forgive me.                                  #
+#======================================== jwieck@debis.com (Jan Wieck) #
+
+
+
+
--- a/doc/TODO.detail/primary
+++ b/doc/TODO.detail/primary
@ -0,0 +1,159 @@
+From owner-pgsql-hackers@hub.org Fri Sep  4 00:47:06 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id AAA01047
+	for <maillist@candle.pha.pa.us>; Fri, 4 Sep 1998 00:47:05 -0400 (EDT)
+Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id XAA02044 for <maillist@candle.pha.pa.us>; Thu, 3 Sep 1998 23:11:07 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id XAA27418; Thu, 3 Sep 1998 23:06:16 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Thu, 03 Sep 1998 23:04:11 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id XAA27185 for pgsql-hackers-outgoing; Thu, 3 Sep 1998 23:04:09 -0400 (EDT)
+Received: from dune.krs.ru (dune.krs.ru [195.161.16.38]) by hub.org (8.8.8/8.7.5) with ESMTP id XAA27169 for <hackers@postgreSQL.org>; Thu, 3 Sep 1998 23:03:59 -0400 (EDT)
+Received: from krs.ru (localhost.krs.ru [127.0.0.1])
+	by dune.krs.ru (8.8.8/8.8.8) with ESMTP id LAA10059;
+	Fri, 4 Sep 1998 11:03:00 +0800 (KRSS)
+	(envelope-from vadim@krs.ru)
+Message-ID: <35EF5864.E5142D35@krs.ru>
+Date: Fri, 04 Sep 1998 11:03:00 +0800
+From: Vadim Mikheev <vadim@krs.ru>
+Organization: OJSC Rostelecom (Krasnoyarsk)
+X-Mailer: Mozilla 4.05 [en] (X11; I; FreeBSD 2.2.6-RELEASE i386)
+MIME-Version: 1.0
+To: "D'Arcy J.M. Cain" <darcy@druid.net>
+CC: "Thomas G. Lockhart" <lockhart@alumni.caltech.edu>, hackers@postgreSQL.org
+Subject: Re: [HACKERS] Adding PRIMARY KEY info
+References: <m0zEaoV-00006JC@druid.net>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: RO
+
+D'Arcy J.M. Cain wrote:
+> 
+> Thus spake Vadim Mikheev
+> > Imho, indices should be used/created for FOREIGN keys and so pg_index
+> > is good place for both PRIMARY and FOREIGN keys infos.
+> 
+> Are you sure?  I don't know about implementing it but it seems more
+> like an attribute thing rather than an index thing.  Certainly from a
+> database design viewpoint you want to refer to the fields, not the
+> index on them.  If you put it into the index then you have to do
+> an extra join to get the information.
+> 
+> Perhaps you have to do the extra join anyway for other purposes so it
+> may not matter.  All I want is to be able to be able to extract the
+> field that the designer specified as the key.  As long as I can design
+> a select statement that gives me that I don't much care how it is
+> implemented.  I'll cache the information anyway so it won't have a
+> huge impact on my programs.
+
+First, let me note that you have to add int28 field to pg_class,
+not just oid field, to know what attributeS are in primary key
+(we support multi-attribute primary keys).
+This could be done...
+But what about foreign and unique (!) keys ?
+There may be _many_ foreign/unique keys defined for one table!
+And so foreign/unique keys info have to be stored somewhere else,
+not in pg_class.
+
+pg_index is good place for all _3_ key types because of:
+
+1. index should be created for each foreign key - 
+   just for performance.
+2. pg_index already has int28 field for key attributes.
+3. pg_index already has indisunique (note that foreign keys 
+   may reference unique keys, not just primary ones).
+
+- so we have just add two fields to pg_index:
+
+bool indisprimary;
+oid  indreferenced; 
+^^^^^^^^^^^^^^^^^^
+this is for foreign keys: oid of referenced relation' 
+primary/unique key index.
+
+I agreed that indices are just implementation...
+If you don't like to store key infos in pg_index then
+new pg_key relation have to be added...
+
+Comments ?
+
+Vadim
+
+
+From owner-pgsql-hackers@hub.org Sat Sep  5 02:01:13 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.8.5/8.8.5) with ESMTP id CAA14437
+	for <maillist@candle.pha.pa.us>; Sat, 5 Sep 1998 02:01:11 -0400 (EDT)
+Received: from hub.org (hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id BAA09928 for <maillist@candle.pha.pa.us>; Sat, 5 Sep 1998 01:48:32 -0400 (EDT)
+Received: from localhost (majordom@localhost) by hub.org (8.8.8/8.7.5) with SMTP id BAA18282; Sat, 5 Sep 1998 01:43:16 -0400 (EDT)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Sat, 05 Sep 1998 01:41:40 +0000 (EDT)
+Received: (from majordom@localhost) by hub.org (8.8.8/8.7.5) id BAA18241 for pgsql-hackers-outgoing; Sat, 5 Sep 1998 01:41:38 -0400 (EDT)
+Received: from dune.krs.ru (dune.krs.ru [195.161.16.38]) by hub.org (8.8.8/8.7.5) with ESMTP id BAA18211; Sat, 5 Sep 1998 01:41:21 -0400 (EDT)
+Received: from krs.ru (localhost.krs.ru [127.0.0.1])
+	by dune.krs.ru (8.8.8/8.8.8) with ESMTP id NAA20555;
+	Sat, 5 Sep 1998 13:40:44 +0800 (KRSS)
+	(envelope-from vadim@krs.ru)
+Message-ID: <35F0CEDB.AD721090@krs.ru>
+Date: Sat, 05 Sep 1998 13:40:43 +0800
+From: Vadim Mikheev <vadim@krs.ru>
+Organization: OJSC Rostelecom (Krasnoyarsk)
+X-Mailer: Mozilla 4.05 [en] (X11; I; FreeBSD 2.2.6-RELEASE i386)
+MIME-Version: 1.0
+To: "D'Arcy J.M. Cain" <darcy@druid.net>
+CC: hackers@postgreSQL.org, pgsql-core@postgreSQL.org
+Subject: Re: [HACKERS] Adding PRIMARY KEY info
+References: <m0zEvLK-00006FC@druid.net>
+Content-Type: text/plain; charset=us-ascii
+Content-Transfer-Encoding: 7bit
+Sender: owner-pgsql-hackers@hub.org
+Precedence: bulk
+Status: ROr
+
+D'Arcy J.M. Cain wrote:
+> 
+> >
+> > pg_index is good place for all _3_ key types because of:
+> >
+> > 1. index should be created for each foreign key -
+> >    just for performance.
+> > 2. pg_index already has int28 field for key attributes.
+> > 3. pg_index already has indisunique (note that foreign keys
+> >    may reference unique keys, not just primary ones).
+> >
+> > - so we have just add two fields to pg_index:
+> >
+> > bool indisprimary;
+> > oid  indreferenced;
+> > ^^^^^^^^^^^^^^^^^^
+> > this is for foreign keys: oid of referenced relation'
+> > primary/unique key index.
+> 
+> Sounds fine to me.  Any chance of seeing this in 6.4?
+
+I could add this (and FOREIGN key implementation) before
+11-13 Sep... But not the ALTER TABLE ADD/DROP CONSTRAINT
+stuff (ok for Entry SQL).
+But we are in beta...
+
+Comments?
+
+> Nope, pg_index is fine by me.  Now, once we have this, how do we find
+> the index for a particular attribute?  I can't seem to figure out the
+> relationship between pg_attribute and pg_index.  The chart in the docs
+> suggests that indkey is the relation but I can't see any useful info
+> there for joining the tables.
+
+pg_index:
+	indrelid - oid of indexed relation
+	indkey   - up to the 8 attnums
+
+pg_attribute:
+	attrelid - oid of relation
+	attnum   - ...
+
+Without outer join you have to query pg_attribute for each
+valid attnum from pg_index->indkey -:(
+
+Vadim
+
+
--- a/doc/TODO.detail/tcl_arrays
+++ b/doc/TODO.detail/tcl_arrays
@ -0,0 +1,240 @@
+From owner-pgsql-patches@hub.org Wed Oct 14 17:31:26 1998
+Received: from renoir.op.net (root@renoir.op.net [209.152.193.4])
+	by candle.pha.pa.us (8.9.0/8.9.0) with ESMTP id RAA01594
+	for <maillist@candle.pha.pa.us>; Wed, 14 Oct 1998 17:31:24 -0400 (EDT)
+Received: from hub.org (majordom@hub.org [209.47.148.200]) by renoir.op.net (o1/$Revision: 1.1 $) with ESMTP id RAA01745 for <maillist@candle.pha.pa.us>; Wed, 14 Oct 1998 17:12:28 -0400 (EDT)
+Received: from localhost (majordom@localhost)
+	by hub.org (8.8.8/8.8.8) with SMTP id RAA06607;
+	Wed, 14 Oct 1998 17:10:43 -0400 (EDT)
+	(envelope-from owner-pgsql-patches@hub.org)
+Received: by hub.org (TLB v0.10a (1.23 tibbs 1997/01/09 00:29:32)); Wed, 14 Oct 1998 17:10:27 +0000 (EDT)
+Received: (from majordom@localhost)
+	by hub.org (8.8.8/8.8.8) id RAA06562
+	for pgsql-patches-outgoing; Wed, 14 Oct 1998 17:10:26 -0400 (EDT)
+	(envelope-from owner-pgsql-patches@postgreSQL.org)
+X-Authentication-Warning: hub.org: majordom set sender to owner-pgsql-patches@postgreSQL.org using -f
+Received: from mambo.cs.unitn.it (mambo.cs.unitn.it [193.205.199.204])
+	by hub.org (8.8.8/8.8.8) with SMTP id RAA06494
+	for <pgsql-patches@postgreSQL.org>; Wed, 14 Oct 1998 17:10:01 -0400 (EDT)
+	(envelope-from dz@cs.unitn.it)
+Received: from nikita.wizard.net (ts-slip31.gelso.unitn.it [193.205.200.31]) by mambo.cs.unitn.it (8.6.12/8.6.12) with ESMTP id XAA20316 for <pgsql-patches@postgreSQL.org>; Wed, 14 Oct 1998 23:09:52 +0200
+Received: (from dz@localhost) by nikita.wizard.net (8.8.5/8.6.9) id WAA00489 for pgsql-patches@postgreSQL.org; Wed, 14 Oct 1998 22:56:58 +0200
+From: Massimo Dal Zotto <dz@cs.unitn.it>
+Message-Id: <199810142056.WAA00489@nikita.wizard.net>
+Subject: [PATCHES] TCL_ARRAYS
+To: pgsql-patches@postgreSQL.org (Pgsql Patches)
+Date: Wed, 14 Oct 1998 22:56:58 +0200 (MET DST)
+X-Mailer: ELM [version 2.4 PL24 ME4]
+MIME-Version: 1.0
+Content-Type: text/plain; charset=iso-8859-1
+Content-Transfer-Encoding: 8bit
+Sender: owner-pgsql-patches@postgreSQL.org
+Precedence: bulk
+Status: RO
+
+Hi,
+
+I have written this patch which fixes some problems with TCL_ARRAYS.
+The new array code uses a temporary buffer and is disabled by default
+because it depends on contrib/string-io which most of you don't use.
+This raises once again the problem of backslashes/escapes and various
+ambiguities in pgsql output. I hope this will be solved in 6.5.
+
+*** src/interfaces/libpgtcl/pgtclCmds.c.orig	Mon Sep 21 09:00:19 1998
+--- src/interfaces/libpgtcl/pgtclCmds.c	Wed Oct 14 15:32:21 1998
+***************
+*** 602,616 ****
+  		{
+  			for (i = 0; i < PQnfields(result); i++)
+  			{
+  				sprintf(nameBuffer, "%d,%.200s", tupno, PQfname(result, i));
+  				if (Tcl_SetVar2(interp, arrVar, nameBuffer,
+! #ifdef TCL_ARRAYS
+! 								tcl_value(PQgetvalue(result, tupno, i)),
+  #else
+  								PQgetvalue(result, tupno, i),
+- #endif
+  								TCL_LEAVE_ERR_MSG) == NULL)
+  					return TCL_ERROR;
+  			}
+  		}
+  		Tcl_AppendResult(interp, arrVar, 0);
+--- 602,624 ----
+  		{
+  			for (i = 0; i < PQnfields(result); i++)
+  			{
+ #ifdef TCL_ARRAYS
+ 				char *buff = strdup(PQgetvalue(result, tupno, i));
+  				sprintf(nameBuffer, "%d,%.200s", tupno, PQfname(result, i));
+  				if (Tcl_SetVar2(interp, arrVar, nameBuffer,
+! 								tcl_value(buff),
+! 								TCL_LEAVE_ERR_MSG) == NULL) {
+! 					free(buff);
+! 					return TCL_ERROR;
+! 				}
+! 				free(buff);
+  #else
+ 				sprintf(nameBuffer, "%d,%.200s", tupno, PQfname(result, i));
+ 				if (Tcl_SetVar2(interp, arrVar, nameBuffer,
+  								PQgetvalue(result, tupno, i),
+  								TCL_LEAVE_ERR_MSG) == NULL)
+  					return TCL_ERROR;
+ #endif
+  			}
+  		}
+  		Tcl_AppendResult(interp, arrVar, 0);
+***************
+*** 636,643 ****
+  		 */
+  		for (tupno = 0; tupno < PQntuples(result); tupno++)
+  		{
+  			const char *field0 = PQgetvalue(result, tupno, 0);
+! 			char * workspace = malloc(strlen(field0) + strlen(appendstr) + 210);
+  
+  			for (i = 1; i < PQnfields(result); i++)
+  			{
+--- 644,674 ----
+  		 */
+  		for (tupno = 0; tupno < PQntuples(result); tupno++)
+  		{
+ #ifdef TCL_ARRAYS
+ 			char *buff = strdup(PQgetvalue(result, tupno, 0));
+ 			const char *field0 = tcl_value(buff);
+ 			char *workspace = malloc(strlen(field0) + 210 + strlen(appendstr));
+ 
+ 			for (i = 1; i < PQnfields(result); i++)
+ 			{
+ 				free(buff);
+ 				buff = strdup(PQgetvalue(result, tupno, i));
+ 				sprintf(workspace, "%s,%.200s%s", field0, PQfname(result,i),
+ 						appendstr);
+ 				if (Tcl_SetVar2(interp, arrVar, workspace,
+ 								tcl_value(buff),
+ 								TCL_LEAVE_ERR_MSG) == NULL)
+ 				{
+ 					free(buff);
+ 					free(workspace);
+ 					return TCL_ERROR;
+ 				}
+ 			}
+ 			free(buff);
+ 			free(workspace);
+ #else
+  			const char *field0 = PQgetvalue(result, tupno, 0);
+! 			char *workspace = malloc(strlen(field0) + 210 + strlen(appendstr));
+  
+  			for (i = 1; i < PQnfields(result); i++)
+  			{
+***************
+*** 652,657 ****
+--- 683,689 ----
+  				}
+  			}
+  			free(workspace);
+ #endif
+  		}
+  		Tcl_AppendResult(interp, arrVar, 0);
+  		return TCL_OK;
+***************
+*** 669,676 ****
+--- 701,716 ----
+  			Tcl_AppendResult(interp, "argument to getTuple cannot exceed number of tuples - 1", 0);
+  			return TCL_ERROR;
+  		}
+ #ifdef TCL_ARRAYS
+ 		for (i = 0; i < PQnfields(result); i++) {
+ 			char *buff = strdup(PQgetvalue(result, tupno, i));
+ 			Tcl_AppendElement(interp, tcl_value(buff));
+ 			free(buff);
+ 		}
+ #else
+  		for (i = 0; i < PQnfields(result); i++)
+  			Tcl_AppendElement(interp, PQgetvalue(result, tupno, i));
+ #endif
+  		return TCL_OK;
+  	}
+  	else if (strcmp(opt, "-tupleArray") == 0)
+***************
+*** 688,697 ****
+--- 728,748 ----
+  		}
+  		for (i = 0; i < PQnfields(result); i++)
+  		{
+ #ifdef TCL_ARRAYS
+ 			char *buff = strdup(PQgetvalue(result, tupno, i));
+ 			if (Tcl_SetVar2(interp, argv[4], PQfname(result, i),
+ 							tcl_value(buff),
+ 							TCL_LEAVE_ERR_MSG) == NULL) {
+ 				free(buff);
+ 				return TCL_ERROR;
+ 			}
+ 			free(buff);
+ #else
+  			if (Tcl_SetVar2(interp, argv[4], PQfname(result, i),
+  							PQgetvalue(result, tupno, i),
+  							TCL_LEAVE_ERR_MSG) == NULL)
+  				return TCL_ERROR;
+ #endif
+  		}
+  		return TCL_OK;
+  	}
+***************
+*** 1303,1310 ****
+  		sprintf(buffer, "%d", tupno);
+  		Tcl_SetVar2(interp, argv[3], ".tupno", buffer, 0);
+  
+  		for (column = 0; column < ncols; column++)
+! 			Tcl_SetVar2(interp, argv[3], info[column].cname, PQgetvalue(result, tupno, column), 0);
+  
+  		Tcl_SetVar2(interp, argv[3], ".command", "update", 0);
+  
+--- 1354,1371 ----
+  		sprintf(buffer, "%d", tupno);
+  		Tcl_SetVar2(interp, argv[3], ".tupno", buffer, 0);
+  
+ #ifdef TCL_ARRAYS
+ 		for (column = 0; column < ncols; column++) {
+ 			char *buff = strdup(PQgetvalue(result, tupno, column));
+ 			Tcl_SetVar2(interp, argv[3], info[column].cname,
+ 						tcl_value(buff), 0);
+ 			free(buff);
+ 		}
+ #else
+  		for (column = 0; column < ncols; column++)
+! 			Tcl_SetVar2(interp, argv[3], info[column].cname,
+! 						PQgetvalue(result, tupno, column), 0);
+! #endif
+  
+  		Tcl_SetVar2(interp, argv[3], ".command", "update", 0);
+  
+*** src/include/config.h.in.orig	Wed Aug 26 09:01:16 1998
+--- src/include/config.h.in	Wed Oct 14 22:44:00 1998
+***************
+*** 312,318 ****
+   * of postgres C-like arrays, for example {{"a1" "a2"} {"b1" "b2"}} instead 
+   * of {{"a1","a2"},{"b1","b2"}}.
+   */
+! #define TCL_ARRAYS
+  
+  /*
+   * The following flag allows limiting the number of rows returned by a query.
+--- 312,318 ----
+   * of postgres C-like arrays, for example {{"a1" "a2"} {"b1" "b2"}} instead 
+   * of {{"a1","a2"},{"b1","b2"}}.
+   */
+! /* #define TCL_ARRAYS */
+  
+  /*
+   * The following flag allows limiting the number of rows returned by a query.
+
+-- 
+Massimo Dal Zotto
+
+----------------------------------------------------------------------+
+|  Massimo Dal Zotto                email:  dz@cs.unitn.it             |
+|  Via Marconi, 141                 phone:  ++39-461-534251            |
+|  38057 Pergine Valsugana (TN)     www:  http://www.cs.unitn.it/~dz/  |
+|  Italy                            pgp:  finger dz@tango.cs.unitn.it  |
+----------------------------------------------------------------------+
+
+