3380 lines
149 KiB
Plaintext
3380 lines
149 KiB
Plaintext
From pgsql-hackers-owner+M77861=pgman=candle.pha.pa.us@postgresql.org Fri Dec 23 05:19:20 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
From: Simon Riggs <simon@2ndquadrant.com>
|
|
To: Stephen Frost <sfrost@snowman.net>
|
|
cc: Martijn van Oosterhout <kleptog@svana.org>,
|
|
Jim C. Nasby <jnasby@pervasive.com>,
|
|
bizgres-general <bizgres-general@pgfoundry.org>,
|
|
pgsql-hackers@postgresql.org
|
|
In-Reply-To: <20051222223625.GC6026@ns.snowman.net>
|
|
References: <1135261893.2964.502.camel@localhost.localdomain>
|
|
<20051222183751.GG72143@pervasive.com> <20051222201826.GH21783@svana.org>
|
|
<1135289583.2964.536.camel@localhost.localdomain>
|
|
<20051222223625.GC6026@ns.snowman.net>
|
|
Date: Fri, 23 Dec 2005 10:18:43 +0000
|
|
Message-ID: <1135333123.2964.589.camel@localhost.localdomain>
|
|
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.1 required=5 tests=[AWL=0.100]
|
|
X-Spam-Score: 0.1
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 4728
|
|
|
|
On Thu, 2005-12-22 at 17:36 -0500, Stephen Frost wrote:
|
|
> * Simon Riggs (simon@2ndquadrant.com) wrote:
|
|
> > On Thu, 2005-12-22 at 21:18 +0100, Martijn van Oosterhout wrote:
|
|
> > > Considering "WAL bypass" is code for "breaks PITR"
|
|
> >
|
|
> > No it isn't. All of the WAL bypass logic does *not* operate when PITR is
|
|
> > active. The WAL bypass logic is aimed at Data Warehouses, which
|
|
> > typically never operate in PITR mode for performance reasons, however
|
|
> > the choice is yours.
|
|
|
|
OK, thanks for saying all of that; you probably speak for many in
|
|
raising these concerns. I'll answer each bit as we come to it. Suffice
|
|
to say, your concerns are good and so are the answers:
|
|
|
|
> Eh? PITR mode is bad for performance? Maybe I missed something but I
|
|
> wouldn't have thought PITR would degrade regular performance all that
|
|
> badly.
|
|
|
|
PITR mode is *not* bad for performance. On a very heavily loaded
|
|
write-intensive test system, the general PITR overhead on regular
|
|
performance was around 1% - so almost negligible.
|
|
|
|
We have been discussing a number of optimizations to specific commands
|
|
that would allow them to avoid writing WAL and thus speed up their
|
|
performance. If archive_command is set then WAL will always be written;
|
|
if it is not set then these commands will (or could) go faster:
|
|
|
|
- CREATE TABLE AS SELECT (in 8.1)
|
|
- COPY LOCK (patch submitted)
|
|
- COPY in same transaction as CREATE TABLE (patch submitted)
|
|
- INSERT SELECT in same transaction as CREATE TABLE (this discussion)
|
|
|
|
(There are a number of other conditions also, such as there must be no
|
|
indexes on a table. All of which now documented with the patch)
|
|
|
|
> So long as it doesn't take 15 minutes or some such to move the
|
|
> WAL to somewhere else (and I'm not sure that'd even slow things down..).
|
|
> For a Data Warehouse, have you got a better way of doing backups such
|
|
> that you don't lose at minimum most of a day's work?
|
|
|
|
Yes. Don't just use the backup facilities on their own. Think about how
|
|
the architecture of your systems will work and see if there is a better
|
|
way when you look at very large systems.
|
|
|
|
> I'm not exactly a
|
|
> big fan do doing a pg_dump every night either given that the database is
|
|
> 360GB. Much nicer to take a weekly dump of the database and then do
|
|
> PITR for a week or two before taking another dump of the db.
|
|
|
|
e.g. Keep your reference data (low volume) in an Operational Data Store
|
|
(ODS) database, protected by archiving. Keep your main fact data (high
|
|
volume) in the Data Warehouse, but save the data in slices as you load
|
|
it, so that a recovery is simply a reload of the database: no PITR or
|
|
pg_dump required, so high performance data transformation and load work
|
|
is possible. This is a commonly used architectural design pattern.
|
|
|
|
> I like the idea of making COPY go faster, but please don't break my
|
|
> backup system while you're at it.
|
|
|
|
On a personal note, I would only add that I spent a long time working on
|
|
PITR and I would never design anything that would intentionally break it
|
|
(nor would patches be accepted that did that). That probably gives me
|
|
the confidence to approach designs that might look like I'm doing that,
|
|
but without actually straying over the edge.
|
|
|
|
> I'm honestly kind of nervous about
|
|
> what you mean by checking it PITR is active- how is that done, exactly?
|
|
> Check if you have a script set to rotate the logs elsewhere? Or is it
|
|
> checking if you're in the taking-a-full-database-backup stage? Or what?
|
|
|
|
Internally, we use XLogArchivingActive(). Externally this will be set
|
|
when the admin sets archive_command to a particular value.
|
|
|
|
My original preference was for a parameter called archive_mode= ON | OFF
|
|
which would allow us to more easily discuss this, but this does not
|
|
currently exist.
|
|
|
|
> What's the performance decrease when using PITR, and what's it from? Is
|
|
> it just that COPY isn't as fast? Honestly, I could live with COPY being
|
|
> not as fast as it could be if my backups work. :)
|
|
|
|
These commands will not be optimized for speed when archive_command is set:
|
|
- CREATE TABLE AS SELECT (in 8.1)
|
|
- COPY LOCK (patch submitted)
|
|
|
|
> Sorry for sounding concerned but, well, backups are very important and
|
|
> so is performance and I'm afraid either I've not read all the
|
|
> documentation about the issues being discussed here or there isn't
|
|
> enough out there to make sense of it all yet. :)
|
|
|
|
If you choose PITR, then you are safe. If you do not, the crash recovery
|
|
of the database is not endangered by these optimizations.
|
|
|
|
Hope that covers all of your concerns?
|
|
|
|
I'm just writing a course that explains many of these techniques,
|
|
available in the New Year.
|
|
|
|
Best Regards, Simon Riggs
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 2: Don't 'kill -9' the postmaster
|
|
|
|
From pgsql-hackers-owner+M78004=pgman=candle.pha.pa.us@postgresql.org Wed Dec 28 20:59:03 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
Message-ID: <200512290158.jBT1wEK28785@candle.pha.pa.us>
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
In-Reply-To: <20051226122206.GA12934@svana.org>
|
|
To: Martijn van Oosterhout <kleptog@svana.org>
|
|
Date: Wed, 28 Dec 2005 20:58:14 -0500 (EST)
|
|
cc: Simon Riggs <simon@2ndquadrant.com>, Tom Lane <tgl@sss.pgh.pa.us>,
|
|
Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
|
|
Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
|
|
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.122 required=5 tests=[AWL=0.122]
|
|
X-Spam-Score: 0.122
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 3461
|
|
|
|
|
|
Having read through this thread, I would like to propose a
|
|
syntax/behavior.
|
|
|
|
I think we all now agree that the logging is more part of the table than
|
|
the command itself. Right now we have a COPY LOCK patch, but people are
|
|
going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
|
|
and all sorts of other things, so I think we are best adding an ALTER
|
|
TABLE capability. I am thinking of this syntax:
|
|
|
|
ALTER TABLE name RELIABILITY option
|
|
|
|
where "option" is:
|
|
|
|
DROP [ TABLE ON CRASH ]
|
|
DELETE [ ROWS ON CRASH ]
|
|
EXCLUSIVE
|
|
SHARE
|
|
|
|
Let me explain each option. DROP would drop the table on a restart
|
|
after a non-clean shutdown. It would do _no_ logging on the table and
|
|
allow concurrent access, plus index access. DELETE is the same as DROP,
|
|
but it just truncates the table (perhaps TRUNCATE is a better word).
|
|
|
|
EXCLUSIVE would allow only a single session to modify the table, and
|
|
would do all changes by appending to the table, similar to COPY LOCK.
|
|
EXCLUSIVE would also not allow indexes because those can not be isolated
|
|
like appending to the heap. EXCLUSIVE would write all dirty shared
|
|
buffers for the table and fsync them before committing. SHARE is the
|
|
functionality we have now, with full logging.
|
|
|
|
Does this get us any closer to a TODO item? It isn't great, but I think
|
|
it is pretty clear, and I assume pg_dump would use ALTER to load each
|
|
table. The advanage is that the COPY statements themselves are
|
|
unchanged so they would work in loading into older versions of
|
|
PostgreSQL.
|
|
|
|
---------------------------------------------------------------------------
|
|
|
|
Martijn van Oosterhout wrote:
|
|
-- Start of PGP signed section.
|
|
> On Mon, Dec 26, 2005 at 12:03:27PM +0000, Simon Riggs wrote:
|
|
> > I would not be against such a table-level switch, but the exact
|
|
> > behaviour would need to be specified more closely before this became a
|
|
> > TODO item, IMHO.
|
|
>
|
|
> Well, I think at a per table level is the only sensible level. If a
|
|
> table isn't logged, neither are the indexes. After an unclean shutdown
|
|
> the data could be anywhere between OK and rubbish, with no way of
|
|
> finding out which way.
|
|
>
|
|
> > If someone has a 100 GB table, they would not appreciate the table being
|
|
> > truncated if a transaction to load 1 GB of data aborts, forcing recovery
|
|
> > of the 100 GB table.
|
|
>
|
|
> Ah, but wouldn't such a large table be partitioned in such a way that
|
|
> you could have the most recent partition having the loaded data.
|
|
> Personally, I think these "shared temp tables" have more applications
|
|
> than meet the eye. I've had systems with cache tables which could be
|
|
> wiped on boot. Though I think my preference would be to TRUNCATE rather
|
|
> than DROP on unclean shutdown.
|
|
>
|
|
> Have a nice day,
|
|
> --
|
|
> Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
|
|
> > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
|
|
> > tool for doing 5% of the work and then sitting around waiting for someone
|
|
> > else to do the other 95% so you can sue them.
|
|
-- End of PGP section, PGP failed!
|
|
|
|
--
|
|
Bruce Momjian | http://candle.pha.pa.us
|
|
pgman@candle.pha.pa.us | (610) 359-1001
|
|
+ If your life is a hard drive, | 13 Roberts Road
|
|
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 3: Have you checked our extensive FAQ?
|
|
|
|
http://www.postgresql.org/docs/faq
|
|
|
|
From pgsql-hackers-owner+M78007=pgman=candle.pha.pa.us@postgresql.org Wed Dec 28 22:06:13 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
Message-ID: <43B3527A.4040709@commandprompt.com>
|
|
Date: Wed, 28 Dec 2005 19:05:30 -0800
|
|
From: Joshua D. Drake <jd@commandprompt.com>
|
|
Organization: Command Prompt, Inc.
|
|
User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)
|
|
X-Accept-Language: en-us, en
|
|
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
cc: Martijn van Oosterhout <kleptog@svana.org>,
|
|
Simon Riggs <simon@2ndquadrant.com>, Tom Lane <tgl@sss.pgh.pa.us>,
|
|
Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
|
|
Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
|
|
In-Reply-To: <200512290158.jBT1wEK28785@candle.pha.pa.us>
|
|
X-Greylist: Sender succeded SMTP AUTH authentication, not delayed by milter-greylist-1.6 (hosting.commandprompt.com [192.168.1.101]); Wed, 28 Dec 2005 18:57:25 -0800 (PST)
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.05 required=5 tests=[AWL=0.050, UPPERCASE_25_50=0]
|
|
X-Spam-Score: 0.05
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 725
|
|
|
|
now agree that the logging is more part of the table than
|
|
> the command itself. Right now we have a COPY LOCK patch, but people are
|
|
> going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
|
|
> and all sorts of other things, so I think we are best adding an ALTER
|
|
> TABLE capability. I am thinking of this syntax:
|
|
>
|
|
> ALTER TABLE name RELIABILITY option
|
|
>
|
|
> where "option" is:
|
|
>
|
|
> DROP [ TABLE ON CRASH ]
|
|
> DELETE [ ROWS ON CRASH ]
|
|
> EXCLUSIVE
|
|
> SHARE
|
|
|
|
I would say ON FAILURE (Crash just seems way to scary :))
|
|
|
|
Joshua D. Drake
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 3: Have you checked our extensive FAQ?
|
|
|
|
http://www.postgresql.org/docs/faq
|
|
|
|
From pgsql-hackers-owner+M78008=pgman=candle.pha.pa.us@postgresql.org Wed Dec 28 23:09:58 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
Message-ID: <200512290409.jBT49LD13611@candle.pha.pa.us>
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
In-Reply-To: <43B3527A.4040709@commandprompt.com>
|
|
To: Joshua D. Drake <jd@commandprompt.com>
|
|
Date: Wed, 28 Dec 2005 23:09:21 -0500 (EST)
|
|
cc: Martijn van Oosterhout <kleptog@svana.org>,
|
|
Simon Riggs <simon@2ndquadrant.com>, Tom Lane <tgl@sss.pgh.pa.us>,
|
|
Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
|
|
Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
|
|
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.122 required=5 tests=[AWL=0.122, UPPERCASE_25_50=0]
|
|
X-Spam-Score: 0.122
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 1111
|
|
|
|
Joshua D. Drake wrote:
|
|
> now agree that the logging is more part of the table than
|
|
> > the command itself. Right now we have a COPY LOCK patch, but people are
|
|
> > going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
|
|
> > and all sorts of other things, so I think we are best adding an ALTER
|
|
> > TABLE capability. I am thinking of this syntax:
|
|
> >
|
|
> > ALTER TABLE name RELIABILITY option
|
|
> >
|
|
> > where "option" is:
|
|
> >
|
|
> > DROP [ TABLE ON CRASH ]
|
|
> > DELETE [ ROWS ON CRASH ]
|
|
> > EXCLUSIVE
|
|
> > SHARE
|
|
>
|
|
> I would say ON FAILURE (Crash just seems way to scary :))
|
|
|
|
Agreed, maybe ON RECOVERY.
|
|
|
|
--
|
|
Bruce Momjian | http://candle.pha.pa.us
|
|
pgman@candle.pha.pa.us | (610) 359-1001
|
|
+ If your life is a hard drive, | 13 Roberts Road
|
|
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 9: In versions below 8.0, the planner will ignore your desire to
|
|
choose an index scan if your joining column's datatypes do not
|
|
match
|
|
|
|
From simon@2ndquadrant.com Thu Dec 29 08:19:47 2005
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
From: Simon Riggs <simon@2ndquadrant.com>
|
|
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
cc: Martijn van Oosterhout <kleptog@svana.org>, Tom Lane <tgl@sss.pgh.pa.us>,
|
|
Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
|
|
Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
|
|
In-Reply-To: <200512290158.jBT1wEK28785@candle.pha.pa.us>
|
|
References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
|
|
Date: Thu, 29 Dec 2005 13:19:45 +0000
|
|
Message-ID: <1135862385.2964.804.camel@localhost.localdomain>
|
|
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
|
|
Content-Length: 7026
|
|
|
|
On Wed, 2005-12-28 at 20:58 -0500, Bruce Momjian wrote:
|
|
> Having read through this thread, I would like to propose a
|
|
> syntax/behavior.
|
|
>
|
|
> I think we all now agree that the logging is more part of the table than
|
|
> the command itself. Right now we have a COPY LOCK patch, but people are
|
|
> going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
|
|
> and all sorts of other things, so I think we are best adding an ALTER
|
|
> TABLE capability. I am thinking of this syntax:
|
|
>
|
|
> ALTER TABLE name RELIABILITY option
|
|
>
|
|
> where "option" is:
|
|
>
|
|
> DROP [ TABLE ON CRASH ]
|
|
> DELETE [ ROWS ON CRASH ]
|
|
> EXCLUSIVE
|
|
> SHARE
|
|
>
|
|
> Let me explain each option. DROP would drop the table on a restart
|
|
> after a non-clean shutdown. It would do _no_ logging on the table and
|
|
> allow concurrent access, plus index access. DELETE is the same as DROP,
|
|
> but it just truncates the table (perhaps TRUNCATE is a better word).
|
|
>
|
|
> EXCLUSIVE would allow only a single session to modify the table, and
|
|
> would do all changes by appending to the table, similar to COPY LOCK.
|
|
> EXCLUSIVE would also not allow indexes because those can not be isolated
|
|
> like appending to the heap. EXCLUSIVE would write all dirty shared
|
|
> buffers for the table and fsync them before committing. SHARE is the
|
|
> functionality we have now, with full logging.
|
|
>
|
|
> Does this get us any closer to a TODO item? It isn't great, but I think
|
|
> it is pretty clear, and I assume pg_dump would use ALTER to load each
|
|
> table. The advanage is that the COPY statements themselves are
|
|
> unchanged so they would work in loading into older versions of
|
|
> PostgreSQL.
|
|
|
|
First off, thanks for summarising a complex thread.
|
|
|
|
My view would be that this thread has been complex because everybody has
|
|
expressed a somewhat different requirement, which could be broken down
|
|
as:
|
|
1. The need for a multi-user-accessible yet temporary table
|
|
2. Loading data into a table immediately after it is created (i.e. in
|
|
same transaction), including but not limited to a reload from pg_dump
|
|
3. How to load data quickly into an existing table (COPY)
|
|
4. How to add/modify data quickly in an existing table (INSERT SELECT,
|
|
UPDATE)
|
|
|
|
I can see the need for all of those individually; my existing patch
|
|
submission covers (2) and (3) only. I very much like your thought to
|
|
coalesce these various requirements into a single coherent model.
|
|
|
|
For requirement (1), table level options make sense. We would:
|
|
- CREATE TABLE ALLTHINGS
|
|
- ALTER TABLE ALLTHINGS RELIABILITY DELETE ROWS ON RECOVERY
|
|
- lots of SQL, all fast because not logged
|
|
|
|
(2) is catered for adequately by the existing COPY patch i.e. it will
|
|
detect whether a table has just been created and then avoid writing WAL.
|
|
In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
|
|
pg_dump *will* work with any other version of PostgreSQL, which *would
|
|
not* be the case if we added ALTER TABLE ... RELIABILITY statements into
|
|
it. Also, a pg_dump created at an earlier version could also be loaded
|
|
faster using the patch. The only requirement is to issue all SQL as part
|
|
of the same transaction - which is catered for by the
|
|
--single-transaction option on pg_restore and psql. So (2) is catered
|
|
for fully without the need for an ALTER TABLE ... RELIABILITY statement
|
|
or COPY LOCK.
|
|
|
|
For requirement (3), I would use table level options like this:
|
|
(the table already exists and is reasonably big; we should not assume
|
|
that everybody can and does use partitioning)
|
|
- ALTER TABLE RELIABILITY ALLTHINGS2 EXCLUSIVE
|
|
- COPY
|
|
- ALTER TABLE RELIABILITY ALLTHINGS2 SHARE
|
|
|
|
For a load into an existing table I would always do all three actions
|
|
together. COPY LOCK does exactly that *and* does it atomically.
|
|
|
|
The two ways of doing (3) have a few pros/cons either way:
|
|
Pro for ALTER TABLE:
|
|
- same syntax as req (1)
|
|
- doesn't need the keyword LOCK
|
|
- allows INSERT SELECT, UPDATE operations also (req 4)
|
|
Cons:
|
|
- existing programs have to add additional statements to take advantage
|
|
of this; with COPY LOCK we would add just a single keyword
|
|
- operation is not atomic, which might lead to some operations waiting
|
|
for a lock to operate as unlogged, since they would execute before the
|
|
second ALTER TABLE gets there
|
|
- operation will be understood by some, but not others. They will forget
|
|
to switch the RELIABILITY back on and then lose their whole table when
|
|
the database crashes. (watch...)
|
|
|
|
...but would it be a problem to have both?
|
|
|
|
|
|
So, my thinking would be to separate things into two:
|
|
a) Add a TODO item "shared temp tables" that caters for (1) and (4)
|
|
|
|
ALTER TABLE name RELIABILITY
|
|
{DELETE ROWS AT RECOVERY | FULL RECOVERY}
|
|
(syntax TBD)
|
|
|
|
which would
|
|
- truncate all rows and remove all index entries during recovery
|
|
- use shared_buffers, not temp_buffers
|
|
- never write xlog records, even when in PITR mode
|
|
- would avoid writing WAL for both heap *and* index tuples
|
|
|
|
b) Leave the COPY patch as is, since it caters for reqs (2) and (3) as
|
|
*separate* optimizations (but using a common infrastructure in code).
|
|
[This work was based upon discussions on -hackers only 6 months ago, so
|
|
its not like its been snuck in or anything
|
|
http://archives.postgresql.org/pgsql-hackers/2005-06/msg00069.php
|
|
http://archives.postgresql.org/pgsql-hackers/2005-06/msg00075.php ]
|
|
|
|
These two thoughts are separable. There is no need to
|
|
have-both-or-neither within PostgreSQL.
|
|
|
|
Eventually, I'd like all of these options, as a database designer.
|
|
|
|
Best Regards, Simon Riggs
|
|
|
|
> --------------------------------------------------------------------------
|
|
>
|
|
> Martijn van Oosterhout wrote:
|
|
> -- Start of PGP signed section.
|
|
> > On Mon, Dec 26, 2005 at 12:03:27PM +0000, Simon Riggs wrote:
|
|
> > > I would not be against such a table-level switch, but the exact
|
|
> > > behaviour would need to be specified more closely before this became a
|
|
> > > TODO item, IMHO.
|
|
> >
|
|
> > Well, I think at a per table level is the only sensible level. If a
|
|
> > table isn't logged, neither are the indexes. After an unclean shutdown
|
|
> > the data could be anywhere between OK and rubbish, with no way of
|
|
> > finding out which way.
|
|
> >
|
|
> > > If someone has a 100 GB table, they would not appreciate the table being
|
|
> > > truncated if a transaction to load 1 GB of data aborts, forcing recovery
|
|
> > > of the 100 GB table.
|
|
> >
|
|
> > Ah, but wouldn't such a large table be partitioned in such a way that
|
|
> > you could have the most recent partition having the loaded data.
|
|
> > Personally, I think these "shared temp tables" have more applications
|
|
> > than meet the eye. I've had systems with cache tables which could be
|
|
> > wiped on boot. Though I think my preference would be to TRUNCATE rather
|
|
> > than DROP on unclean shutdown.
|
|
> >
|
|
> > Have a nice day,
|
|
> > --
|
|
> > Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
|
|
> > > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
|
|
> > > tool for doing 5% of the work and then sitting around waiting for someone
|
|
> > > else to do the other 95% so you can sue them.
|
|
> -- End of PGP section, PGP failed!
|
|
>
|
|
|
|
From pgsql-hackers-owner+M78019=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 08:20:11 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
From: Simon Riggs <simon@2ndquadrant.com>
|
|
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
cc: Martijn van Oosterhout <kleptog@svana.org>, Tom Lane <tgl@sss.pgh.pa.us>,
|
|
Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
|
|
Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
|
|
In-Reply-To: <200512290158.jBT1wEK28785@candle.pha.pa.us>
|
|
References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
|
|
Date: Thu, 29 Dec 2005 13:19:45 +0000
|
|
Message-ID: <1135862385.2964.804.camel@localhost.localdomain>
|
|
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.112 required=5 tests=[AWL=0.112]
|
|
X-Spam-Score: 0.112
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 7139
|
|
|
|
On Wed, 2005-12-28 at 20:58 -0500, Bruce Momjian wrote:
|
|
> Having read through this thread, I would like to propose a
|
|
> syntax/behavior.
|
|
>
|
|
> I think we all now agree that the logging is more part of the table than
|
|
> the command itself. Right now we have a COPY LOCK patch, but people are
|
|
> going to want to control logging for INSERT INTO ... SELECT, and UPDATE,
|
|
> and all sorts of other things, so I think we are best adding an ALTER
|
|
> TABLE capability. I am thinking of this syntax:
|
|
>
|
|
> ALTER TABLE name RELIABILITY option
|
|
>
|
|
> where "option" is:
|
|
>
|
|
> DROP [ TABLE ON CRASH ]
|
|
> DELETE [ ROWS ON CRASH ]
|
|
> EXCLUSIVE
|
|
> SHARE
|
|
>
|
|
> Let me explain each option. DROP would drop the table on a restart
|
|
> after a non-clean shutdown. It would do _no_ logging on the table and
|
|
> allow concurrent access, plus index access. DELETE is the same as DROP,
|
|
> but it just truncates the table (perhaps TRUNCATE is a better word).
|
|
>
|
|
> EXCLUSIVE would allow only a single session to modify the table, and
|
|
> would do all changes by appending to the table, similar to COPY LOCK.
|
|
> EXCLUSIVE would also not allow indexes because those can not be isolated
|
|
> like appending to the heap. EXCLUSIVE would write all dirty shared
|
|
> buffers for the table and fsync them before committing. SHARE is the
|
|
> functionality we have now, with full logging.
|
|
>
|
|
> Does this get us any closer to a TODO item? It isn't great, but I think
|
|
> it is pretty clear, and I assume pg_dump would use ALTER to load each
|
|
> table. The advanage is that the COPY statements themselves are
|
|
> unchanged so they would work in loading into older versions of
|
|
> PostgreSQL.
|
|
|
|
First off, thanks for summarising a complex thread.
|
|
|
|
My view would be that this thread has been complex because everybody has
|
|
expressed a somewhat different requirement, which could be broken down
|
|
as:
|
|
1. The need for a multi-user-accessible yet temporary table
|
|
2. Loading data into a table immediately after it is created (i.e. in
|
|
same transaction), including but not limited to a reload from pg_dump
|
|
3. How to load data quickly into an existing table (COPY)
|
|
4. How to add/modify data quickly in an existing table (INSERT SELECT,
|
|
UPDATE)
|
|
|
|
I can see the need for all of those individually; my existing patch
|
|
submission covers (2) and (3) only. I very much like your thought to
|
|
coalesce these various requirements into a single coherent model.
|
|
|
|
For requirement (1), table level options make sense. We would:
|
|
- CREATE TABLE ALLTHINGS
|
|
- ALTER TABLE ALLTHINGS RELIABILITY DELETE ROWS ON RECOVERY
|
|
- lots of SQL, all fast because not logged
|
|
|
|
(2) is catered for adequately by the existing COPY patch i.e. it will
|
|
detect whether a table has just been created and then avoid writing WAL.
|
|
In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
|
|
pg_dump *will* work with any other version of PostgreSQL, which *would
|
|
not* be the case if we added ALTER TABLE ... RELIABILITY statements into
|
|
it. Also, a pg_dump created at an earlier version could also be loaded
|
|
faster using the patch. The only requirement is to issue all SQL as part
|
|
of the same transaction - which is catered for by the
|
|
--single-transaction option on pg_restore and psql. So (2) is catered
|
|
for fully without the need for an ALTER TABLE ... RELIABILITY statement
|
|
or COPY LOCK.
|
|
|
|
For requirement (3), I would use table level options like this:
|
|
(the table already exists and is reasonably big; we should not assume
|
|
that everybody can and does use partitioning)
|
|
- ALTER TABLE RELIABILITY ALLTHINGS2 EXCLUSIVE
|
|
- COPY
|
|
- ALTER TABLE RELIABILITY ALLTHINGS2 SHARE
|
|
|
|
For a load into an existing table I would always do all three actions
|
|
together. COPY LOCK does exactly that *and* does it atomically.
|
|
|
|
The two ways of doing (3) have a few pros/cons either way:
|
|
Pro for ALTER TABLE:
|
|
- same syntax as req (1)
|
|
- doesn't need the keyword LOCK
|
|
- allows INSERT SELECT, UPDATE operations also (req 4)
|
|
Cons:
|
|
- existing programs have to add additional statements to take advantage
|
|
of this; with COPY LOCK we would add just a single keyword
|
|
- operation is not atomic, which might lead to some operations waiting
|
|
for a lock to operate as unlogged, since they would execute before the
|
|
second ALTER TABLE gets there
|
|
- operation will be understood by some, but not others. They will forget
|
|
to switch the RELIABILITY back on and then lose their whole table when
|
|
the database crashes. (watch...)
|
|
|
|
...but would it be a problem to have both?
|
|
|
|
|
|
So, my thinking would be to separate things into two:
|
|
a) Add a TODO item "shared temp tables" that caters for (1) and (4)
|
|
|
|
ALTER TABLE name RELIABILITY
|
|
{DELETE ROWS AT RECOVERY | FULL RECOVERY}
|
|
(syntax TBD)
|
|
|
|
which would
|
|
- truncate all rows and remove all index entries during recovery
|
|
- use shared_buffers, not temp_buffers
|
|
- never write xlog records, even when in PITR mode
|
|
- would avoid writing WAL for both heap *and* index tuples
|
|
|
|
b) Leave the COPY patch as is, since it caters for reqs (2) and (3) as
|
|
*separate* optimizations (but using a common infrastructure in code).
|
|
[This work was based upon discussions on -hackers only 6 months ago, so
|
|
its not like its been snuck in or anything
|
|
http://archives.postgresql.org/pgsql-hackers/2005-06/msg00069.php
|
|
http://archives.postgresql.org/pgsql-hackers/2005-06/msg00075.php ]
|
|
|
|
These two thoughts are separable. There is no need to
|
|
have-both-or-neither within PostgreSQL.
|
|
|
|
Eventually, I'd like all of these options, as a database designer.
|
|
|
|
Best Regards, Simon Riggs
|
|
|
|
> --------------------------------------------------------------------------
|
|
>
|
|
> Martijn van Oosterhout wrote:
|
|
> -- Start of PGP signed section.
|
|
> > On Mon, Dec 26, 2005 at 12:03:27PM +0000, Simon Riggs wrote:
|
|
> > > I would not be against such a table-level switch, but the exact
|
|
> > > behaviour would need to be specified more closely before this became a
|
|
> > > TODO item, IMHO.
|
|
> >
|
|
> > Well, I think at a per table level is the only sensible level. If a
|
|
> > table isn't logged, neither are the indexes. After an unclean shutdown
|
|
> > the data could be anywhere between OK and rubbish, with no way of
|
|
> > finding out which way.
|
|
> >
|
|
> > > If someone has a 100 GB table, they would not appreciate the table being
|
|
> > > truncated if a transaction to load 1 GB of data aborts, forcing recovery
|
|
> > > of the 100 GB table.
|
|
> >
|
|
> > Ah, but wouldn't such a large table be partitioned in such a way that
|
|
> > you could have the most recent partition having the loaded data.
|
|
> > Personally, I think these "shared temp tables" have more applications
|
|
> > than meet the eye. I've had systems with cache tables which could be
|
|
> > wiped on boot. Though I think my preference would be to TRUNCATE rather
|
|
> > than DROP on unclean shutdown.
|
|
> >
|
|
> > Have a nice day,
|
|
> > --
|
|
> > Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
|
|
> > > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
|
|
> > > tool for doing 5% of the work and then sitting around waiting for someone
|
|
> > > else to do the other 95% so you can sue them.
|
|
> -- End of PGP section, PGP failed!
|
|
>
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 6: explain analyze is your friend
|
|
|
|
From pgsql-hackers-owner+M78021=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 09:35:58 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
From: Rod Taylor <pg@rbt.ca>
|
|
To: Simon Riggs <simon@2ndquadrant.com>
|
|
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
|
|
Martijn van Oosterhout <kleptog@svana.org>, Tom Lane <tgl@sss.pgh.pa.us>,
|
|
Greg Stark <gsstark@mit.edu>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
|
|
pgsql-hackers@postgresql.org
|
|
In-Reply-To: <1135862385.2964.804.camel@localhost.localdomain>
|
|
References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
|
|
<1135862385.2964.804.camel@localhost.localdomain>
|
|
Date: Thu, 29 Dec 2005 09:35:27 -0500
|
|
Message-ID: <1135866927.61038.13.camel@home>
|
|
X-Mailer: Evolution 2.4.2.1 FreeBSD GNOME Team Port
|
|
X-SA-Exim-Mail-From: pg@rbt.ca
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
X-SA-Exim-Version: 3.1 (built Tue Feb 24 05:09:27 GMT 2004)
|
|
X-SA-Exim-Scanned: Yes
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.024 required=5 tests=[AWL=0.024, UPPERCASE_25_50=0]
|
|
X-Spam-Score: 0.024
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 506
|
|
|
|
|
|
> So, my thinking would be to separate things into two:
|
|
> a) Add a TODO item "shared temp tables" that caters for (1) and (4)
|
|
>
|
|
> ALTER TABLE name RELIABILITY
|
|
> {DELETE ROWS AT RECOVERY | FULL RECOVERY}
|
|
> (syntax TBD)
|
|
|
|
DELETE ROWS AT RECOVERY would need to be careful or disallowed when
|
|
referenced via a foreign key to ensure the database is not restored in
|
|
an inconsistent state.
|
|
|
|
--
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 2: Don't 'kill -9' the postmaster
|
|
|
|
From pg@rbt.ca Thu Dec 29 09:35:35 2005
|
|
From: Rod Taylor <pg@rbt.ca>
|
|
To: Simon Riggs <simon@2ndquadrant.com>
|
|
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
|
|
Martijn van Oosterhout <kleptog@svana.org>, Tom Lane <tgl@sss.pgh.pa.us>,
|
|
Greg Stark <gsstark@mit.edu>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
|
|
pgsql-hackers@postgresql.org
|
|
In-Reply-To: <1135862385.2964.804.camel@localhost.localdomain>
|
|
References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
|
|
<1135862385.2964.804.camel@localhost.localdomain>
|
|
Date: Thu, 29 Dec 2005 09:35:27 -0500
|
|
Message-ID: <1135866927.61038.13.camel@home>
|
|
X-Mailer: Evolution 2.4.2.1 FreeBSD GNOME Team Port
|
|
X-SA-Exim-Mail-From: pg@rbt.ca
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on psi.look.ca
|
|
X-Spam-Level:
|
|
X-Spam-Status: No, hits=0.7 required=9.0 tests=UPPERCASE_25_50 autolearn=no
|
|
version=2.63
|
|
X-SA-Exim-Version: 3.1 (built Tue Feb 24 05:09:27 GMT 2004)
|
|
X-SA-Exim-Scanned: Yes
|
|
Content-Length: 393
|
|
|
|
|
|
> So, my thinking would be to separate things into two:
|
|
> a) Add a TODO item "shared temp tables" that caters for (1) and (4)
|
|
>
|
|
> ALTER TABLE name RELIABILITY
|
|
> {DELETE ROWS AT RECOVERY | FULL RECOVERY}
|
|
> (syntax TBD)
|
|
|
|
DELETE ROWS AT RECOVERY would need to be careful or disallowed when
|
|
referenced via a foreign key to ensure the database is not restored in
|
|
an inconsistent state.
|
|
|
|
--
|
|
|
|
From pgsql-hackers-owner+M78022=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 10:10:57 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
From: Simon Riggs <simon@2ndquadrant.com>
|
|
To: Rod Taylor <pg@rbt.ca>
|
|
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
|
|
Martijn van Oosterhout <kleptog@svana.org>, Tom Lane <tgl@sss.pgh.pa.us>,
|
|
Greg Stark <gsstark@mit.edu>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
|
|
pgsql-hackers@postgresql.org
|
|
In-Reply-To: <1135866927.61038.13.camel@home>
|
|
References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
|
|
<1135862385.2964.804.camel@localhost.localdomain>
|
|
<1135866927.61038.13.camel@home>
|
|
Date: Thu, 29 Dec 2005 15:10:40 +0000
|
|
Message-ID: <1135869040.2964.824.camel@localhost.localdomain>
|
|
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113]
|
|
X-Spam-Score: 0.113
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 888
|
|
|
|
On Thu, 2005-12-29 at 09:35 -0500, Rod Taylor wrote:
|
|
> > So, my thinking would be to separate things into two:
|
|
> > a) Add a TODO item "shared temp tables" that caters for (1) and (4)
|
|
> >
|
|
> > ALTER TABLE name RELIABILITY
|
|
> > {DELETE ROWS AT RECOVERY | FULL RECOVERY}
|
|
> > (syntax TBD)
|
|
>
|
|
> DELETE ROWS AT RECOVERY would need to be careful or disallowed when
|
|
> referenced via a foreign key to ensure the database is not restored in
|
|
> an inconsistent state.
|
|
|
|
I think we'd need to apply the same rule as we do for temp tables: they
|
|
cannot be referenced by a permanent table.
|
|
|
|
There are possibly some other restrictions also. Anyone?
|
|
|
|
Best Regards, Simon Riggs
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 9: In versions below 8.0, the planner will ignore your desire to
|
|
choose an index scan if your joining column's datatypes do not
|
|
match
|
|
|
|
From tgl@sss.pgh.pa.us Thu Dec 29 11:12:13 2005
|
|
To: Simon Riggs <simon@2ndquadrant.com>
|
|
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
|
|
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
|
|
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
|
|
pgsql-hackers@postgresql.org
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
In-Reply-To: <1135862385.2964.804.camel@localhost.localdomain>
|
|
References: <200512290158.jBT1wEK28785@candle.pha.pa.us> <1135862385.2964.804.camel@localhost.localdomain>
|
|
Comments: In-reply-to Simon Riggs <simon@2ndquadrant.com>
|
|
message dated "Thu, 29 Dec 2005 13:19:45 +0000"
|
|
Date: Thu, 29 Dec 2005 11:12:11 -0500
|
|
Message-ID: <7273.1135872731@sss.pgh.pa.us>
|
|
From: Tom Lane <tgl@sss.pgh.pa.us>
|
|
Content-Length: 1963
|
|
|
|
Simon Riggs <simon@2ndquadrant.com> writes:
|
|
> My view would be that this thread has been complex because everybody has
|
|
> expressed a somewhat different requirement, which could be broken down
|
|
> as:
|
|
> 1. The need for a multi-user-accessible yet temporary table
|
|
> 2. Loading data into a table immediately after it is created (i.e. in
|
|
> same transaction), including but not limited to a reload from pg_dump
|
|
> 3. How to load data quickly into an existing table (COPY)
|
|
> 4. How to add/modify data quickly in an existing table (INSERT SELECT,
|
|
> UPDATE)
|
|
|
|
> I can see the need for all of those individually; my existing patch
|
|
> submission covers (2) and (3) only. I very much like your thought to
|
|
> coalesce these various requirements into a single coherent model.
|
|
|
|
However, you then seem to be arguing for still using the COPY LOCK
|
|
syntax, which I think Bruce intended would go away in favor of using
|
|
these ALTER commands. Certainly that's what I'd prefer --- COPY has
|
|
got too darn many options already.
|
|
|
|
> In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
|
|
> pg_dump *will* work with any other version of PostgreSQL, which *would
|
|
> not* be the case if we added ALTER TABLE ... RELIABILITY statements into
|
|
> it.
|
|
|
|
Wrong --- the good thing about ALTER TABLE is that an old version of
|
|
Postgres would simply reject it and keep going. Therefore we could get
|
|
the speedup in dumps without losing compatibility, which is not true
|
|
of COPY LOCK.
|
|
|
|
BTW, this is a perfect example of the use-case for not abandoning a
|
|
dump-file load simply because one command fails. (We have relied on
|
|
this sort of reasoning many times before, too, for example by using
|
|
"SET default_with_oids" in preference to CREATE TABLE WITH/WITHOUT OIDS.)
|
|
I don't think that "wrap the whole load into begin/end" is really a very
|
|
workable answer, because there are far too many scenarios where you
|
|
can't do that. Another one where it doesn't help is a data-only dump.
|
|
|
|
regards, tom lane
|
|
|
|
From pgsql-hackers-owner+M78028=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 11:12:41 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
To: Simon Riggs <simon@2ndquadrant.com>
|
|
cc: Bruce Momjian <pgman@candle.pha.pa.us>,
|
|
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
|
|
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
|
|
pgsql-hackers@postgresql.org
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
In-Reply-To: <1135862385.2964.804.camel@localhost.localdomain>
|
|
References: <200512290158.jBT1wEK28785@candle.pha.pa.us> <1135862385.2964.804.camel@localhost.localdomain>
|
|
Comments: In-reply-to Simon Riggs <simon@2ndquadrant.com>
|
|
message dated "Thu, 29 Dec 2005 13:19:45 +0000"
|
|
Date: Thu, 29 Dec 2005 11:12:11 -0500
|
|
Message-ID: <7273.1135872731@sss.pgh.pa.us>
|
|
From: Tom Lane <tgl@sss.pgh.pa.us>
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.053 required=5 tests=[AWL=0.053]
|
|
X-Spam-Score: 0.053
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 2075
|
|
|
|
Simon Riggs <simon@2ndquadrant.com> writes:
|
|
> My view would be that this thread has been complex because everybody has
|
|
> expressed a somewhat different requirement, which could be broken down
|
|
> as:
|
|
> 1. The need for a multi-user-accessible yet temporary table
|
|
> 2. Loading data into a table immediately after it is created (i.e. in
|
|
> same transaction), including but not limited to a reload from pg_dump
|
|
> 3. How to load data quickly into an existing table (COPY)
|
|
> 4. How to add/modify data quickly in an existing table (INSERT SELECT,
|
|
> UPDATE)
|
|
|
|
> I can see the need for all of those individually; my existing patch
|
|
> submission covers (2) and (3) only. I very much like your thought to
|
|
> coalesce these various requirements into a single coherent model.
|
|
|
|
However, you then seem to be arguing for still using the COPY LOCK
|
|
syntax, which I think Bruce intended would go away in favor of using
|
|
these ALTER commands. Certainly that's what I'd prefer --- COPY has
|
|
got too darn many options already.
|
|
|
|
> In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
|
|
> pg_dump *will* work with any other version of PostgreSQL, which *would
|
|
> not* be the case if we added ALTER TABLE ... RELIABILITY statements into
|
|
> it.
|
|
|
|
Wrong --- the good thing about ALTER TABLE is that an old version of
|
|
Postgres would simply reject it and keep going. Therefore we could get
|
|
the speedup in dumps without losing compatibility, which is not true
|
|
of COPY LOCK.
|
|
|
|
BTW, this is a perfect example of the use-case for not abandoning a
|
|
dump-file load simply because one command fails. (We have relied on
|
|
this sort of reasoning many times before, too, for example by using
|
|
"SET default_with_oids" in preference to CREATE TABLE WITH/WITHOUT OIDS.)
|
|
I don't think that "wrap the whole load into begin/end" is really a very
|
|
workable answer, because there are far too many scenarios where you
|
|
can't do that. Another one where it doesn't help is a data-only dump.
|
|
|
|
regards, tom lane
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 2: Don't 'kill -9' the postmaster
|
|
|
|
From pgsql-hackers-owner+M78025=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 10:57:46 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
Message-ID: <51082.68.143.134.146.1135872877.squirrel@www.dunslane.net>
|
|
Date: Thu, 29 Dec 2005 10:14:37 -0600 (CST)
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
From: Andrew Dunstan <andrew@dunslane.net>
|
|
To: <pgman@candle.pha.pa.us>
|
|
In-Reply-To: <200512290158.jBT1wEK28785@candle.pha.pa.us>
|
|
References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
|
|
X-Priority: 3
|
|
Importance: Normal
|
|
X-MSMail-Priority: Normal
|
|
cc: <kleptog@svana.org>, <simon@2ndquadrant.com>, <tgl@sss.pgh.pa.us>,
|
|
<gsstark@mit.edu>, <pg@rbt.ca>, <zhouqq@cs.toronto.edu>,
|
|
<pgsql-hackers@postgresql.org>
|
|
X-Mailer: SquirrelMail (version 1.2.5)
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.082 required=5 tests=[AWL=0.082]
|
|
X-Spam-Score: 0.082
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 1185
|
|
|
|
Bruce Momjian said:
|
|
> DROP would drop the table on a restart
|
|
> after a non-clean shutdown. It would do _no_ logging on the table and
|
|
> allow concurrent access, plus index access. DELETE is the same as
|
|
> DROP, but it just truncates the table (perhaps TRUNCATE is a better
|
|
> word).
|
|
>
|
|
> EXCLUSIVE would allow only a single session to modify the table, and
|
|
> would do all changes by appending to the table, similar to COPY LOCK.
|
|
> EXCLUSIVE would also not allow indexes because those can not be
|
|
> isolated like appending to the heap. EXCLUSIVE would write all dirty
|
|
> shared buffers for the table and fsync them before committing. SHARE
|
|
> is the functionality we have now, with full logging.
|
|
|
|
|
|
I an horribly scared that this will be used as a "performance boost" for
|
|
normal use. I would at least like to see some restrictions that make it
|
|
harder to mis-use. Perhaps restrict to superuser?
|
|
|
|
cheers
|
|
|
|
andrew
|
|
|
|
|
|
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 1: if posting/reading through Usenet, please send an appropriate
|
|
subscribe-nomail command to majordomo@postgresql.org so that your
|
|
message can get through to the mailing list cleanly
|
|
|
|
From tgl@sss.pgh.pa.us Thu Dec 29 11:24:30 2005
|
|
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
cc: Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org,
|
|
simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
|
|
pgsql-hackers@postgresql.org
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
In-Reply-To: <200512291605.jBTG5gi00396@candle.pha.pa.us>
|
|
References: <200512291605.jBTG5gi00396@candle.pha.pa.us>
|
|
Comments: In-reply-to Bruce Momjian <pgman@candle.pha.pa.us>
|
|
message dated "Thu, 29 Dec 2005 11:05:42 -0500"
|
|
Date: Thu, 29 Dec 2005 11:24:28 -0500
|
|
Message-ID: <7966.1135873468@sss.pgh.pa.us>
|
|
From: Tom Lane <tgl@sss.pgh.pa.us>
|
|
Content-Length: 612
|
|
|
|
Bruce Momjian <pgman@candle.pha.pa.us> writes:
|
|
> Andrew Dunstan wrote:
|
|
>> I an horribly scared that this will be used as a "performance boost" for
|
|
>> normal use. I would at least like to see some restrictions that make it
|
|
>> harder to mis-use. Perhaps restrict to superuser?
|
|
|
|
> Certainly restrict to table owner.
|
|
|
|
I can see the argument for superuser-only: decisions about data
|
|
integrity tradeoffs should be reserved to the DBA, who is the one who
|
|
will get blamed if the database loses data, no matter how stupid his
|
|
users are.
|
|
|
|
But I'm not wedded to that. I could live with table-owner.
|
|
|
|
regards, tom lane
|
|
|
|
From pgsql-hackers-owner+M78031=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 11:38:17 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
Message-ID: <200512291637.jBTGbdC03848@candle.pha.pa.us>
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
In-Reply-To: <7273.1135872731@sss.pgh.pa.us>
|
|
To: Tom Lane <tgl@sss.pgh.pa.us>
|
|
Date: Thu, 29 Dec 2005 11:37:39 -0500 (EST)
|
|
cc: Simon Riggs <simon@2ndquadrant.com>,
|
|
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
|
|
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
|
|
pgsql-hackers@postgresql.org
|
|
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.122 required=5 tests=[AWL=0.122]
|
|
X-Spam-Score: 0.122
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 3932
|
|
|
|
Tom Lane wrote:
|
|
> Simon Riggs <simon@2ndquadrant.com> writes:
|
|
> > My view would be that this thread has been complex because everybody has
|
|
> > expressed a somewhat different requirement, which could be broken down
|
|
> > as:
|
|
> > 1. The need for a multi-user-accessible yet temporary table
|
|
> > 2. Loading data into a table immediately after it is created (i.e. in
|
|
> > same transaction), including but not limited to a reload from pg_dump
|
|
> > 3. How to load data quickly into an existing table (COPY)
|
|
> > 4. How to add/modify data quickly in an existing table (INSERT SELECT,
|
|
> > UPDATE)
|
|
>
|
|
> > I can see the need for all of those individually; my existing patch
|
|
> > submission covers (2) and (3) only. I very much like your thought to
|
|
> > coalesce these various requirements into a single coherent model.
|
|
>
|
|
> However, you then seem to be arguing for still using the COPY LOCK
|
|
> syntax, which I think Bruce intended would go away in favor of using
|
|
> these ALTER commands. Certainly that's what I'd prefer --- COPY has
|
|
> got too darn many options already.
|
|
>
|
|
> > In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
|
|
> > pg_dump *will* work with any other version of PostgreSQL, which *would
|
|
> > not* be the case if we added ALTER TABLE ... RELIABILITY statements into
|
|
> > it.
|
|
>
|
|
> Wrong --- the good thing about ALTER TABLE is that an old version of
|
|
> Postgres would simply reject it and keep going. Therefore we could get
|
|
> the speedup in dumps without losing compatibility, which is not true
|
|
> of COPY LOCK.
|
|
>
|
|
> BTW, this is a perfect example of the use-case for not abandoning a
|
|
> dump-file load simply because one command fails. (We have relied on
|
|
> this sort of reasoning many times before, too, for example by using
|
|
> "SET default_with_oids" in preference to CREATE TABLE WITH/WITHOUT OIDS.)
|
|
> I don't think that "wrap the whole load into begin/end" is really a very
|
|
> workable answer, because there are far too many scenarios where you
|
|
> can't do that. Another one where it doesn't help is a data-only dump.
|
|
|
|
Yep, Tom is echoing my reaction. There is a temptation to add things up
|
|
onto existing commands, e.g. LOCK, and while it works, it makes for some
|
|
very complex user API's. Having COPY behave differently because it is
|
|
in a transaction is fine as long as it is user-invisible, but once you
|
|
require users to do that to get the speedup, it isn't user-invisible
|
|
anymore.
|
|
|
|
(I can see it now, "Why is pg_dump putting things in transactions?",
|
|
"Because it prevents it from being logged." "Oh, should I be doing that
|
|
in my code?" "Perhaps, if you want ..." You can see where that
|
|
discussion is going. Having them see "ATER TABLE ... RELIBILITY
|
|
TRUNCATE" is very clear, and very clear on how it can be used in user
|
|
code.)
|
|
|
|
I think there is great utility in giving users one API, namely
|
|
RELIABILITY (or some other keyword), and telling them that is where they
|
|
control logging. I realize adding one keyword, LOCK, to an existing
|
|
command isn't a big deal, but once you decentralize your API enough
|
|
times, you end up with a terribly complex database system. It is this
|
|
design rigidity that helps make PostgreSQL so much easier to use than
|
|
other database systems.
|
|
|
|
I do think it is valid concern about someone use the table between the
|
|
CREATE and the ALTER TABLE RELIABILITY. One solution would be to allow
|
|
the RELIABILITY as part of the CREATE TABLE, another is to tell users to
|
|
create the table inside a transaction.
|
|
|
|
--
|
|
Bruce Momjian | http://candle.pha.pa.us
|
|
pgman@candle.pha.pa.us | (610) 359-1001
|
|
+ If your life is a hard drive, | 13 Roberts Road
|
|
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 9: In versions below 8.0, the planner will ignore your desire to
|
|
choose an index scan if your joining column's datatypes do not
|
|
match
|
|
|
|
From pgsql-hackers-owner+M78036=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 12:21:12 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
To: Andrew Dunstan <andrew@dunslane.net>
|
|
cc: <pgman@candle.pha.pa.us>, <kleptog@svana.org>, <simon@2ndquadrant.com>,
|
|
<tgl@sss.pgh.pa.us>, <gsstark@mit.edu>, <pg@rbt.ca>,
|
|
<zhouqq@cs.toronto.edu>, <pgsql-hackers@postgresql.org>
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
References: <200512290158.jBT1wEK28785@candle.pha.pa.us>
|
|
<51082.68.143.134.146.1135872877.squirrel@www.dunslane.net>
|
|
In-Reply-To: <51082.68.143.134.146.1135872877.squirrel@www.dunslane.net>
|
|
From: Greg Stark <gsstark@mit.edu>
|
|
Organization: The Emacs Conspiracy; member since 1992
|
|
Date: 29 Dec 2005 12:20:32 -0500
|
|
Message-ID: <87vex74y73.fsf@stark.xeocode.com>
|
|
Lines: 42
|
|
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.112 required=5 tests=[AWL=0.112]
|
|
X-Spam-Score: 0.112
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 1983
|
|
|
|
"Andrew Dunstan" <andrew@dunslane.net> writes:
|
|
|
|
> Bruce Momjian said:
|
|
> > DROP would drop the table on a restart
|
|
> > after a non-clean shutdown. It would do _no_ logging on the table and
|
|
> > allow concurrent access, plus index access. DELETE is the same as
|
|
> > DROP, but it just truncates the table (perhaps TRUNCATE is a better
|
|
> > word).
|
|
> >
|
|
> > EXCLUSIVE would allow only a single session to modify the table, and
|
|
> > would do all changes by appending to the table, similar to COPY LOCK.
|
|
> > EXCLUSIVE would also not allow indexes because those can not be
|
|
> > isolated like appending to the heap. EXCLUSIVE would write all dirty
|
|
> > shared buffers for the table and fsync them before committing. SHARE
|
|
> > is the functionality we have now, with full logging.
|
|
>
|
|
> I an horribly scared that this will be used as a "performance boost" for
|
|
> normal use. I would at least like to see some restrictions that make it
|
|
> harder to mis-use. Perhaps restrict to superuser?
|
|
|
|
Well that's its whole purpose. At least you can hardly argue that you didn't
|
|
realize the consequences of "DELETE ROWS ON RECOVERY"... :)
|
|
|
|
Some thoughts:
|
|
|
|
a) I'm not sure I understand the purpose of EXCLUSIVE. When would I ever want to
|
|
use it instead of DELETE ROWS?
|
|
|
|
b) It seems like the other feature people were talking about of not logging
|
|
for a table created within the same transaction should be handled by
|
|
having this flag implicitly set for any such newly created table.
|
|
Ie, the test for whether to log would look like:
|
|
|
|
if (!table->logged && table->xid != myxid) ...
|
|
|
|
c) Every option in ALTER TABLE should be in CREATE TABLE as well.
|
|
|
|
d) Yes as someone else mentioned, this should only be allowable on a table
|
|
with no foreign keys referencing it.
|
|
|
|
--
|
|
greg
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 9: In versions below 8.0, the planner will ignore your desire to
|
|
choose an index scan if your joining column's datatypes do not
|
|
match
|
|
|
|
From pgsql-hackers-owner+M78037=pgman=candle.pha.pa.us@postgresql.org Thu Dec 29 12:31:40 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
Message-ID: <200512291730.jBTHUnn09840@candle.pha.pa.us>
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
In-Reply-To: <87vex74y73.fsf@stark.xeocode.com>
|
|
To: Greg Stark <gsstark@mit.edu>
|
|
Date: Thu, 29 Dec 2005 12:30:49 -0500 (EST)
|
|
cc: Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org,
|
|
simon@2ndquadrant.com, tgl@sss.pgh.pa.us, pg@rbt.ca, zhouqq@cs.toronto.edu,
|
|
pgsql-hackers@postgresql.org
|
|
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.122 required=5 tests=[AWL=0.122]
|
|
X-Spam-Score: 0.122
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 3304
|
|
|
|
Greg Stark wrote:
|
|
> "Andrew Dunstan" <andrew@dunslane.net> writes:
|
|
>
|
|
> > Bruce Momjian said:
|
|
> > > DROP would drop the table on a restart
|
|
> > > after a non-clean shutdown. It would do _no_ logging on the table and
|
|
> > > allow concurrent access, plus index access. DELETE is the same as
|
|
> > > DROP, but it just truncates the table (perhaps TRUNCATE is a better
|
|
> > > word).
|
|
> > >
|
|
> > > EXCLUSIVE would allow only a single session to modify the table, and
|
|
> > > would do all changes by appending to the table, similar to COPY LOCK.
|
|
> > > EXCLUSIVE would also not allow indexes because those can not be
|
|
> > > isolated like appending to the heap. EXCLUSIVE would write all dirty
|
|
> > > shared buffers for the table and fsync them before committing. SHARE
|
|
> > > is the functionality we have now, with full logging.
|
|
> >
|
|
> > I an horribly scared that this will be used as a "performance boost" for
|
|
> > normal use. I would at least like to see some restrictions that make it
|
|
> > harder to mis-use. Perhaps restrict to superuser?
|
|
>
|
|
> Well that's its whole purpose. At least you can hardly argue that you didn't
|
|
> realize the consequences of "DELETE ROWS ON RECOVERY"... :)
|
|
|
|
True. I think we are worried about non-owners using it, but the owner
|
|
had to grant permissions for others to modify it, so we might be OK.
|
|
|
|
> Some thoughts:
|
|
>
|
|
> a) I'm not sure I understand the purpose of EXCLUSIVE. When would I ever want to
|
|
> use it instead of DELETE ROWS?
|
|
|
|
Good question. The use case is doing COPY into a table that already had
|
|
data. EXCLUSIVE allows additions to the table but preserves the
|
|
existing data on a crash.
|
|
|
|
> b) It seems like the other feature people were talking about of not logging
|
|
> for a table created within the same transaction should be handled by
|
|
> having this flag implicitly set for any such newly created table.
|
|
> Ie, the test for whether to log would look like:
|
|
>
|
|
> if (!table->logged && table->xid != myxid) ...
|
|
|
|
Yes, the question is whether we want to limit users to having this
|
|
optimization _only_ when they have created the table in the same
|
|
transaction, and the short answer is we don't.
|
|
|
|
> c) Every option in ALTER TABLE should be in CREATE TABLE as well.
|
|
|
|
I looked into that and see that things like:
|
|
|
|
ALTER [ COLUMN ] column SET STATISTICS integer
|
|
ALTER [ COLUMN ] column SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN }
|
|
|
|
are not supported by CREATE TABLE, and probably shouldn't be because the
|
|
value can be changed after the table is created. I think the only
|
|
things we usually support in CREATE TABLE are those that cannot be
|
|
altered.
|
|
|
|
> d) Yes as someone else mentioned, this should only be allowable on a table
|
|
> with no foreign keys referencing it.
|
|
|
|
Right, and EXCLUSIVE can not have an index either.
|
|
|
|
--
|
|
Bruce Momjian | http://candle.pha.pa.us
|
|
pgman@candle.pha.pa.us | (610) 359-1001
|
|
+ If your life is a hard drive, | 13 Roberts Road
|
|
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 1: if posting/reading through Usenet, please send an appropriate
|
|
subscribe-nomail command to majordomo@postgresql.org so that your
|
|
message can get through to the mailing list cleanly
|
|
|
|
From simon@2ndquadrant.com Fri Dec 30 08:10:53 2005
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
From: Simon Riggs <simon@2ndquadrant.com>
|
|
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>,
|
|
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
|
|
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
|
|
pgsql-hackers@postgresql.org
|
|
In-Reply-To: <200512291637.jBTGbdC03848@candle.pha.pa.us>
|
|
References: <200512291637.jBTGbdC03848@candle.pha.pa.us>
|
|
Date: Fri, 30 Dec 2005 13:09:12 +0000
|
|
Message-ID: <1135948152.2862.113.camel@localhost.localdomain>
|
|
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
|
|
Content-Length: 6343
|
|
|
|
On Thu, 2005-12-29 at 11:37 -0500, Bruce Momjian wrote:
|
|
> Tom Lane wrote:
|
|
> > Simon Riggs <simon@2ndquadrant.com> writes:
|
|
> > > My view would be that this thread has been complex because everybody has
|
|
> > > expressed a somewhat different requirement, which could be broken down
|
|
> > > as:
|
|
> > > 1. The need for a multi-user-accessible yet temporary table
|
|
> > > 2. Loading data into a table immediately after it is created (i.e. in
|
|
> > > same transaction), including but not limited to a reload from pg_dump
|
|
> > > 3. How to load data quickly into an existing table (COPY)
|
|
> > > 4. How to add/modify data quickly in an existing table (INSERT SELECT,
|
|
> > > UPDATE)
|
|
|
|
> > However, you then seem to be arguing for still using the COPY LOCK
|
|
> > syntax, which I think Bruce intended would go away in favor of using
|
|
> > these ALTER commands. Certainly that's what I'd prefer --- COPY has
|
|
> > got too darn many options already.
|
|
|
|
COPY LOCK was Tom's suggestion at the end of a long discussion thread on
|
|
this precise issue. Nobody objected to it at that point; I implemented
|
|
it *exactly* that way because I wanted to very visibly follow the
|
|
consensus of the community, after informed debate.
|
|
http://archives.postgresql.org/pgsql-hackers/2005-06/msg00068.php
|
|
|
|
Please re-read the links to previous discussions.
|
|
http://archives.postgresql.org/pgsql-hackers/2005-06/msg00069.php
|
|
There are points there, not made by me, that still apply and need to be
|
|
considered here, yet have not been.
|
|
|
|
Just to restate my current thinking:
|
|
- agree we should have ALTER TABLE ... RELIABILITY DELETE ROWS
|
|
- we should have COPY LOCK rather than
|
|
ALTER TABLE .... RELIABILITY EXCLUSIVE
|
|
(Though I welcome better wording and syntax in either case; it is the
|
|
behaviour only that I discuss).
|
|
|
|
It seems now that we have agreed approaches for (1), (2) and (4). Please
|
|
note that I have listened to the needs of others with regard to
|
|
requirement (1), as espoused by earlier by Hannu and again now by
|
|
Martijn. Some of the points about requirement (3) I made in my previous
|
|
post have not yet been addressed, IMHO.
|
|
|
|
My mind is not fixed. AFAICS there are valid points remaining on both
|
|
sides of the discussion about loading data quickly into an existing
|
|
table.
|
|
|
|
> I do think it is valid concern about someone use the table between the
|
|
> CREATE and the ALTER TABLE RELIABILITY. One solution would be to allow
|
|
> the RELIABILITY as part of the CREATE TABLE, another is to tell users to
|
|
> create the table inside a transaction.
|
|
|
|
Neither solution works for this use case:
|
|
|
|
> > 3. How to load data quickly into an existing table (COPY)
|
|
|
|
This is the only use case for which ALTER TABLE ... EXCLUSIVE makes
|
|
sense. That option means that any write lock held upon the table would
|
|
be an EXCLUSIVE table lock, so would never be a performance gain with
|
|
single row INSERT, UPDATE or DELETEs.
|
|
|
|
Following Andrew's concerns, I'd also note that ALTER TABLE requires a
|
|
much higher level of privilege to operate than does COPY. That sounds
|
|
like it will make things more secure, but all it does is open up the
|
|
administrative rights, since full ownership rights must be obtained
|
|
merely to load data.
|
|
|
|
> Having COPY behave differently because it is
|
|
> in a transaction is fine as long as it is user-invisible
|
|
|
|
Good
|
|
|
|
> I think there is great utility in giving users one API, namely
|
|
> RELIABILITY (or some other keyword), and telling them that is where they
|
|
> control logging. I realize adding one keyword, LOCK, to an existing
|
|
> command isn't a big deal, but once you decentralize your API enough
|
|
> times, you end up with a terribly complex database system. It is this
|
|
> design rigidity that helps make PostgreSQL so much easier to use than
|
|
> other database systems.
|
|
|
|
I do see the appeal of your suggestion...
|
|
|
|
TRUNCATE is a special command to delete quickly. There is no requirement
|
|
to do an ALTER TABLE statement before that command executes.
|
|
|
|
Balance would suggest that a special command to load data quickly would
|
|
be reasonably accepted by users.
|
|
|
|
|
|
|
|
|
|
Minor points below:
|
|
|
|
> > > In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
|
|
> > > pg_dump *will* work with any other version of PostgreSQL, which *would
|
|
> > > not* be the case if we added ALTER TABLE ... RELIABILITY statements into
|
|
> > > it.
|
|
> >
|
|
> > Wrong --- the good thing about ALTER TABLE is that an old version of
|
|
> > Postgres would simply reject it and keep going. Therefore we could get
|
|
> > the speedup in dumps without losing compatibility, which is not true
|
|
> > of COPY LOCK.
|
|
|
|
That was pointing out one of Bruce's objections was not relevant because
|
|
it assumed COPY LOCK was required to make pg_restore go faster; that was
|
|
not the case - so there is no valid objection either way now.
|
|
|
|
> > BTW, this is a perfect example of the use-case for not abandoning a
|
|
> > dump-file load simply because one command fails. (We have relied on
|
|
> > this sort of reasoning many times before, too, for example by using
|
|
> > "SET default_with_oids" in preference to CREATE TABLE WITH/WITHOUT OIDS.)
|
|
> > I don't think that "wrap the whole load into begin/end" is really a very
|
|
> > workable answer, because there are far too many scenarios where you
|
|
> > can't do that. Another one where it doesn't help is a data-only dump.
|
|
|
|
Which is why --single-transaction is not the default, per the earlier
|
|
discussion on that point (on -patches).
|
|
|
|
> Yep, Tom is echoing my reaction. There is a temptation to add things up
|
|
> onto existing commands, e.g. LOCK, and while it works, it makes for some
|
|
> very complex user API's. Having COPY behave differently because it is
|
|
> in a transaction is fine as long as it is user-invisible, but once you
|
|
> require users to do that to get the speedup, it isn't user-invisible
|
|
> anymore.
|
|
>
|
|
> (I can see it now, "Why is pg_dump putting things in transactions?",
|
|
> "Because it prevents it from being logged." "Oh, should I be doing that
|
|
> in my code?" "Perhaps, if you want ..." You can see where that
|
|
> discussion is going. Having them see "ATER TABLE ... RELIBILITY
|
|
> TRUNCATE" is very clear, and very clear on how it can be used in user
|
|
> code.)
|
|
|
|
The above case is not an argument against COPY LOCK. Exactly what you
|
|
say above would still occur even when we have ALTER TABLE ...
|
|
RELIABILITY statement, since COPY LOCK and
|
|
COPY-optimized-within-same-transaction are different things.
|
|
|
|
Best Regards, Simon Riggs
|
|
|
|
From pgsql-hackers-owner+M78064=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 11:50:49 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
Message-ID: <200512301649.jBUGnxn21488@candle.pha.pa.us>
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
In-Reply-To: <1135948152.2862.113.camel@localhost.localdomain>
|
|
To: Simon Riggs <simon@2ndquadrant.com>
|
|
Date: Fri, 30 Dec 2005 11:49:59 -0500 (EST)
|
|
cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>,
|
|
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
|
|
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
|
|
pgsql-hackers@postgresql.org
|
|
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.12 required=5 tests=[AWL=0.120]
|
|
X-Spam-Score: 0.12
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 8888
|
|
|
|
Simon Riggs wrote:
|
|
> On Thu, 2005-12-29 at 11:37 -0500, Bruce Momjian wrote:
|
|
> > Tom Lane wrote:
|
|
> > > Simon Riggs <simon@2ndquadrant.com> writes:
|
|
> > > > My view would be that this thread has been complex because everybody has
|
|
> > > > expressed a somewhat different requirement, which could be broken down
|
|
> > > > as:
|
|
> > > > 1. The need for a multi-user-accessible yet temporary table
|
|
> > > > 2. Loading data into a table immediately after it is created (i.e. in
|
|
> > > > same transaction), including but not limited to a reload from pg_dump
|
|
> > > > 3. How to load data quickly into an existing table (COPY)
|
|
> > > > 4. How to add/modify data quickly in an existing table (INSERT SELECT,
|
|
> > > > UPDATE)
|
|
>
|
|
> > > However, you then seem to be arguing for still using the COPY LOCK
|
|
> > > syntax, which I think Bruce intended would go away in favor of using
|
|
> > > these ALTER commands. Certainly that's what I'd prefer --- COPY has
|
|
> > > got too darn many options already.
|
|
>
|
|
> COPY LOCK was Tom's suggestion at the end of a long discussion thread on
|
|
> this precise issue. Nobody objected to it at that point; I implemented
|
|
> it *exactly* that way because I wanted to very visibly follow the
|
|
> consensus of the community, after informed debate.
|
|
> http://archives.postgresql.org/pgsql-hackers/2005-06/msg00068.php
|
|
>
|
|
> Please re-read the links to previous discussions.
|
|
> http://archives.postgresql.org/pgsql-hackers/2005-06/msg00069.php
|
|
> There are points there, not made by me, that still apply and need to be
|
|
> considered here, yet have not been.
|
|
|
|
Yes, I know we agreed to the COPY LOCK, but new features now being
|
|
requested, so we have to re-evaluate where we are going with COPY LOCK
|
|
to get a more consistent solution.
|
|
|
|
> Just to restate my current thinking:
|
|
> - agree we should have ALTER TABLE ... RELIABILITY DELETE ROWS
|
|
> - we should have COPY LOCK rather than
|
|
> ALTER TABLE .... RELIABILITY EXCLUSIVE
|
|
> (Though I welcome better wording and syntax in either case; it is the
|
|
> behaviour only that I discuss).
|
|
>
|
|
> It seems now that we have agreed approaches for (1), (2) and (4). Please
|
|
> note that I have listened to the needs of others with regard to
|
|
> requirement (1), as espoused by earlier by Hannu and again now by
|
|
> Martijn. Some of the points about requirement (3) I made in my previous
|
|
> post have not yet been addressed, IMHO.
|
|
>
|
|
> My mind is not fixed. AFAICS there are valid points remaining on both
|
|
> sides of the discussion about loading data quickly into an existing
|
|
> table.
|
|
>
|
|
> > I do think it is valid concern about someone use the table between the
|
|
> > CREATE and the ALTER TABLE RELIABILITY. One solution would be to allow
|
|
> > the RELIABILITY as part of the CREATE TABLE, another is to tell users to
|
|
> > create the table inside a transaction.
|
|
>
|
|
> Neither solution works for this use case:
|
|
>
|
|
> > > 3. How to load data quickly into an existing table (COPY)
|
|
>
|
|
> This is the only use case for which ALTER TABLE ... EXCLUSIVE makes
|
|
> sense. That option means that any write lock held upon the table would
|
|
> be an EXCLUSIVE table lock, so would never be a performance gain with
|
|
> single row INSERT, UPDATE or DELETEs.
|
|
|
|
Ah, but people wanted fast INSERT INTO ... SELECT, and that would use
|
|
EXCLUSIVE too. What about a massive UPDATE? Perhaps that could use
|
|
EXCLUSIVE? We don't want to add "LOCK" to every command that might use
|
|
EXCLUSIVE. ALTER is much better for this.
|
|
|
|
I agree if we thought EXCLUSIVE would only be used for COPY, we could
|
|
use LOCK, but I am thinking it will be used for other commands as well.
|
|
|
|
> Following Andrew's concerns, I'd also note that ALTER TABLE requires a
|
|
> much higher level of privilege to operate than does COPY. That sounds
|
|
> like it will make things more secure, but all it does is open up the
|
|
> administrative rights, since full ownership rights must be obtained
|
|
> merely to load data.
|
|
|
|
True, but as pointed out by others, I don't see that happening too
|
|
often.
|
|
|
|
> > Having COPY behave differently because it is
|
|
> > in a transaction is fine as long as it is user-invisible
|
|
>
|
|
> Good
|
|
>
|
|
> > I think there is great utility in giving users one API, namely
|
|
> > RELIABILITY (or some other keyword), and telling them that is where they
|
|
> > control logging. I realize adding one keyword, LOCK, to an existing
|
|
> > command isn't a big deal, but once you decentralize your API enough
|
|
> > times, you end up with a terribly complex database system. It is this
|
|
> > design rigidity that helps make PostgreSQL so much easier to use than
|
|
> > other database systems.
|
|
>
|
|
> I do see the appeal of your suggestion...
|
|
>
|
|
> TRUNCATE is a special command to delete quickly. There is no requirement
|
|
> to do an ALTER TABLE statement before that command executes.
|
|
|
|
The TRUNCATE happens during recovery. There is no user interaction. It
|
|
happens because we can't restore the contents of the table in a
|
|
consistent state because no logging was used. Basically, a table marked
|
|
RELIABILITY TRUNCATE would be truncated on a recovery start of the
|
|
postmaster.
|
|
|
|
> Balance would suggest that a special command to load data quickly would
|
|
> be reasonably accepted by users.
|
|
>
|
|
>
|
|
>
|
|
>
|
|
> Minor points below:
|
|
>
|
|
> > > > In the patch, pg_dump has *not* been altered to use COPY LOCK, so a
|
|
> > > > pg_dump *will* work with any other version of PostgreSQL, which *would
|
|
> > > > not* be the case if we added ALTER TABLE ... RELIABILITY statements into
|
|
> > > > it.
|
|
> > >
|
|
> > > Wrong --- the good thing about ALTER TABLE is that an old version of
|
|
> > > Postgres would simply reject it and keep going. Therefore we could get
|
|
> > > the speedup in dumps without losing compatibility, which is not true
|
|
> > > of COPY LOCK.
|
|
>
|
|
> That was pointing out one of Bruce's objections was not relevant because
|
|
> it assumed COPY LOCK was required to make pg_restore go faster; that was
|
|
> not the case - so there is no valid objection either way now.
|
|
|
|
I don't consider the single-transaction to be a no-cost solution. You
|
|
are adding flags to commands, and you are using a dump layout for
|
|
performance where the purpose for the layout is not clear. The ALTER is
|
|
clear to the user, and it allows nologging operations to happen after
|
|
the table is created.
|
|
|
|
In fact, for use in pg_dump, I think DROP is the proper operation for
|
|
loading, not your transaction wrapping solution. We already agree we
|
|
need DROP (or TRUNCATE), so why not use that rather than the transaction
|
|
wrap idea?
|
|
|
|
> > > BTW, this is a perfect example of the use-case for not abandoning a
|
|
> > > dump-file load simply because one command fails. (We have relied on
|
|
> > > this sort of reasoning many times before, too, for example by using
|
|
> > > "SET default_with_oids" in preference to CREATE TABLE WITH/WITHOUT OIDS.)
|
|
> > > I don't think that "wrap the whole load into begin/end" is really a very
|
|
> > > workable answer, because there are far too many scenarios where you
|
|
> > > can't do that. Another one where it doesn't help is a data-only dump.
|
|
>
|
|
> Which is why --single-transaction is not the default, per the earlier
|
|
> discussion on that point (on -patches).
|
|
|
|
Right, but why not use DROP/TRUNCATE? That works for old dumps too, and
|
|
has no downsides, meaning it can be always on.
|
|
|
|
> > Yep, Tom is echoing my reaction. There is a temptation to add things up
|
|
> > onto existing commands, e.g. LOCK, and while it works, it makes for some
|
|
> > very complex user API's. Having COPY behave differently because it is
|
|
> > in a transaction is fine as long as it is user-invisible, but once you
|
|
> > require users to do that to get the speedup, it isn't user-invisible
|
|
> > anymore.
|
|
> >
|
|
> > (I can see it now, "Why is pg_dump putting things in transactions?",
|
|
> > "Because it prevents it from being logged." "Oh, should I be doing that
|
|
> > in my code?" "Perhaps, if you want ..." You can see where that
|
|
> > discussion is going. Having them see "ATER TABLE ... RELIBILITY
|
|
> > TRUNCATE" is very clear, and very clear on how it can be used in user
|
|
> > code.)
|
|
>
|
|
> The above case is not an argument against COPY LOCK. Exactly what you
|
|
> say above would still occur even when we have ALTER TABLE ...
|
|
> RELIABILITY statement, since COPY LOCK and
|
|
> COPY-optimized-within-same-transaction are different things.
|
|
|
|
See my posting above that we might want EXCLUSIVE for other commands,
|
|
meaning ALTER makes more sense.
|
|
|
|
So, to summarize, I think we should add DROP/TRUNCATE, and use that by
|
|
default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
|
|
for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
|
|
|
|
--
|
|
Bruce Momjian | http://candle.pha.pa.us
|
|
pgman@candle.pha.pa.us | (610) 359-1001
|
|
+ If your life is a hard drive, | 13 Roberts Road
|
|
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 5: don't forget to increase your free space map settings
|
|
|
|
From pgsql-hackers-owner+M78065=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 12:40:48 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
Message-ID: <43B570C9.6060406@dunslane.net>
|
|
Date: Fri, 30 Dec 2005 12:39:21 -0500
|
|
From: Andrew Dunstan <andrew@dunslane.net>
|
|
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.7.12-1.3.1
|
|
X-Accept-Language: en-us, en
|
|
To: Tom Lane <tgl@sss.pgh.pa.us>
|
|
cc: simon@2ndquadrant.com, pgman@candle.pha.pa.us, kleptog@svana.org,
|
|
gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
|
|
pgsql-hackers@postgresql.org
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
References: <1135948152.2862.113.camel@localhost.localdomain> <56737.68.143.134.146.1135954413.squirrel@www.dunslane.net> <11876.1135954626@sss.pgh.pa.us>
|
|
In-Reply-To: <11876.1135954626@sss.pgh.pa.us>
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.041 required=5 tests=[AWL=0.041]
|
|
X-Spam-Score: 0.041
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 1815
|
|
|
|
|
|
|
|
Tom Lane wrote:
|
|
|
|
>"Andrew Dunstan" <andrew@dunslane.net> writes:
|
|
>
|
|
>
|
|
>>Simon Riggs said:
|
|
>>
|
|
>>
|
|
>>>Following Andrew's concerns, I'd also note that ALTER TABLE requires a
|
|
>>>much higher level of privilege to operate than does COPY. That sounds
|
|
>>>like it will make things more secure, but all it does is open up the
|
|
>>>administrative rights, since full ownership rights must be obtained
|
|
>>>merely to load data.
|
|
>>>
|
|
>>>
|
|
>
|
|
>
|
|
>
|
|
>>My concern is more about making plain that this is for special operations,
|
|
>>not normal operations. Or maybe I have misunderstood the purpose.
|
|
>>
|
|
>>
|
|
>
|
|
>Rephrase that as "full ownership rights must be obtained to load data in
|
|
>a way that requires dropping any existing indexes and locking out other
|
|
>users of the table". I don't think the use-case for this will be very
|
|
>large for non-owners, or indeed even for owners except during initial
|
|
>table creation; and so I don't think the above argument is strong.
|
|
>
|
|
>
|
|
>
|
|
>
|
|
|
|
Those restrictions aren't true of Bruce's proposed drop and
|
|
delete/truncate recovery modes, are they?
|
|
|
|
People do crazy things in pursuit of performance. Illustration: a few
|
|
months ago I was instrumenting an app (based on MySQL/ISAM) and I
|
|
noticed that under load it simply didn't update the inventory properly -
|
|
of 1000 orders placed within a few seconds it might reduce inventory by
|
|
3 or 4. I reported this and they shrugged their shoulders and said
|
|
"well, we'd have to lock the table and that would slow everything down
|
|
...".
|
|
|
|
I just want to be sure we aren't providing a footgun. "Oh, just set
|
|
recovery mode to delete. It won't make any difference unless you crash
|
|
and you'll run faster."
|
|
|
|
cheers
|
|
|
|
andrew
|
|
|
|
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 5: don't forget to increase your free space map settings
|
|
|
|
From pgsql-hackers-owner+M78066=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 12:58:52 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
Message-ID: <200512301758.jBUHwFv03107@candle.pha.pa.us>
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
In-Reply-To: <43B570C9.6060406@dunslane.net>
|
|
To: Andrew Dunstan <andrew@dunslane.net>
|
|
Date: Fri, 30 Dec 2005 12:58:15 -0500 (EST)
|
|
cc: Tom Lane <tgl@sss.pgh.pa.us>, simon@2ndquadrant.com, kleptog@svana.org,
|
|
gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
|
|
pgsql-hackers@postgresql.org
|
|
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.12 required=5 tests=[AWL=0.120]
|
|
X-Spam-Score: 0.12
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 1996
|
|
|
|
Andrew Dunstan wrote:
|
|
> >>My concern is more about making plain that this is for special operations,
|
|
> >>not normal operations. Or maybe I have misunderstood the purpose.
|
|
> >>
|
|
> >>
|
|
> >
|
|
> >Rephrase that as "full ownership rights must be obtained to load data in
|
|
> >a way that requires dropping any existing indexes and locking out other
|
|
> >users of the table". I don't think the use-case for this will be very
|
|
> >large for non-owners, or indeed even for owners except during initial
|
|
> >table creation; and so I don't think the above argument is strong.
|
|
> >
|
|
> >
|
|
> >
|
|
> >
|
|
>
|
|
> Those restrictions aren't true of Bruce's proposed drop and
|
|
> delete/truncate recovery modes, are they?
|
|
|
|
Only the owner could do the ALTER, for sure, but once the owner sets it,
|
|
any user with permission to write to the table would have those
|
|
characteristics.
|
|
|
|
> People do crazy things in pursuit of performance. Illustration: a few
|
|
> months ago I was instrumenting an app (based on MySQL/ISAM) and I
|
|
> noticed that under load it simply didn't update the inventory properly -
|
|
> of 1000 orders placed within a few seconds it might reduce inventory by
|
|
> 3 or 4. I reported this and they shrugged their shoulders and said
|
|
> "well, we'd have to lock the table and that would slow everything down
|
|
> ...".
|
|
>
|
|
> I just want to be sure we aren't providing a footgun. "Oh, just set
|
|
> recovery mode to delete. It won't make any difference unless you crash
|
|
> and you'll run faster."
|
|
|
|
I think we have to trust the object owner in this case. I don't know of
|
|
any super-user-only ALTER commands, but I suppose we could set it up
|
|
that way if we wanted.
|
|
|
|
--
|
|
Bruce Momjian | http://candle.pha.pa.us
|
|
pgman@candle.pha.pa.us | (610) 359-1001
|
|
+ If your life is a hard drive, | 13 Roberts Road
|
|
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 2: Don't 'kill -9' the postmaster
|
|
|
|
From pgsql-hackers-owner+M78070=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 14:29:06 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
From: Simon Riggs <simon@2ndquadrant.com>
|
|
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>,
|
|
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
|
|
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
|
|
pgsql-hackers@postgresql.org
|
|
In-Reply-To: <200512301649.jBUGnxn21488@candle.pha.pa.us>
|
|
References: <200512301649.jBUGnxn21488@candle.pha.pa.us>
|
|
Date: Fri, 30 Dec 2005 19:28:41 +0000
|
|
Message-ID: <1135970921.5052.68.camel@localhost.localdomain>
|
|
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.034 required=5 tests=[AWL=0.034]
|
|
X-Spam-Score: 0.034
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 3112
|
|
|
|
On Fri, 2005-12-30 at 11:49 -0500, Bruce Momjian wrote:
|
|
|
|
> Yes, I know we agreed to the COPY LOCK, but new features now being
|
|
> requested, so we have to re-evaluate where we are going with COPY LOCK
|
|
> to get a more consistent solution.
|
|
|
|
Thank you.
|
|
|
|
> Ah, but people wanted fast INSERT INTO ... SELECT, and that would use
|
|
> EXCLUSIVE too. What about a massive UPDATE? Perhaps that could use
|
|
> EXCLUSIVE? We don't want to add "LOCK" to every command that might use
|
|
> EXCLUSIVE. ALTER is much better for this.
|
|
|
|
> I agree if we thought EXCLUSIVE would only be used for COPY, we could
|
|
> use LOCK, but I am thinking it will be used for other commands as well.
|
|
|
|
Agreed, I will look to implement this.
|
|
|
|
Could the internals of my recent patch be reviewed? Changing the user
|
|
interface is less of a problem than changing the internals, which is
|
|
where the hard work takes place. I do not want to extend this work
|
|
further only to have that part rejected later.
|
|
|
|
The implications of EXCLUSIVE are:
|
|
- there will be a check on each and every I, U, D to check the state of
|
|
the relation
|
|
- *every* operation that attempts a write lock will attempt to acquire
|
|
an EXCLUSIVE full table lock instead
|
|
- following successful completion of *each* DML statement, the relation
|
|
will be heap_sync'd involving a full scan of the buffer cache
|
|
|
|
Can I clarify the wording of the syntax? Is EXCLUSIVE the right word?
|
|
How about FASTLOAD or BULKLOAD? Those words seem less likely to be
|
|
misused in the future - i.e. we are invoking a special mode, rather than
|
|
invoking a special "go faster" option.
|
|
|
|
> I don't consider the single-transaction to be a no-cost solution. You
|
|
> are adding flags to commands, and you are using a dump layout for
|
|
> performance where the purpose for the layout is not clear. The ALTER is
|
|
> clear to the user, and it allows nologging operations to happen after
|
|
> the table is created.
|
|
>
|
|
> In fact, for use in pg_dump, I think DROP is the proper operation for
|
|
> loading, not your transaction wrapping solution. We already agree we
|
|
> need DROP (or TRUNCATE), so why not use that rather than the transaction
|
|
> wrap idea?
|
|
|
|
This was discussed on-list by 2 core team members, a committer and
|
|
myself, but I see no requirements change here. You even accepted the
|
|
invisible COPY optimization in your last post - why unpick that now?
|
|
Please forgive my tone, but I am lost for reasonable yet expressive
|
|
words.
|
|
|
|
The --single-transaction mode would apply even if the dump was created
|
|
using an earlier version of pg_dump. pg_dump has *not* been altered at
|
|
all. (And I would again add that the idea was not my own)
|
|
|
|
> So, to summarize, I think we should add DROP/TRUNCATE, and use that by
|
|
> default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
|
|
> for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
|
|
|
|
Would you mind stating again what you mean, just so I can understand
|
|
this? Your summary isn't enough.
|
|
|
|
Best Regards, Simon Riggs
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 3: Have you checked our extensive FAQ?
|
|
|
|
http://www.postgresql.org/docs/faq
|
|
|
|
From pgsql-hackers-owner+M78072=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 16:15:30 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
Message-ID: <200512302114.jBULEno02301@candle.pha.pa.us>
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
In-Reply-To: <1135970921.5052.68.camel@localhost.localdomain>
|
|
To: Simon Riggs <simon@2ndquadrant.com>
|
|
Date: Fri, 30 Dec 2005 16:14:49 -0500 (EST)
|
|
cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>,
|
|
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
|
|
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
|
|
pgsql-hackers@postgresql.org
|
|
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.12 required=5 tests=[AWL=0.120]
|
|
X-Spam-Score: 0.12
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 7285
|
|
|
|
Simon Riggs wrote:
|
|
> On Fri, 2005-12-30 at 11:49 -0500, Bruce Momjian wrote:
|
|
>
|
|
> > Yes, I know we agreed to the COPY LOCK, but new features now being
|
|
> > requested, so we have to re-evaluate where we are going with COPY LOCK
|
|
> > to get a more consistent solution.
|
|
>
|
|
> Thank you.
|
|
|
|
Good. I think we can be happy that COPY LOCK didn't get into a release,
|
|
so we don't have to support it forever. When we are adding features, we
|
|
have to consider not only the current release, but future releases and
|
|
what people will ask for in the future so the syntax can be expanded
|
|
without breaking previous usage.
|
|
|
|
> > Ah, but people wanted fast INSERT INTO ... SELECT, and that would use
|
|
> > EXCLUSIVE too. What about a massive UPDATE? Perhaps that could use
|
|
> > EXCLUSIVE? We don't want to add "LOCK" to every command that might use
|
|
> > EXCLUSIVE. ALTER is much better for this.
|
|
>
|
|
> > I agree if we thought EXCLUSIVE would only be used for COPY, we could
|
|
> > use LOCK, but I am thinking it will be used for other commands as well.
|
|
>
|
|
> Agreed, I will look to implement this.
|
|
>
|
|
> Could the internals of my recent patch be reviewed? Changing the user
|
|
> interface is less of a problem than changing the internals, which is
|
|
> where the hard work takes place. I do not want to extend this work
|
|
> further only to have that part rejected later.
|
|
|
|
OK, I will look it over this week or next.
|
|
|
|
> The implications of EXCLUSIVE are:
|
|
> - there will be a check on each and every I, U, D to check the state of
|
|
> the relation
|
|
> - *every* operation that attempts a write lock will attempt to acquire
|
|
> an EXCLUSIVE full table lock instead
|
|
> - following successful completion of *each* DML statement, the relation
|
|
> will be heap_sync'd involving a full scan of the buffer cache
|
|
|
|
Yes, I think that is it. What we can do is implement EXCLUSIVE to
|
|
affect only COPY at this point, and document that, and later add other
|
|
commands.
|
|
|
|
> Can I clarify the wording of the syntax? Is EXCLUSIVE the right word?
|
|
> How about FASTLOAD or BULKLOAD? Those words seem less likely to be
|
|
> misused in the future - i.e. we are invoking a special mode, rather than
|
|
> invoking a special "go faster" option.
|
|
|
|
The problem with the FASTLOAD/BULKLOAD words is that EXCLUSIVE mode is
|
|
probably not the best for loading. I would think TRUNCATE would be a
|
|
better option.
|
|
|
|
In fact, in loading a table, I think both EXCLUSIVE and TRUNCATE would be
|
|
the same, mostly. You would create the table, set its RELIABILITY to
|
|
TRUNCATE, COPY into the table, then set the RELIABILITY to SHARE or
|
|
DEFAULT. The second ALTER has to sync all the dirty data blocks, which
|
|
the same thing EXCLUSIVE does at the conclusion of COPY.
|
|
|
|
So, we need a name for EXCLUSIVE mode that suggests how it is different
|
|
from TRUNCATE, and in this case, the difference is that EXCLUSIVE
|
|
preserves the previous contents of the table on recovery, while TRUNCATE
|
|
does not. Do you want to call the mode PRESERVE, or EXCLUSIVE WRITER?
|
|
Anyway, the keywords are easy to modify, even after the patch is
|
|
submitted. FYI, I usually go through keywords.c looking for a keyword
|
|
we already use.
|
|
|
|
> > I don't consider the single-transaction to be a no-cost solution. You
|
|
> > are adding flags to commands, and you are using a dump layout for
|
|
> > performance where the purpose for the layout is not clear. The ALTER is
|
|
> > clear to the user, and it allows nologging operations to happen after
|
|
> > the table is created.
|
|
> >
|
|
> > In fact, for use in pg_dump, I think DROP is the proper operation for
|
|
> > loading, not your transaction wrapping solution. We already agree we
|
|
> > need DROP (or TRUNCATE), so why not use that rather than the transaction
|
|
> > wrap idea?
|
|
>
|
|
> This was discussed on-list by 2 core team members, a committer and
|
|
> myself, but I see no requirements change here. You even accepted the
|
|
> invisible COPY optimization in your last post - why unpick that now?
|
|
> Please forgive my tone, but I am lost for reasonable yet expressive
|
|
> words.
|
|
|
|
Do you think you are the only one who has rewritten a patch multiple
|
|
times? We all have. The goal is to get the functionality into the
|
|
system in the most seamless way possible. Considering the number of
|
|
people who use PostgreSQL, if it takes use 10 tries, it is worth it
|
|
considering the thousands of people who will use it. Would you have us
|
|
include a sub-optimal patch and have thousands of people adjust to its
|
|
non-optimal functionality? I am sure you would not. Perhaps a company
|
|
would say, "Oh, just ship it", but we don't.
|
|
|
|
> The --single-transaction mode would apply even if the dump was created
|
|
> using an earlier version of pg_dump. pg_dump has *not* been altered at
|
|
> all. (And I would again add that the idea was not my own)
|
|
|
|
I assume you mean this:
|
|
|
|
http://archives.postgresql.org/pgsql-patches/2005-12/msg00257.php
|
|
|
|
I guess with the ALTER commands I don't see much value in the
|
|
--single-transaction flag. I am sure others suggested it, but would
|
|
they suggest it now given our current direction. The fact that the
|
|
patch was submitted does not give it any more weight --- the question is
|
|
does this feature make sense for 8.2. The goal is not to cram as many
|
|
optimizations into PostgreSQL as possible, the goal is to present a
|
|
consistent usable system to users.
|
|
|
|
> > So, to summarize, I think we should add DROP/TRUNCATE, and use that by
|
|
> > default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
|
|
> > for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
|
|
>
|
|
> Would you mind stating again what you mean, just so I can understand
|
|
> this? Your summary isn't enough.
|
|
|
|
New ALTER TABLE mode, perhaps call it PERSISTENCE:
|
|
|
|
ALTER TABLE tab PERSISTENCE DROP ON RECOVERY
|
|
ALTER TABLE tab PERSISTENCE TRUNCATE ON RECOVERY
|
|
|
|
These would drop or truncate all tables with this flag on a non-clean
|
|
start of the postmaster, and write something in the server logs.
|
|
However, I don't know that we have the code in place to DROP/TRUNCATE in
|
|
recovery mode, and it would affect all databases, so it could be quite
|
|
complex to implement. In this mode, no WAL logs would be written for
|
|
table modifications, though DDL commands would have to be logged.
|
|
|
|
ALTER TABLE tab PERSISTENCE PRESERVE (or STABLE?)
|
|
|
|
Table contents are preserved across recoveries, but data modifications
|
|
can happen only one at a time. I don't think we have a lock mode that
|
|
does this, so I am worried a new lock mode will have to be created. A
|
|
simplified solution at this stage would be to take an exclusive lock on
|
|
the table, but really we just need a single-writer table lock, which I
|
|
don't think we have. initially this can implemented to only affect COPY
|
|
but later can be done for other commands.
|
|
|
|
ALTER TABLE tab PERSISTENCE DEFAULT
|
|
|
|
This would be our current default mode, which is full concurrency and
|
|
persistence.
|
|
|
|
It took me over an hour to write this, but I feel the time is worth it
|
|
because of the number of users who use our software.
|
|
|
|
--
|
|
Bruce Momjian | http://candle.pha.pa.us
|
|
pgman@candle.pha.pa.us | (610) 359-1001
|
|
+ If your life is a hard drive, | 13 Roberts Road
|
|
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 2: Don't 'kill -9' the postmaster
|
|
|
|
From pgsql-hackers-owner+M78076=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 17:37:00 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
cc: Simon Riggs <simon@2ndquadrant.com>, Andrew Dunstan <andrew@dunslane.net>,
|
|
Tom Lane <tgl@sss.pgh.pa.us>, Martijn van Oosterhout <kleptog@svana.org>,
|
|
Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
|
|
Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
References: <200512302114.jBULEno02301@candle.pha.pa.us>
|
|
In-Reply-To: <200512302114.jBULEno02301@candle.pha.pa.us>
|
|
From: Greg Stark <gsstark@mit.edu>
|
|
Organization: The Emacs Conspiracy; member since 1992
|
|
Date: 30 Dec 2005 17:36:24 -0500
|
|
Message-ID: <87mzii8b6f.fsf@stark.xeocode.com>
|
|
Lines: 28
|
|
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.4
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.113 required=5 tests=[AWL=0.113]
|
|
X-Spam-Score: 0.113
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 1424
|
|
|
|
|
|
As far as EXCLUSIVE or COPY LOCK goes, I think this would be useful
|
|
functionality but perhaps there doesn't have to be any proprietary user
|
|
interface to it at all. Why not just check if the conditions are already
|
|
present to allow the optimization and if so go ahead.
|
|
|
|
That is, if the current transaction already has an exclusive lock on the table
|
|
and there are no indexes (and PITR isn't active) then Postgres could go ahead
|
|
and use the same WAL skipping logic as the other operations that already so
|
|
so. This would work for inserts whether coming from COPY or plain SQL INSERTs.
|
|
|
|
The nice thing about this is that the user's SQL wouldn't need any proprietary
|
|
extensions at all. Just tell people to do
|
|
|
|
BEGIN;
|
|
LOCK TABLE foo;
|
|
COPY foo from ...
|
|
COMMIT;
|
|
|
|
There could be a COPY LOCK option to obtain a lock, but it would be purely for
|
|
user convenience so they don't have to bother with BEGIN and COMMIt.
|
|
|
|
The only downside is a check to see if an exclusive table lock is present on
|
|
every copy and insert. That might be significant but perhaps there are ways to
|
|
finess that. If not perhaps only doing it on COPY would be a good compromise.
|
|
|
|
--
|
|
greg
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 1: if posting/reading through Usenet, please send an appropriate
|
|
subscribe-nomail command to majordomo@postgresql.org so that your
|
|
message can get through to the mailing list cleanly
|
|
|
|
From pgsql-hackers-owner+M78077=pgman=candle.pha.pa.us@postgresql.org Fri Dec 30 17:47:18 2005
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
Message-ID: <200512302246.jBUMkjF25196@candle.pha.pa.us>
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
In-Reply-To: <87mzii8b6f.fsf@stark.xeocode.com>
|
|
To: Greg Stark <gsstark@mit.edu>
|
|
Date: Fri, 30 Dec 2005 17:46:45 -0500 (EST)
|
|
cc: Simon Riggs <simon@2ndquadrant.com>, Andrew Dunstan <andrew@dunslane.net>,
|
|
Tom Lane <tgl@sss.pgh.pa.us>, Martijn van Oosterhout <kleptog@svana.org>,
|
|
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
|
|
pgsql-hackers@postgresql.org
|
|
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.12 required=5 tests=[AWL=0.120]
|
|
X-Spam-Score: 0.12
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 2135
|
|
|
|
Greg Stark wrote:
|
|
>
|
|
> As far as EXCLUSIVE or COPY LOCK goes, I think this would be useful
|
|
> functionality but perhaps there doesn't have to be any proprietary user
|
|
> interface to it at all. Why not just check if the conditions are already
|
|
> present to allow the optimization and if so go ahead.
|
|
>
|
|
> That is, if the current transaction already has an exclusive lock on the table
|
|
> and there are no indexes (and PITR isn't active) then Postgres could go ahead
|
|
> and use the same WAL skipping logic as the other operations that already so
|
|
> so. This would work for inserts whether coming from COPY or plain SQL INSERTs.
|
|
>
|
|
> The nice thing about this is that the user's SQL wouldn't need any proprietary
|
|
> extensions at all. Just tell people to do
|
|
>
|
|
> BEGIN;
|
|
> LOCK TABLE foo;
|
|
> COPY foo from ...
|
|
> COMMIT;
|
|
>
|
|
> There could be a COPY LOCK option to obtain a lock, but it would be purely for
|
|
> user convenience so they don't have to bother with BEGIN and COMMIt.
|
|
>
|
|
> The only downside is a check to see if an exclusive table lock is present on
|
|
> every copy and insert. That might be significant but perhaps there are ways to
|
|
> finess that. If not perhaps only doing it on COPY would be a good compromise.
|
|
|
|
Well, again, if we wanted to use EXCLUSIVE only for COPY, this might
|
|
make sense. However, also consider that the idea for EXCLUSIVE was that
|
|
users could continue read-only queries on the table while it is being
|
|
loaded (like COPY allows now), and that in EXCLUSIVE mode, we are only
|
|
going to write into new pages.
|
|
|
|
If someone has an exclusive lock on the table and does a COPY or SELECT
|
|
INTO do we want to assume we are only going to write into new pages, and
|
|
do we want to force an exclusive lock rather than a single-writer lock?
|
|
I don't think so.
|
|
|
|
--
|
|
Bruce Momjian | http://candle.pha.pa.us
|
|
pgman@candle.pha.pa.us | (610) 359-1001
|
|
+ If your life is a hard drive, | 13 Roberts Road
|
|
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 6: explain analyze is your friend
|
|
|
|
From mpaesold@gmx.at Sat Dec 31 06:59:51 2005
|
|
Date: Sat, 31 Dec 2005 12:59:44 +0100 (MET)
|
|
From: Michael Paesold <mpaesold@gmx.at>
|
|
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
cc: simon@2ndquadrant.com, andrew@dunslane.net, tgl@sss.pgh.pa.us,
|
|
kleptog@svana.org, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
|
|
pgsql-hackers@postgresql.org
|
|
References: <200512302114.jBULEno02301@candle.pha.pa.us>
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
X-Priority: 3 (Normal)
|
|
X-Authenticated: #1946847
|
|
Message-ID: <14969.1136030384@www6.gmx.net>
|
|
X-Mailer: WWW-Mail 1.6 (Global Message Exchange)
|
|
X-Flags: 0001
|
|
Content-Length: 1305
|
|
|
|
Bruce Momjian wrote:
|
|
|
|
> > The --single-transaction mode would apply even if the dump was created
|
|
> > using an earlier version of pg_dump. pg_dump has *not* been altered at
|
|
> > all. (And I would again add that the idea was not my own)
|
|
>
|
|
> I assume you mean this:
|
|
>
|
|
> http://archives.postgresql.org/pgsql-patches/2005-12/msg00257.php
|
|
>
|
|
> I guess with the ALTER commands I don't see much value in the
|
|
> --single-transaction flag. I am sure others suggested it, but would
|
|
> they suggest it now given our current direction.
|
|
|
|
I just want to add that --single-transaction has a value of it's own. There
|
|
were times when I wanted to restore parts of a dump all-or-nothing.
|
|
|
|
This is possible with PostgreSQL, unlike many other DBM systems, because
|
|
people like Tom Lane have invested in ensuring that all DDL is working
|
|
without implicitly committing an enclosing transaction.
|
|
|
|
Using pg_restore directly into a database, it is not possible to get a
|
|
single transaction right now. One has to restore to a file and manually
|
|
added BEGIN/COMMIT. Just for that I think --single-transaction is a great
|
|
addition and a missing feature.
|
|
|
|
I think more people have a use-case for that.
|
|
|
|
Best Regards,
|
|
Michael Paesold
|
|
|
|
--
|
|
Telefonieren Sie schon oder sparen Sie noch?
|
|
NEU: GMX Phone_Flat http://www.gmx.net/de/go/telefonie
|
|
|
|
From pgsql-hackers-owner+M78213=pgman=candle.pha.pa.us@postgresql.org Tue Jan 3 12:08:43 2006
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
Message-ID: <200601031708.k03H85j27170@candle.pha.pa.us>
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
In-Reply-To: <17173.1136306881@sss.pgh.pa.us>
|
|
To: Tom Lane <tgl@sss.pgh.pa.us>
|
|
Date: Tue, 3 Jan 2006 12:08:05 -0500 (EST)
|
|
cc: Jim C. Nasby <jnasby@pervasive.com>,
|
|
Andrew Dunstan <andrew@dunslane.net>, simon@2ndquadrant.com,
|
|
kleptog@svana.org, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
|
|
pgsql-hackers@postgresql.org
|
|
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.121 required=5 tests=[AWL=0.121]
|
|
X-Spam-Score: 0.121
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 1125
|
|
|
|
Tom Lane wrote:
|
|
> "Jim C. Nasby" <jnasby@pervasive.com> writes:
|
|
> > On Tue, Jan 03, 2006 at 11:26:51AM -0500, Tom Lane wrote:
|
|
> >> Such an ALTER would certainly require exclusive lock on the table,
|
|
> >> so I'm not sure that I see much use-case for doing it like that.
|
|
> >> You'd want to do the ALTER and commit so as not to lock other people
|
|
> >> out of the table entirely while doing the bulk data-pushing.
|
|
>
|
|
> > Maybe this just isn't clear, but would EXCLUSIVE block writes from all
|
|
> > other sessions then?
|
|
>
|
|
> I don't think it should (which implies that EXCLUSIVE is a bad name).
|
|
|
|
Agreed, EXCLUSIVE was used to mean an _exclusive_ writer. The new words
|
|
I proposed were PRESERVE or STABLE.
|
|
|
|
--
|
|
Bruce Momjian | http://candle.pha.pa.us
|
|
pgman@candle.pha.pa.us | (610) 359-1001
|
|
+ If your life is a hard drive, | 13 Roberts Road
|
|
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 4: Have you searched our list archives?
|
|
|
|
http://archives.postgresql.org
|
|
|
|
From tgl@sss.pgh.pa.us Tue Jan 3 12:37:34 2006
|
|
To: Stephen Frost <sfrost@snowman.net>
|
|
cc: Jim C. Nasby <jnasby@pervasive.com>,
|
|
Bruce Momjian <pgman@candle.pha.pa.us>,
|
|
Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org,
|
|
simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
|
|
pgsql-hackers@postgresql.org
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
In-Reply-To: <20060103165359.GP6026@ns.snowman.net>
|
|
References: <200512291605.jBTG5gi00396@candle.pha.pa.us> <7966.1135873468@sss.pgh.pa.us> <20060103154521.GC82560@pervasive.com> <20060103162137.GO6026@ns.snowman.net> <16856.1136305742@sss.pgh.pa.us> <20060103165359.GP6026@ns.snowman.net>
|
|
Comments: In-reply-to Stephen Frost <sfrost@snowman.net>
|
|
message dated "Tue, 03 Jan 2006 11:54:01 -0500"
|
|
Date: Tue, 03 Jan 2006 12:37:32 -0500
|
|
Message-ID: <17841.1136309852@sss.pgh.pa.us>
|
|
From: Tom Lane <tgl@sss.pgh.pa.us>
|
|
Content-Length: 976
|
|
|
|
Stephen Frost <sfrost@snowman.net> writes:
|
|
> The problem is that you might want to grant 'truncate' to people who
|
|
> *aren't* particularly trusted. For truncate, at least I have a
|
|
> real-world use-case for it.
|
|
|
|
I don't find this use-case particularly convincing. If the users are
|
|
allowed to delete all data in a given table, then that table must be
|
|
dedicated to them anyway; so it's not that easy to see why you can't
|
|
risk giving them ownership rights on it. The worst they can do is
|
|
screw up their own data, no?
|
|
|
|
In any case, I don't see what's so wrong with the model of using
|
|
SECURITY DEFINER interface functions when you want a security
|
|
restriction that's finer-grain than the system provides. I really
|
|
*don't* want to see us trying to, say, categorize every variety of
|
|
ALTER TABLE as a separately grantable privilege. I could live with
|
|
something like a catchall "ADMIN" privilege ... except it's not
|
|
clear how that would differ from ownership.
|
|
|
|
regards, tom lane
|
|
|
|
From pgsql-hackers-owner+M78221=pgman=candle.pha.pa.us@postgresql.org Tue Jan 3 13:30:34 2006
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
Date: Tue, 3 Jan 2006 13:30:56 -0500
|
|
From: Stephen Frost <sfrost@snowman.net>
|
|
To: Tom Lane <tgl@sss.pgh.pa.us>
|
|
cc: Jim C. Nasby <jnasby@pervasive.com>,
|
|
Bruce Momjian <pgman@candle.pha.pa.us>,
|
|
Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org,
|
|
simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
|
|
pgsql-hackers@postgresql.org
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
Message-ID: <20060103183056.GR6026@ns.snowman.net>
|
|
Mail-Followup-To: Tom Lane <tgl@sss.pgh.pa.us>,
|
|
"Jim C. Nasby" <jnasby@pervasive.com>,
|
|
Bruce Momjian <pgman@candle.pha.pa.us>,
|
|
Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org,
|
|
simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca,
|
|
zhouqq@cs.toronto.edu, pgsql-hackers@postgresql.org
|
|
References: <200512291605.jBTG5gi00396@candle.pha.pa.us> <7966.1135873468@sss.pgh.pa.us> <20060103154521.GC82560@pervasive.com> <20060103162137.GO6026@ns.snowman.net> <16856.1136305742@sss.pgh.pa.us> <20060103165359.GP6026@ns.snowman.net> <17841.1136309852@sss.pgh.pa.us>
|
|
Content-Disposition: inline
|
|
In-Reply-To: <17841.1136309852@sss.pgh.pa.us>
|
|
X-Editor: Vim http://www.vim.org/
|
|
X-Info: http://www.snowman.net
|
|
X-Operating-System: Linux/2.4.24ns.3.0 (i686)
|
|
X-Uptime: 12:39:16 up 206 days, 9:50, 11 users, load average: 0.02, 0.05, 0.05
|
|
User-Agent: Mutt/1.5.9i
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.105 required=5 tests=[AWL=0.105]
|
|
X-Spam-Score: 0.105
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 2666
|
|
|
|
-- Start of PGP signed section.
|
|
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
|
|
> I don't find this use-case particularly convincing. If the users are
|
|
> allowed to delete all data in a given table, then that table must be
|
|
> dedicated to them anyway; so it's not that easy to see why you can't
|
|
> risk giving them ownership rights on it. The worst they can do is
|
|
> screw up their own data, no?
|
|
|
|
Being able to delete all data in a given table in no way implies
|
|
ownership rights. The tables are part of a specification which the
|
|
users are being asked to respond to. Being able to change the table
|
|
types or remove the constraints put on the tables would allow the
|
|
users to upload garbage which would then affect downstream processing.
|
|
|
|
We can't guarentee this won't happen anyway but we try to confine the
|
|
things they can mess up to a reasonable set which we can check for (and
|
|
do, through a rather involved error checking system). There are *alot*
|
|
of things built on top of the table structures and having them change
|
|
would basically break the whole system (without the appropriate changes
|
|
being made to the other parts of the system).
|
|
|
|
> In any case, I don't see what's so wrong with the model of using
|
|
> SECURITY DEFINER interface functions when you want a security
|
|
> restriction that's finer-grain than the system provides. I really
|
|
> *don't* want to see us trying to, say, categorize every variety of
|
|
> ALTER TABLE as a separately grantable privilege. I could live with
|
|
> something like a catchall "ADMIN" privilege ... except it's not
|
|
> clear how that would differ from ownership.
|
|
|
|
I don't think anyone's asked for 'ALTER TABLE' privileges to be
|
|
seperately grantable. It seems to me that the privileges which *need*
|
|
to be grantable are ones associated with DML statements. I would
|
|
classify TRUNCATE, VACUUM and ANALYZE as DML statements (along with
|
|
select, insert, update, and delete). They're PostgreSQL-specific DML
|
|
statements but they still fall into that category. I don't think
|
|
it's a coincidence that the SQL-defined DML statements are all,
|
|
individually, grantable.
|
|
|
|
That doesn't mean I think we should get rid of RULE, REFERENCES or
|
|
TRIGGER, though honestly I've very rarely needed to grant any of them
|
|
(I don't think I've ever granted RULE or TRIGGER...). References is
|
|
DDL-oriented, but for *other* tables; RULE and TRIGGER are DDL and I
|
|
can't really justify why someone other than the owner would need them
|
|
but I'm guessing someone's using them. I don't think their existance
|
|
should imply that if we ever change the grants again we have to include
|
|
all types of 'ALTER TABLE', etc, though.
|
|
|
|
Thanks,
|
|
|
|
Stephen
|
|
-- End of PGP section, PGP failed!
|
|
|
|
From sfrost@snowman.net Tue Jan 3 13:30:13 2006
|
|
Date: Tue, 3 Jan 2006 13:30:56 -0500
|
|
From: Stephen Frost <sfrost@snowman.net>
|
|
To: Tom Lane <tgl@sss.pgh.pa.us>
|
|
cc: Jim C. Nasby <jnasby@pervasive.com>,
|
|
Bruce Momjian <pgman@candle.pha.pa.us>,
|
|
Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org,
|
|
simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
|
|
pgsql-hackers@postgresql.org
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
Message-ID: <20060103183056.GR6026@ns.snowman.net>
|
|
Mail-Followup-To: Tom Lane <tgl@sss.pgh.pa.us>,
|
|
"Jim C. Nasby" <jnasby@pervasive.com>,
|
|
Bruce Momjian <pgman@candle.pha.pa.us>,
|
|
Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org,
|
|
simon@2ndquadrant.com, gsstark@mit.edu, pg@rbt.ca,
|
|
zhouqq@cs.toronto.edu, pgsql-hackers@postgresql.org
|
|
References: <200512291605.jBTG5gi00396@candle.pha.pa.us> <7966.1135873468@sss.pgh.pa.us> <20060103154521.GC82560@pervasive.com> <20060103162137.GO6026@ns.snowman.net> <16856.1136305742@sss.pgh.pa.us> <20060103165359.GP6026@ns.snowman.net> <17841.1136309852@sss.pgh.pa.us>
|
|
Content-Disposition: inline
|
|
In-Reply-To: <17841.1136309852@sss.pgh.pa.us>
|
|
X-Editor: Vim http://www.vim.org/
|
|
X-Info: http://www.snowman.net
|
|
X-Operating-System: Linux/2.4.24ns.3.0 (i686)
|
|
X-Uptime: 12:39:16 up 206 days, 9:50, 11 users, load average: 0.02, 0.05, 0.05
|
|
User-Agent: Mutt/1.5.9i
|
|
Content-Length: 2666
|
|
|
|
-- Start of PGP signed section.
|
|
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
|
|
> I don't find this use-case particularly convincing. If the users are
|
|
> allowed to delete all data in a given table, then that table must be
|
|
> dedicated to them anyway; so it's not that easy to see why you can't
|
|
> risk giving them ownership rights on it. The worst they can do is
|
|
> screw up their own data, no?
|
|
|
|
Being able to delete all data in a given table in no way implies
|
|
ownership rights. The tables are part of a specification which the
|
|
users are being asked to respond to. Being able to change the table
|
|
types or remove the constraints put on the tables would allow the
|
|
users to upload garbage which would then affect downstream processing.
|
|
|
|
We can't guarentee this won't happen anyway but we try to confine the
|
|
things they can mess up to a reasonable set which we can check for (and
|
|
do, through a rather involved error checking system). There are *alot*
|
|
of things built on top of the table structures and having them change
|
|
would basically break the whole system (without the appropriate changes
|
|
being made to the other parts of the system).
|
|
|
|
> In any case, I don't see what's so wrong with the model of using
|
|
> SECURITY DEFINER interface functions when you want a security
|
|
> restriction that's finer-grain than the system provides. I really
|
|
> *don't* want to see us trying to, say, categorize every variety of
|
|
> ALTER TABLE as a separately grantable privilege. I could live with
|
|
> something like a catchall "ADMIN" privilege ... except it's not
|
|
> clear how that would differ from ownership.
|
|
|
|
I don't think anyone's asked for 'ALTER TABLE' privileges to be
|
|
seperately grantable. It seems to me that the privileges which *need*
|
|
to be grantable are ones associated with DML statements. I would
|
|
classify TRUNCATE, VACUUM and ANALYZE as DML statements (along with
|
|
select, insert, update, and delete). They're PostgreSQL-specific DML
|
|
statements but they still fall into that category. I don't think
|
|
it's a coincidence that the SQL-defined DML statements are all,
|
|
individually, grantable.
|
|
|
|
That doesn't mean I think we should get rid of RULE, REFERENCES or
|
|
TRIGGER, though honestly I've very rarely needed to grant any of them
|
|
(I don't think I've ever granted RULE or TRIGGER...). References is
|
|
DDL-oriented, but for *other* tables; RULE and TRIGGER are DDL and I
|
|
can't really justify why someone other than the owner would need them
|
|
but I'm guessing someone's using them. I don't think their existance
|
|
should imply that if we ever change the grants again we have to include
|
|
all types of 'ALTER TABLE', etc, though.
|
|
|
|
Thanks,
|
|
|
|
Stephen
|
|
-- End of PGP section, PGP failed!
|
|
|
|
From pgsql-hackers-owner+M78233=pgman=candle.pha.pa.us@postgresql.org Tue Jan 3 17:39:06 2006
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
Message-ID: <200601032238.k03McP804163@candle.pha.pa.us>
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
In-Reply-To: <20060103212750.GT82560@pervasive.com>
|
|
To: Jim C. Nasby <jnasby@pervasive.com>
|
|
Date: Tue, 3 Jan 2006 17:38:25 -0500 (EST)
|
|
cc: Tom Lane <tgl@sss.pgh.pa.us>, Andrew Dunstan <andrew@dunslane.net>,
|
|
simon@2ndquadrant.com, kleptog@svana.org, gsstark@mit.edu, pg@rbt.ca,
|
|
zhouqq@cs.toronto.edu, pgsql-hackers@postgresql.org
|
|
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.121 required=5 tests=[AWL=0.121]
|
|
X-Spam-Score: 0.121
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 1714
|
|
|
|
Jim C. Nasby wrote:
|
|
> > We would be creating a new lock type for this.
|
|
>
|
|
> Sorry if I've just missed this in the thread, but what would the new
|
|
> lock type do? My impression is that as it stands you can either do:
|
|
>
|
|
> BEGIN;
|
|
> ALTER TABLE EXCLUSIVE;
|
|
> ...
|
|
> ALTER TABLE SHARE; --fsync
|
|
> COMMIT;
|
|
>
|
|
> Which would block all other access to the table as soon as the first
|
|
> ALTER TABLE happens. Or you can:
|
|
>
|
|
> ALTER TABLE EXCLUSIVE;
|
|
> ...
|
|
> ALTER TABLE SHARE;
|
|
>
|
|
> Which means that between the two ALTER TABLES every backend that does
|
|
> DML on that table will not have that DML logged, but because there's no
|
|
> exclusive lock that DML would be allowed to occur.
|
|
|
|
Right, the DML will be single-threaded and fsync of all dirty pages will
|
|
happen before commit of each transaction.
|
|
|
|
> BTW, there might be some usecase for the second scenario, in which case
|
|
> it would probably be better to tell the user to aquire a table-lock on
|
|
> their own rather than do it automatically as part of the update...
|
|
|
|
> > Basically meaning your idea of update while EXCLUSIVE/PRESERVE/STABLE is
|
|
> > happening is never going to be implemented because it is just too hard
|
|
> > to do, and too prone to error.
|
|
>
|
|
> What I figured. Never hurts to ask though. :)
|
|
|
|
Actually, it does hurt because it generates discussion volume for no
|
|
purpose.
|
|
|
|
--
|
|
Bruce Momjian | http://candle.pha.pa.us
|
|
pgman@candle.pha.pa.us | (610) 359-1001
|
|
+ If your life is a hard drive, | 13 Roberts Road
|
|
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 5: don't forget to increase your free space map settings
|
|
|
|
From pgsql-hackers-owner+M78234=pgman=candle.pha.pa.us@postgresql.org Tue Jan 3 17:54:16 2006
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
From: Simon Riggs <simon@2ndquadrant.com>
|
|
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>,
|
|
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
|
|
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
|
|
pgsql-hackers@postgresql.org
|
|
In-Reply-To: <200512302114.jBULEno02301@candle.pha.pa.us>
|
|
References: <200512302114.jBULEno02301@candle.pha.pa.us>
|
|
Date: Tue, 03 Jan 2006 22:53:53 +0000
|
|
Message-ID: <1136328833.5052.223.camel@localhost.localdomain>
|
|
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.04 required=5 tests=[AWL=0.040]
|
|
X-Spam-Score: 0.04
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 5373
|
|
|
|
On Fri, 2005-12-30 at 16:14 -0500, Bruce Momjian wrote:
|
|
> Simon Riggs wrote:
|
|
> > The implications of EXCLUSIVE are:
|
|
> > - there will be a check on each and every I, U, D to check the state of
|
|
> > the relation
|
|
> > - *every* operation that attempts a write lock will attempt to acquire
|
|
> > an EXCLUSIVE full table lock instead
|
|
> > - following successful completion of *each* DML statement, the relation
|
|
> > will be heap_sync'd involving a full scan of the buffer cache
|
|
>
|
|
> Yes, I think that is it. What we can do is implement EXCLUSIVE to
|
|
> affect only COPY at this point, and document that, and later add other
|
|
> commands.
|
|
>
|
|
> > Can I clarify the wording of the syntax? Is EXCLUSIVE the right word?
|
|
> > How about FASTLOAD or BULKLOAD? Those words seem less likely to be
|
|
> > misused in the future - i.e. we are invoking a special mode, rather than
|
|
> > invoking a special "go faster" option.
|
|
>
|
|
> The problem with the FASTLOAD/BULKLOAD words is that EXCLUSIVE mode is
|
|
> probably not the best for loading. I would think TRUNCATE would be a
|
|
> better option.
|
|
>
|
|
> In fact, in loading a table, I think both EXCLUSIVE and TRUNCATE would be
|
|
> the same, mostly. You would create the table, set its RELIABILITY to
|
|
> TRUNCATE, COPY into the table, then set the RELIABILITY to SHARE or
|
|
> DEFAULT. The second ALTER has to sync all the dirty data blocks, which
|
|
> the same thing EXCLUSIVE does at the conclusion of COPY.
|
|
>
|
|
> So, we need a name for EXCLUSIVE mode that suggests how it is different
|
|
> from TRUNCATE, and in this case, the difference is that EXCLUSIVE
|
|
> preserves the previous contents of the table on recovery, while TRUNCATE
|
|
> does not. Do you want to call the mode PRESERVE, or EXCLUSIVE WRITER?
|
|
> Anyway, the keywords are easy to modify, even after the patch is
|
|
> submitted. FYI, I usually go through keywords.c looking for a keyword
|
|
> we already use.
|
|
|
|
I'm very happy for suggestions on what these new modes are called.
|
|
|
|
> > > So, to summarize, I think we should add DROP/TRUNCATE, and use that by
|
|
> > > default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
|
|
> > > for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
|
|
> >
|
|
> > Would you mind stating again what you mean, just so I can understand
|
|
> > this? Your summary isn't enough.
|
|
>
|
|
> New ALTER TABLE mode, perhaps call it PERSISTENCE:
|
|
>
|
|
> ALTER TABLE tab PERSISTENCE DROP ON RECOVERY
|
|
> ALTER TABLE tab PERSISTENCE TRUNCATE ON RECOVERY
|
|
>
|
|
> These would drop or truncate all tables with this flag on a non-clean
|
|
> start of the postmaster, and write something in the server logs.
|
|
> However, I don't know that we have the code in place to DROP/TRUNCATE in
|
|
> recovery mode, and it would affect all databases, so it could be quite
|
|
> complex to implement. In this mode, no WAL logs would be written for
|
|
> table modifications, though DDL commands would have to be logged.
|
|
|
|
Right now, this will be a TODO item... it looks like it will take some
|
|
thought to implement correctly.
|
|
|
|
> ALTER TABLE tab PERSISTENCE PRESERVE (or STABLE?)
|
|
>
|
|
> Table contents are preserved across recoveries, but data modifications
|
|
> can happen only one at a time. I don't think we have a lock mode that
|
|
> does this, so I am worried a new lock mode will have to be created. A
|
|
> simplified solution at this stage would be to take an exclusive lock on
|
|
> the table, but really we just need a single-writer table lock, which I
|
|
> don't think we have. initially this can implemented to only affect COPY
|
|
> but later can be done for other commands.
|
|
|
|
ExclusiveLock locks out everything apart from readers, no new lock mode
|
|
AFAICS. Implementing that is little additional work for COPY.
|
|
|
|
Tom had a concern about setting this for I, U, D commands via the
|
|
executor. Not sure what the details of that are, as yet.
|
|
|
|
We can use either of the unlogged modes for pg_dump, so I'd suggest its
|
|
this one. Everybody happy with this being the new default in pg_dump, or
|
|
should it be an option?
|
|
|
|
> ALTER TABLE tab PERSISTENCE DEFAULT
|
|
>
|
|
> This would be our current default mode, which is full concurrency and
|
|
> persistence.
|
|
|
|
I'm thinking whether the ALTER TABLE statement might be better with two
|
|
bool flags rather than a 3-state char.
|
|
|
|
flag 1: ENABLE LOGGING | DISABLE LOGGING
|
|
|
|
flag 2: FULL RECOVERY | TRUNCATE ON RECOVERY
|
|
|
|
Giving 3 possible sets of options:
|
|
|
|
-- the default
|
|
ALTER TABLE mytable ENABLE LOGGING FULL RECOVERY; (default)
|
|
|
|
-- EXCLUSIVE mode
|
|
ALTER TABLE mytable DISABLE LOGGING FULL RECOVERY;
|
|
...which would be used like this
|
|
ALTER TABLE mytable DISABLE LOGGING;
|
|
COPY or other bulk data manipulation SQL
|
|
ALTER TABLE mytable ENABLE LOGGING;
|
|
...since FULL RECOVERY is the default.
|
|
|
|
-- multiuser temp table mode
|
|
ALTER TABLE mytable DISABLE LOGGING TRUNCATE ON RECOVERY;
|
|
...which would usually be left on all the time
|
|
|
|
which only uses one new keyword LOGGING and yet all the modes are fairly
|
|
explicit as to what they do.
|
|
|
|
An alternative might be the slightly more verbose:
|
|
ALTER TABLE mytable DISABLE LOGGING FORCE EXCLUSIVE TABLE LOCK;
|
|
which would be turned off by
|
|
ALTER TABLE mytable ENABLE LOGGING;
|
|
|
|
Comments?
|
|
|
|
Best Regards, Simon Riggs
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 1: if posting/reading through Usenet, please send an appropriate
|
|
subscribe-nomail command to majordomo@postgresql.org so that your
|
|
message can get through to the mailing list cleanly
|
|
|
|
From simon@2ndquadrant.com Tue Jan 3 18:10:32 2006
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
From: Simon Riggs <simon@2ndquadrant.com>
|
|
To: Jim C. Nasby <jnasby@pervasive.com>,
|
|
Bruce Momjian <pgman@candle.pha.pa.us>
|
|
cc: Tom Lane <tgl@sss.pgh.pa.us>, Andrew Dunstan <andrew@dunslane.net>,
|
|
kleptog@svana.org, gsstark@mit.edu, pg@rbt.ca, zhouqq@cs.toronto.edu,
|
|
pgsql-hackers@postgresql.org
|
|
In-Reply-To: <200601032120.k03LKl609990@candle.pha.pa.us>
|
|
References: <200601032120.k03LKl609990@candle.pha.pa.us>
|
|
Date: Tue, 03 Jan 2006 23:10:16 +0000
|
|
Message-ID: <1136329816.5052.239.camel@localhost.localdomain>
|
|
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
|
|
Content-Length: 2118
|
|
|
|
On Tue, 2006-01-03 at 16:20 -0500, Bruce Momjian wrote:
|
|
> Jim C. Nasby wrote:
|
|
|
|
> > Idealistically, if EXCLUSIVE/PRESERVE/STABLE does it's thing by only
|
|
> > appending new pages, it would be nice if other backends could continue
|
|
> > performing updates at the same time, assuming there's free space
|
|
> > available elsewhere within the table (and that you'd be able to recover
|
|
> > those logged changes regardless of the non-logged operations). But
|
|
> > that's a pretty lofty goal...
|
|
>
|
|
> "Idealistically", yep. It would be great if we could put a helmet on
|
|
> and the computer would read your mind. :-)
|
|
>
|
|
> Basically meaning your idea of update while EXCLUSIVE/PRESERVE/STABLE is
|
|
> happening is never going to be implemented because it is just too hard
|
|
> to do, and too prone to error.
|
|
|
|
The reason for locking the whole table was to ensure that we do not have
|
|
a mixture of logged and non-logged writers writing to the same data
|
|
blocks, since that could damage blocks unrecoverably in the event of a
|
|
crash. (Though perhaps only if full_block_writes is on)
|
|
|
|
The ALTER TABLE .. EXCLUSIVE/(insert name) mode would mean that *any*
|
|
backend who took a write lock on the table, would lock out the whole
|
|
table. So this new mode is not restricted to the job/user who ran the
|
|
ALTER TABLE command. (I would note that that is how Oracle and Teradata
|
|
do this for pre-load utility table locking, but why should we follow
|
|
them on that?)
|
|
|
|
Currently, when we add a new row when the FSM is empty, we check the
|
|
last block of the table. That would cause multiple writers to access the
|
|
same blocks and so we would be in danger. The only way to avoid that
|
|
would be for logged writers (who would use the FSM if it were not empty)
|
|
to notify back to the FSM that they have just added a block - and remove
|
|
the behaviour to look for the last block.
|
|
|
|
Anyway, one step at a time. *Maybe* we can do that in the future, but
|
|
right now I'd like to add the basic fast write/load functionality.
|
|
|
|
Also, I think I will do the docs first this time, just so everyone can
|
|
read what we're getting ahead of time, to ensure we all agree.
|
|
|
|
Best Regards, Simon Riggs
|
|
|
|
From pgsql-hackers-owner+M78236=pgman=candle.pha.pa.us@postgresql.org Tue Jan 3 18:24:20 2006
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
From: Simon Riggs <simon@2ndquadrant.com>
|
|
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
cc: Jim C. Nasby <jnasby@pervasive.com>, Tom Lane <tgl@sss.pgh.pa.us>,
|
|
Andrew Dunstan <andrew@dunslane.net>, kleptog@svana.org, gsstark@mit.edu,
|
|
pg@rbt.ca, zhouqq@cs.toronto.edu, pgsql-hackers@postgresql.org
|
|
In-Reply-To: <200601032238.k03McP804163@candle.pha.pa.us>
|
|
References: <200601032238.k03McP804163@candle.pha.pa.us>
|
|
Date: Tue, 03 Jan 2006 23:23:54 +0000
|
|
Message-ID: <1136330634.5052.247.camel@localhost.localdomain>
|
|
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.043 required=5 tests=[AWL=0.043]
|
|
X-Spam-Score: 0.043
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 725
|
|
|
|
On Tue, 2006-01-03 at 17:38 -0500, Bruce Momjian wrote:
|
|
|
|
> Right, the DML will be single-threaded and fsync of all dirty pages will
|
|
> happen before commit of each transaction.
|
|
|
|
heap_sync() would occur at end of statement, as it does with CTAS. We
|
|
could delay until EOT but I'm not sure I see why; in most cases they'd
|
|
be the same point anyway.
|
|
|
|
I'd been toying with the idea of making the freshly added blocks live
|
|
only in temp_buffers to avoid the shared_buffers overhead, but that was
|
|
starting to sounds too wierd for my liking.
|
|
|
|
Best Regards, Simon Riggs
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 4: Have you searched our list archives?
|
|
|
|
http://archives.postgresql.org
|
|
|
|
From simon@2ndquadrant.com Tue Jan 3 18:58:13 2006
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
From: Simon Riggs <simon@2ndquadrant.com>
|
|
To: Michael Paesold <mpaesold@gmx.at>
|
|
cc: Bruce Momjian <pgman@candle.pha.pa.us>, andrew@dunslane.net,
|
|
tgl@sss.pgh.pa.us, kleptog@svana.org, gsstark@mit.edu, pg@rbt.ca,
|
|
zhouqq@cs.toronto.edu, pgsql-hackers@postgresql.org
|
|
In-Reply-To: <14969.1136030384@www6.gmx.net>
|
|
References: <200512302114.jBULEno02301@candle.pha.pa.us>
|
|
<14969.1136030384@www6.gmx.net>
|
|
Date: Tue, 03 Jan 2006 23:58:09 +0000
|
|
Message-ID: <1136332689.5052.263.camel@localhost.localdomain>
|
|
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
|
|
Content-Length: 1493
|
|
|
|
On Sat, 2005-12-31 at 12:59 +0100, Michael Paesold wrote:
|
|
> Bruce Momjian wrote:
|
|
>
|
|
> > > The --single-transaction mode would apply even if the dump was created
|
|
> > > using an earlier version of pg_dump. pg_dump has *not* been altered at
|
|
> > > all. (And I would again add that the idea was not my own)
|
|
> >
|
|
> > I assume you mean this:
|
|
> >
|
|
> > http://archives.postgresql.org/pgsql-patches/2005-12/msg00257.php
|
|
> >
|
|
> > I guess with the ALTER commands I don't see much value in the
|
|
> > --single-transaction flag. I am sure others suggested it, but would
|
|
> > they suggest it now given our current direction.
|
|
>
|
|
> I just want to add that --single-transaction has a value of it's own. There
|
|
> were times when I wanted to restore parts of a dump all-or-nothing.
|
|
>
|
|
> This is possible with PostgreSQL, unlike many other DBM systems, because
|
|
> people like Tom Lane have invested in ensuring that all DDL is working
|
|
> without implicitly committing an enclosing transaction.
|
|
>
|
|
> Using pg_restore directly into a database, it is not possible to get a
|
|
> single transaction right now. One has to restore to a file and manually
|
|
> added BEGIN/COMMIT. Just for that I think --single-transaction is a great
|
|
> addition and a missing feature.
|
|
>
|
|
> I think more people have a use-case for that.
|
|
|
|
I did originally separate the --single-transaction patch for this
|
|
reason. I think its a valid patch on its own and its wrapped and ready
|
|
to go, with some deletions from the doc patch.
|
|
|
|
Best Regards, Simon Riggs
|
|
|
|
From pgsql-hackers-owner+M78239=pgman=candle.pha.pa.us@postgresql.org Tue Jan 3 19:12:18 2006
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
From: Simon Riggs <simon@2ndquadrant.com>
|
|
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
cc: Tom Lane <tgl@sss.pgh.pa.us>, Martijn van Oosterhout <kleptog@svana.org>,
|
|
Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
|
|
Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
|
|
In-Reply-To: <200512291637.jBTGbdC03848@candle.pha.pa.us>
|
|
References: <200512291637.jBTGbdC03848@candle.pha.pa.us>
|
|
Date: Wed, 04 Jan 2006 00:11:55 +0000
|
|
Message-ID: <1136333515.5052.273.camel@localhost.localdomain>
|
|
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.045 required=5 tests=[AWL=0.045]
|
|
X-Spam-Score: 0.045
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 1200
|
|
|
|
On Thu, 2005-12-29 at 11:37 -0500, Bruce Momjian wrote:
|
|
> Having COPY behave differently because it is
|
|
> in a transaction is fine as long as it is user-invisible, but once you
|
|
> require users to do that to get the speedup, it isn't user-invisible
|
|
> anymore.
|
|
|
|
Since we're agreed on adding ALTER TABLE rather than COPY LOCK, we have
|
|
our explicit mechanism for speedup.
|
|
|
|
However, it costs a single line of code and very very little execution
|
|
time to add in the optimization to COPY to make it bypass WAL when
|
|
executed in the same transaction that created the table. Everything else
|
|
is already there.
|
|
|
|
As part of the use_wal test:
|
|
+ if (resultRelInfo->ri_NumIndices == 0 &&
|
|
+ !XLogArchivingActive() &&
|
|
>> (cstate->rel->rd_createSubid != InvalidSubTransactionId ))
|
|
+ use_wal = false;
|
|
|
|
the value is already retrieved from cache...
|
|
|
|
Can anyone see a reason *not* to put that change in also? We just don't
|
|
advertise it as the "suggested" route to gaining performance, nor would
|
|
we rely on it for pg_dump/restore performance.
|
|
|
|
Best Regards, Simon Riggs
|
|
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 2: Don't 'kill -9' the postmaster
|
|
|
|
From pgsql-hackers-owner+M78303=pgman=candle.pha.pa.us@postgresql.org Thu Jan 5 12:23:39 2006
|
|
X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org
|
|
X-Greylist: from auto-whitelisted by SQLgrey-
|
|
From: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
Message-ID: <200601051722.k05HMSM02052@candle.pha.pa.us>
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
In-Reply-To: <1136328833.5052.223.camel@localhost.localdomain>
|
|
To: Simon Riggs <simon@2ndquadrant.com>
|
|
Date: Thu, 5 Jan 2006 12:22:28 -0500 (EST)
|
|
cc: Andrew Dunstan <andrew@dunslane.net>, Tom Lane <tgl@sss.pgh.pa.us>,
|
|
Martijn van Oosterhout <kleptog@svana.org>, Greg Stark <gsstark@mit.edu>,
|
|
Rod Taylor <pg@rbt.ca>, Qingqing Zhou <zhouqq@cs.toronto.edu>,
|
|
pgsql-hackers@postgresql.org
|
|
X-Mailer: ELM [version 2.4ME+ PL121 (25)]
|
|
X-Virus-Scanned: by amavisd-new at hub.org
|
|
X-Spam-Status: No, score=0.12 required=5 tests=[AWL=0.120]
|
|
X-Spam-Score: 0.12
|
|
X-Spam-Level:
|
|
X-Mailing-List: pgsql-hackers
|
|
List-Archive: <http://archives.postgresql.org/pgsql-hackers>
|
|
List-Help: <mailto:majordomo@postgresql.org?body=help>
|
|
List-Id: <pgsql-hackers.postgresql.org>
|
|
List-Owner: <mailto:pgsql-hackers-owner@postgresql.org>
|
|
List-Post: <mailto:pgsql-hackers@postgresql.org>
|
|
List-Subscribe: <mailto:majordomo@postgresql.org?body=sub%20pgsql-hackers>
|
|
List-Unsubscribe: <mailto:majordomo@postgresql.org?body=unsub%20pgsql-hackers>
|
|
Precedence: bulk
|
|
Sender: pgsql-hackers-owner@postgresql.org
|
|
Content-Length: 6020
|
|
|
|
Simon Riggs wrote:
|
|
> > So, we need a name for EXCLUSIVE mode that suggests how it is different
|
|
> > from TRUNCATE, and in this case, the difference is that EXCLUSIVE
|
|
> > preserves the previous contents of the table on recovery, while TRUNCATE
|
|
> > does not. Do you want to call the mode PRESERVE, or EXCLUSIVE WRITER?
|
|
> > Anyway, the keywords are easy to modify, even after the patch is
|
|
> > submitted. FYI, I usually go through keywords.c looking for a keyword
|
|
> > we already use.
|
|
>
|
|
> I'm very happy for suggestions on what these new modes are called.
|
|
>
|
|
> > > > So, to summarize, I think we should add DROP/TRUNCATE, and use that by
|
|
> > > > default (or optionally off?) in pg_dump, and, assuming we want EXCLUSIVE
|
|
> > > > for more than just COPY, we need to add ALTER TABLE EXCLUSIVE.
|
|
> > >
|
|
> > > Would you mind stating again what you mean, just so I can understand
|
|
> > > this? Your summary isn't enough.
|
|
> >
|
|
> > New ALTER TABLE mode, perhaps call it PERSISTENCE:
|
|
> >
|
|
> > ALTER TABLE tab PERSISTENCE DROP ON RECOVERY
|
|
> > ALTER TABLE tab PERSISTENCE TRUNCATE ON RECOVERY
|
|
> >
|
|
> > These would drop or truncate all tables with this flag on a non-clean
|
|
> > start of the postmaster, and write something in the server logs.
|
|
> > However, I don't know that we have the code in place to DROP/TRUNCATE in
|
|
> > recovery mode, and it would affect all databases, so it could be quite
|
|
> > complex to implement. In this mode, no WAL logs would be written for
|
|
> > table modifications, though DDL commands would have to be logged.
|
|
>
|
|
> Right now, this will be a TODO item... it looks like it will take some
|
|
> thought to implement correctly.
|
|
|
|
OK, I know my suggestions have made it more complicated.
|
|
|
|
TODO added:
|
|
|
|
* Allow control over which tables are WAL-logged
|
|
|
|
Allow tables to bypass WAL writes and just fsync() dirty pages on
|
|
commit. To do this, only a single writer can modify the table, and
|
|
writes must happen only on new pages. Readers can continue accessing
|
|
the table. This would affect COPY, and perhaps INSERT/UPDATE too.
|
|
Another option is to avoid transaction logging entirely and truncate
|
|
or drop the table on crash recovery. These should be implemented
|
|
using ALTER TABLE, e.g. ALTER TABLE PERSISTENCE [ DROP | TRUNCATE |
|
|
STABLE | DEFAULT ]. Tables using non-default logging should not use
|
|
referential integrity with default-logging tables, and tables using
|
|
stable logging probably can not have indexes. [walcontrol]
|
|
|
|
|
|
> > ALTER TABLE tab PERSISTENCE PRESERVE (or STABLE?)
|
|
> >
|
|
> > Table contents are preserved across recoveries, but data modifications
|
|
> > can happen only one at a time. I don't think we have a lock mode that
|
|
> > does this, so I am worried a new lock mode will have to be created. A
|
|
> > simplified solution at this stage would be to take an exclusive lock on
|
|
> > the table, but really we just need a single-writer table lock, which I
|
|
> > don't think we have. initially this can implemented to only affect COPY
|
|
> > but later can be done for other commands.
|
|
>
|
|
> ExclusiveLock locks out everything apart from readers, no new lock mode
|
|
> AFAICS. Implementing that is little additional work for COPY.
|
|
|
|
Nice.
|
|
|
|
> Tom had a concern about setting this for I, U, D commands via the
|
|
> executor. Not sure what the details of that are, as yet.
|
|
|
|
That is much more complicated than the COPY-only idea, for sure. I am
|
|
thinking we could add the ALTER syntax and just do COPY at this stage,
|
|
meaning that I/U/D still do full logging until we get to improving them.
|
|
The big benefit is that the user API doesn't need to change when we
|
|
improve the code. In fact I think we could do the TRUNCATE/DROP easily
|
|
for I/U/D, but the STABLE option would require work and we don't need to
|
|
implement it in the first patch.
|
|
|
|
> We can use either of the unlogged modes for pg_dump, so I'd suggest its
|
|
> this one. Everybody happy with this being the new default in pg_dump, or
|
|
> should it be an option?
|
|
>
|
|
> > ALTER TABLE tab PERSISTENCE DEFAULT
|
|
> >
|
|
> > This would be our current default mode, which is full concurrency and
|
|
> > persistence.
|
|
>
|
|
> I'm thinking whether the ALTER TABLE statement might be better with two
|
|
> bool flags rather than a 3-state char.
|
|
>
|
|
> flag 1: ENABLE LOGGING | DISABLE LOGGING
|
|
>
|
|
> flag 2: FULL RECOVERY | TRUNCATE ON RECOVERY
|
|
>
|
|
> Giving 3 possible sets of options:
|
|
>
|
|
> -- the default
|
|
> ALTER TABLE mytable ENABLE LOGGING FULL RECOVERY; (default)
|
|
>
|
|
> -- EXCLUSIVE mode
|
|
> ALTER TABLE mytable DISABLE LOGGING FULL RECOVERY;
|
|
> ...which would be used like this
|
|
> ALTER TABLE mytable DISABLE LOGGING;
|
|
> COPY or other bulk data manipulation SQL
|
|
> ALTER TABLE mytable ENABLE LOGGING;
|
|
> ...since FULL RECOVERY is the default.
|
|
>
|
|
> -- multiuser temp table mode
|
|
> ALTER TABLE mytable DISABLE LOGGING TRUNCATE ON RECOVERY;
|
|
> ...which would usually be left on all the time
|
|
>
|
|
> which only uses one new keyword LOGGING and yet all the modes are fairly
|
|
> explicit as to what they do.
|
|
>
|
|
> An alternative might be the slightly more verbose:
|
|
> ALTER TABLE mytable DISABLE LOGGING FORCE EXCLUSIVE TABLE LOCK;
|
|
> which would be turned off by
|
|
> ALTER TABLE mytable ENABLE LOGGING;
|
|
>
|
|
> Comments?
|
|
|
|
I had the same idea originally, but avoided it because the logging
|
|
really does affect what other options you can use. For example, if you
|
|
want truncate on recovery, you certainly do not want logging, so it
|
|
seems the options are not really independent. In fact if someone asks
|
|
for truncate on recovery, do we automatically turn off logging for them,
|
|
or throw an error, or a warning. It just seemed too error-prone and
|
|
confusing, though perhaps more logical. Of course, if others like the
|
|
above, we can do it.
|
|
|
|
--
|
|
Bruce Momjian | http://candle.pha.pa.us
|
|
pgman@candle.pha.pa.us | (610) 359-1001
|
|
+ If your life is a hard drive, | 13 Roberts Road
|
|
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
|
|
|
|
---------------------------(end of broadcast)---------------------------
|
|
TIP 2: Don't 'kill -9' the postmaster
|
|
|
|
From simon@2ndquadrant.com Thu Jan 5 16:56:25 2006
|
|
Return-path: <simon@2ndquadrant.com>
|
|
Received: from smtp.nildram.co.uk (smtp.nildram.co.uk [195.112.4.54])
|
|
by candle.pha.pa.us (8.11.6/8.11.6) with ESMTP id k05LuPb02246
|
|
for <pgman@candle.pha.pa.us>; Thu, 5 Jan 2006 16:56:25 -0500 (EST)
|
|
Received: from [192.168.0.3] (unknown [84.12.184.6])
|
|
by smtp.nildram.co.uk (Postfix) with ESMTP
|
|
id A9F0F268C4E; Thu, 5 Jan 2006 21:56:18 +0000 (GMT)
|
|
Subject: Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and
|
|
From: Simon Riggs <simon@2ndquadrant.com>
|
|
To: Bruce Momjian <pgman@candle.pha.pa.us>
|
|
cc: Tom Lane <tgl@sss.pgh.pa.us>, Martijn van Oosterhout <kleptog@svana.org>,
|
|
Greg Stark <gsstark@mit.edu>, Rod Taylor <pg@rbt.ca>,
|
|
Qingqing Zhou <zhouqq@cs.toronto.edu>, pgsql-hackers@postgresql.org
|
|
In-Reply-To: <200601051727.k05HR5p02803@candle.pha.pa.us>
|
|
References: <200601051727.k05HR5p02803@candle.pha.pa.us>
|
|
Content-Type: text/plain
|
|
Date: Thu, 05 Jan 2006 21:56:21 +0000
|
|
Message-ID: <1136498181.21025.285.camel@localhost.localdomain>
|
|
MIME-Version: 1.0
|
|
X-Mailer: Evolution 2.2.3 (2.2.3-2.fc4)
|
|
Content-Transfer-Encoding: 7bit
|
|
Status: OR
|
|
|
|
On Thu, 2006-01-05 at 12:27 -0500, Bruce Momjian wrote:
|
|
|
|
> Seems like a nice optimization.
|
|
|
|
Negative thoughts: Toast tables have a toast index on them, yes? We have
|
|
agreed that we cannot use the optimization if we have indexes on the
|
|
main table. It follows that we cannot use the optimization if we have
|
|
*any* toasted data, since that would require a pointer between two
|
|
blocks, which would not be correctly recovered following a crash. If we
|
|
log the toast table then there could be a mismatch between heap and
|
|
toast table; if we don't log the toast table there could be a mismatch
|
|
between toast table and toast index.
|
|
|
|
We can test to see if the toast table is empty when we do ALTER TABLE,
|
|
but loading operations may try to create toasted data rows.
|
|
|
|
Presumably that means we must either:
|
|
i) abort a COPY if we get a toastable value
|
|
ii) if we get a toastable value, insert the row into a new block, which
|
|
we do logging of, then also log the toast insert and the toast index
|
|
insert - i.e. some blocks we log, others not
|
|
|
|
This is still useful for many applications, IMHO, but the list of
|
|
restrictions seems to be growing. Worse, we wouldn't know that the toast
|
|
tables were empty until after we did the COPY TO for a pg_dump, so we
|
|
wouldn't be able to retrospectively add an ALTER TABLE command ahead of
|
|
the COPY.
|
|
|
|
Thoughts? Hopefully there are some flaws in my thinking here,
|
|
|
|
Best Regards, Simon Riggs
|
|
|
|
|
|
|