Improve the manual's discussion of partitioning. Recommend using a

trigger instead of a rule to redirect insertions, use NEW.* notation
where appropriate, some other updates and adjustments.  David Fetter
and Tom Lane
This commit is contained in:
Tom Lane 2007-12-02 19:20:32 +00:00
parent 8ee076325f
commit 43da837eda
1 changed files with 193 additions and 127 deletions

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/ddl.sgml,v 1.77 2007/11/28 15:42:31 petere Exp $ --> <!-- $PostgreSQL: pgsql/doc/src/sgml/ddl.sgml,v 1.78 2007/12/02 19:20:32 tgl Exp $ -->
<chapter id="ddl"> <chapter id="ddl">
<title>Data Definition</title> <title>Data Definition</title>
@ -2383,8 +2383,8 @@ CHECK ( outletID BETWEEN 200 AND 300 )
<listitem> <listitem>
<para> <para>
Optionally, define a rule or trigger to redirect modifications Optionally, define a trigger or rule to redirect data inserted into
of the master table to the appropriate partition. the master table to the appropriate partition.
</para> </para>
</listitem> </listitem>
@ -2443,16 +2443,16 @@ CREATE TABLE measurement (
Next we create one partition for each active month: Next we create one partition for each active month:
<programlisting> <programlisting>
CREATE TABLE measurement_y2004m02 ( ) INHERITS (measurement); CREATE TABLE measurement_y2006m02 ( ) INHERITS (measurement);
CREATE TABLE measurement_y2004m03 ( ) INHERITS (measurement); CREATE TABLE measurement_y2006m03 ( ) INHERITS (measurement);
... ...
CREATE TABLE measurement_y2005m11 ( ) INHERITS (measurement); CREATE TABLE measurement_y2007m11 ( ) INHERITS (measurement);
CREATE TABLE measurement_y2005m12 ( ) INHERITS (measurement); CREATE TABLE measurement_y2007m12 ( ) INHERITS (measurement);
CREATE TABLE measurement_y2006m01 ( ) INHERITS (measurement); CREATE TABLE measurement_y2008m01 ( ) INHERITS (measurement);
</programlisting> </programlisting>
Each of the partitions are complete tables in their own right, Each of the partitions are complete tables in their own right,
but they inherit their definition from the but they inherit their definitions from the
<structname>measurement</> table. <structname>measurement</> table.
</para> </para>
@ -2470,21 +2470,21 @@ CREATE TABLE measurement_y2006m01 ( ) INHERITS (measurement);
table creation script becomes: table creation script becomes:
<programlisting> <programlisting>
CREATE TABLE measurement_y2004m02 ( CREATE TABLE measurement_y2006m02 (
CHECK ( logdate &gt;= DATE '2004-02-01' AND logdate &lt; DATE '2004-03-01' ) CHECK ( logdate &gt;= DATE '2006-02-01' AND logdate &lt; DATE '2006-03-01' )
) INHERITS (measurement); ) INHERITS (measurement);
CREATE TABLE measurement_y2004m03 ( CREATE TABLE measurement_y2006m03 (
CHECK ( logdate &gt;= DATE '2004-03-01' AND logdate &lt; DATE '2004-04-01' ) CHECK ( logdate &gt;= DATE '2006-03-01' AND logdate &lt; DATE '2006-04-01' )
) INHERITS (measurement); ) INHERITS (measurement);
... ...
CREATE TABLE measurement_y2005m11 ( CREATE TABLE measurement_y2007m11 (
CHECK ( logdate &gt;= DATE '2005-11-01' AND logdate &lt; DATE '2005-12-01' ) CHECK ( logdate &gt;= DATE '2007-11-01' AND logdate &lt; DATE '2007-12-01' )
) INHERITS (measurement); ) INHERITS (measurement);
CREATE TABLE measurement_y2005m12 ( CREATE TABLE measurement_y2007m12 (
CHECK ( logdate &gt;= DATE '2005-12-01' AND logdate &lt; DATE '2006-01-01' ) CHECK ( logdate &gt;= DATE '2007-12-01' AND logdate &lt; DATE '2008-01-01' )
) INHERITS (measurement); ) INHERITS (measurement);
CREATE TABLE measurement_y2006m01 ( CREATE TABLE measurement_y2008m01 (
CHECK ( logdate &gt;= DATE '2006-01-01' AND logdate &lt; DATE '2006-02-01' ) CHECK ( logdate &gt;= DATE '2008-01-01' AND logdate &lt; DATE '2008-02-01' )
) INHERITS (measurement); ) INHERITS (measurement);
</programlisting> </programlisting>
</para> </para>
@ -2495,12 +2495,12 @@ CREATE TABLE measurement_y2006m01 (
We probably need indexes on the key columns too: We probably need indexes on the key columns too:
<programlisting> <programlisting>
CREATE INDEX measurement_y2004m02_logdate ON measurement_y2004m02 (logdate); CREATE INDEX measurement_y2006m02_logdate ON measurement_y2006m02 (logdate);
CREATE INDEX measurement_y2004m03_logdate ON measurement_y2004m03 (logdate); CREATE INDEX measurement_y2006m03_logdate ON measurement_y2006m03 (logdate);
... ...
CREATE INDEX measurement_y2005m11_logdate ON measurement_y2005m11 (logdate); CREATE INDEX measurement_y2007m11_logdate ON measurement_y2007m11 (logdate);
CREATE INDEX measurement_y2005m12_logdate ON measurement_y2005m12 (logdate); CREATE INDEX measurement_y2007m12_logdate ON measurement_y2007m12 (logdate);
CREATE INDEX measurement_y2006m01_logdate ON measurement_y2006m01 (logdate); CREATE INDEX measurement_y2008m01_logdate ON measurement_y2008m01 (logdate);
</programlisting> </programlisting>
We choose not to add further indexes at this time. We choose not to add further indexes at this time.
@ -2509,56 +2509,72 @@ CREATE INDEX measurement_y2006m01_logdate ON measurement_y2006m01 (logdate);
<listitem> <listitem>
<para> <para>
We want our application to be able to say <literal>INSERT INTO
measurement ...</> and have the data be redirected into the
appropriate partition table. We can arrange that by attaching
a suitable trigger function to the master table.
If data will be added only to the latest partition, we can If data will be added only to the latest partition, we can
set up a very simple rule to insert data. We must use a very simple trigger function:
redefine this each month so that it always points to the
current partition:
<programlisting> <programlisting>
CREATE OR REPLACE RULE measurement_current_partition AS CREATE OR REPLACE FUNCTION measurement_insert_trigger()
ON INSERT TO measurement RETURNS TRIGGER AS $$
DO INSTEAD BEGIN
INSERT INTO measurement_y2006m01 VALUES ( NEW.city_id, INSERT INTO measurement_y2008m01 VALUES (NEW.*);
NEW.logdate, RETURN NULL;
NEW.peaktemp, END;
NEW.unitsales ); $$
LANGUAGE plpgsql;
</programlisting> </programlisting>
After creating the function, we create a trigger which
calls the trigger function:
<programlisting>
CREATE TRIGGER insert_measurement_trigger
BEFORE INSERT ON measurement
FOR EACH ROW EXECUTE PROCEDURE measurement_insert_trigger();
</programlisting>
We must redefine the trigger function each month so that it always
points to the current partition. The trigger definition does
not need to be updated, however.
</para>
<para>
We might want to insert data and have the server automatically We might want to insert data and have the server automatically
locate the partition into which the row should be added. We locate the partition into which the row should be added. We
could do this with a more complex set of rules as shown below: could do this with a more complex trigger function, for example:
<programlisting> <programlisting>
CREATE RULE measurement_insert_y2004m02 AS CREATE OR REPLACE FUNCTION measurement_insert_trigger()
ON INSERT TO measurement WHERE RETURNS TRIGGER AS $$
( logdate &gt;= DATE '2004-02-01' AND logdate &lt; DATE '2004-03-01' ) BEGIN
DO INSTEAD IF ( logdate &gt;= DATE '2006-02-01' AND logdate &lt; DATE '2006-03-01' ) THEN
INSERT INTO measurement_y2004m02 VALUES ( NEW.city_id, INSERT INTO measurement_y2006m02 VALUES (NEW.*);
NEW.logdate, ELSIF ( logdate &gt;= DATE '2006-03-01' AND logdate &lt; DATE '2006-04-01' ) THEN
NEW.peaktemp, INSERT INTO measurement_y2006m03 VALUES (NEW.*);
NEW.unitsales ); ...
... ELSIF ( logdate &gt;= DATE '2008-01-01' AND logdate &lt; DATE '2008-02-01' ) THEN
CREATE RULE measurement_insert_y2005m12 AS INSERT INTO measurement_y2008m01 VALUES (NEW.*);
ON INSERT TO measurement WHERE ELSE
( logdate &gt;= DATE '2005-12-01' AND logdate &lt; DATE '2006-01-01' ) RAISE EXCEPTION 'Date out of range. Fix the measurement_insert_trigger() function!';
DO INSTEAD END IF;
INSERT INTO measurement_y2005m12 VALUES ( NEW.city_id, RETURN NULL;
NEW.logdate, END;
NEW.peaktemp, $$
NEW.unitsales ); LANGUAGE plpgsql;
CREATE RULE measurement_insert_y2006m01 AS
ON INSERT TO measurement WHERE
( logdate &gt;= DATE '2006-01-01' AND logdate &lt; DATE '2006-02-01' )
DO INSTEAD
INSERT INTO measurement_y2006m01 VALUES ( NEW.city_id,
NEW.logdate,
NEW.peaktemp,
NEW.unitsales );
</programlisting> </programlisting>
Note that the <literal>WHERE</literal> clause in each rule The trigger definition is the same as before.
exactly matches the <literal>CHECK</literal> Note that each <literal>IF</literal> test must exactly match the
constraint for its partition. <literal>CHECK</literal> constraint for its partition.
</para>
<para>
While this function is more complex than the single-month case,
it doesn't need to be updated as often, since branches can be
added in advance of being needed.
</para> </para>
</listitem> </listitem>
</orderedlist> </orderedlist>
@ -2571,24 +2587,6 @@ DO INSTEAD
script that generates the required DDL automatically. script that generates the required DDL automatically.
</para> </para>
<para>
Partitioning can also be arranged using a <literal>UNION ALL</literal>
view:
<programlisting>
CREATE VIEW measurement AS
SELECT * FROM measurement_y2004m02
UNION ALL SELECT * FROM measurement_y2004m03
...
UNION ALL SELECT * FROM measurement_y2005m11
UNION ALL SELECT * FROM measurement_y2005m12
UNION ALL SELECT * FROM measurement_y2006m01;
</programlisting>
However, the need to
recreate the view adds an extra step to adding and dropping
individual partitions of the data set.
</para>
</sect2> </sect2>
<sect2 id="ddl-partitioning-managing-partitions"> <sect2 id="ddl-partitioning-managing-partitions">
@ -2609,7 +2607,7 @@ UNION ALL SELECT * FROM measurement_y2006m01;
The simplest option for removing old data is simply to drop the partition The simplest option for removing old data is simply to drop the partition
that is no longer necessary: that is no longer necessary:
<programlisting> <programlisting>
DROP TABLE measurement_y2003m02; DROP TABLE measurement_y2006m02;
</programlisting> </programlisting>
This can very quickly delete millions of records because it doesn't have This can very quickly delete millions of records because it doesn't have
to individually delete every record. to individually delete every record.
@ -2620,7 +2618,7 @@ DROP TABLE measurement_y2003m02;
the partitioned table but retain access to it as a table in its own the partitioned table but retain access to it as a table in its own
right: right:
<programlisting> <programlisting>
ALTER TABLE measurement_y2003m02 NO INHERIT measurement; ALTER TABLE measurement_y2006m02 NO INHERIT measurement;
</programlisting> </programlisting>
This allows further operations to be performed on the data before This allows further operations to be performed on the data before
it is dropped. For example, this is often a useful time to back up it is dropped. For example, this is often a useful time to back up
@ -2636,8 +2634,8 @@ ALTER TABLE measurement_y2003m02 NO INHERIT measurement;
were created above: were created above:
<programlisting> <programlisting>
CREATE TABLE measurement_y2006m02 ( CREATE TABLE measurement_y2008m02 (
CHECK ( logdate &gt;= DATE '2006-02-01' AND logdate &lt; DATE '2006-03-01' ) CHECK ( logdate &gt;= DATE '2008-02-01' AND logdate &lt; DATE '2008-03-01' )
) INHERITS (measurement); ) INHERITS (measurement);
</programlisting> </programlisting>
@ -2647,13 +2645,13 @@ CREATE TABLE measurement_y2006m02 (
transformed prior to it appearing in the partitioned table: transformed prior to it appearing in the partitioned table:
<programlisting> <programlisting>
CREATE TABLE measurement_y2006m02 CREATE TABLE measurement_y2008m02
(LIKE measurement INCLUDING DEFAULTS INCLUDING CONSTRAINTS); (LIKE measurement INCLUDING DEFAULTS INCLUDING CONSTRAINTS);
ALTER TABLE measurement_y2006m02 ADD CONSTRAINT y2006m02 ALTER TABLE measurement_y2008m02 ADD CONSTRAINT y2008m02
CHECK ( logdate &gt;= DATE '2006-02-01' AND logdate &lt; DATE '2006-03-01' ); CHECK ( logdate &gt;= DATE '2008-02-01' AND logdate &lt; DATE '2008-03-01' );
\copy measurement_y2006m02 from 'measurement_y2006m02' \copy measurement_y2008m02 from 'measurement_y2008m02'
-- possibly some other data preparation work -- possibly some other data preparation work
ALTER TABLE measurement_y2006m02 INHERIT measurement; ALTER TABLE measurement_y2008m02 INHERIT measurement;
</programlisting> </programlisting>
</para> </para>
</sect2> </sect2>
@ -2672,7 +2670,7 @@ ALTER TABLE measurement_y2006m02 INHERIT measurement;
<programlisting> <programlisting>
SET constraint_exclusion = on; SET constraint_exclusion = on;
SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2006-01-01'; SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2008-01-01';
</programlisting> </programlisting>
Without constraint exclusion, the above query would scan each of Without constraint exclusion, the above query would scan each of
@ -2691,23 +2689,23 @@ SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2006-01-01';
<programlisting> <programlisting>
SET constraint_exclusion = off; SET constraint_exclusion = off;
EXPLAIN SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2006-01-01'; EXPLAIN SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2008-01-01';
QUERY PLAN QUERY PLAN
----------------------------------------------------------------------------------------------- -----------------------------------------------------------------------------------------------
Aggregate (cost=158.66..158.68 rows=1 width=0) Aggregate (cost=158.66..158.68 rows=1 width=0)
-&gt; Append (cost=0.00..151.88 rows=2715 width=0) -&gt; Append (cost=0.00..151.88 rows=2715 width=0)
-&gt; Seq Scan on measurement (cost=0.00..30.38 rows=543 width=0) -&gt; Seq Scan on measurement (cost=0.00..30.38 rows=543 width=0)
Filter: (logdate &gt;= '2006-01-01'::date) Filter: (logdate &gt;= '2008-01-01'::date)
-&gt; Seq Scan on measurement_y2004m02 measurement (cost=0.00..30.38 rows=543 width=0) -&gt; Seq Scan on measurement_y2006m02 measurement (cost=0.00..30.38 rows=543 width=0)
Filter: (logdate &gt;= '2006-01-01'::date) Filter: (logdate &gt;= '2008-01-01'::date)
-&gt; Seq Scan on measurement_y2004m03 measurement (cost=0.00..30.38 rows=543 width=0) -&gt; Seq Scan on measurement_y2006m03 measurement (cost=0.00..30.38 rows=543 width=0)
Filter: (logdate &gt;= '2006-01-01'::date) Filter: (logdate &gt;= '2008-01-01'::date)
... ...
-&gt; Seq Scan on measurement_y2005m12 measurement (cost=0.00..30.38 rows=543 width=0) -&gt; Seq Scan on measurement_y2007m12 measurement (cost=0.00..30.38 rows=543 width=0)
Filter: (logdate &gt;= '2006-01-01'::date) Filter: (logdate &gt;= '2008-01-01'::date)
-&gt; Seq Scan on measurement_y2006m01 measurement (cost=0.00..30.38 rows=543 width=0) -&gt; Seq Scan on measurement_y2008m01 measurement (cost=0.00..30.38 rows=543 width=0)
Filter: (logdate &gt;= '2006-01-01'::date) Filter: (logdate &gt;= '2008-01-01'::date)
</programlisting> </programlisting>
Some or all of the partitions might use index scans instead of Some or all of the partitions might use index scans instead of
@ -2718,15 +2716,15 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2006-01-01';
<programlisting> <programlisting>
SET constraint_exclusion = on; SET constraint_exclusion = on;
EXPLAIN SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2006-01-01'; EXPLAIN SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2008-01-01';
QUERY PLAN QUERY PLAN
----------------------------------------------------------------------------------------------- -----------------------------------------------------------------------------------------------
Aggregate (cost=63.47..63.48 rows=1 width=0) Aggregate (cost=63.47..63.48 rows=1 width=0)
-&gt; Append (cost=0.00..60.75 rows=1086 width=0) -&gt; Append (cost=0.00..60.75 rows=1086 width=0)
-&gt; Seq Scan on measurement (cost=0.00..30.38 rows=543 width=0) -&gt; Seq Scan on measurement (cost=0.00..30.38 rows=543 width=0)
Filter: (logdate &gt;= '2006-01-01'::date) Filter: (logdate &gt;= '2008-01-01'::date)
-&gt; Seq Scan on measurement_y2006m01 measurement (cost=0.00..30.38 rows=543 width=0) -&gt; Seq Scan on measurement_y2008m01 measurement (cost=0.00..30.38 rows=543 width=0)
Filter: (logdate &gt;= '2006-01-01'::date) Filter: (logdate &gt;= '2008-01-01'::date)
</programlisting> </programlisting>
</para> </para>
@ -2739,6 +2737,69 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2006-01-01';
a large part of the partition or just a small part. An index will a large part of the partition or just a small part. An index will
be helpful in the latter case but not the former. be helpful in the latter case but not the former.
</para> </para>
</sect2>
<sect2 id="ddl-partitioning-alternatives">
<title>Alternative Partitioning Methods</title>
<para>
A different approach to redirecting inserts into the appropriate
partition table is to set up rules, instead of a trigger, on the
master table. For example:
<programlisting>
CREATE RULE measurement_insert_y2006m02 AS
ON INSERT TO measurement WHERE
( logdate &gt;= DATE '2006-02-01' AND logdate &lt; DATE '2006-03-01' )
DO INSTEAD
INSERT INTO measurement_y2006m02 VALUES (NEW.*);
...
CREATE RULE measurement_insert_y2008m01 AS
ON INSERT TO measurement WHERE
( logdate &gt;= DATE '2008-01-01' AND logdate &lt; DATE '2008-02-01' )
DO INSTEAD
INSERT INTO measurement_y2008m01 VALUES (NEW.*);
</programlisting>
A rule has significantly more overhead than a trigger, but the overhead
is paid once per query rather than once per row, so this method might be
advantageous for bulk-insert situations. In most cases, however, the
trigger method will offer better performance.
</para>
<para>
Be aware that <command>COPY</> ignores rules. If you want to
use <command>COPY</> to insert data, you'll need to copy into the correct
partition table rather than into the master. <command>COPY</> does fire
triggers, so you can use it normally if you use the trigger approach.
</para>
<para>
Another disadvantage of the rule approach is that there is no simple
way to force an error if the set of rules doesn't cover the insertion
date; the data will silently go into the master table instead.
</para>
<para>
Partitioning can also be arranged using a <literal>UNION ALL</literal>
view, instead of table inheritance. For example,
<programlisting>
CREATE VIEW measurement AS
SELECT * FROM measurement_y2006m02
UNION ALL SELECT * FROM measurement_y2006m03
...
UNION ALL SELECT * FROM measurement_y2007m11
UNION ALL SELECT * FROM measurement_y2007m12
UNION ALL SELECT * FROM measurement_y2008m01;
</programlisting>
However, the need to recreate the view adds an extra step to adding and
dropping individual partitions of the data set. In practice this
method has little to recommend it compared to using inheritance.
</para>
</sect2> </sect2>
<sect2 id="ddl-partitioning-caveats"> <sect2 id="ddl-partitioning-caveats">
@ -2749,24 +2810,38 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2006-01-01';
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para> <para>
There is currently no way to verify that all of the There is no automatic way to verify that all of the
<literal>CHECK</literal> constraints are mutually <literal>CHECK</literal> constraints are mutually
exclusive. Care is required by the database designer. exclusive. It is safer to create code that generates
partitions and creates and/or modifies associated objects than
to write each by hand.
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
<para> <para>
There is currently no simple way to specify that rows must not be The schemes shown here assume that the partition key column(s)
inserted into the master table. A <literal>CHECK (false)</literal> of a row never change, or at least do not change enough to require
constraint on the master table would be inherited by all child it to move to another partition. An <command>UPDATE</> that attempts
tables, so that cannot be used for this purpose. One possibility is to do that will fail because of the <literal>CHECK</> constraints.
to set up an <literal>ON INSERT</> trigger on the master table that If you need to handle such cases, you can put suitable update triggers
always raises an error. (Alternatively, such a trigger could be on the partition tables, but it makes management of the structure
used to redirect the data into the proper child table, instead of much more complicated.
using a set of rules as suggested above.)
</para> </para>
</listitem> </listitem>
<listitem>
<para>
If you are using manual <command>VACUUM</command> or
<command>ANALYZE</command> commands, don't forget that
you need to run them on each partition individually. A command like
<programlisting>
ANALYZE measurement;
</programlisting>
will only process the master table.
</para>
</listitem>
</itemizedlist> </itemizedlist>
</para> </para>
@ -2801,18 +2876,9 @@ EXPLAIN SELECT count(*) FROM measurement WHERE logdate &gt;= DATE '2006-01-01';
<para> <para>
All constraints on all partitions of the master table are examined All constraints on all partitions of the master table are examined
during constraint exclusion, so large numbers of partitions are likely during constraint exclusion, so large numbers of partitions are likely
to increase query planning time considerably. to increase query planning time considerably. Partitioning using
</para> these techniques will work well with up to perhaps a hundred partitions;
</listitem> don't try to use many thousands of partitions.
<listitem>
<para>
Don't forget that you still need to run <command>ANALYZE</command>
on each partition individually. A command like:
<programlisting>
ANALYZE measurement;
</programlisting>
will only process the master table.
</para> </para>
</listitem> </listitem>