Rip out table expression section from SQL syntax chapter and develop it
into new chapter on query (SELECT) syntax. In the end this should become a narrative and example-filled counterpart to the SELECT reference page.
This commit is contained in:
parent
e9c936ff38
commit
21a3857f1f
@ -1,4 +1,4 @@
|
||||
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/filelist.sgml,v 1.4 2001/01/06 11:58:56 petere Exp $ -->
|
||||
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/filelist.sgml,v 1.5 2001/01/22 23:34:32 petere Exp $ -->
|
||||
|
||||
<!entity about SYSTEM "about.sgml">
|
||||
<!entity history SYSTEM "history.sgml">
|
||||
@ -31,7 +31,7 @@
|
||||
<!entity plsql SYSTEM "plsql.sgml">
|
||||
<!entity pltcl SYSTEM "pltcl.sgml">
|
||||
<!entity psql SYSTEM "psql.sgml">
|
||||
<!entity query-ug SYSTEM "query-ug.sgml">
|
||||
<!entity queries SYSTEM "queries.sgml">
|
||||
<!entity storage SYSTEM "storage.sgml">
|
||||
<!entity syntax SYSTEM "syntax.sgml">
|
||||
<!entity typeconv SYSTEM "typeconv.sgml">
|
||||
|
819
doc/src/sgml/queries.sgml
Normal file
819
doc/src/sgml/queries.sgml
Normal file
@ -0,0 +1,819 @@
|
||||
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/queries.sgml,v 1.1 2001/01/22 23:34:33 petere Exp $ -->
|
||||
|
||||
<chapter id="queries">
|
||||
<title>Queries</title>
|
||||
|
||||
<para>
|
||||
A <firstterm>query</firstterm> is the process of or the command to
|
||||
retrieve data from a database. In SQL the <command>SELECT</command>
|
||||
command is used to specify queries. The general syntax of the
|
||||
<command>SELECT</command> command is
|
||||
<synopsis>
|
||||
SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable> <optional><replaceable>sort_specification</replaceable></optional>
|
||||
</synopsis>
|
||||
The following sections describe the details of the select list, the
|
||||
table expression, and the sort specification. The simplest kind of
|
||||
query has the form
|
||||
<programlisting>
|
||||
SELECT * FROM table1;
|
||||
</programlisting>
|
||||
Assuming that there is a table called table1, this command would
|
||||
retrieve all rows and all columns from table1. (The method of
|
||||
retrieval depends on the client application. For example, the
|
||||
<application>psql</application> program will display an ASCII-art
|
||||
table on the screen, client libraries will offer functions to
|
||||
retrieve individual rows and columns.) The select list
|
||||
specification <literal>*</literal> means all columns that the table
|
||||
expression happens to provide. A select list can also select a
|
||||
subset of the available columns or even make calculations on the
|
||||
columns before retrieving them; see <xref
|
||||
linkend="queries-select-lists">. For example, if table1 has columns
|
||||
named a, b, and c (and perhaps others) you can make the following
|
||||
query:
|
||||
<programlisting>
|
||||
SELECT a, b + c FROM table1;
|
||||
</programlisting>
|
||||
(assuming that b and c are of a numeric data type).
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<literal>FROM table1</literal> is a particularly simple kind of
|
||||
table expression. In general, table expressions can be complex
|
||||
constructs of base tables, joins, and subqueries. But you can also
|
||||
omit the table expression entirely and use the SELECT command as a
|
||||
calculator:
|
||||
<programlisting>
|
||||
SELECT 3 * 4;
|
||||
</programlisting>
|
||||
This is more useful if the expressions in the select list return
|
||||
varying results. For example, you could call a function this way.
|
||||
<programlisting>
|
||||
SELECT random();
|
||||
</programlisting>
|
||||
</para>
|
||||
|
||||
<sect1 id="queries-table-expressions">
|
||||
<title>Table Expressions</title>
|
||||
|
||||
<para>
|
||||
A <firstterm>table expression</firstterm> specifies a table. The
|
||||
table expression contains a FROM clause that is optionally followed
|
||||
by WHERE, GROUP BY, and HAVING clauses. Trivial table expressions
|
||||
simply refer to a table on disk, a so-called base table, but more
|
||||
complex expressions can be used to modify or combine base tables in
|
||||
various ways.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The WHERE, GROUP BY, and HAVING clauses in the table expression
|
||||
specify a pipeline of successive transformations performed on the
|
||||
table derived in the FROM clause. The final transformed table that
|
||||
is derived provides the input rows used to derive output rows as
|
||||
specified by the select list of derived column value expressions.
|
||||
</para>
|
||||
|
||||
<sect2 id="queries-from">
|
||||
<title>FROM clause</title>
|
||||
|
||||
<para>
|
||||
The FROM clause derives a table from one or more other tables
|
||||
given in a comma-separated table reference list.
|
||||
<synopsis>
|
||||
FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_reference</replaceable> <optional>, ...</optional></optional>
|
||||
</synopsis>
|
||||
|
||||
A table reference may be a table name or a derived table such as a
|
||||
subquery, a table join, or complex combinations of these. If more
|
||||
than one table reference is listed in the FROM clause they are
|
||||
CROSS JOINed (see below) to form the derived table that may then
|
||||
be subject to transformations by the WHERE, GROUP BY, and HAVING
|
||||
clauses and is finally the result of the overall table expression.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If a table reference is a simple table name and it is the
|
||||
supertable in a table inheritance hierarchy, rows of the table
|
||||
include rows from all of its subtable successors unless the
|
||||
keyword ONLY precedes the table name.
|
||||
</para>
|
||||
|
||||
<sect3 id="queries-join">
|
||||
<title>Joined Tables</title>
|
||||
|
||||
<para>
|
||||
A joined table is a table derived from two other (real or
|
||||
derived) tables according to the rules of the particular join
|
||||
type. INNER, OUTER, NATURAL, and CROSS JOIN are supported.
|
||||
</para>
|
||||
|
||||
<variablelist>
|
||||
<title>Join Types</title>
|
||||
|
||||
<varlistentry>
|
||||
<term>CROSS JOIN</term>
|
||||
|
||||
<listitem>
|
||||
<synopsis>
|
||||
<replaceable>T1</replaceable> CROSS JOIN <replaceable>T2</replaceable>
|
||||
</synopsis>
|
||||
|
||||
<para>
|
||||
For each combination of rows from
|
||||
<replaceable>T1</replaceable> and
|
||||
<replaceable>T2</replaceable> the derived table will contain a
|
||||
row consisting of all columns in <replaceable>T1</replaceable>
|
||||
followed by all columns in <replaceable>T2</replaceable>. If
|
||||
the tables have have N and M rows respectively, the joined
|
||||
table will have N * M rows. A cross join is essentially an
|
||||
<literal>INNER JOIN ON TRUE</literal>.
|
||||
</para>
|
||||
|
||||
<tip>
|
||||
<para>
|
||||
<literal>FROM <replaceable>T1</replaceable> CROSS JOIN
|
||||
<replaceable>T2</replaceable></literal> is equivalent to
|
||||
<literal>FROM <replaceable>T1</replaceable>,
|
||||
<replaceable>T2</replaceable></literal>.
|
||||
</para>
|
||||
</tip>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>Qualified JOINs</term>
|
||||
<listitem>
|
||||
|
||||
<synopsis>
|
||||
<replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> ON <replaceable>boolean expression</replaceable>
|
||||
<replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> USING ( <replaceable>join column list</replaceable> )
|
||||
</synopsis>
|
||||
|
||||
<para>
|
||||
The words <token>INNER</token> and <token>OUTER</token> are
|
||||
optional for all JOINs. <token>INNER</token> is the default;
|
||||
<token>LEFT</token>, <token>RIGHT</token>, and
|
||||
<token>FULL</token> are for OUTER JOINs only.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The <firstterm>join condition</firstterm> is specified in the
|
||||
ON or USING clause. (The meaning of the join condition
|
||||
depends on the particular join type; see below.) The ON
|
||||
clause takes a Boolean value expression of the same kind as is
|
||||
used in a WHERE clause. The USING clause takes a
|
||||
comma-separated list of column names, which the joined tables
|
||||
must have in common, and joins the tables on the equality of
|
||||
those columns as a set, resulting in a joined table having one
|
||||
column for each common column listed and all of the other
|
||||
columns from both tables. Thus, <literal>USING (a, b,
|
||||
c)</literal> is equivalent to <literal>ON (t1.a = t2.a AND
|
||||
t1.b = t2.b AND t1.c = t2.c)</literal> with the exception that
|
||||
if ON is used there will be two columns a, b, and c in the
|
||||
result, whereas with USING there will be only one of each.
|
||||
</para>
|
||||
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term>INNER JOIN</term>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
For each row R1 of T1, the joined table has a row for each
|
||||
row in T2 that satisfies the join condition with R1.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>LEFT OUTER JOIN</term>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
First, an INNER JOIN is performed. Then, for a row in T1
|
||||
that does not satisfy the join condition with any row in
|
||||
T2, a joined row is returned with NULL values in columns of
|
||||
T2. Thus, the joined table unconditionally has a row for each
|
||||
row in T1.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>RIGHT OUTER JOIN</term>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
This is like a left join, only that the result table will
|
||||
unconditionally have a row for each row in T2.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>FULL OUTER JOIN</term>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
First, an INNER JOIN is performed. Then, for each row in
|
||||
T1 that does not satisfy the join condition with any row in
|
||||
T2, a joined row is returned with null values in columns of
|
||||
T2. Also, for each row of T2 that does not satisfy the
|
||||
join condition with any row in T1, a joined row with null
|
||||
values in the columns of T1 is returned.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>NATURAL JOIN</term>
|
||||
|
||||
<listitem>
|
||||
<synopsis>
|
||||
<replaceable>T1</replaceable> NATURAL { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> JOIN <replaceable>T2</replaceable>
|
||||
</synopsis>
|
||||
<para>
|
||||
A natural join creates a joined table where every pair of matching
|
||||
column names between the two tables are merged into one column. The
|
||||
join specification is effectively a USING clause containing all the
|
||||
common column names and is otherwise like a Qualified JOIN.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
|
||||
<para>
|
||||
Joins of all types can be chained together or nested where either
|
||||
or both of <replaceable>T1</replaceable> and
|
||||
<replaceable>T2</replaceable> may be JOINed tables. Parenthesis
|
||||
can be used around JOIN clauses to control the join order which
|
||||
are otherwise left to right.
|
||||
</para>
|
||||
</sect3>
|
||||
|
||||
<sect3 id="queries-subqueries">
|
||||
<title>Subqueries</title>
|
||||
|
||||
<para>
|
||||
Subqueries specifying a derived table must be enclosed in
|
||||
parenthesis and <emphasis>must</emphasis> be named using an AS
|
||||
clause. (See <xref linkend="queries-table-aliases">.)
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
FROM (SELECT * FROM table1) AS alias_name
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
This example is equivalent to <literal>FROM table1 AS
|
||||
alias_name</literal>. Many subqueries can be written as table
|
||||
joins instead.
|
||||
</para>
|
||||
</sect3>
|
||||
|
||||
<sect3 id="queries-table-aliases">
|
||||
<title>Table and Column Aliases</title>
|
||||
|
||||
<para>
|
||||
A temporary name can be given to tables and complex table
|
||||
references to be used for references to the derived table in
|
||||
further processing. This is called a <firstterm>table
|
||||
alias</firstterm>.
|
||||
<synopsis>
|
||||
FROM <replaceable>table_reference</replaceable> AS <replaceable>alias</replaceable>
|
||||
</synopsis>
|
||||
Here, <replaceable>alias</replaceable> can be any regular
|
||||
identifier. The alias becomes the new name of the table
|
||||
reference for the current query -- it is no longer possible to
|
||||
refer to the table by the original name (if the table reference
|
||||
was an ordinary base table). Thus
|
||||
<programlisting>
|
||||
SELECT * FROM my_table AS m WHERE my_table.a > 5;
|
||||
</programlisting>
|
||||
is not valid SQL syntax. What will happen instead, as a
|
||||
<productname>Postgres</productname> extension, is that an implicit
|
||||
table reference is added to the FROM clause, so the query is
|
||||
processed as if it was written as
|
||||
<programlisting>
|
||||
SELECT * FROM my_table AS m, my_table WHERE my_table.a > 5;
|
||||
</programlisting>
|
||||
Table aliases are mainly for notational convenience, but it is
|
||||
necessary to use them when joining a table to itself, e.g.,
|
||||
<programlisting>
|
||||
SELECT * FROM my_table AS a CROSS JOIN my_table AS b ...
|
||||
</programlisting>
|
||||
Additionally, an alias is required if the table reference is a
|
||||
subquery.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Parenthesis are used to resolve ambiguities. The following
|
||||
statement will assign the alias <literal>b</literal> to the
|
||||
result of the join, unlike the previous example:
|
||||
<programlisting>
|
||||
SELECT * FROM (my_table AS a CROSS JOIN my_table) AS b ...
|
||||
</programlisting>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<synopsis>
|
||||
FROM <replaceable>table_reference</replaceable> <replaceable>alias</replaceable>
|
||||
</synopsis>
|
||||
This form is equivalent the previously treated one; the
|
||||
<token>AS</token> key word is noise.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<synopsis>
|
||||
FROM <replaceable>table_reference</replaceable> <optional>AS</optional> <replaceable>alias</replaceable> ( <replaceable>column1</replaceable> <optional>, <replaceable>column2</replaceable> <optional>, ...</optional></optional> )
|
||||
</synopsis>
|
||||
In addition to renaming the table as described above, the columns
|
||||
of the table are also given temporary names. If less column
|
||||
aliases are specified than the actual table has columns, the last
|
||||
columns are not renamed. This syntax is especially useful for
|
||||
self-joins or subqueries.
|
||||
</para>
|
||||
</sect3>
|
||||
|
||||
<sect3 id="queries-table-expression-examples">
|
||||
<title>Examples</title>
|
||||
|
||||
<para>
|
||||
<programlisting>
|
||||
FROM T1 INNER JOIN T2 USING (C)
|
||||
FROM T1 LEFT OUTER JOIN T2 USING (C)
|
||||
FROM (T1 RIGHT OUTER JOIN T2 ON (T1C1=T2C1)) AS DT1
|
||||
FROM (T1 FULL OUTER JOIN T2 USING (C)) AS DT1 (DT1C1, DT1C2)
|
||||
|
||||
FROM T1 NATURAL INNER JOIN T2
|
||||
FROM T1 NATURAL LEFT OUTER JOIN T2
|
||||
FROM T1 NATURAL RIGHT OUTER JOIN T2
|
||||
FROM T1 NATURAL FULL OUTER JOIN T2
|
||||
|
||||
FROM (SELECT * FROM T1) DT1 CROSS JOIN T2, T3
|
||||
FROM (SELECT * FROM T1) DT1, T2, T3
|
||||
</programlisting>
|
||||
|
||||
Above are some examples of joined tables and complex derived
|
||||
tables. Notice how the AS clause renames or names a derived
|
||||
table and how the optional comma-separated list of column names
|
||||
that follows gives names or renames the columns. The last two
|
||||
FROM clauses produce the same derived table from T1, T2, and T3.
|
||||
The AS keyword was omitted in naming the subquery as DT1. The
|
||||
keywords OUTER and INNER are noise that can be omitted also.
|
||||
</para>
|
||||
</sect3>
|
||||
|
||||
</sect2>
|
||||
|
||||
<sect2 id="queries-where">
|
||||
<title>WHERE clause</title>
|
||||
|
||||
<para>
|
||||
The syntax of the WHERE clause is
|
||||
<synopsis>
|
||||
WHERE <replaceable>search condition</replaceable>
|
||||
</synopsis>
|
||||
where <replaceable>search condition</replaceable> is any value
|
||||
expression as defined in <xref linkend="sql-expressions"> that
|
||||
returns a value of type <type>boolean</type>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
After the processing of the FROM clause is done, each row of the
|
||||
derived table is checked against the search condition. If the
|
||||
result of the condition is true, the row is kept in the output
|
||||
table, otherwise (that is, if the result is false or NULL) it is
|
||||
discarded. The search condition typically references at least some
|
||||
column in the table generated in the FROM clause; this is not
|
||||
required, but otherwise the WHERE clause will be fairly useless.
|
||||
</para>
|
||||
|
||||
<note>
|
||||
<para>
|
||||
Before the implementation of the JOIN syntax, it was necessary to
|
||||
put the join condition of an inner join in the WHERE clause. For
|
||||
example, these table expressions are equivalent:
|
||||
<programlisting>
|
||||
FROM a, b WHERE a.id = b.id AND b.val > 5
|
||||
</programlisting>
|
||||
and
|
||||
<programlisting>
|
||||
FROM a INNER JOIN b ON (a.id = b.id) WHERE b.val > 5
|
||||
</programlisting>
|
||||
or perhaps even
|
||||
<programlisting>
|
||||
FROM a NATURAL JOIN b WHERE b.val > 5
|
||||
</programlisting>
|
||||
Which one of these you use is mainly a matter of style. The JOIN
|
||||
syntax in the FROM clause is probably not as portable to other
|
||||
products. For outer joins there is no choice in any case: they
|
||||
must be done in the FROM clause.
|
||||
</para>
|
||||
</note>
|
||||
|
||||
<programlisting>
|
||||
FROM FDT WHERE
|
||||
C1 > 5
|
||||
|
||||
FROM FDT WHERE
|
||||
C1 IN (1, 2, 3)
|
||||
FROM FDT WHERE
|
||||
C1 IN (SELECT C1 FROM T2)
|
||||
FROM FDT WHERE
|
||||
C1 IN (SELECT C3 FROM T2 WHERE C2 = FDT.C1 + 10)
|
||||
|
||||
FROM FDT WHERE
|
||||
C1 BETWEEN (SELECT C3 FROM T2 WHERE C2 = FDT.C1 + 10) AND 100
|
||||
|
||||
FROM FDT WHERE
|
||||
EXISTS (SELECT C1 FROM T2 WHERE C2 > FDT.C1)
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
In the examples above, FDT is the table derived in the FROM
|
||||
clause. Rows that do not meet the search condition of the where
|
||||
clause are eliminated from FDT. Notice the use of scalar
|
||||
subqueries as value expressions (C2 assumed UNIQUE). Just like
|
||||
any other query, the subqueries can employ complex table
|
||||
expressions. Notice how FDT is referenced in the subqueries.
|
||||
Qualifying C1 as FDT.C1 is only necessary if C1 is the name of a
|
||||
column in the derived input table of the subquery. Qualifying the
|
||||
column name adds clarity even when it is not needed. The column
|
||||
naming scope of an outer query extends into its inner queries.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
|
||||
<sect2 id="queries-group">
|
||||
<title>GROUP BY and HAVING clauses</title>
|
||||
|
||||
<para>
|
||||
After passing the WHERE filter, the derived input table may be
|
||||
subject to grouping, using the GROUP BY clause, and elimination of
|
||||
group rows using the HAVING clause.
|
||||
</para>
|
||||
|
||||
<synopsis>
|
||||
SELECT <replaceable>select_list</replaceable> FROM ... <optional>WHERE ...</optional> GROUP BY <replaceable>grouping_column_reference</replaceable> <optional>, <replaceable>grouping_column_reference</replaceable></optional>...
|
||||
</synopsis>
|
||||
|
||||
<para>
|
||||
The GROUP BY clause is used to group together rows in a table that
|
||||
share the same values in all the columns listed. The order in
|
||||
which the columns are listed does not matter (as opposed to an
|
||||
ORDER BY clause). The purpose is to reduce each group of rows
|
||||
sharing common values into one group row that is representative of
|
||||
all rows in the group. This is done to eliminate redundancy in
|
||||
the output and/or obtain aggregates that apply to these groups.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Once a table is grouped, columns that are not included in the
|
||||
grouping cannot be referenced, except in aggregate expressions,
|
||||
since a specific value in those columns is ambiguous - which row
|
||||
in the group should it come from? The grouped-by columns can be
|
||||
referenced in select list column expressions since they have a
|
||||
known constant value per group. Aggregate functions on the
|
||||
ungrouped columns provide values that span the rows of a group,
|
||||
not of the whole table. For instance, a
|
||||
<function>sum(sales)</function> on a grouped table by product code
|
||||
gives the total sales for each product, not the total sales on all
|
||||
products. The aggregates of the ungrouped columns are
|
||||
representative of the group, whereas their individual values may
|
||||
not be.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Example:
|
||||
<programlisting>
|
||||
SELECT pid, p.name, (sum(s.units) * p.price) AS sales
|
||||
FROM products p LEFT JOIN sales s USING ( pid )
|
||||
GROUP BY pid, p.name, p.price;
|
||||
</programlisting>
|
||||
In this example, the columns pid, p.name, and p.price must be in
|
||||
the GROUP BY clause since they are referenced in the query select
|
||||
list. The column s.units does not have to be in the GROUP BY list
|
||||
since it is only used in an aggregate expression
|
||||
(<function>sum()</function>), which represents the group of sales
|
||||
of a product. For each product, a summary row is returned about
|
||||
all sales of the product.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
In strict SQL, GROUP BY can only group by columns of the source
|
||||
table but Postgres extends this to also allow GROUP BY to group by
|
||||
select columns in the query select list. Grouping by value
|
||||
expressions instead of simple column names is also allowed.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<synopsis>
|
||||
SELECT <replaceable>select_list</replaceable> FROM ... <optional>WHERE ...</optional> GROUP BY ... HAVING <replaceable>boolean_expression</replaceable>
|
||||
</synopsis>
|
||||
If a table has been grouped using a GROUP BY clause, but then only
|
||||
certain groups are of interest, the HAVING clause can be used,
|
||||
much like a WHERE clause, to eliminate groups from a grouped
|
||||
table. For some queries, Postgres allows a HAVING clause to be
|
||||
used without a GROUP BY and then it acts just like another WHERE
|
||||
clause, but the point in using HAVING that way is not clear. Since
|
||||
HAVING operates on groups, only grouped columns can be listed in
|
||||
the HAVING clause. If selection based on some ungrouped column is
|
||||
desired, it should be expressed in the WHERE clause.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Example:
|
||||
<programlisting>
|
||||
SELECT pid AS "Products",
|
||||
p.name AS "Over 5000",
|
||||
(sum(s.units) * (p.price - p.cost)) AS "Past Month Profit"
|
||||
FROM products p LEFT JOIN sales s USING ( pid )
|
||||
WHERE p.date > CURRENT_DATE - INTERVAL '4 weeks'
|
||||
GROUP BY pid, p.name, p.price, p.cost
|
||||
HAVING p.price > 5000;
|
||||
</programlisting>
|
||||
In the example above, the WHERE clause is selecting rows by a
|
||||
column that is not grouped, while the HAVING clause
|
||||
is selecting groups with a price greater than 5000.
|
||||
</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
|
||||
<sect1 id="queries-select-lists">
|
||||
<title>Select Lists</title>
|
||||
|
||||
<para>
|
||||
The table expression in the <command>SELECT</command> command
|
||||
constructs an intermediate virtual table by possibly combining
|
||||
tables, views, eliminating rows, grouping, etc. This table is
|
||||
finally passed on to processing by the select list. The select
|
||||
list determines which <emphasis>columns</emphasis> of the
|
||||
intermediate table are retained. The simplest kind of select list
|
||||
is <literal>*</literal> which retains all columns that the table
|
||||
expression produces. Otherwise, a select list is a comma-separated
|
||||
list of value expressions (as defined in <xref
|
||||
linkend="sql-expressions">). For instance, it could be a list of
|
||||
column names:
|
||||
<programlisting>
|
||||
SELECT a, b, c FROM ...
|
||||
</programlisting>
|
||||
The columns names a, b, and c are either the actual names of the
|
||||
columns of table referenced in the FROM clause, or the aliases
|
||||
given to them as explained in <xref linkend="queries-table-aliases">.
|
||||
The name space available in the select list is the same as in the
|
||||
WHERE clause (unless grouping is used, in which case it is the same
|
||||
as in the HAVING clause). If more than one table has a column of
|
||||
the same name, the table name must also be given, as in
|
||||
<programlisting>
|
||||
SELECT tbl1.a, tbl2.b, tbl1.c FROM ...
|
||||
</programlisting>
|
||||
(see also <xref linkend="queries-where">).
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If an arbitrary value expression is used in the select list, it
|
||||
conceptually adds a new virtual column to the returned table. The
|
||||
value expression is effectively evaluated once for each retrieved
|
||||
row with real values substituted for any column references. But
|
||||
the expressions in the select list do not have to reference any
|
||||
columns in the table expression of the FROM clause; they can be
|
||||
constant arithmetic expressions as well, for instance.
|
||||
</para>
|
||||
|
||||
<sect2 id="queries-column-labels">
|
||||
<title>Column Labels</title>
|
||||
|
||||
<para>
|
||||
The entries in the select list can be assigned names for further
|
||||
processing. The <quote>further processing</quote> in this case is
|
||||
an optional sort specification and the client application (e.g.,
|
||||
column headers for display). For example:
|
||||
<programlisting>
|
||||
SELECT a AS value, b + c AS sum FROM ...
|
||||
</programlisting>
|
||||
The AS key word can in fact be omitted.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If no name is chosen, the system assigns a default. For simple
|
||||
column references, this is the name of the column. For function
|
||||
calls, this is the name of the function. For complex expressions,
|
||||
the system will generate a generic name.
|
||||
</para>
|
||||
|
||||
<note>
|
||||
<para>
|
||||
The naming of output columns here is different from that done in
|
||||
the FROM clause (see <xref linkend="queries-table-aliases">). This
|
||||
pipeline will in fact allow you to rename the same column twice,
|
||||
but the name chosen in the select list is the one that will be
|
||||
passed on.
|
||||
</para>
|
||||
</note>
|
||||
</sect2>
|
||||
|
||||
<sect2 id="queries-distinct">
|
||||
<title>DISTINCT</title>
|
||||
|
||||
<para>
|
||||
After the select list has been processed, the result table may
|
||||
optionally be subject to the elimination of duplicates. The
|
||||
<token>DISTINCT</token> key word is written directly after the
|
||||
<token>SELECT</token> to enable this:
|
||||
<synopsis>
|
||||
SELECT DISTINCT <replaceable>select_list</replaceable> ...
|
||||
</synopsis>
|
||||
(Instead of <token>DISTINCT</token> the word <token>ALL</token>
|
||||
can be used to select the default behavior of retaining all rows.)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Obviously, two rows are considered distinct if they differ in at
|
||||
least one column value. NULLs are considered equal in this
|
||||
consideration.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Alternatively, an arbitrary expression can determine what rows are
|
||||
to be considered distinct:
|
||||
<synopsis>
|
||||
SELECT DISTINCT ON (<replaceable>expression</replaceable> <optional>, <replaceable>expression</replaceable> ...</optional>) <replaceable>select_list</replaceable> ...
|
||||
</synopsis>
|
||||
Here <replaceable>expression</replaceable> is an arbitrary value
|
||||
expression that is evaluated for all rows. A set of rows for
|
||||
which all the expressions is equal are considered duplicates and
|
||||
only the first row is kept in the output. Note that the
|
||||
<quote>first row</quote> of a set is unpredictable unless the
|
||||
query is sorted.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The DISTINCT ON clause is not part of the SQL standard and is
|
||||
sometimes considered bad style because of the indeterminate nature
|
||||
of its results. With judicious use of GROUP BY and subselects in
|
||||
FROM the construct can be avoided, but it is very often the much
|
||||
more convenient alternative.
|
||||
</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="queries-union">
|
||||
<title>Combining Queries</title>
|
||||
|
||||
<para>
|
||||
The results of two queries can be combined using the set operations
|
||||
union, intersection, and difference. The syntax is
|
||||
<synopsis>
|
||||
<replaceable>query1</replaceable> UNION <optional>ALL</optional> <replaceable>query2</replaceable>
|
||||
<replaceable>query1</replaceable> INTERSECT <optional>ALL</optional> <replaceable>query2</replaceable>
|
||||
<replaceable>query1</replaceable> EXCEPT <optional>ALL</optional> <replaceable>query2</replaceable>
|
||||
</synopsis>
|
||||
<replaceable>query1</replaceable> and
|
||||
<replaceable>query2</replaceable> are queries that can use any of
|
||||
the features discussed up to this point. Set operations can also
|
||||
be nested and chained, for example
|
||||
<synopsis>
|
||||
<replaceable>query1</replaceable> UNION <replaceable>query2</replaceable> UNION <replaceable>query3</replaceable>
|
||||
</synopsis>
|
||||
which really says
|
||||
<synopsis>
|
||||
(<replaceable>query1</replaceable> UNION <replaceable>query2</replaceable>) UNION <replaceable>query3</replaceable>
|
||||
</synopsis>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<command>UNION</command> effectively appends the result of
|
||||
<replaceable>query2</replaceable> to the result of
|
||||
<replaceable>query1</replaceable> (although there is no guarantee
|
||||
that this is the order in which the rows are actually returned) and
|
||||
eliminates all duplicate rows, in the sense of DISTINCT, unless ALL
|
||||
is specified.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<command>INTERSECT</command> returns all rows that are both in the
|
||||
result of <replaceable>query1</replaceable> and in the result of
|
||||
<replaceable>query2</replaceable>. Duplicate rows are eliminated
|
||||
unless ALL is specified.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<command>EXCEPT</command> returns all rows that are in the result
|
||||
of <replaceable>query1</replaceable> but not in the result of
|
||||
<replaceable>query2</replaceable>. Again, duplicates are
|
||||
eliminated unless ALL is specified.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
In order to calculate the union, intersection, or difference of two
|
||||
queries, the two queries must be <quote>union compatible</quote>,
|
||||
which means that they both return the same number of columns, and
|
||||
that the corresponding columns have compatible data types, as
|
||||
described in <xref linkend="typeconv-union-case">.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
|
||||
<sect1 id="queries-order">
|
||||
<title>Sorting Rows</title>
|
||||
|
||||
<para>
|
||||
After a query has produced an output table (after the select list
|
||||
has been processed) it can optionally be sorted. If sorting is not
|
||||
chosen, the rows will be returned in random order. The actual
|
||||
order in that case will depend on the scan and join plan types and
|
||||
the order on disk, but it must not be relied on. A particular
|
||||
ordering can only be guaranteed if the sort step is explicitly
|
||||
chosen.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The ORDER BY clause specifies the sort order:
|
||||
<synopsis>
|
||||
SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable> ORDER BY <replaceable>column1</replaceable> <optional>ASC | DESC</optional> <optional>, <replaceable>column2</replaceable> <optional>ASC | DESC</optional> ...</optional>
|
||||
</synopsis>
|
||||
<replaceable>column1</replaceable>, etc., refer to select list
|
||||
columns: It can either be the name of a column (either the
|
||||
explicit column label or default name, as explained in <xref
|
||||
linkend="queries-column-labels">) or the number of a column. Some
|
||||
examples:
|
||||
<programlisting>
|
||||
SELECT a, b FROM table1 ORDER BY a;
|
||||
SELECT a + b AS sum, c FROM table1 ORDER BY sum;
|
||||
SELECT a, sum(b) FROM table1 GROUP BY a ORDER BY 1;
|
||||
</programlisting>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
As an extension to the SQL standard, Postgres also allows ordering
|
||||
by arbitrary expressions:
|
||||
<programlisting>
|
||||
SELECT a, b FROM table1 ORDER BY a + b;
|
||||
</programlisting>
|
||||
References to column names in the FROM clause that are renamed in
|
||||
the select list are also allowed:
|
||||
<programlisting>
|
||||
SELECT a AS b FROM table1 ORDER BY a;
|
||||
</programlisting>
|
||||
But this does not work in queries involving UNION, INTERSECT, or
|
||||
EXCEPT, and is not portable.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Each column specification may be followed by an optional ASC or
|
||||
DESC to set the sort direction. ASC is default. Ascending order
|
||||
puts smaller values first, where <quote>smaller</quote> is defined
|
||||
in terms of the <literal><</literal> operator. Similarly,
|
||||
descending order is determined with the <literal>></literal>
|
||||
operator.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If more than one sort column is specified the later entries are
|
||||
used to sort the rows that are equal under the order imposed by the
|
||||
earlier sort specifications.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1 id="queries-limit">
|
||||
<title>LIMIT and OFFSET</title>
|
||||
|
||||
<synopsis>
|
||||
SELECT <replaceable>select_list</replaceable> FROM <replaceable>table_expression</replaceable> <optional>ORDER BY <replaceable>sort_spec</replaceable></optional> <optional>LIMIT { <replaceable>number</replaceable> | ALL }</optional> <optional>OFFSET <replaceable>number</replaceable></optional>
|
||||
</synopsis>
|
||||
|
||||
<para>
|
||||
LIMIT allows you to retrieve just a portion of the rows that are
|
||||
generated by the rest of the query. If a limit count is given, no
|
||||
more than that many rows will be returned. If an offset is given,
|
||||
that many rows will be skipped before starting to return rows.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
When using LIMIT, it is a good idea to use an ORDER BY clause that
|
||||
constrains the result rows into a unique order. Otherwise you will
|
||||
get an unpredictable subset of the query's rows---you may be asking
|
||||
for the tenth through twentieth rows, but tenth through twentieth
|
||||
in what ordering? The ordering is unknown, unless you specified
|
||||
ORDER BY.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The query optimizer takes LIMIT into account when generating a
|
||||
query plan, so you are very likely to get different plans (yielding
|
||||
different row orders) depending on what you give for LIMIT and
|
||||
OFFSET. Thus, using different LIMIT/OFFSET values to select
|
||||
different subsets of a query result <emphasis>will give
|
||||
inconsistent results</emphasis> unless you enforce a predictable
|
||||
result ordering with ORDER BY. This is not a bug; it is an
|
||||
inherent consequence of the fact that SQL does not promise to
|
||||
deliver the results of a query in any particular order unless ORDER
|
||||
BY is used to constrain the order.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
</chapter>
|
@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v 1.36 2001/01/21 22:02:01 petere Exp $
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v 1.37 2001/01/22 23:34:33 petere Exp $
|
||||
-->
|
||||
|
||||
<chapter id="sql-syntax">
|
||||
@ -743,7 +743,7 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
|
||||
</para>
|
||||
|
||||
<para>
|
||||
In addition to this list, there are a number of contructs that can
|
||||
In addition to this list, there are a number of constructs that can
|
||||
be classified as an expression but do not follow any general syntax
|
||||
rules. These generally have the semantics of a function or
|
||||
operator and are explained in the appropriate location in <xref
|
||||
@ -763,15 +763,15 @@ CAST ( '<replaceable>string</replaceable>' AS <replaceable>type</replaceable> )
|
||||
<para>
|
||||
A column can be referenced in the form:
|
||||
<synopsis>
|
||||
<replaceable>corelation</replaceable>.<replaceable>columnname</replaceable> `['<replaceable>subscript</replaceable>`]'
|
||||
<replaceable>correlation</replaceable>.<replaceable>columnname</replaceable> `['<replaceable>subscript</replaceable>`]'
|
||||
</synopsis>
|
||||
|
||||
<replaceable>corelation</replaceable> is either the name of a
|
||||
<replaceable>correlation</replaceable> is either the name of a
|
||||
table, an alias for a table defined by means of a FROM clause, or
|
||||
the keyword <literal>NEW</literal> or <literal>OLD</literal>.
|
||||
(NEW and OLD can only appear in the action portion of a rule,
|
||||
while other corelation names can be used in any SQL statement.)
|
||||
The corelation name can be omitted if the column name is unique
|
||||
while other correlation names can be used in any SQL statement.)
|
||||
The correlation name can be omitted if the column name is unique
|
||||
across all the tables being used in the current query. If
|
||||
<replaceable>column</replaceable> is of an array type, then the
|
||||
optional <replaceable>subscript</replaceable> selects a specific
|
||||
@ -895,8 +895,8 @@ sqrt(2)
|
||||
The precedence and associativity of the operators is hard-wired
|
||||
into the parser. Most operators have the same precedence and are
|
||||
left-associative. This may lead to non-intuitive behavior; for
|
||||
example the boolean operators "<" and ">" have a different
|
||||
precedence than the boolean operators "<=" and ">=". Also,
|
||||
example the Boolean operators "<" and ">" have a different
|
||||
precedence than the Boolean operators "<=" and ">=". Also,
|
||||
you will sometimes need to add parentheses when using combinations
|
||||
of binary and unary operators. For instance
|
||||
<programlisting>
|
||||
@ -917,7 +917,7 @@ SELECT (5 &) ~ 6;
|
||||
<tgroup cols="2">
|
||||
<thead>
|
||||
<row>
|
||||
<entry>OperatorElement</entry>
|
||||
<entry>Operator/Element</entry>
|
||||
<entry>Associativity</entry>
|
||||
<entry>Description</entry>
|
||||
</row>
|
||||
@ -1057,526 +1057,6 @@ SELECT (5 &) ~ 6;
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
|
||||
<sect1 id="sql-table-expressions">
|
||||
<title>Table Expressions</title>
|
||||
|
||||
<para>
|
||||
A <firstterm>table expression</firstterm> specifies a table. The
|
||||
table expression contains a FROM clause that is optionally followed
|
||||
by WHERE, GROUP BY, and HAVING clauses. Trivial table expressions
|
||||
simply refer to a table on disk, a so-called base table, but more
|
||||
complex expressions can be used to modify or combine base tables in
|
||||
various ways.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The general syntax of the <command>SELECT</command> command is
|
||||
<synopsis>
|
||||
SELECT <replaceable>select_list</replaceable> <replaceable>table_expression</replaceable>
|
||||
</synopsis>
|
||||
|
||||
The <replaceable>select_list</replaceable> is a comma separated
|
||||
list of <replaceable>value expressions</replaceable> as defined in
|
||||
<xref linkend="sql-expressions"> that specify the derived columns
|
||||
of the query output table. Column names in the derived table that
|
||||
is the result of the <replaceable>table_expression</replaceable>
|
||||
can be used in the <replaceable>value expression</replaceable>s of
|
||||
the <replaceable>select_list</replaceable>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The WHERE, GROUP BY, and HAVING clauses in the table expression
|
||||
specify a pipeline of successive transformations performed on the
|
||||
table derived in the FROM clause. The final transformed table that
|
||||
is derived provides the input rows used to derive output rows as
|
||||
specified by the select list of derived column value expressions.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>FROM clause</title>
|
||||
|
||||
<para>
|
||||
The FROM clause derives a table from one or more other tables
|
||||
given in a comma-separated table reference list.
|
||||
<synopsis>
|
||||
FROM <replaceable>table_reference</replaceable> <optional>, <replaceable>table_reference</replaceable> <optional>, ...</optional></optional>
|
||||
</synopsis>
|
||||
|
||||
A table reference may be a table name or a derived table such as a
|
||||
subquery, a table join, or complex combinations of these. If more
|
||||
than one table reference is listed in the FROM clause they are
|
||||
CROSS JOINed (see below) to form the derived table that may then
|
||||
be subject to transformations by the WHERE, GROUP BY, and HAVING
|
||||
clauses and is finally the result of the overall table expression.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If a table reference is a simple table name and it is the
|
||||
supertable in a table inheritance hierarchy, rows of the table
|
||||
include rows from all of its subtable successors unless the
|
||||
keyword ONLY precedes the table name.
|
||||
</para>
|
||||
|
||||
<sect3>
|
||||
<title>Joined Tables</title>
|
||||
|
||||
<para>
|
||||
A joined table is a table derived from two other (real or
|
||||
derived) tables according to the rules of the particular join
|
||||
type. INNER, OUTER, NATURAL, and CROSS JOIN are supported.
|
||||
</para>
|
||||
|
||||
<variablelist>
|
||||
<title>Join Types</title>
|
||||
|
||||
<varlistentry>
|
||||
<term>CROSS JOIN</term>
|
||||
|
||||
<listitem>
|
||||
<synopsis>
|
||||
<replaceable>T1</replaceable> CROSS JOIN <replaceable>T2</replaceable>
|
||||
</synopsis>
|
||||
|
||||
<para>
|
||||
For each combination of rows from
|
||||
<replaceable>T1</replaceable> and
|
||||
<replaceable>T2</replaceable> the derived table will contain a
|
||||
row consisting of all columns in <replaceable>T1</replaceable>
|
||||
followed by all columns in <replaceable>T2</replaceable>. If
|
||||
the tables have have N and M rows respectively, the joined
|
||||
table will have N * M rows. A cross join is essentially an
|
||||
<literal>INNER JOIN ON TRUE</literal>.
|
||||
</para>
|
||||
|
||||
<tip>
|
||||
<para>
|
||||
<literal>FROM <replaceable>T1</replaceable> CROSS JOIN
|
||||
<replaceable>T2</replaceable></literal> is equivalent to
|
||||
<literal>FROM <replaceable>T1</replaceable>,
|
||||
<replaceable>T2</replaceable></literal>.
|
||||
</para>
|
||||
</tip>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>Qualified JOINs</term>
|
||||
<listitem>
|
||||
|
||||
<synopsis>
|
||||
<replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> ON <replaceable>boolean expression</replaceable>
|
||||
<replaceable>T1</replaceable> { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> } JOIN <replaceable>T2</replaceable> USING ( <replaceable>join column list</replaceable> )
|
||||
</synopsis>
|
||||
|
||||
<para>
|
||||
The words <token>INNER</token> and <token>OUTER</token> are
|
||||
optional for all JOINs. <token>INNER</token> is the default;
|
||||
<token>LEFT</token>, <token>RIGHT</token>, and
|
||||
<token>FULL</token> are for OUTER JOINs only.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The <firstterm>join condition</firstterm> is specified in the
|
||||
ON or USING clause. (The meaning of the join condition
|
||||
depends on the particular join type; see below.) The ON
|
||||
clause takes a boolean value expression of the same kind as is
|
||||
used in a WHERE clause. The USING clause takes a
|
||||
comma-separated list of column names, which the joined tables
|
||||
must have in common, and joins the tables on the equality of
|
||||
those columns as a set, resulting in a joined table having one
|
||||
column for each common column listed and all of the other
|
||||
columns from both tables. Thus, <literal>USING (a, b,
|
||||
c)</literal> is equivalent to <literal>ON (t1.a = t2.a AND
|
||||
t1.b = t2.b AND t1.c = t2.c)</literal> with the exception that
|
||||
if ON is used there will be two columns a, b, and c in the
|
||||
result, whereas with USING there will be only one of each.
|
||||
</para>
|
||||
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term>INNER JOIN</term>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
For each row R1 of T1, the joined table has a row for each
|
||||
row in T2 that satisfies the join condition with R1.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>LEFT OUTER JOIN</term>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
First, an INNER JOIN is performed. Then, for a row in T1
|
||||
that does not satisfy the join condition with any row in
|
||||
T2, a joined row is returned with NULL values in columns of
|
||||
T2. Thus, the joined table unconditionally has a row for each
|
||||
row in T1.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>RIGHT OUTER JOIN</term>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
This is like a left join, only that the result table will
|
||||
unconditionally have a row for each row in T2.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>FULL OUTER JOIN</term>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
First, an INNER JOIN is performed. Then, for each row in
|
||||
T1 that does not satisfy the join condition with any row in
|
||||
T2, a joined row is returned with null values in columns of
|
||||
T2. Also, for each row of T2 that does not satisfy the
|
||||
join condition with any row in T1, a joined row with null
|
||||
values in the columns of T1 is returned.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>NATURAL JOIN</term>
|
||||
|
||||
<listitem>
|
||||
<synopsis>
|
||||
<replaceable>T1</replaceable> NATURAL { <optional>INNER</optional> | { LEFT | RIGHT | FULL } <optional>OUTER</optional> JOIN <replaceable>T2</replaceable>
|
||||
</synopsis>
|
||||
<para>
|
||||
A natural join creates a joined table where every pair of matching
|
||||
column names between the two tables are merged into one column. The
|
||||
join specification is effectively a USING clause containing all the
|
||||
common column names and is otherwise like a Qualified JOIN.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
|
||||
<para>
|
||||
Joins of all types can be chained together or nested where either
|
||||
or both of <replaceable>T1</replaceable> and
|
||||
<replaceable>T2</replaceable> may be JOINed tables. Parenthesis
|
||||
can be used around JOIN clauses to control the join order which
|
||||
are otherwise left to right.
|
||||
</para>
|
||||
</sect3>
|
||||
|
||||
<sect3 id="sql-subqueries">
|
||||
<title>Subqueries</title>
|
||||
|
||||
<para>
|
||||
Subqueries specifying a derived table must be enclosed in
|
||||
parenthesis and <emphasis>must</emphasis> be named using an AS
|
||||
clause. (See <xref linkend="sql-table-aliases">.)
|
||||
</para>
|
||||
|
||||
<programlisting>
|
||||
FROM (SELECT * FROM table1) AS alias_name
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
This example is equivalent to <literal>FROM table1 AS
|
||||
alias_name</literal>. Many subquieries can be written as table
|
||||
joins instead.
|
||||
</para>
|
||||
</sect3>
|
||||
|
||||
<sect3 id="sql-table-aliases">
|
||||
<title>Table and Column Aliases</title>
|
||||
|
||||
<para>
|
||||
A temporary name can be given to tables and complex table
|
||||
references to be used for references to the derived table in
|
||||
further processing. This is called a <firstterm>table
|
||||
alias</firstterm>.
|
||||
<synopsis>
|
||||
FROM <replaceable>table_reference</replaceable> AS <replaceable>alias</replaceable>
|
||||
</synopsis>
|
||||
Here, <replaceable>alias</replaceable> can be any regular
|
||||
identifier. The alias becomes the new name of the table
|
||||
reference for the current query -- it is no longer possible to
|
||||
refer to the table by the original name (if the table reference
|
||||
was an ordinary base table). Thus
|
||||
<programlisting>
|
||||
SELECT * FROM my_table AS m WHERE my_table.a > 5;
|
||||
</programlisting>
|
||||
is not valid SQL syntax. What will happen instead, as a
|
||||
<productname>Postgres</productname> extension, is that an implict
|
||||
table reference is added to the FROM clause, so the query is
|
||||
processed as if it was written as
|
||||
<programlisting>
|
||||
SELECT * FROM my_table AS m, my_table WHERE my_table.a > 5;
|
||||
</programlisting>
|
||||
Table aliases are mainly for notational convenience, but it is
|
||||
necessary to use them when joining a table to itself, e.g.,
|
||||
<programlisting>
|
||||
SELECT * FROM my_table AS a CROSS JOIN my_table AS b ...
|
||||
</programlisting>
|
||||
Additionally, an alias is required if the table reference is a
|
||||
subquery.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Parenthesis are used to resolve ambiguities. The following
|
||||
statement will assign the alias <literal>b</literal> to the
|
||||
result of the join, unlike the previous example:
|
||||
<programlisting>
|
||||
SELECT * FROM (my_table AS a CROSS JOIN my_table) AS b ...
|
||||
</programlisting>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<synopsis>
|
||||
FROM <replaceable>table_reference</replaceable> <replaceable>alias</replaceable>
|
||||
</synopsis>
|
||||
This form is equivalent the previously treated one; the
|
||||
<token>AS</token> key word is noise.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<synopsis>
|
||||
FROM <replaceable>table_reference</replaceable> <optional>AS</optional> <replaceable>alias</replaceable> ( <replaceable>column1</replaceable> <optional>, <replaceable>column2</replaceable> <optional>, ...</optional></optional> )
|
||||
</synopsis>
|
||||
In addition to renaming the table as described above, the columns
|
||||
of the table are also given temporary names. If less column
|
||||
aliases are specified than the actual table has columns, the last
|
||||
columns are not renamed. This syntax is especially useful for
|
||||
self-joins or subqueries.
|
||||
</para>
|
||||
</sect3>
|
||||
|
||||
<sect3>
|
||||
<title>Examples</title>
|
||||
|
||||
<para>
|
||||
<programlisting>
|
||||
FROM T1 INNER JOIN T2 USING (C)
|
||||
FROM T1 LEFT OUTER JOIN T2 USING (C)
|
||||
FROM (T1 RIGHT OUTER JOIN T2 ON (T1C1=T2C1)) AS DT1
|
||||
FROM (T1 FULL OUTER JOIN T2 USING (C)) AS DT1 (DT1C1, DT1C2)
|
||||
|
||||
FROM T1 NATURAL INNER JOIN T2
|
||||
FROM T1 NATURAL LEFT OUTER JOIN T2
|
||||
FROM T1 NATURAL RIGHT OUTER JOIN T2
|
||||
FROM T1 NATURAL FULL OUTER JOIN T2
|
||||
|
||||
FROM (SELECT * FROM T1) DT1 CROSS JOIN T2, T3
|
||||
FROM (SELECT * FROM T1) DT1, T2, T3
|
||||
</programlisting>
|
||||
|
||||
Above are some examples of joined tables and complex derived
|
||||
tables. Notice how the AS clause renames or names a derived
|
||||
table and how the optional comma-separated list of column names
|
||||
that follows gives names or renames the columns. The last two
|
||||
FROM clauses produce the same derived table from T1, T2, and T3.
|
||||
The AS keyword was omitted in naming the subquery as DT1. The
|
||||
keywords OUTER and INNER are noise that can be omitted also.
|
||||
</para>
|
||||
</sect3>
|
||||
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>WHERE clause</title>
|
||||
|
||||
<para>
|
||||
The syntax of the WHERE clause is
|
||||
<synopsis>
|
||||
WHERE <replaceable>search condition</replaceable>
|
||||
</synopsis>
|
||||
where <replaceable>search condition</replaceable> is any value
|
||||
expression as defined in <xref linkend="sql-expressions"> that
|
||||
returns a value of type <type>boolean</type>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
After the processing of the FROM clause is done, each row of the
|
||||
derived table is checked against the search condition. If the
|
||||
result of the condition is true, the row is kept in the output
|
||||
table, otherwise (that is, if the result is false or NULL) it is
|
||||
discared. The search condition typically references at least some
|
||||
column in the table generated in the FROM clause; this is not
|
||||
required, but otherwise the WHERE clause will be fairly useless.
|
||||
</para>
|
||||
|
||||
<note>
|
||||
<para>
|
||||
Before the implementation of the JOIN syntax, it was necessary to
|
||||
put the join condition of an inner join in the WHERE clause. For
|
||||
example, these table expressions are equivalent:
|
||||
<programlisting>
|
||||
FROM a, b WHERE a.id = b.id AND b.val > 5
|
||||
</programlisting>
|
||||
and
|
||||
<programlisting>
|
||||
FROM a INNER JOIN b ON (a.id = b.id) WHERE b.val > 5
|
||||
</programlisting>
|
||||
or perhaps even
|
||||
<programlisting>
|
||||
FROM a NATURAL JOIN b WHERE b.val > 5
|
||||
</programlisting>
|
||||
Which one of these you use is mainly a matter of style. The JOIN
|
||||
syntax in the FROM clause is probably not as portable to other
|
||||
products. For outer joins there is no choice in any case: they
|
||||
must be done in the FROM clause.
|
||||
</para>
|
||||
</note>
|
||||
|
||||
<programlisting>
|
||||
FROM FDT WHERE
|
||||
C1 > 5
|
||||
|
||||
FROM FDT WHERE
|
||||
C1 IN (1, 2, 3)
|
||||
FROM FDT WHERE
|
||||
C1 IN (SELECT C1 FROM T2)
|
||||
FROM FDT WHERE
|
||||
C1 IN (SELECT C3 FROM T2 WHERE C2 = FDT.C1 + 10)
|
||||
|
||||
FROM FDT WHERE
|
||||
C1 BETWEEN (SELECT C3 FROM T2 WHERE C2 = FDT.C1 + 10) AND 100
|
||||
|
||||
FROM FDT WHERE
|
||||
EXISTS (SELECT C1 FROM T2 WHERE C2 > FDT.C1)
|
||||
</programlisting>
|
||||
|
||||
<para>
|
||||
In the examples above, FDT is the table derived in the FROM
|
||||
clause. Rows that do not meet the search condition of the where
|
||||
clause are eliminated from FDT. Notice the use of scalar
|
||||
subqueries as value expressions (C2 assumed UNIQUE). Just like
|
||||
any other query, the subqueries can employ complex table
|
||||
expressions. Notice how FDT is referenced in the subqueries.
|
||||
Qualifying C1 as FDT.C1 is only necessary if C1 is the name of a
|
||||
column in the derived input table of the subquery. Qualifying the
|
||||
column name adds clarity even when it is not needed. The column
|
||||
naming scope of an outer query extends into its inner queries.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<!-- This is confusing as heck. Make it simpler. -->
|
||||
|
||||
<![IGNORE[
|
||||
|
||||
<sect2>
|
||||
<title>GROUP BY and HAVING clauses</title>
|
||||
|
||||
<para>
|
||||
After passing the WHERE filter, the derived input table may be
|
||||
subject to grouping, using the GROUP BY clause, and elimination of
|
||||
group rows using the HAVING clause. (The HAVING clause can also
|
||||
be used without GROUP BY, but then it is equivalent to the WHERE
|
||||
clause.)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
In standard SQL, the GROUP BY clause takes a list of column names,
|
||||
that specify a subrow, from the derived input table produced by
|
||||
the previous WHERE or FROM clause and partitions the table into
|
||||
groups with duplicate subrows such that within a column of the
|
||||
subrow, no column value is distinct from other column values. The
|
||||
resulting derived input table is a special type of table, called a
|
||||
grouped table, which still contains all columns but only
|
||||
references to columns of the grouped subrow, and group aggregates,
|
||||
derived from any of the columns, may appear in derived column
|
||||
value expressions in the query select list. When deriving an
|
||||
output table from a query using a grouped input table, each output
|
||||
row is derived from a corresponding group/partition of the grouped
|
||||
table. Aggregates computed in a derived output column are
|
||||
aggregates on the current partition/group of the grouped input
|
||||
table being processed. Only one output table row results per
|
||||
group/partition of the grouped input table.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Postgres has extended the GROUP BY clause to allow some
|
||||
non-standard, but useful behavior. Derived output columns, given
|
||||
names using an AS clause in the query select list, may appear in
|
||||
the GROUP BY clause in combination with, or instead of, the input
|
||||
table column names. Tables may also be grouped by arbitrary
|
||||
expressions. If output table column names appear in the GROUP BY
|
||||
list, then the input table is augmented with additional columns of
|
||||
the output table columns listed in the GROUP BY clause. The value
|
||||
for each row in the additional columns is computed from the value
|
||||
expression that defines the output column in the query select
|
||||
list. The augmented input table is grouped by the column names
|
||||
listed in the GROUP BY clause. The resulting grouped augmented
|
||||
input table is then treated according standard SQL GROUP BY
|
||||
semantics. Only the columns of the unaugmented input table in the
|
||||
grouped subrow (if any), and group aggregates, derived from any of
|
||||
the columns of the unaugmented input table, may be referenced in
|
||||
the value expressions of the derived output columns of the
|
||||
query. Output columns derived with an aggregate expression cannot
|
||||
be named in the GROUP BY clause.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
A HAVING clause may optionally follow a GROUP BY clause. The
|
||||
HAVING clause selects or eliminates, depending on which
|
||||
perspective is taken, groups from the grouped table derived in the
|
||||
GROUP BY clause that precedes it. The search condition is the
|
||||
same type of expression allowed in a WHERE clause and may
|
||||
reference any of the input table column names in the grouped
|
||||
subrow, but may not reference any others or any named output
|
||||
columns. When the search condition results in TRUE the group is
|
||||
retained, otherwise the group is eliminated.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>ORDER BY and LIMIT clauses</title>
|
||||
|
||||
<para>
|
||||
ORDER BY and LIMIT clauses are not clauses of a table expression.
|
||||
They are optional clauses that may follow a query expression and
|
||||
are discussed here because they are commonly used with the
|
||||
clauses above.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
ORDER BY takes a comma-separated list of columns and performs a
|
||||
cascaded ordering of the table by the columns listed, in the
|
||||
order listed. The keyword DESC or ASC may follow any column name
|
||||
or expression in the list to specify descending or ascending
|
||||
ordering, respectively. Ascending order is the default. The
|
||||
ORDER BY clause conforms to the SQL standard but is extended in
|
||||
Postgres. Postgres allows ORDER BY to reference both output
|
||||
table columns, as named in the select list using the AS clause,
|
||||
and input table columns, as given by the table derived in the
|
||||
FROM clause and other previous clauses. Postgres also extends
|
||||
ORDER BY to allow ordering by arbitrary expressions. If used in a
|
||||
query with a GROUP BY clause, the ORDER BY clause can only
|
||||
reference output table column names and grouped input table
|
||||
columns.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
LIMIT is not a standard SQL clause. LIMIT is a Postgres
|
||||
extension that limits the number of rows that will be returned
|
||||
from a query. The rows returned by a query using the LIMIT
|
||||
clause are random if no ORDER BY clause is specified. A LIMIT
|
||||
clause may optionally be followed by an OFFSET clause which
|
||||
specifies a number of rows to be skipped in the output table
|
||||
before returning the number of rows specified in the LIMIT
|
||||
clause.
|
||||
</para>
|
||||
</sect2>
|
||||
]]>
|
||||
</sect1>
|
||||
|
||||
</chapter>
|
||||
|
||||
<!-- Keep this comment at the end of the file
|
||||
|
@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/Attic/user.sgml,v 1.24 2001/01/13 23:58:55 petere Exp $
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/Attic/user.sgml,v 1.25 2001/01/22 23:34:33 petere Exp $
|
||||
-->
|
||||
|
||||
<book id="user">
|
||||
@ -44,6 +44,7 @@ $Header: /cvsroot/pgsql/doc/src/sgml/Attic/user.sgml,v 1.24 2001/01/13 23:58:55
|
||||
|
||||
&intro;
|
||||
&syntax;
|
||||
&queries;
|
||||
&datatype;
|
||||
&func;
|
||||
&typeconv;
|
||||
|
Loading…
x
Reference in New Issue
Block a user