If we restrict unique constraints on partitioned tables so that they
must always include the partition key, then our standard approach to
unique indexes already works --- each unique key is forced to exist
within a single partition, so enforcing the unique restriction in each
index individually is enough to have it enforced globally. Therefore we
can implement unique indexes on partitions by simply removing a few
restrictions (and adding others.)
Discussion: https://postgr.es/m/20171222212921.hi6hg6pem2w2t36z@alvherre.pgsql
Discussion: https://postgr.es/m/20171229230607.3iib6b62fn3uaf47@alvherre.pgsql
Reviewed-by: Simon Riggs, Jesper Pedersen, Peter Eisentraut, Jaime
Casanova, Amit Langote
Formerly, DTYPE_REC was used only for variables declared as "record";
variables of named composite types used DTYPE_ROW, which is faster for
some purposes but much less flexible. In particular, the ROW code paths
are entirely incapable of dealing with DDL-caused changes to the number
or data types of the columns of a row variable, once a particular plpgsql
function has been parsed for the first time in a session. And, since the
stored representation of a ROW isn't a tuple, there wasn't any easy way
to deal with variables of domain-over-composite types, since the domain
constraint checking code would expect the value to be checked to be a
tuple. A lesser, but still real, annoyance is that ROW format cannot
represent a true NULL composite value, only a row of per-field NULL
values, which is not exactly the same thing.
Hence, switch to using DTYPE_REC for all composite-typed variables,
whether "record", named composite type, or domain over named composite
type. DTYPE_ROW remains but is used only for its native purpose, to
represent a fixed-at-compile-time list of variables, for instance the
targets of an INTO clause.
To accomplish this without taking significant performance losses, introduce
infrastructure that allows storing composite-type variables as "expanded
objects", similar to the "expanded array" infrastructure introduced in
commit 1dc5ebc90. A composite variable's value is thereby kept (most of
the time) in the form of separate Datums, so that field accesses and
updates are not much more expensive than they were in the ROW format.
This holds the line, more or less, on performance of variables of named
composite types in field-access-intensive microbenchmarks, and makes
variables declared "record" perform much better than before in similar
tests. In addition, the logic involved with enforcing composite-domain
constraints against updates of individual fields is in the expanded
record infrastructure not plpgsql proper, so that it might be reusable
for other purposes.
In further support of this, introduce a typcache feature for assigning a
unique-within-process identifier to each distinct tuple descriptor of
interest; in particular, DDL alterations on composite types result in a new
identifier for that type. This allows very cheap detection of the need to
refresh tupdesc-dependent data. This improves on the "tupDescSeqNo" idea
I had in commit 687f096ea: that assigned identifying sequence numbers to
successive versions of individual composite types, but the numbers were not
unique across different types, nor was there support for assigning numbers
to registered record types.
In passing, allow plpgsql functions to accept as well as return type
"record". There was no good reason for the old restriction, and it
was out of step with most of the other PLs.
Tom Lane, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/8962.1514399547@sss.pgh.pa.us
We can now create indexes more easily than before, so update this
chapter to use the simpler instructions.
After an idea of Amit Langote. I (Álvaro) opted to do more invasive
surgery and remove the previous suggestion to create per-partition
indexes, which his patch left in place.
Discussion: https://postgr.es/m/eafaaeb1-f0fd-d010-dd45-07db0300f645@lab.ntt.co.jp
Author: Amit Langote, Álvaro Herrera
Since we now support the server side handler for git over https (so
we're no longer using the "dumb protocol"), make https the primary
choice for cloning the repository, and the git protocol the secondary
choice.
In passing, also change the links to git-scm.com from http to https.
Reviewed by Stefan Kaltenbrunner and David G. Johnston
This patch adds the ability to use "RANGE offset PRECEDING/FOLLOWING"
frame boundaries in window functions. We'd punted on that back in the
original patch to add window functions, because it was not clear how to
do it in a reasonably data-type-extensible fashion. That problem is
resolved here by adding the ability for btree operator classes to provide
an "in_range" support function that defines how to add or subtract the
RANGE offset value. Factoring it this way also allows the operator class
to avoid overflow problems near the ends of the datatype's range, if it
wishes to expend effort on that. (In the committed patch, the integer
opclasses handle that issue, but it did not seem worth the trouble to
avoid overflow failures for datetime types.)
The patch includes in_range support for the integer_ops opfamily
(int2/int4/int8) as well as the standard datetime types. Support for
other numeric types has been requested, but that seems like suitable
material for a follow-on patch.
In addition, the patch adds GROUPS mode which counts the offset in
ORDER-BY peer groups rather than rows, and it adds the frame_exclusion
options specified by SQL:2011. As far as I can see, we are now fully
up to spec on window framing options.
Existing behaviors remain unchanged, except that I changed the errcode
for a couple of existing error reports to meet the SQL spec's expectation
that negative "offset" values should be reported as SQLSTATE 22013.
Internally and in relevant parts of the documentation, we now consistently
use the terminology "offset PRECEDING/FOLLOWING" rather than "value
PRECEDING/FOLLOWING", since the term "value" is confusingly vague.
Oliver Ford, reviewed and whacked around some by me
Discussion: https://postgr.es/m/CAGMVOdu9sivPAxbNN0X+q19Sfv9edEPv=HibOJhB14TJv_RCQg@mail.gmail.com
Up to now, useful info for writing a new btree opclass has been buried
in the backend's nbtree/README file. Let's move it into the SGML docs,
in preparation for extending it with info about "in_range" functions
in the upcoming window RANGE patch.
To do this, I chose to create a new chapter for btree indexes in Part VII
(Internals), parallel to the chapters that exist for the newer index AMs.
This is a pretty short chapter as-is. At some point somebody might care
to flesh it out with more detail about btree internals, but that is
beyond the scope of my ambition for today.
Discussion: https://postgr.es/m/23141.1517874668@sss.pgh.pa.us
To make this work, tuplesort.c and logtape.c must also support
parallelism, so this patch adds that infrastructure and then applies
it to the particular case of parallel btree index builds. Testing
to date shows that this can often be 2-3x faster than a serial
index build.
The model for deciding how many workers to use is fairly primitive
at present, but it's better than not having the feature. We can
refine it as we get more experience.
Peter Geoghegan with some help from Rushabh Lathia. While Heikki
Linnakangas is not an author of this patch, he wrote other patches
without which this feature would not have been possible, and
therefore the release notes should possibly credit him as an author
of this feature. Reviewed by Claudio Freire, Heikki Linnakangas,
Thomas Munro, Tels, Amit Kapila, me.
Discussion: http://postgr.es/m/CAM3SWZQKM=Pzc=CAHzRixKjp2eO5Q0Jg1SoFQqeXFQ647JiwqQ@mail.gmail.com
Discussion: http://postgr.es/m/CAH2-Wz=AxWqDoVvGU7dq856S4r6sJAj6DBn7VMtigkB33N5eyg@mail.gmail.com
Add into the ALTER TABLE synopsis the definition of
partition_bound_spec, column_constraint, index_parameters and
exclude_element.
Initial patch by Lætitia Avrot, with further improvements by Amit
Langote and Thomas Munro.
Discussion: https://postgr.es/m/flat/27ec4df3-d1ab-3411-f87f-647f944897e1%40lab.ntt.co.jp
Commit 9521ce4a7a1125385fb4de9689f345db594c516a from Sep 13, 2017 and
backpatched through 9.5 used rsync examples with datadir. The reporter
has pointed out, and testing has verified, that clusterdir must be used,
so update the docs accordingly.
Reported-by: Don Seiler
Discussion: https://postgr.es/m/CAHJZqBD0u9dCERpYzK6BkRv=663AmH==DFJpVC=M4Xg_rq2=CQ@mail.gmail.com
Backpatch-through: 9.5
Technically, pg_upgrade's --old-datadir and --new-datadir are
configuration directories, not necessarily data directories. This is
reflected in the 'postgres' manual page, so do the same for pg_upgrade.
Reported-by: Yves Goergen
Bug: 14898
Discussion: https://postgr.es/m/20171110220912.31513.13322@wrigleys.postgresql.org
Backpatch-through: 10
pg_hba_file_rules erroneously reported this as scram-sha256. Fix that.
To avoid future errors and confusion, also adjust documentation links
and internal symbols to have a separator between "sha" and "256".
Reported-by: Christophe Courtois <christophe.courtois@dalibo.com>
Author: Michael Paquier <michael.paquier@gmail.com>
Clarify that the restriction against reg* types only applies to table
columns using these types, not to the type appearing in any other way,
for example as a function argument.
We have a lot of code in which option names, which from the user's
viewpoint are logically keywords, are passed through the grammar as plain
identifiers, and then matched to string literals during command execution.
This approach avoids making words into lexer keywords unnecessarily. Some
places matched these strings using plain strcmp, some using pg_strcasecmp.
But the latter should be unnecessary since identifiers would have been
downcased on their way through the parser. Aside from any efficiency
concerns (probably not a big factor), the lack of consistency in this area
creates a hazard of subtle bugs due to different places coming to different
conclusions about whether two option names are the same or different.
Hence, standardize on using strcmp() to match any option names that are
expected to have been fed through the parser.
This does create a user-visible behavioral change, which is that while
formerly all of these would work:
alter table foo set (fillfactor = 50);
alter table foo set (FillFactor = 50);
alter table foo set ("fillfactor" = 50);
alter table foo set ("FillFactor" = 50);
now the last case will fail because that double-quoted identifier is
different from the others. However, none of our documentation says that
you can use a quoted identifier in such contexts at all, and we should
discourage doing so since it would break if we ever decide to parse such
constructs as true lexer keywords rather than poor man's substitutes.
So this shouldn't create a significant compatibility issue for users.
Daniel Gustafsson, reviewed by Michael Paquier, small changes by me
Discussion: https://postgr.es/m/29405B24-564E-476B-98C0-677A29805B84@yesql.se
This clause was superseded by SQL-standard syntax back in 7.3.
We've kept it around for backwards-compatibility purposes ever since;
but 15 years seems like long enough for that, especially seeing that
there are undocumented weirdnesses in how it interacts with the
SQL-standard syntax for specifying the same options.
Michael Paquier, per an observation by Daniel Gustafsson;
some small cosmetic adjustments to nearby code by me.
Discussion: https://postgr.es/m/20180115022748.GB1724@paquier.xyz
We have switches already to suppress other subsidiary object properties,
such as ACLs, security labels, ownership, and tablespaces, so just on
the grounds of symmetry we should allow suppressing comments as well.
Also, commit 0d4e6ed30 added a positive reason to have this feature,
i.e. to allow obtaining the old behavior of selective pg_restore should
anyone desire that.
Recent commits have removed the cases where pg_dump emitted comments on
built-in objects that the restoring user might not have privileges to
comment on, so the original primary motivation for this feature is gone,
but it still seems at least somewhat useful in its own right.
Robins Tharakan, reviewed by Fabrízio Mello
Discussion: https://postgr.es/m/CAEP4nAx22Z4ch74oJGzr5RyyjcyUSbpiFLyeYXX8pehfou92ug@mail.gmail.com
Ensure that CREATE DATABASE and related commands are issued when, and
only when, --create is specified. Previously there were scenarios
where using selective-dump switches would prevent --create from having
any effect. For example, it would fail to do anything in pg_restore
if the archive file had been made by a selective dump, because there
would be no TOC entry for the database.
Since we don't issue \connect either if we don't issue CREATE DATABASE,
this could result in unexpectedly restoring objects into the wrong
database.
Also fix pg_restore's selective restore logic so that when an object is
selected to be restored, we also restore its ACL, comment, and security
label if any. Previously there was no way to get the latter properties
except through tedious mucking about with a -L file. If, for some
reason, you don't want these properties, you can match the old behavior
by adding --no-acl etc.
While at it, try to make _tocEntryRequired() a little better organized
and better documented.
Discussion: https://postgr.es/m/32668.1516848577@sss.pgh.pa.us
Previously, the docs implied that only Linux and Windows could use huge
pages. That's not quite true: it's just that we only know how to
request them explicitly on those OSes. Be more explicit about what
huge_pages really does and mention that some OSes may use huge pages
automatically.
Author: Thomas Munro and Catalin Iacob
Reviewed-By: Justin Pryzby, Peter Eisentraut
Discussion: https://postgr.es/m/CAEepm=3qzR-hfjepymohuC4XO5phxoSoipOjm6BEhnJHjNR+jg@mail.gmail.com
Split the "Authentication and Security" section into two separate
sections "Authentication" and "SSL". The latter part has gotten much
longer over time, and doesn't primarily have to do with authentication.
Also, the row_security parameter was inconsistently categorized, so
clean that up while we're here.
This patch rearranges the division of labor between pg_dump and pg_dumpall
so that pg_dump itself handles all properties attached to a single
database. Notably, a database's ACL (GRANT/REVOKE status) and local GUC
settings established by ALTER DATABASE SET and ALTER ROLE IN DATABASE SET
can be dumped and restored by pg_dump. This is a long-requested
improvement.
"pg_dumpall -g" will now produce only role- and tablespace-related output,
nothing about individual databases. The total output of a regular
pg_dumpall run remains the same.
pg_dump (or pg_restore) will restore database-level properties only when
creating the target database with --create. This applies not only to
ACLs and GUCs but to the other database properties it already handled,
that is database comments and security labels. This is more consistent
and useful, but does represent an incompatibility in the behavior seen
without --create.
(This change makes the proposed patch to have pg_dump use "COMMENT ON
DATABASE CURRENT_DATABASE" unnecessary, since there is no case where
the command is issued that we won't know the true name of the database.
We might still want that patch as a feature in its own right, but pg_dump
no longer needs it.)
pg_dumpall with --clean will now drop and recreate the "postgres" and
"template1" databases in the target cluster, allowing their locale and
encoding settings to be changed if necessary, and providing a cleaner
way to set nondefault tablespaces for them than we had before. This
means that such a script must now always be started in the "postgres"
database; the order of drops and reconnects will not work otherwise.
Without --clean, the script will not adjust any database-level properties
of those two databases (including their comments, ACLs, and security
labels, which it formerly would try to set).
Another minor incompatibility is that the CREATE DATABASE commands in a
pg_dumpall script will now always specify locale and encoding settings.
Formerly those would be omitted if they matched the cluster's default.
While that behavior had some usefulness in some migration scenarios,
it also posed a significant hazard of unwanted locale/encoding changes.
To migrate to another locale/encoding, it's now necessary to use pg_dump
without --create to restore into a database with the desired settings.
Commit 4bd371f6f's hack to emit "SET default_transaction_read_only = off"
is gone: we now dodge that problem by the expedient of not issuing ALTER
DATABASE SET commands until after reconnecting to the target database.
Therefore, such settings won't apply during the restore session.
In passing, improve some shaky grammar in the docs, and add a note pointing
out that pg_dumpall's output can't be expected to load without any errors.
(Someday we might want to fix that, but this is not that patch.)
Haribabu Kommi, reviewed at various times by Andreas Karlsson,
Vaishnavi Prabakaran, and Robert Haas; further hacking by me.
Discussion: https://postgr.es/m/CAJrrPGcUurV0eWTeXODwsOYFN=Ekq36t1s0YnFYUNzsmRfdAyA@mail.gmail.com
In each of the supplied procedural languages (PL/pgSQL, PL/Perl,
PL/Python, PL/Tcl), add language-specific commit and rollback
functions/commands to control transactions in procedures in that
language. Add similar underlying functions to SPI. Some additional
cleanup so that transaction commit or abort doesn't blow away data
structures still used by the procedure call. Add execution context
tracking to CALL and DO statements so that transaction control commands
can only be issued in top-level procedure and block calls, not function
calls or other procedure or block calls.
- SPI
Add a new function SPI_connect_ext() that is like SPI_connect() but
allows passing option flags. The only option flag right now is
SPI_OPT_NONATOMIC. A nonatomic SPI connection can execute transaction
control commands, otherwise it's not allowed. This is meant to be
passed down from CALL and DO statements which themselves know in which
context they are called. A nonatomic SPI connection uses different
memory management. A normal SPI connection allocates its memory in
TopTransactionContext. For nonatomic connections we use PortalContext
instead. As the comment in SPI_connect_ext() (previously SPI_connect())
indicates, one could potentially use PortalContext in all cases, but it
seems safest to leave the existing uses alone, because this stuff is
complicated enough already.
SPI also gets new functions SPI_start_transaction(), SPI_commit(), and
SPI_rollback(), which can be used by PLs to implement their transaction
control logic.
- portalmem.c
Some adjustments were made in the code that cleans up portals at
transaction abort. The portal code could already handle a command
*committing* a transaction and continuing (e.g., VACUUM), but it was not
quite prepared for a command *aborting* a transaction and continuing.
In AtAbort_Portals(), remove the code that marks an active portal as
failed. As the comment there already predicted, this doesn't work if
the running command wants to keep running after transaction abort. And
it's actually not necessary, because pquery.c is careful to run all
portal code in a PG_TRY block and explicitly runs MarkPortalFailed() if
there is an exception. So the code in AtAbort_Portals() is never used
anyway.
In AtAbort_Portals() and AtCleanup_Portals(), we need to be careful not
to clean up active portals too much. This mirrors similar code in
PreCommit_Portals().
- PL/Perl
Gets new functions spi_commit() and spi_rollback()
- PL/pgSQL
Gets new commands COMMIT and ROLLBACK.
Update the PL/SQL porting example in the documentation to reflect that
transactions are now possible in procedures.
- PL/Python
Gets new functions plpy.commit and plpy.rollback.
- PL/Tcl
Gets new commands commit and rollback.
Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
Add support for huge pages (called large pages on Windows) to the
Windows build.
This (probably) breaks compatibility with Windows versions prior to
Windows 2003 or Windows Vista.
Authors: Takayuki Tsunakawa and Thomas Munro
Reviewed by: Magnus Hagander, Amit Kapila
Document how to properly create root and intermediate certificates using
v3_ca extensions and where to place intermediate certificates so they
are properly transferred to the remote side with the leaf certificate to
link to the remote root certificate. This corrects docs that used to
say that intermediate certificates must be stored with the root
certificate.
Also add instructions on how to create root, intermediate, and leaf
certificates.
Discussion: https://postgr.es/m/20180116002238.GC12724@momjian.us
Reviewed-by: Michael Paquier
Backpatch-through: 9.3