Generate a table_rewrite event when ALTER TABLE
attempts to rewrite a table. Provide helper
functions to identify table and reason.
Intended use case is to help assess or to react
to schema changes that might hold exclusive locks
for long periods.
Dimitri Fontaine, triggering an edit by Simon Riggs
Reviewed in detail by Michael Paquier
Transactions can now set their commit timestamp directly as they commit,
or an external transaction commit timestamp can be fed from an outside
system using the new function TransactionTreeSetCommitTsData(). This
data is crash-safe, and truncated at Xid freeze point, same as pg_clog.
This module is disabled by default because it causes a performance hit,
but can be enabled in postgresql.conf requiring only a server restart.
A new test in src/test/modules is included.
Catalog version bumped due to the new subdirectory within PGDATA and a
couple of new SQL functions.
Authors: Álvaro Herrera and Petr Jelínek
Reviewed to varying degrees by Michael Paquier, Andres Freund, Robert
Haas, Amit Kapila, Fujii Masao, Jaime Casanova, Simon Riggs, Steven
Singer, Peter Eisentraut
This is advance preparation for introducing even more test modules; the
easy solution is to add them to contrib, but that's bloated enough that
it seems a good time to think of something different.
Moved modules are dummy_seclabel, test_shm_mq, test_parser and
worker_spi.
(test_decoding was also a candidate, but there was too much opposition
to moving that one. We can always reconsider later.)
This reverts commit 9f80f4835a55a1cbffcda5d23a617917f3286c14. The
function returned the raw value of a connection parameter, a task served
by PQconninfo(). The next commit will reimplement the psql \conninfo
change that way. Back-patch to 9.4, where that commit first appeared.
As pointed out by Robert, we should really have named pg_rowsecurity
pg_policy, as the objects stored in that catalog are policies. This
patch fixes that and updates the column names to start with 'pol' to
match the new catalog name.
The security consideration for COPY with row level security, also
pointed out by Robert, has also been addressed by remembering and
re-checking the OID of the relation initially referenced during COPY
processing, to make sure it hasn't changed under us by the time we
finish planning out the query which has been built.
Robert and Alvaro also commented on missing OCLASS and OBJECT entries
for POLICY (formerly ROWSECURITY or POLICY, depending) in various
places. This patch fixes that too, which also happens to add the
ability to COMMENT on policies.
In passing, attempt to improve the consistency of messages, comments,
and documentation as well. This removes various incarnations of
'row-security', 'row-level security', 'Row-security', etc, in favor
of 'policy', 'row level security' or 'row_security' as appropriate.
Happy Thanksgiving!
These cases formerly failed with errors about "could not find array type
for data type". Now they yield arrays of the same element type and one
higher dimension.
The implementation involves creating functions with API similar to the
existing accumArrayResult() family. I (tgl) also extended the base family
by adding an initArrayResult() function, which allows callers to avoid
special-casing the zero-inputs case if they just want an empty array as
result. (Not all do, so the previous calling convention remains valid.)
This allowed simplifying some existing code in xml.c and plperl.c.
Ali Akbar, reviewed by Pavel Stehule, significantly modified by me
If the "dbname" attribute in PQconnectDBParams contained a connection string
or URI (and expand_dbname = TRUE), the database name from the connection
string could not be overridden by a subsequent "dbname" keyword in the
array. That was not intentional; all other options can be overridden.
Furthermore, any subsequent "dbname" caused the connection string from the
first dbname value to be processed again, overriding any values for the same
options that were given between the connection string and the second dbname
option.
In the passing, clarify in the docs that only the first dbname option in the
array is parsed as a connection string.
Alex Shulgin. Backpatch to all supported versions.
In bug #12000, Andreas Kunert complained that the documentation was
misleading in saying "FROM T1 CROSS JOIN T2 is equivalent to FROM T1, T2".
That's correct as far as it goes, but the equivalence doesn't hold when
you consider three or more tables, since JOIN binds more tightly than
comma. I added a <note> to explain this, and ended up rearranging some
of the existing text so that the note would make sense in context.
In passing, rewrite the description of JOIN USING, which was unnecessarily
vague, and hadn't been helped any by somebody's reliance on markup as a
substitute for clear writing. (Mostly this involved reintroducing a
concrete example that was unaccountably removed by commit 032f3b7e166cfa28.)
Back-patch to all supported branches.
Allows pg_dump to use a snapshot previously defined by a concurrent
session that has either used pg_export_snapshot() or obtained a
snapshot when creating a logical slot. When this option is used with
parallel pg_dump, the snapshot defined by this option is used and no
new snapshot is taken.
Simon Riggs and Michael Paquier
Previously pg_receivexlog flushed WAL data only when WAL file was switched.
Then 3dad73e added -F option to pg_receivexlog so that users could control
how frequently sync commands were issued to WAL files. It also allowed users
to make pg_receivexlog flush WAL data immediately after writing by
specifying 0 in -F option. However feedback messages were not sent back
immediately even after a flush location was updated. So even if WAL data
was flushed in real time, the server could not see that for a while.
This commit removes -F option from and adds --synchronous to pg_receivexlog.
If --synchronous is specified, like the standby's wal receiver, pg_receivexlog
flushes WAL data as soon as there is WAL data which has not been flushed yet.
Then it sends back the feedback message identifying the latest flush location
to the server. This option is useful to make pg_receivexlog behave as sync
standby by using replication slot, for example.
Original patch by Furuya Osamu, heavily rewritten by me.
Reviewed by Heikki Linnakangas, Alvaro Herrera and Sawada Masahiko.
The SELECT reference page didn't really address the question of when
aggregate function evaluation occurs, nor did the "expression evaluation
rules" documentation mention that CASE can't be used to control whether
an aggregate gets evaluated or not. Improve that.
Per discussion of bug #11661. Original text by Marti Raudsepp and Michael
Paquier, rewritten significantly by me.
When ALTER TABLESPACE MOVE ALL was changed to be ALTER TABLE ALL IN
TABLESPACE, the ALTER TABLESPACE summary should have been adjusted back
to its original definition.
Patch by Thom Brown (thanks!).
Previously the maximum size of GIN pending list was controlled only by
work_mem. But the reasonable value of work_mem and the reasonable size
of the list are basically not the same, so it was not appropriate to
control both of them by only one GUC, i.e., work_mem. This commit
separates new GUC, pending_list_cleanup_size, from work_mem to allow
users to control only the size of the list.
Also this commit adds pending_list_cleanup_size as new storage parameter
to allow users to specify the size of the list per index. This is useful,
for example, when users want to increase the size of the list only for
the GIN index which can be updated heavily, and decrease it otherwise.
Reviewed by Etsuro Fujita.
Besides a couple of typo fixes, per David Rowley, Thom Brown, and Amit
Langote, and mentions of BRIN in the general CREATE INDEX page again per
David, this includes silencing MSVC compiler warnings (thanks Microsoft)
and an additional variable initialization per Coverity scanner.
BRIN is a new index access method intended to accelerate scans of very
large tables, without the maintenance overhead of btrees or other
traditional indexes. They work by maintaining "summary" data about
block ranges. Bitmap index scans work by reading each summary tuple and
comparing them with the query quals; all pages in the range are returned
in a lossy TID bitmap if the quals are consistent with the values in the
summary tuple, otherwise not. Normal index scans are not supported
because these indexes do not store TIDs.
As new tuples are added into the index, the summary information is
updated (if the block range in which the tuple is added is already
summarized) or not; in the latter case, a subsequent pass of VACUUM or
the brin_summarize_new_values() function will create the summary
information.
For data types with natural 1-D sort orders, the summary info consists
of the maximum and the minimum values of each indexed column within each
page range. This type of operator class we call "Minmax", and we
supply a bunch of them for most data types with B-tree opclasses.
Since the BRIN code is generalized, other approaches are possible for
things such as arrays, geometric types, ranges, etc; even for things
such as enum types we could do something different than minmax with
better results. In this commit I only include minmax.
Catalog version bumped due to new builtin catalog entries.
There's more that could be done here, but this is a good step forwards.
Loosely based on ideas from Simon Riggs; code mostly by Álvaro Herrera,
with contribution by Heikki Linnakangas.
Patch reviewed by: Amit Kapila, Heikki Linnakangas, Robert Haas.
Testing help from Jeff Janes, Erik Rijkers, Emanuel Calvo.
PS:
The research leading to these results has received funding from the
European Union's Seventh Framework Programme (FP7/2007-2013) under
grant agreement n° 318633.
xlog.c is huge, this makes it a little bit smaller, which is nice. Functions
related to putting together the WAL record are in xloginsert.c, and the
lower level stuff for managing WAL buffers and such are in xlog.c.
Also move the definition of XLogRecord to a separate header file. This
causes churn in the #includes of all the files that write WAL records, and
redo routines, but it avoids pulling in xlog.h into most places.
Reviewed by Michael Paquier, Alvaro Herrera, Andres Freund and Amit Kapila.
Long ago we briefly had an "autocommit" GUC that turned server-side
autocommit on and off. That behavior was removed in 7.4 after concluding
that it broke far too much client-side logic, and making clients cope with
both behaviors was impractical. But the GUC variable was left behind, so
as not to break any client code that might be trying to read its value.
Enough time has now passed that we should remove the GUC completely.
Whatever vestigial backwards-compatibility benefit it had is outweighed by
the risk of confusion for newbies who assume it ought to do something,
as per a recent complaint from Wolfgang Wilhelm.
In passing, adjust what seemed to me a rather confusing documentation
reference to libpq's autocommit behavior. libpq as such knows nothing
about autocommit, so psql is probably what was meant.
pgp_sym_encrypt's option is spelled "sess-key", not "enable-session-key".
Spotted by Jeff Janes.
In passing, improve a comment in pgp-pgsql.c to make it clearer that
the debugging options are intentionally undocumented.
Building the documentation with XSLT does not check the DTD, like a
DSSSL build would. One can often get away with having invalid XML, but
the stylesheets might then create incorrect output, as they are not
designed to handle that. Therefore, check the validity of the XML
against the DTD, using xmllint, during the build.
Add xmllint detection to configure, and add some documentation.
xmllint comes with libxml2, which is already in use, but it might be in
a separate package, such as libxml2-utils on Debian.
Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr>