It seems pretty unacceptable to have no regression test coverage
for this aspect of the logical replication protocol, especially
given the bugs we've found in related code.
Discussion: https://postgr.es/m/16129-a0c0f48e71741e5f@postgresql.org
slot_modify_cstrings seriously abused the TupleTableSlot API by relying
on a slot's underlying data to stay valid across ExecClearTuple. Since
this abuse was also quite undocumented, it's little surprise that the
case got broken during the v12 slot rewrites. As reported in bug #16129
from Ondřej Jirman, this could lead to crashes or data corruption when
a logical replication subscriber processes a row update. Problems would
only arise if the subscriber's table contained columns of pass-by-ref
types that were not being copied from the publisher.
Fix by explicitly copying the datum/isnull arrays from the source slot
that the old row was in already. This ends up being about the same
thing that happened pre-v12, but hopefully in a less opaque and
fragile way.
We might've caught the problem sooner if there were any test cases
dealing with updates involving non-replicated or dropped columns.
Now there are.
Back-patch to v10 where this code came in. Even though the failure
does not manifest before v12, IMO this code is too fragile to leave
as-is. In any case we certainly want the additional test coverage.
Patch by me; thanks to Tomas Vondra for initial investigation.
Discussion: https://postgr.es/m/16129-a0c0f48e71741e5f@postgresql.org
While a self-referential view doesn't actually work, it's possible
to create one, and it turns out that this breaks some of the
information_schema views. Those views call relation_is_updatable(),
which neglected to consider the hazards of being recursive. In
older PG versions you get a "stack depth limit exceeded" error,
but since v10 it'd recurse to the point of stack overrun and crash,
because commit a4c35ea1c took out the expression_returns_set() call
that was incidentally checking the stack depth.
Since this function is only used by information_schema views, it
seems like it'd be better to return "not updatable" than suffer
an error. Hence, add tracking of what views we're examining,
in just the same way that the nearby fireRIRrules() code detects
self-referential views. I added a check_stack_depth() call too,
just to be defensive.
Per private report from Manuel Rigger. Back-patch to all
supported versions.
Trying to use hypothetical indexes with BRIN currently fails when trying
to access a relation that does not exist when looking for the
statistics. With the current API, it is not possible to easily pass
a value for pages_per_range down to the hypothetical index, so this
makes use of the default value of BRIN_DEFAULT_PAGES_PER_RANGE, which
should be fine enough in most cases.
Being able to refine or enforce the hypothetical costs in more
optimistic ways would require more refactoring by filling in the
statistics when building IndexOptInfo in plancat.c. This would involve
ABI breakages around the costing routines, something not fit for stable
branches.
This is broken since 7e534ad, so backpatch down to v10.
Author: Julien Rouhaud, Heikki Linnakangas
Reviewed-by: Álvaro Herrera, Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/CAOBaU_ZH0LKEA8VFCocr6Lpte1ab0b6FpvgS0y4way+RPSXfYg@mail.gmail.com
Backpatch-through: 10
When ReadFile() encounters the end of a file while reading from
a synchronous handle with an offset provided via OVERLAPPED, it
reports an error instead of returning 0. By not handling that
(undocumented) result correctly, we caused some noisy LOG
messages about an unknown error code. Repair.
Back-patch to 12, where we started using pread()/ReadFile() with
an offset.
Reported-by: ZhenHua Cai, Amit Kapila
Diagnosed-by: Juan Jose Santamaria Flecha
Tested-by: Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1LK3%2BWRtpz68TiRdpHwxxWm%3D%2Bt1BMf-G68hhQsAQ41PZg%40mail.gmail.com
The planner's optimization code for LIKE and regex operators could
error out with a complaint like "no = operator for opfamily NNN"
if someone created a binary-compatible index (for example, a
bpchar_ops index on a text column) on the LIKE's left argument.
This is a consequence of careless refactoring in commit 74dfe58a5.
The old code in match_special_index_operator only accepted specific
combinations of the pattern operator and the index opclass, thereby
indirectly guaranteeing that the opclass would have a comparison
operator with the same LHS input type as the pattern operator.
While moving the logic out to a planner support function, I simplified
that test in a way that no longer guarantees that. Really though we'd
like an altogether weaker dependency on the opclass, so rather than
put back exactly the old code, just allow lookup failure. I have in
mind now to rewrite this logic completely, but this is the minimum
change needed to fix the bug in v12.
Per report from Manuel Rigger. Back-patch to v12 where the mistake
came in.
Discussion: https://postgr.es/m/CA+u7OA7nnGYy8rY0vdTe811NuA+Frr9nbcBO9u2Z+JxqNaud+g@mail.gmail.com
By oversight 52ac6cd2d0 makes ginDeletePage() sets pd_prune_xid of page to be
deleted before entering the critical section. It appears that only versions 11
and later were affected by this oversight.
Backpatch-through: 11
We find GIN concurrency bugs from time to time. One of the problems here is
that concurrency of GIN isn't well-documented in README. So, it might be even
hard to distinguish design bugs from implementation bugs.
This commit revised concurrency section in GIN README providing more details.
Some examples are illustrated in ASCII art.
Also, this commit add the explanation of how is tuple layout in internal GIN
B-tree page different in comparison with nbtree.
Discussion: https://postgr.es/m/CAPpHfduXR_ywyaVN4%2BOYEGaw%3DcPLzWX6RxYLBncKw8de9vOkqw%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Peter Geoghegan
Backpatch-through: 9.4
Current GIN code appears to don't handle traversing to the deleted page via
downlink. This commit fixes that by stepping right from the delete page like
we do in nbtree.
This commit also fixes setting 'deleted' flag to the GIN pages. Now other page
flags are not erased once page is deleted. That helps to keep our assertions
true if we arrive deleted page via downlink.
Discussion: https://postgr.es/m/CAPpHfdvMvsw-NcE5bRS7R1BbvA4BxoDnVVjkXC5W0Czvy9LVrg%40mail.gmail.com
Author: Alexander Korotkov
Reviewed-by: Peter Geoghegan
Backpatch-through: 9.4
When ginDeletePage() is about to delete page it locks its left sibling to revise
the rightlink. So, it locks pages in right to left manner. Int he same time
ginStepRight() locks pages in left to right manner, and that could cause a
deadlock.
This commit makes ginScanToDelete() keep exclusive lock on left siblings of
currently investigated path. That elimites need to relock left sibling in
ginDeletePage(). Thus, deadlock with ginStepRight() can't happen anymore.
Reported-by: Chen Huajun
Discussion: https://postgr.es/m/5c332bd1.87b6.16d7c17aa98.Coremail.chjischj%40163.com
Author: Alexander Korotkov
Reviewed-by: Peter Geoghegan
Backpatch-through: 10
The existing text stated that "Default privileges that are specified
per-schema are added to whatever the global default privileges are for
the particular object type". However, that bare-bones observation is
not quite clear enough, as demonstrated by the complaint in bug #16124.
Flesh it out by stating explicitly that you can't revoke built-in
default privileges this way, and by providing an example to drive
the point home.
Back-patch to all supported branches, since it's been like this
from the beginning.
Discussion: https://postgr.es/m/16124-423d8ee4358421bc@postgresql.org
It turns out that commit e9f1c01b7 missed a case: we must print a
VALUES clause in long format if get_query_def is given a resultDesc
that would require the query's output column name(s) to be different
from what the bare VALUES clause would produce.
This applies in case an ALTER ... RENAME COLUMN has been done to
a view that formerly could be printed in simple format, as shown
in the added regression test case. It also explains bug #16119
from Dmitry Telpt, because it turns out that (unlike CREATE VIEW)
CREATE MATERIALIZED VIEW fails to apply any column aliases it's
given to the stored ON SELECT rule. So to get them to be printed,
we have to account for the resultDesc renaming. It might be worth
changing the matview code so that it creates the ON SELECT rule
with the correct aliases; but we'd still need these messy checks in
get_simple_values_rte to handle the case of a subsequent column
rename, so any such change would be just neatnik-ism not a bug fix.
Like the previous patch, back-patch to all supported branches.
Discussion: https://postgr.es/m/16119-e64823f30a45a754@postgresql.org
Concurrent autovacuums running with the main regression test suite
could cause the tests with VACUUM (SKIP_LOCKED) to generate randomly
WARNING messages. For these tests, set client_min_messages to ERROR to
get rid of those random failures, as disabling autovacuum for the
relations operated would not completely close the failure window.
For isolation tests, disable autovacuum for the relations vacuumed with
SKIP_LOCKED. The tests are designed so as LOCK commands are taken
in a first session before running a concurrent VACUUM (SKIP_LOCKED) in a
second to generate WARNING messages, but a concurrent autovacuum could
cause the tests to be slower.
Reported-by: Tom Lane
Author: Michael Paquier
Reviewed-by: Andres Freund, Tom Lane
Discussion: https://postgr.es/m/25294.1573077278@sss.pgh.pa.us
Backpatch-through: 12
When estimating number of distinct groups, we failed to ignore system
attributes when matching the group expressions to mvdistinct stats,
causing failures like
ERROR: negative bitmapset member not allowed
Fix that by simply skipping anything that is not a regular attribute.
Backpatch to PostgreSQL 10, where the extended stats were introduced.
Bug: #16111
Reported-by: Tuomas Leikola
Author: Tomas Vondra
Backpatch-through: 10
Discussion: https://postgr.es/m/16111-687799584c3a7e73@postgresql.org
Call ExecShutdownNode() after ExecutePlan()'s loop, rather than at each
break. We had forgotten to do that in one case. The omission caused
intermittent "temporary file leak" warnings from multi-batch parallel
hash joins with a LIMIT clause.
Back-patch to 11. Though the problem exists in theory in earlier
parallel query releases, nothing really depended on it.
Author: Kyotaro Horiguchi
Reviewed-by: Thomas Munro, Amit Kapila
Discussion: https://postgr.es/m/20191111.212418.2222262873417235945.horikyota.ntt%40gmail.com
We should throw an error for indeterminate collation, but bpcharne()
was missing that logic, resulting in a much less user-friendly error
(either an assertion failure or "cache lookup failed for collation 0").
Per report from Manuel Rigger. Back-patch to v12 where the mistake
came in, evidently in commit 5e1963fb7. (Before non-deterministic
collations, this function wasn't collation sensitive.)
Discussion: https://postgr.es/m/CA+u7OA4HOjtymxAbuGNh4-X_2R0Lw5n01tzvP8E5-i-2gQXYWA@mail.gmail.com
Initializing a pointer to "false" isn't per project style,
and reportedly some compilers warn about it (though I've
not seen any such warnings in the buildfarm).
Seems to have come in with commit ff11e7f4b, so back-patch
to v12 where that was added.
Didier Gautheron
Discussion: https://postgr.es/m/CAJRYxu+XQuM0qnSqt1Ujztu6fBPzMMAT3VEn6W32rgKG6A2Fsw@mail.gmail.com
Commit 6b76f1bb5 changed all the RADIUS auth parameters to be lists
rather than single values. But its use of SplitIdentifierString
to parse the list format was not very carefully thought through,
because that function thinks it's parsing SQL identifiers, which
means it will (a) downcase the strings and (b) truncate them to
be shorter than NAMEDATALEN. While downcasing should be harmless
for the server names and ports, it's just wrong for the shared
secrets, and probably for the NAS Identifier strings as well.
The truncation aspect is at least potentially a problem too,
though typical values for these parameters would fit in 63 bytes.
Fortunately, we now have a function SplitGUCList that is exactly
the same except for not doing the two unwanted things, so fixing
this is a trivial matter of calling that function instead.
While here, improve the documentation to show how to double-quote
the parameter values. I failed to resist the temptation to do
some copy-editing as well.
Report and patch from Marcos David (bug #16106); doc changes by me.
Back-patch to v10 where the aforesaid commit came in, since this is
arguably a regression from our previous behavior with RADIUS auth.
Discussion: https://postgr.es/m/16106-7d319e4295d08e70@postgresql.org
The TableFunc node (i.e., XMLTABLE) includes type and collation OIDs
that might not be referenced anywhere else in the expression tree,
so they need to be accounted for when extracting dependencies.
Fortunately, the practical effects of this are limited, since
(a) it's somewhat unlikely that people would be extracting
columns of non-builtin types from an XML document, and (b)
in many scenarios, the query would contain other references
to such types, or functions depending on them. However, it's
not hard to construct examples wherein the existing code lets
one drop a type used in XMLTABLE and thereby break a view.
This is evidently an original oversight in the XMLTABLE patch,
so back-patch to v10 where that came in.
Discussion: https://postgr.es/m/18427.1573508501@sss.pgh.pa.us
pg_upgrade needs to check whether certain non-upgradable data types
appear anywhere on-disk in the source cluster. It knew that it has
to check for these types being contained inside domains and composite
types; but it somehow overlooked that they could be contained in
arrays and ranges, too. Extend the existing recursive-containment
query to handle those cases.
We probably should have noticed this oversight while working on
commit 0ccfc2822 and follow-ups, but we failed to :-(. The whole
thing's possibly a bit overdesigned, since we don't really expect
that any of these types will appear on disk; but if we're going to
the effort of doing a recursive search then it's silly not to cover
all the possibilities.
While at it, refactor so that we have only one copy of the search
logic, not three-and-counting. Also, to keep the branches looking
more alike, back-patch the output wording change of commit 1634d3615.
Back-patch to all supported branches.
Discussion: https://postgr.es/m/31473.1573412838@sss.pgh.pa.us
The point is that DELETE triggers cannot modify any values.
Reported-by: Eugen Konkov, Liudmila Mantrova
Discussion: https://postgr.es/m/919823407.20191029175436@yandex.ru
Backpatch-through: 12 only, where commit as missing
The example of expansion of multiple views claimed that the resulting
subquery nest would not get fully flattened because of an aggregate
function. There's no aggregate in the example, though, only a user
defined function confusingly named MIN(). In a modern server, the
reason for the non-flattening is that MIN() is volatile, but I'm
unsure whether that was true back when this text was written.
Let's reduce the confusion level by using LEAST() instead (which
we didn't have at the time this example was created). And then
we can just say that the planner will flatten the sub-queries, so
the rewrite system doesn't have to.
Noted by Paul Jungwirth. This text is old enough to vote, so
back-patch to all supported branches.
Discussion: https://postgr.es/m/CA+renyXZFnmp9PcvX1EVR2dR=XG5e6E-AELr8AHCNZ8RYrpnPw@mail.gmail.com
Commits 4ea03f3f4 et al arranged to filter out row counts in parallel
plans, because those are dependent on the number of workers actually
obtained. Somehow I missed that the 'Rows Removed by Filter' counts
can also vary, so fix that too. Per buildfarm.
This seems worth a last-minute patch because unreliable regression
tests are a serious pain in the rear for packagers.
Like the previous patch, back-patch to v11 where this test was
introduced.
The previous statement that using a passphrase disables the ability to
change the server's SSL configuration without a server restart was no
longer completely true since the introduction of
ssl_passphrase_command_supports_reload.
This happens when we add a replica identity column on a subscriber
that does not yet exist on the publisher, according to the mapping
maintained by the subscriber. Code that checks whether the target
relation on the subscriber is updatable would check the replica
identity attribute bitmap with a column number -1, which would result
in an error. To fix, skip such columns in the bitmap lookup and
consider the relation not updatable. The result is consistent with
the rule that the replica identity columns on the subscriber must be a
subset of those on the publisher, since if the column doesn't exist on
the publisher, the column set on the subscriber can't be a subset.
Reported-by: Tim Clarke <tim.clarke@minerva.info>
Analyzed-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Discussion: https://www.postgresql.org/message-id/flat/a9139c29-7ddd-973b-aa7f-71fed9c38d75%40minerva.info
As usual, the release notes for other branches will be made by cutting
these down, but put them up for community review first. Note that a
fair percentage of the entries apply only to prior branches because
their issue was already fixed in 12.0. I'll remove those from the 12.1
list later.
Currently, postgres_fdw does not support preparing a remote transaction
for two-phase commit even in the case where the remote transaction is
read-only, but the old error message appeared to imply that that was not
supported only if the remote transaction modified remote tables. Change
the message so as to include the case where the remote transaction is
read-only.
Also fix a comment above the message.
Also add a note about the lack of supporting PREPARE TRANSACTION to the
postgres_fdw documentation.
Reported-by: Gilles Darold
Author: Gilles Darold and Etsuro Fujita
Reviewed-by: Michael Paquier and Kyotaro Horiguchi
Backpatch-through: 9.4
Discussion: https://postgr.es/m/08600ed3-3084-be70-65ba-279ab19618a5%40darold.net
Declaring this in the client-visible header ecpglib.h was a pretty
poor decision. It's not meant to be application-callable (and if
it was, putting it outside the extern "C" { ... } wrapper means
that C++ clients would fail to call it). And the declaration would
not even compile for a client, anyway, since it would not have the
macro pg_attribute_format_arg(). Fortunately, it seems that no
clients have tried to include this header with ENABLE_NLS defined,
or we'd have gotten complaints about that. But we have no business
putting such a restriction on client code.
Move the declaration to ecpglib_extern.h, since in fact nothing
outside src/interfaces/ecpg/ecpglib/ needs to call it.
The practical effect of this is just that clients can now safely
#include ecpglib.h while having ENABLE_NLS defined, but that seems
like enough of a reason to back-patch it.
Discussion: https://postgr.es/m/20590.1573069709@sss.pgh.pa.us
SET CONSTRAINTS ... DEFERRED failed on partitioned tables, because of a
sanity check that ensures that the affected constraints have triggers.
On partitioned tables, the triggers are in the leaf partitions, not in
the partitioned relations themselves, so the sanity check fails.
Removing the sanity check solves the problem, because the code needed to
support the case is already there.
Backpatch to 11.
Note: deferred unique constraints are not affected by this bug, because
they do have triggers in the parent partitioned table. I did not add a
test for this scenario.
Discussion: https://postgr.es/m/20191105212915.GA11324@alvherre.pgsql
This patch adopts the overflow check logic introduced by commit cbdb8b4c0
into two more places. interval_mul() failed to notice if it computed a
new microseconds value that was one more than INT64_MAX, and pgbench's
double-to-int64 logic had the same sorts of edge-case problems that
cbdb8b4c0 fixed in the core code.
To make this easier to get right in future, put the guts of the checks
into new macros in c.h, and add commentary about how to use the macros
correctly.
Back-patch to all supported branches, as we did with the previous fix.
Yuya Watari
Discussion: https://postgr.es/m/CAJ2pMkbkkFw2hb9Qb1Zj8d06EhWAQXFLy73St4qWv6aX=vqnjw@mail.gmail.com
If there is the WAL page that the continuation WAL record just fits within
(i.e., the continuation record ends just at the end of the page) and
the LSN in such page is specified with -s option, previously pg_waldump
caused an assertion failure. The cause of this assertion failure was that
XLogFindNextRecord() that pg_waldump -s calls mistakenly handled
such special WAL page.
This commit changes XLogFindNextRecord() so that it can handle
such WAL page correctly.
Back-patch to all supported versions.
Author: Andrey Lepikhov
Reviewed-by: Fujii Masao, Michael Paquier
Discussion: https://postgr.es/m/99303554-5dd5-06e6-f943-b3005ccd6edd@postgrespro.ru
The docs do say which GUCs can be changed only by superusers, but we
forgot to mention this for the new log_transaction_sample_rate. This
GUC was introduced in PostgreSQL 12, so backpatch accordingly.
Author: Adrien Nayrat
Backpatch-through: 12
Avoid creating transiently-inconsistent slot states where possible,
by not setting TTS_FLAG_SHOULDFREE until after the slot actually has
a free'able tuple pointer, and by making sure that we reset tts_nvalid
and related derived state before we replace the tuple contents. This
would only matter if something were to examine the slot after we'd
suffered some kind of error (e.g. out of memory) while manipulating
the slot. We typically don't do that, so these changes might just be
cosmetic --- but even if so, it seems like good future-proofing.
Also remove some redundant Asserts, and add a couple for consistency.
Back-patch to v12 where all this code was rewritten.
Discussion: https://postgr.es/m/16095-c3ff2e5283b8dba5@postgresql.org
Since commit d26a810eb, we've defined bool as being either _Bool from
<stdbool.h>, or "unsigned char"; but that commit overlooked the fact
that probes.d has "#define bool char". For consistency, make it say
"unsigned char" instead. This should be strictly a cosmetic change,
but it seems best to be in sync.
Formally, in the now-normal case where we're using <stdbool.h>, it'd
be better to write "#define bool _Bool". However, then we'd need
some build infrastructure to inject that configuration choice into
probes.d, and it doesn't seem worth the trouble. We only use
<stdbool.h> if sizeof(_Bool) is 1, so having DTrace think that
bool parameters are "unsigned char" should be close enough.
Back-patch to v12 where d26a810eb came in.
Discussion: https://postgr.es/m/CAA4eK1LmaKO7Du9M9Lo=kxGU8sB6aL8fa3sF6z6d5yYYVe3BuQ@mail.gmail.com
When sending data for logical decoding using the streaming replication
protocol via a WAL sender, the timestamp of the sent write message is
allocated at the beginning of the message when preparing for the write,
and actually computed when the write message is ready to be sent.
The timestamp was getting computed after sending the message. This
impacts anything using logical decoding, causing for example logical
replication to report mostly NULL for last_msg_send_time in
pg_stat_subscription.
This commit makes sure that the timestamp is computed before sending the
message. This is wrong since 5a991ef, so backpatch down to 9.4.
Author: Jeff Janes
Discussion: https://postgr.es/m/CAMkU=1z=WMn8jt7iEdC5sYNaPgAgOASb_OW5JYv-vMdYaJSL-w@mail.gmail.com
Backpatch-through: 9.4
WindowAgg will potentially store large numbers of input rows into
tuplestores to allow access to other rows in the frame. If the input
is coming via an explicit Sort node, then unneeded columns will
already have been discarded (since Sort requests a small tlist); but
there are idioms like COUNT(*) OVER () that result in the input not
being sorted at all, and cases where the input is being sorted by some
means other than a Sort; if we don't request a small tlist, then
WindowAgg's storage requirement is inflated by the unneeded columns.
Backpatch back to 9.6, where the current tlist handling was added.
(Prior to that, WindowAgg would always use a small tlist.)
Discussion: https://postgr.es/m/87a7ator8n.fsf@news-spur.riddles.org.uk
For a long time (since commit aed378e8d) we have had a policy to log
nothing about a connection if the client disconnects when challenged
for a password. This is because libpq-using clients will typically
do that, and then come back for a new connection attempt once they've
collected a password from their user, so that logging the abandoned
connection attempt will just result in log spam. However, this did
not work well for PAM authentication: the bottom-level function
pam_passwd_conv_proc() was on board with it, but we logged messages
at higher levels anyway, for lack of any reporting mechanism.
Add a flag and tweak the logic so that the case is silent, as it is
for other password-using auth mechanisms.
Per complaint from Yoann La Cancellera. It's been like this for awhile,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/CACP=ajbrFFYUrLyJBLV8=q+eNCapa1xDEyvXhMoYrNphs-xqPw@mail.gmail.com