A pending asynchronous request is handled by process_pending_request(),
which previously not only processed an in-progress remote query but
performed ExecForeignScan() to produce a tuple to return to the local
server asynchronously from the result of the remote query. But that led
to a server crash when executing a query or led to an "InstrStartNode
called twice in a row" or "InstrEndLoop called on running node" failure
when doing EXPLAIN ANALYZE of it, in cases where the plan tree for it
contained multiple async-capable nodes accessing the same
initplan/subplan that contained multiple async-capable nodes scanning
the same foreign tables as for the parent async-capable nodes, as
reported by Andrey Lepikhov. The reason is that the second step in
process_pending_request() invoked when executing the initplan/subplan
for one of the parent async-capable nodes caused recursive execution of
the initplan/subplan for another of the parent async-capable nodes.
To fix, split process_pending_request() into the two steps and postpone
the second step until ForeignAsyncConfigureWait() is called for each of
the pending asynchronous requests. Also, in ExecAppendAsyncEventWait()
we assumed that FDWs would register at least one wait event in a
WaitEventSet created there when they were called from
ForeignAsyncConfigureWait() in that function, but allow FDWs to register
zero wait events in the WaitEventSet; modify ExecAppendAsyncEventWait()
to just return in that case.
Oversight in commit 27e1f1456. Back-patch to v14 where that commit went
in.
Andrey Lepikhov and Etsuro Fujita
Discussion: https://postgr.es/m/fe5eaa19-1704-e4a4-76ee-3b9d37ade399@postgrespro.ru
Buildfarm shows that this test has a further failure mode when a
checkpoint starts earlier than expected, so we detect a "checkpoint
completed" line that's not the one we want. Change the config to try
and prevent this.
Per buildfarm
While at it, update one comment that was forgotten in commit
d18e75664a2f.
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20210729.162038.534808353849568395.horikyota.ntt@gmail.com
Commit ffa2e4670 changed libpq so that multiple error reports
occurring during one operation (a connection attempt or query)
are accumulated in conn->errorMessage, where before new ones
usually replaced any prior error. At least in theory, that makes
us more vulnerable to running out of memory for the errorMessage
buffer. If it did happen, the user would be left with just an
empty-string error report, which is pretty unhelpful.
We can improve this by relying on pqexpbuffer.c's existing "broken
buffer" convention to track whether we've hit OOM for the current
operation's error string, and then substituting a constant "out of
memory" string in the small number of places where the errorMessage
is read out.
While at it, apply the same method to similar OOM cases in
pqInternalNotice and pqGetErrorNotice3.
Back-patch to v14 where ffa2e4670 came in. In principle this could
go back further; but in view of the lack of field reports, the
hazard seems negligible in older branches.
Discussion: https://postgr.es/m/530153.1627425648@sss.pgh.pa.us
Certain versions of msys2/Windows have been observed to resolve symlinks
in perl2host rather than just follow them. This defeats using a
symlinked shorter path to a longer path, and makes certain tests fail.
We therefore call perl2host on the parent directory of the symlink and
thereafter just use that result.
Apply to release 14 where the problem has been observed.
Sometimes cygpath has been observed to return a path with a trailing
slash. That can cause problems, Also, make "cygpath" usage
consistent with "pwd -W" with respect to the use of forward slashes.
Backpatch to release 14 where the current code was introduced.
pg_verifybackup needs by default pg_waldump to check after a range of
WAL segments required for a backup, except if --no-parse-wal is
specified. The code checked for the presence of the binary pg_waldump
in an installation and reported an error, but it forgot to properly
exit(). This could lead to confusing errors reported.
Reviewed-by: Robert Haas, Fabien Coelho
Discussion: https://postgr.es/m/YQDMdB+B68yePFeT@paquier.xyz
Backpatch-through: 13
If a file is truncated, we must update minRecoveryPoint. Once a file is
truncated, there's no going back; it would not be safe to stop recovery
at a point earlier than that anymore.
Commit 7bffc9b7bf changed xact_redo_commit() so that it updates
minRecoveryPoint on truncation, but forgot to change xact_redo_abort().
Back-patch to all supported versions.
Reported-by: mengjuan.cmj@alibaba-inc.com
Author: Fujii Masao
Reviewed-by: Heikki Linnakangas
Discussion: https://postgr.es/m/b029fce3-4fac-4265-968e-16f36ff4d075.mengjuan.cmj@alibaba-inc.com
It wasn't all that clear which lock levels, if any, would be held on the
DEFAULT partition during an ATTACH PARTITION operation.
Also, clarify which locks will be taken if the DEFAULT partition or the
table being attached are themselves partitioned tables.
Here I'm only backpatching to v12 as before then we obtained an ACCESS
EXCLUSIVE lock on the partitioned table. It seems much less relevant to
mention which locks are taken on other tables when the partitioned table
itself is locked with an ACCESS EXCLUSIVE lock.
Author: Matthias van de Meent, David Rowley
Discussion: https://postgr.es/m/CAEze2WiTB6iwrV8W_J=fnrnZ7fowW3qu-8iQ8zCHP3FiQ6+o-A@mail.gmail.com
Backpatch-through: 12
This changes the behavior of examining the pg_file_settings view after
changing a config option that requires restart. The user needs to know
that any change of such options does not take effect until a restart,
and this worked correctly if the line is edited without removing it.
However, for the case where the line is removed altogether, the flag
doesn't get set, because a flag was only set in set_config_option, but
that's not called for lines removed. Repair.
(Ref.: commits 62d16c7fc561 and a486e35706ea)
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/202107262302.xsfdfc5sb7sh@alvherre.pgsql
We failed to deal with an UNKNOWN-type input for
anycompatiblemultirange; that should throw an error indicating
that we don't know how to resolve the multirange type.
We also failed to infer the type of an anycompatiblerange output
from an anycompatiblemultirange input or vice versa.
Per bug #17066 from Alexander Lakhin. Back-patch to v14
where multiranges were added.
Discussion: https://postgr.es/m/17066-16a37f6223a8470b@postgresql.org
The error messages using the word "non-negative" are confusing
because it's ambiguous about whether it accepts zero or not.
This commit improves those error messages by replacing it with
less ambiguous word like "greater than zero" or
"greater than or equal to zero".
Also this commit added the note about the word "non-negative" to
the error message style guide, to help writing the new error messages.
When postgres_fdw option fetch_size was set to zero, previously
the error message "fetch_size requires a non-negative integer value"
was reported. This error message was outright buggy. Therefore
back-patch to all supported versions where such buggy error message
could be thrown.
Reported-by: Hou Zhijie
Author: Bharath Rupireddy
Reviewed-by: Kyotaro Horiguchi, Fujii Masao
Discussion: https://postgr.es/m/OS0PR01MB5716415335A06B489F1B3A8194569@OS0PR01MB5716.jpnprd01.prod.outlook.com
Add pg_resetxlog -u option to set the oldest xid in pg_control.
Previously -x set this value be -2 billion less than the -x value.
However, this causes the server to immediately scan all relation's
relfrozenxid so it can advance pg_control's oldest xid to be inside the
autovacuum_freeze_max_age range, which is inefficient and might disrupt
diagnostic recovery. pg_upgrade will use this option to better create
the new cluster to match the old cluster.
Reported-by: Jason Harvey, Floris Van Nee
Discussion: https://postgr.es/m/20190615183759.GB239428@rfd.leadboat.com, 87da83168c644fd9aae38f546cc70295@opammb0562.comp.optiver.com
Author: Bertrand Drouvot
Backpatch-through: 9.6
Commit ad600bba04 added psql command \dX listing extended statistics
objects, but it failed to consider search_path when selecting the
elements so some of the returned elements might be invisible.
The visibility was already considered for tab completion (added by
commit d99d58cdc8), so adding it to the query is fairly simple.
Reported and fix by Justin Pryzby, regression tests by me. Backpatch
to PostgreSQL 14, where \dX was introduced.
Batchpatch-through: 14
Author: Justin Pryzby
Reviewed-by: Tatsuro Yamada
Discussion: https://postgr.es/m/c027a541-5856-75a5-0868-341301e1624b%40nttcom.co.jp_1
The documentation mentioned the use of log_checkpoints, that cannot be
used in this context. This commit replaces log_checkpoints with
force_parallel_mode, a developer option useful to perform checks related
to parallelism.
Oversight in 854434c.
Author: Haiying Tang
Discussion: https://postgr.es/m/OS0PR01MB6113954B883ACEB2DDC973F2FBE59@OS0PR01MB6113.jpnprd01.prod.outlook.com
Backpatch-through: 14
Adjust the header comment in get_agg_clause_costs so that it matches what
the function currently does. No recursive searching has been done ever
since 0a2bc5d61. It also does not determine the aggtranstype like the
comment claimed. That's all done in preprocess_aggref().
preprocess_aggref also now determines the numOrderedAggs, so remove the
mention that get_agg_clause_costs also calculates "counts".
Normally, since this is just an adjustment of a comment it might not be
worth back-patching, but since this code is new to PG14 and that version
is still in beta, then it seems worth having the comments match.
Discussion: https://postgr.es/m/CAApHDvrrGrTJFPELrjx0CnDtz9B7Jy2XYW3Z2BKifAWLSaJYwQ@mail.gmail.com
Backpatch-though: 14
These have been introduced by 7fbe0c8, and could happen for
pg_basebackup and pg_receivewal.
Per report from Coverity for the ones in walmethods.c, I have spotted
the ones in receivelog.c after more review.
Backpatch-through: 10
The point of introducing the hash_mem_multiplier GUC was to let users
reproduce the old behavior of hash aggregation, i.e. that it could use
more than work_mem at need. However, the implementation failed to get
the job done on Win64, where work_mem is clamped to 2GB to protect
various places that calculate memory sizes using "long int". As
written, the same clamp was applied to hash_mem. This resulted in
severe performance regressions for queries requiring a bit more than
2GB for hash aggregation, as they now spill to disk and there's no
way to stop that.
Getting rid of the work_mem restriction seems like a good idea, but
it's a big job and could not conceivably be back-patched. However,
there's only a fairly small number of places that are concerned with
the hash_mem value, and it turns out to be possible to remove the
restriction there without too much code churn or any ABI breaks.
So, let's do that for now to fix the regression, and leave the
larger task for another day.
This patch does introduce a bit more infrastructure that should help
with the larger task, namely pg_bitutils.h support for working with
size_t values.
Per gripe from Laurent Hasson. Back-patch to v13 where the
behavior change came in.
Discussion: https://postgr.es/m/997817.1627074924@sss.pgh.pa.us
Discussion: https://postgr.es/m/MN2PR15MB25601E80A9B6D1BA6F592B1985E39@MN2PR15MB2560.namprd15.prod.outlook.com
5a1e1d83022 was a minimal bug fix for dc7420c2c92. To avoid future bugs of
that kind, deduplicate the choice of a relation's horizon into a new helper,
GlobalVisHorizonKindForRel().
As the code in question was only introduced in dc7420c2c92 it seems worth
backpatching this change as well, otherwise 14 will look different from all
other branches.
A different approach to this was suggested by Matthias van de Meent.
Author: Andres Freund
Discussion: https://postgr.es/m/20210621122919.2qhu3pfugxxp3cji@alap3.anarazel.de
Backpatch: 14, like 5a1e1d83022
We have an implementation restriction that PREPARE TRANSACTION can't
handle cases where both session-lifespan and transaction-lifespan locks
are held on the same lockable object. (That's because we'd otherwise
need to acquire a new PROCLOCK entry during post-prepare cleanup, which
is an operation that might fail. The situation can only arise with odd
usages of advisory locks, so removing the restriction is probably not
worth the amount of effort it would take.) AtPrepare_Locks attempted
to enforce this, but its logic was many bricks shy of a load, because
it only detected cases where the session and transaction locks had the
same lockmode. Locks of different modes on the same object would lead
to the rather unhelpful message "PANIC: we seem to have dropped a bit
somewhere".
To fix, build a transient hashtable with one entry per locktag,
not one per locktag + mode, and use that to detect conflicts.
Per bug #17122 from Alexander Pyhalov. This bug is ancient,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/17122-04f3c32098a62233@postgresql.org
We previously took a hard-line attitude that callers should never print
a null string pointer, and doing so is worthy of an assertion failure
or crash. However, we've long since flushed out any easy-to-find bugs
of that nature. What remains is a lot of code that perhaps could fail
that way in hard-to-reach corner cases. For example, in something as
simple as
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
errmsg("constraint \"%s\" for table \"%s\" does not exist",
conname, get_rel_name(relid))));
one must wonder whether it's completely guaranteed that get_rel_name
cannot return NULL in this context. If such a situation did occur,
the existing policy converts what might be a pretty minor bug into
a server crash condition. This is not good for robustness.
Hence, let's follow the lead of glibc and print "(null)" instead
of failing. We should, of course, still consider it a bug if that
behavior is reachable in ordinary use; but crashing seems less
desirable than not crashing.
This fix works across-the-board in v12 and up, where we always use
src/port/snprintf.c. Before that, on most platforms we're at the mercy
of the local libc, but it appears that Solaris 10 is the only supported
platform where we'd still get a crash. Most other platforms such as
*BSD, macOS, and Solaris 11 have adopted glibc's behavior at some
point. (AIX and HPUX just print "" not "(null)", but that's close
enough.) I've not checked what Windows' native printf would do, but
it doesn't matter because we've long used snprintf.c on that platform.
In v12 and up, also const-ify related code so that we're not casting
away const on the constant string. This is just neatnik-ism, since
next to no compilers will warn about that.
Discussion: https://postgr.es/m/17098-b960f3616c861f83@postgresql.org
This testing was useful when it was written, nigh twenty years ago,
but it seems fairly pointless for any platform built in the last
dozen or more years. (Compare also the comments at 8a2121185.)
Also we now have reports that the test program itself fails under
ThreadSanitizer. Rather than invest effort in fixing it, let's
just drop it, and assume that the few people who still care
already know they need to use --disable-thread-safety.
Back-patch into v14, for consistency with 8a2121185.
Discussion: https://postgr.es/m/CADhDkKzPSiNvA3Hyq+wSR_icuPmazG0cFe=YnC3U-CFcYLc8Xw@mail.gmail.com
Code inlined by LLVM can crash or fail with "Relocation type not
implemented yet!" if it tries to access thread local variables. Don't
inline such code.
Back-patch to 11, where LLVM arrived. Bug #16696.
Author: Dmitry Marakasov <amdmi3@amdmi3.ru>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/16696-29d944a33801fbfe@postgresql.org
In postgresql.conf, memory and file size GUCs can be specified with "B"
(bytes) as of b06d8e58b. Likewise, time GUCs can be specified with "us"
(microseconds) as of caf626b2c. Update postgres.conf.sample to reflect
that fact.
Pavel Luzanov
Backpatch to v12, which is the earliest version that allows both of
these units. A separate commit will document the "B" case for v11.
Discussion: https://www.postgresql.org/message-id/flat/f10d16fc-8fa0-1b3c-7371-cb3a35a13b7a%40postgrespro.ru
If an error was raised during our initial attempt to check whether
a successfully-compiled expression is "simple", subsequent calls of
exec_stmt_execsql would suppose that stmt->mod_stmt was already computed
when it had not been. This could lead to assertion failures in debug
builds; in production builds the effect would typically be to act as
if INTO STRICT had been specified even when it had not been. Of course
that only matters if the subsequent attempt to execute the expression
succeeds, so that the problem can only be reached by fixing a failure
in some referenced, inline-able SQL function and then retrying the
calling plpgsql function in the same session.
(There might be even-more-obscure ways to change the expression's
behavior without changing the plpgsql function, but that one seems
like the only one people would be likely to hit in practice.)
The most foolproof way to fix this would be to arrange for
exec_prepare_plan to not set expr->plan until we've finished the
subsidiary simple-expression check. But it seems hard to do that
without creating reference-count leak issues. So settle for documenting
the hazard in a comment and fixing exec_stmt_execsql to test separately
for whether it's computed stmt->mod_stmt. (That adds a test-and-branch
per execution, but hopefully that's negligible in context.) In v11 and
up, also fix exec_stmt_call which had a variant of the same issue.
Per bug #17113 from Alexander Lakhin. Back-patch to all
supported branches.
Discussion: https://postgr.es/m/17113-077605ce00e0e7ec@postgresql.org
The logic handling the opening of new WAL segments was fuzzy when using
--compress if a partial, non-compressed, segment with the same base name
existed in the repository storing those files. In this case, using
--compress would cause the code to first check for the existence and the
size of a non-compressed segment, followed by the opening of a new
compressed, partial, segment. The code was accidentally working
correctly on most platforms as the buildfarm has proved, except
bowerbird where gzflush() could fail in this code path. It is wrong
anyway to take the code path used pre-padding when creating a new
partial, non-compressed, segment, so let's fix it.
Note that this issue exists when users mix successive runs of
pg_receivewal with or without compression, as discovered with the tests
introduced by ffc9dda.
While on it, this refactors the code so as code paths that need to know
about the ".gz" suffix are down from four to one in walmethods.c, easing
a bit the introduction of new compression methods. This addresses a
second issue where log messages generated for an unexpected failure
would not show the compressed segment name involved, which was
confusing, printing instead the name of the non-compressed equivalent.
Reported-by: Georgios Kokolatos
Discussion: https://postgr.es/m/YPDLz2x3o1aX2wRh@paquier.xyz
Backpatch-through: 10
Commit 3499df0d added a comment that incorrectly suggested that
--force-index-cleanup did not appear in the same major version as the
similar --no-index-cleanup option. In fact, both options are new to
PostgreSQL 14.
Backpatch: 14-, where both options were introduced.
Further fix the test code in ead9e51e8236, this time by waiting until
the checkpoint has completed before moving on; this ensures that the
WAL segment removal has already happened when we create the next slot.
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20210719.111318.2042379313472032754.horikyota.ntt@gmail.com
We don't allow to create replication slot_name as an empty string ('') via
SQL API pg_create_logical_replication_slot() but it is allowed to be set
via Alter Subscription command. This will lead to apply worker repeatedly
keep trying to stream data via slot_name '' and the user is not allowed to
create the slot with that name.
Author: Japin Li
Reviewed-By: Ranier Vilela, Amit Kapila
Backpatch-through: 10, where it was introduced
Discussion: https://postgr.es/m/MEYP282MB1669CBD98E721C77CA696499B61A9@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
It has been spotted that multiranges lack of ability to decompose them into
individual ranges. Subscription and proper expanded object representation
require substantial work, and it's too late for v14. This commit
provides the implementation of unnest(multirange), which is quite trivial.
unnest(multirange) is defined as a polymorphic procedure.
Catversion is bumped.
Reported-by: Jonathan S. Katz
Discussion: https://postgr.es/m/flat/60258efe-bd7e-4886-82e1-196e0cac5433%40postgresql.org
Author: Alexander Korotkov
Reviewed-by: Justin Pryzby, Jonathan S. Katz, Zhihong Yu, Tom Lane
Reviewed-by: Alvaro Herrera
The new test code added in ead9e51e8236 is racy -- it hinges on
shared-memory state, which changes before the WARNING message is logged.
Put it the other way around.
Backpatch to 13.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/202107161809.zclasccpfcg3@alvherre.pgsql
We had documentation of default_transaction_isolation et al,
but for some reason not of transaction_isolation et al.
AFAICS this is just an ancient oversight, so repair.
Per bug #17077 from Yanliang Lei.
Discussion: https://postgr.es/m/17077-ade8e166a01e1374@postgresql.org
pg_dump failed to preserve the 'enabled' flag (which can be not only
disabled, but also REPLICA or ALWAYS) for partitions which had it
changed from their respective parents. Attempt to handle that by
including a definition for such triggers in the dump, but replace the
standard CREATE TRIGGER line with an ALTER TRIGGER line.
Backpatch to 11, where these triggers can exist. In branches 11 and 12,
pick up a few test lines from commit b9b408c48724 to verify that
pg_upgrade is okay with these arrangements.
Co-authored-by: Justin Pryzby <pryzby@telsasoft.com>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20200930223450.GA14848@telsasoft.com
When triggers are cloned from partitioned tables to their partitions,
the 'tgenabled' flag (origin/replica/always/disable) was not propagated.
Make it so that the flag on the trigger on partition is initially set to
the same value as on the partitioned table.
Add a test case to verify the behavior.
Backpatch to 11, where this appeared in commit 86f575948c77.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200930223450.GA14848@telsasoft.com
When some slots are invalidated due to the max_slot_wal_keep_size limit,
the old segment horizon should move forward to stay within the limit.
However, in commit c6550776394e we forgot to call KeepLogSeg again to
recompute the horizon after invalidating replication slots. In cases
where other slots remained, the limits would be recomputed eventually
for other reasons, but if all slots were invalidated, the limits would
not move at all afterwards. Repair.
Backpatch to 13 where the feature was introduced.
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reported-by: Marcin Krupowicz <mk@071.ovh>
Discussion: https://postgr.es/m/17103-004130e8f27782c9@postgresql.org
Ensure to properly mark up function parameters in text with <parameter>,
avoid using <acronym> for terms which aren't acronyms and properly place
the ", and" in a value list. The acronym removal is a follow-up to commit
fb72a7b8c3 which removed it for minmax-multi. In passing, also fix an
incorrectly cased word.
Author: Ekaterina Kiryanova <e.kiryanova@postgrespro.ru>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/c050ecbc-80b2-b360-3c1d-9fe6a6a11bb5@postgrespro.ru
Backpatch-through: v14
Autoconf's AC_CHECK_DECLS() always defines HAVE_DECL_whatever
as 1 or 0, but some of the entries in msvc/Solution.pm showed
such symbols as "undef" instead of 0. Fix that for consistency.
There's no live bug in current usages AFAICS, but it's not hard
to imagine one creeping in if more-complex #if tests get added.
Back-patch to v13, which is as far back as Solution.pm contains
this data. The inconsistency still exists in the manually-filled
pg_config_ext.h.win32 files of older branches; but as long as the
problem is only latent, it doesn't seem worth the trouble to
clean things up there.
Discussion: https://postgr.es/m/3185430.1626133592@sss.pgh.pa.us
This commit fixes the description of a couple of multirange operators and
oprjoin for another multirange operator. The change of oprjoin is more
cosmetic since both old and new functions return the same constant.
These cosmetic changes don't worth catalog incompatibility between 14beta2
and 14beta3. So, catversion isn't bumped.
Discussion: https://postgr.es/m/CAPpHfdv9OZEuZDqOQoUKpXhq%3Dmc-qa4gKCPmcgG5Vvesu7%3Ds1w%40mail.gmail.com
Backpatch-throgh: 14