Commit Graph

59464 Commits

Author SHA1 Message Date
Bruce Momjian f057781686 Improve Perl script which adds commit links to release notes
Reported-by: Andrew Dunstan

Discussion: https://postgr.es/m/b2465837-56df-4794-a0b5-5e6ed44ed870@dunslane.net

Author: Andrew Dunstan

Backpatch-through: 12
2024-09-19 08:45:33 -04:00
Alexander Korotkov 4c57facbb1 Add UpgradeTaskProcessCB to typedefs.list
While it doesn't directly influence indentation right now, add it for
uniformity.
2024-09-19 14:34:52 +03:00
Alexander Korotkov a094e8b9e4 Fix order of includes in src/bin/pg_upgrade/info.c 2024-09-19 14:34:00 +03:00
Alexander Korotkov 014f9f34d2 Move pg_wal_replay_wait() to xlogfuncs.c
This commit moves pg_wal_replay_wait() procedure to be a neighbor of
WAL-related functions in xlogfuncs.c.  The implementation of LSN waiting
continues to reside in the same place.

By proposal from Michael Paquier.

Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/18c0fa64-0475-415e-a1bd-665d922c5201%40eisentraut.org
2024-09-19 14:26:11 +03:00
Michael Paquier 87eeadaea1 psql: Clean up more aggressively state of \bind[_named], \parse and \close
This fixes a couple of issues with the psql meta-commands mentioned
above when called repeatedly:
- The statement name is reset for each call.  If a command errors out,
its send_mode would still be set, causing an incorrect path to be taken
when processing a query.  For \bind_named, this could trigger an
assertion failure as a statement name is always expected for this
meta-command.  This issue has been introduced by d55322b0da.
- The memory allocated for bind parameters can be leaked.  This is a bug
enlarged by d55322b0da that exists since 5b66de3433, as it is also
possible to leak memory with \bind in v16 and v17.  This requires a fix
that will be done on the affected branches separately.  This issue is
taken care of here for HEAD.

This patch tightens the cleanup of the state used for the extended
protocol meta-commands (bind parameters, send mode, statement name) by
doing it before running each meta-command on top of doing it once a
query has been processed, avoiding any leaks and the inconsistencies
when mixing calls, by refactoring the cleanup in a single routine used
in all the code paths where this step is required.

Reported-by: Alexander Lakhin
Author: Anthonin Bonnefoy
Discussion: https://postgr.es/m/2e5b89af-a351-ff0a-000c-037ac28314ab@gmail.com
2024-09-19 15:39:01 +09:00
Michael Paquier d69a3f4d70 Introduce ATT_PARTITIONED_TABLE in tablecmds.c
Partitioned tables and normal tables have been relying on ATT_TABLE in
ATSimplePermissions() to produce error messages that depend on the
relation's relkind, because both relkinds currently support the same set
of ALTER TABLE subcommands.

A patch to restrict SET LOGGED/UNLOGGED for partitioned tables is under
discussion, and introducing ATT_PARTITIONED_TABLE makes subcommand
restrictions for partitioned tables easier to deal with, so let's add
one.  There is no functional change.

Author: Michael Paquier
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/Zt6cDnwSvnuLLnak@paquier.xyz
2024-09-19 12:22:56 +09:00
David Rowley 5d56d07ca3 Optimize tuplestore usage for WITH RECURSIVE CTEs
nodeRecursiveunion.c makes use of two tuplestores and, until now, would
delete and recreate one of these tuplestores after every recursive
iteration.

Here we adjust that behavior and instead reuse one of the existing
tuplestores and just empty it of all tuples using tuplestore_clear().

This saves some free/malloc roundtrips and has shown a 25-30% performance
improvement for queries that perform very little work between recursive
iterations.

This also paves the way to add some EXPLAIN ANALYZE telemetry output for
recursive common table expressions, similar to what was done in 1eff8279d
and 95d6e9af0.  Previously calling tuplestore_end() would have caused
the maximum storage space used to be lost.

Reviewed-by: Tatsuo Ishii
Discussion: https://postgr.es/m/CAApHDvr9yW0YRiK8A2J7nvyT8g17YzbSfOviEWrghazKZbHbig@mail.gmail.com
2024-09-19 15:20:35 +12:00
Bruce Momjian 8a6e85b46e doc PG relnotes: add paragraph explaining the section symbol
And suppress the symbol in print mode, where the section symbol does not
appear.

Discussion: https://postgr.es/m/ZuobILbmGGetxEg5@momjian.us

Backpatch-through: 12
2024-09-18 17:13:19 -04:00
Bruce Momjian f986882ffd doc PG relnotes: no relnote footnotes for commit links in PDF
In print output, there are too many commit links for footnotes in the
release notes to be useful.

Reported-by: Tom Lane

Discussion: https://postgr.es/m/1709858.1726618961@sss.pgh.pa.us

Backpatch-through: 12
2024-09-18 16:34:52 -04:00
Nathan Bossart b52c4fc3c0 Add TOAST table to pg_index.
This change allows pg_index rows to use out-of-line storage for the
"indexprs" and "indpred" columns, which enables use-cases such as
very large index expressions.

This system catalog was previously not given a TOAST table due to a
fear of circularity issues (see commit 96cdeae07f).  Testing has
not revealed any such problems, and it seems unlikely that the
entries for system indexes could ever need out-of-line storage.  In
any case, it is still early in the v18 development cycle, so
committing this now will hopefully increase the chances of finding
any unexpected problems prior to release.

Bumps catversion.

Reported-by: Jonathan Katz
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/b611015f-b423-458c-aa2d-be0e655cc1b4%40postgresql.org
2024-09-18 14:42:57 -05:00
Fujii Masao a7c39db5eb docs: Improve the description of num_timed column in pg_stat_checkpointer.
The previous documentation stated that num_timed reflects the number of
scheduled checkpoints performed. However, checkpoints may be skipped
if the server has been idle, and num_timed counts both skipped and completed
checkpoints. This commit clarifies the description to make it clear that
the counter includes both skipped and completed checkpoints.

Back-patch to v17 where pg_stat_checkpointer was added.

Author: Fujii Masao
Reviewed-by: Alexander Korotkov
Discussion: https://postgr.es/m/9ea77f40-818d-4841-9dee-158ac8f6e690@oss.nttdata.com
2024-09-19 02:14:10 +09:00
Michael Paquier 24f5205948 Add some sanity checks in executor for query ID reporting
This commit adds three sanity checks in code paths of the executor where
it is possible to use hooks, checking that a query ID is reported in
pg_stat_activity if compute_query_id is enabled:
- ExecutorRun()
- ExecutorFinish()
- ExecutorEnd()

This causes the test in pg_stat_statements added in 933848d16d to
complain immediately in ExecutorRun().  The idea behind this commit is
to help extensions to detect if they are missing query ID reports when a
query goes through the executor.  Perhaps this will prove to be a bad
idea, but let's see where this experience goes in v18 and newer
versions.

Reviewed-by: Sami Imseih
Discussion: https://postgr.es/m/ZuJb5xCKHH0A9tMN@paquier.xyz
2024-09-18 14:43:37 +09:00
Fujii Masao 4f08ab5545 postgres_fdw: Extend postgres_fdw_get_connections to return user name.
This commit adds a "user_name" output column to
the postgres_fdw_get_connections function, returning the name
of the local user mapped to the foreign server for each connection.
If a public mapping is used, it returns "public."

This helps identify postgres_fdw connections more easily,
such as determining which connections are invalid, closed,
or used within the current transaction.

No extension version bump is needed, as commit c297a47c5f
already handled it for v18~.

Author: Hayato Kuroda
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/b492a935-6c7e-8c08-e485-3c1d64d7d10f@oss.nttdata.com
2024-09-18 12:51:48 +09:00
Michael Paquier b14e9ce7d5 Extend PgStat_HashKey.objid from 4 to 8 bytes
This opens the possibility to define keys for more types of statistics
kinds in PgStat_HashKey, the first case being 8-byte query IDs for
statistics like pg_stat_statements.

This increases the size of PgStat_HashKey from 12 to 16 bytes, while
PgStatShared_HashEntry, entry stored in the dshash for pgstats, keeps
the same size due to alignment.

xl_xact_stats_item, that tracks the stats items to drop in commit WAL
records, is increased from 12 to 16 bytes.  Note that individual chunks
in commit WAL records should be multiples of sizeof(int), hence 8-byte
object IDs are stored as two uint32, based on a suggestion from Heikki
Linnakangas.

While on it, the field of PgStat_HashKey is renamed from "objoid" to
"objid", as for some stats kinds this field does not refer to OIDs but
just IDs, like for replication slot stats.

This commit bumps the following format variables:
- PGSTAT_FILE_FORMAT_ID, as PgStat_HashKey is written to the stats file
for non-serialized stats kinds in the dshash table.
- XLOG_PAGE_MAGIC for the changes in xl_xact_stats_item.
- Catalog version, for the SQL function pg_stat_have_stats().

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/ZsvTS9EW79Up8I62@paquier.xyz
2024-09-18 12:44:15 +09:00
Noah Misch ac04aa84a7 Don't enter parallel mode when holding interrupts.
Doing so caused the leader to hang in wait_event=ParallelFinish, which
required an immediate shutdown to resolve.  Back-patch to v12 (all
supported versions).

Francesco Degrassi

Discussion: https://postgr.es/m/CAC-SaSzHUKT=vZJ8MPxYdC_URPfax+yoA1hKTcF4ROz_Q6z0_Q@mail.gmail.com
2024-09-17 19:53:11 -07:00
Michael Paquier 933848d16d Add missing query ID reporting in extended query protocol
This commit adds query ID reports for two code paths when processing
extended query protocol messages:
- When receiving a bind message, setting it to the first Query retrieved
from a cached cache.
- When receiving an execute message, setting it to the first PlannedStmt
stored in a portal.

An advantage of this method is that this is able to cover all the types
of portals handled in the extended query protocol, particularly these
two when the report done in ExecutorStart() is not enough (neither is an
addition in ExecutorRun(), actually, for the second point):
- Multiple execute messages, with multiple ExecutorRun().
- Portal with execute/fetch messages, like a query with a RETURNING
clause and a fetch size that stores the tuples in a first execute
message going though ExecutorStart() and ExecuteRun(), followed by one
or more execute messages doing only fetches from the tuplestore created
in the first message.  This corresponds to the case where
execute_is_fetch is set, for example.

Note that the query ID reporting done in ExecutorStart() is still
necessary, as an EXECUTE requires it.  Query ID reporting is optimistic
and more calls to pgstat_report_query_id() don't matter as the first
report takes priority except if the report is forced.  The comment in
ExecutorStart() is adjusted to reflect better the reality with the
extended query protocol.

The test added in pg_stat_statements is a courtesy of Robert Haas.  This
uses psql's \bind metacommand, hence this part is backpatched down to
v16.

Reported-by:  Kaido Vaikla, Erik Wienhold
Author: Sami Imseih
Reviewed-by: Jian He, Andrei Lepikhov, Michael Paquier
Discussion: https://postgr.es/m/CA+427g8DiW3aZ6pOpVgkPbqK97ouBdf18VLiHFesea2jUk3XoQ@mail.gmail.com
Discussion: https://postgr.es/m/CA+TgmoZxtnf_jZ=VqBSyaU8hfUkkwoJCJ6ufy4LGpXaunKrjrg@mail.gmail.com
Discussion: https://postgr.es/m/1391613709.939460.1684777418070@office.mailbox.org
Backpatch-through: 14
2024-09-18 09:59:09 +09:00
Thomas Munro 70d38e3d8a Allow ReadStream to be consumed as raw block numbers.
Commits 041b9680 and 6377e12a changed the interface of
scan_analyze_next_block() to take a ReadStream instead of a BlockNumber
and a BufferAccessStrategy, and to return a value to indicate when the
stream has run out of blocks.

This caused integration problems for at least one known extension that
uses specially encoded BlockNumber values that map to different
underlying storage, because acquire_sample_rows() sets up the stream so
that read_stream_next_buffer() reads blocks from the main fork of the
relation's SMgrRelation.

Provide read_stream_next_block(), as a way for such an extension to
access the stream of raw BlockNumbers directly and forward them to its
own ReadBuffer() calls after decoding, as it could in earlier releases.
The new function returns the BlockNumber and BufferAccessStrategy that
were previously passed directly to scan_analyze_next_block().
Alternatively, an extension could wrap the stream of BlockNumbers in
another ReadStream with a callback that performs any decoding required
to arrive at real storage manager BlockNumber values, so that it could
benefit from the I/O combining and concurrency provided by
read_stream.c.

Another class of table access method that does nothing in
scan_analyze_next_block() because it is not block-oriented could use
this function to control the number of block sampling loops.  It could
match the previous behavior with "return read_stream_next_block(stream,
&bas) != InvalidBlockNumber".

Ongoing work is expected to provide better ANALYZE support for table
access methods that don't behave like heapam with respect to storage
blocks, but that will be for future releases.

Back-patch to 17.

Reported-by: Mats Kindahl <mats@timescale.com>
Reviewed-by: Mats Kindahl <mats@timescale.com>
Discussion: https://postgr.es/m/CA%2B14425%2BCcm07ocG97Fp%2BFrD9xUXqmBKFvecp0p%2BgV2YYR258Q%40mail.gmail.com
2024-09-18 11:34:28 +12:00
Tom Lane 918e21d251 Repair pg_upgrade for identity sequences with non-default persistence.
Since we introduced unlogged sequences in v15, identity sequences
have defaulted to having the same persistence as their owning table.
However, it is possible to change that with ALTER SEQUENCE, and
pg_dump tries to preserve the logged-ness of sequences when it doesn't
match (as indeed it wouldn't for an unlogged table from before v15).

The fly in the ointment is that ALTER SEQUENCE SET [UN]LOGGED fails
in binary-upgrade mode, because it needs to assign a new relfilenode
which we cannot permit in that mode.  Thus, trying to pg_upgrade a
database containing a mismatching identity sequence failed.

To fix, add syntax to ADD/ALTER COLUMN GENERATED AS IDENTITY to allow
the sequence's persistence to be set correctly at creation, and use
that instead of ALTER SEQUENCE SET [UN]LOGGED in pg_dump.  (I tried to
make SET [UN]LOGGED work without any pg_dump modifications, but that
seems too fragile to be a desirable answer.  This way should be
markedly faster anyhow.)

In passing, document the previously-undocumented SEQUENCE NAME option
that pg_dump also relies on for identity sequences; I see no value
in trying to pretend it doesn't exist.

Per bug #18618 from Anthony Hsu.
Back-patch to v15 where we invented this stuff.

Discussion: https://postgr.es/m/18618-d4eb26d669ed110a@postgresql.org
2024-09-17 15:53:35 -04:00
Alexander Korotkov 2520226c95 Ensure standby promotion point in 043_wal_replay_wait.pl
This commit ensures standby will be promoted at least at the primary insert
LSN we have just observed.  We use pg_switch_wal() to force the insert LSN
to be written then wait for standby to catchup.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/1d7b08f2-64a2-77fb-c666-c9a74c68eeda%40gmail.com
2024-09-17 22:51:06 +03:00
Alexander Korotkov 85b98b8d5a Minor cleanup related to pg_wal_replay_wait() procedure
* Rename $node_standby1 to $node_standby in 043_wal_replay_wait.pl as there
   is only one standby.
 * Remove useless debug printing in 043_wal_replay_wait.pl.
 * Fix typo in one check description in 043_wal_replay_wait.pl.
 * Fix some wording in comments and documentation.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/1d7b08f2-64a2-77fb-c666-c9a74c68eeda%40gmail.com
Reviewed-by: Alexander Lakhin
2024-09-17 22:50:43 +03:00
Peter Geoghegan d8adfc18be Avoid parallel nbtree index scan hangs with SAOPs.
Commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution, made
parallel index scans work with the new design for arrays via explicit
scheduling of primitive index scans.  A backend that successfully
scheduled the scan's next primitive index scan saved its backend local
array keys in shared memory.  Any backend could pick up the scheduled
primitive scan within _bt_first.  This scheme decouples scheduling a
primitive scan from starting the scan (by performing another descent of
the index via a _bt_search call from _bt_first) to make things robust.

The scheme had a deadlock hazard, at least when the leader process
participated in the scan.  _bt_parallel_seize had a code path that made
backends that were not in an immediate position to start a scheduled
primitive index scan wait for some other backend to do so instead.
Under the right circumstances, the leader process could wait here
forever: the leader would wait for any other backend to start the
primitive scan, while every worker was busy waiting on the leader to
consume tuples from the scan's tuple queue.

To fix, don't wait for a scheduled primitive index scan to be started by
some other eligible backend from within _bt_parallel_seize (when the
calling backend isn't in a position to do so itself).  Return false
instead, while recording that the scan has a scheduled primitive index
scan in backend local state.  This leaves the backend in the same state
as the existing case where a backend schedules (or tries to schedule)
another primitive index scan from within _bt_advance_array_keys, before
calling _bt_parallel_seize.  _bt_parallel_seize already handles that
case by returning false without waiting, and without unsetting the
backend local state.  Leaving the backend in this state enables it to
start a previously scheduled primitive index scan once it gets back to
_bt_first.

Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp
execution.

Matthias van de Meent, with tweaks by me.

Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reported-By: Tomas Vondra <tomas@vondra.me>
Reviewed-By: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzmMGaPa32u9x_FvEbPTUkP5e95i=QxR8054nvCRydP-sw@mail.gmail.com
Backpatch: 17-, where nbtree SAOP execution was enhanced.
2024-09-17 11:10:35 -04:00
Peter Eisentraut 89f908a6d0 Add temporal FOREIGN KEY contraints
Add PERIOD clause to foreign key constraint definitions.  This is
supported for range and multirange types.  Temporal foreign keys check
for range containment instead of equality.

This feature matches the behavior of the SQL standard temporal foreign
keys, but it works on PostgreSQL's native ranges instead of SQL's
"periods", which don't exist in PostgreSQL (yet).

Reference actions ON {UPDATE,DELETE} {CASCADE,SET NULL,SET DEFAULT}
are not supported yet.

(previously committed as 34768ee361, reverted by 8aee330af55; this is
essentially unchanged from those)

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-09-17 11:29:30 +02:00
Peter Eisentraut fc0438b4e8 Add temporal PRIMARY KEY and UNIQUE constraints
Add WITHOUT OVERLAPS clause to PRIMARY KEY and UNIQUE constraints.
These are backed by GiST indexes instead of B-tree indexes, since they
are essentially exclusion constraints with = for the scalar parts of
the key and && for the temporal part.

(previously committed as 46a0cd4cef, reverted by 46a0cd4cefb; the new
part is this:)

Because 'empty' && 'empty' is false, the temporal PK/UQ constraint
allowed duplicates, which is confusing to users and breaks internal
expectations.  For instance, when GROUP BY checks functional
dependencies on the PK, it allows selecting other columns from the
table, but in the presence of duplicate keys you could get the value
from any of their rows.  So we need to forbid empties.

This all means that at the moment we can only support ranges and
multiranges for temporal PK/UQs, unlike the original patch (above).
Documentation and tests for this are added.  But this could
conceivably be extended by introducing some more general support for
the notion of "empty" for other types.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-09-17 11:29:30 +02:00
Peter Eisentraut 7406ab623f Add stratnum GiST support function
This is support function 12 for the GiST AM and translates
"well-known" RT*StrategyNumber values into whatever strategy number is
used by the opclass (since no particular numbers are actually
required).  We will use this to support temporal PRIMARY
KEY/UNIQUE/FOREIGN KEY/FOR PORTION OF functionality.

This commit adds two implementations, one for internal GiST opclasses
(just an identity function) and another for btree_gist opclasses.  It
updates btree_gist from 1.7 to 1.8, adding the support function for
all its opclasses.

(previously committed as 6db4598fcb, reverted by 8aee330af55; this is
essentially unchanged from those)

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-09-17 11:29:29 +02:00
Tatsuo Ishii 95d6e9af07 Add memory/disk usage for Window aggregate nodes in EXPLAIN.
This commit is similar to 1eff8279d and expands the idea to Window
aggregate nodes so that users can know how much memory or disk the
tuplestore used.

This commit uses newly introduced tuplestore_get_stats() to inquire this
information and add some additional output in EXPLAIN ANALYZE to
display the information for the Window aggregate node.

Reviewed-by: David Rowley, Ashutosh Bapat, Maxim Orlov, Jian He
Discussion: https://postgr.es/m/20240706.202254.89740021795421286.ishii%40postgresql.org
2024-09-17 14:38:53 +09:00
Nathan Bossart 1bbf1e2f1a Fix redefinition of typedef.
Per buildfarm members sifaka and longfin, clang with
-Wtypedef-redefinition warns of a duplicate typedef unless building
with C11.

Oversight in commit 40e2e5e92b.
2024-09-16 16:33:50 -05:00
Nathan Bossart c880cf2588 pg_upgrade: Parallelize encoding conversion check.
This commit makes use of the new task framework in pg_upgrade to
parallelize the check for incompatible user-defined encoding
conversions, i.e., those defined on servers older than v14.  This
step will now process multiple databases concurrently when
pg_upgrade's --jobs option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart f93f5f7b98 pg_upgrade: Parallelize WITH OIDS check.
This commit makes use of the new task framework in pg_upgrade to
parallelize the check for tables declared WITH OIDS.  This step
will now process multiple databases concurrently when pg_upgrade's
--jobs option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart cf2f82a37c pg_upgrade: Parallelize incompatible polymorphics check.
This commit makes use of the new task framework in pg_upgrade to
parallelize the check for usage of incompatible polymorphic
functions, i.e., those with arguments of type anyarray/anyelement
rather than the newer anycompatible variants.  This step will now
process multiple databases concurrently when pg_upgrade's --jobs
option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart c34eabfbbf pg_upgrade: Parallelize postfix operator check.
This commit makes use of the new task framework in pg_upgrade to
parallelize the check for user-defined postfix operators.  This
step will now process multiple databases concurrently when
pg_upgrade's --jobs option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart 9db3018cf8 pg_upgrade: Parallelize contrib/isn check.
This commit makes use of the new task framework in pg_upgrade to
parallelize the check for contrib/isn functions that rely on the
bigint data type.  This step will now process multiple databases
concurrently when pg_upgrade's --jobs option is provided a value
greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart bbf83cab98 pg_upgrade: Parallelize data type checks.
This commit makes use of the new task framework in pg_upgrade to
parallelize the checks for incompatible data types, i.e., data
types whose on-disk format has changed, data types that have been
removed, etc.  This step will now process multiple databases
concurrently when pg_upgrade's --jobs option is provided a value
greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart 6ab8f27bc7 pg_upgrade: Parallelize retrieving extension updates.
This commit makes use of the new task framework in pg_upgrade to
parallelize retrieving the set of extensions that should be updated
with the ALTER EXTENSION command after upgrade.  This step will now
process multiple databases concurrently when pg_upgrade's --jobs
option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart 46cad8b319 pg_upgrade: Parallelize retrieving loadable libraries.
This commit makes use of the new task framework in pg_upgrade to
parallelize retrieving the names of all libraries referenced by
non-built-in C functions.  This step will now process multiple
databases concurrently when pg_upgrade's --jobs option is provided
a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart 7baa36de58 pg_upgrade: Parallelize subscription check.
This commit makes use of the new task framework in pg_upgrade to
parallelize the part of check_old_cluster_subscription_state() that
verifies each of the subscribed tables is in the 'i' (initialize)
or 'r' (ready) state.  This check will now process multiple
databases concurrently when pg_upgrade's --jobs option is provided
a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart 6d3d2e8e54 pg_upgrade: Parallelize retrieving relation information.
This commit makes use of the new task framework in pg_upgrade to
parallelize retrieving relation and logical slot information.  This
step will now process multiple databases concurrently when
pg_upgrade's --jobs option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Nathan Bossart 40e2e5e92b Introduce framework for parallelizing various pg_upgrade tasks.
A number of pg_upgrade steps require connecting to every database
in the cluster and running the same query in each one.  When there
are many databases, these steps are particularly time-consuming,
especially since they are performed sequentially, i.e., we connect
to a database, run the query, and process the results before moving
on to the next database.

This commit introduces a new framework that makes it easy to
parallelize most of these once-in-each-database tasks by processing
multiple databases concurrently.  This framework manages a set of
slots that follow a simple state machine, and it uses libpq's
asynchronous APIs to establish the connections and run the queries.
The --jobs option is used to determine the number of slots to use.
To use this new task framework, callers simply need to provide the
query and a callback function to process its results, and the
framework takes care of the rest.  A more complete description is
provided at the top of the new task.c file.

None of the eligible once-in-each-database tasks are converted to
use this new framework in this commit.  That will be done via
several follow-up commits.

Reviewed-by: Jeff Davis, Robert Haas, Daniel Gustafsson, Ilya Gladyshev, Corey Huinker
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13
2024-09-16 16:10:33 -05:00
Bruce Momjian d891c49286 doc PG relnotes: fix SGML markup for new commit links
Backpatch-through: 12
2024-09-16 14:23:39 -04:00
Bruce Momjian 2572104552 scripts: add Perl script to add links to release notes
Reported-by: jian he

Discussion: https://postgr.es/m/ZuYsS5XdA7hVcV9l@momjian.us

Backpatch-through: 12
2024-09-16 13:26:37 -04:00
Bruce Momjian 4632e5cf4b Perl scripts: revert 43ce181059
Small improvement not worth the code churn.

Reported-by: Andrew Dunstan

Discussion: https://postgr.es/m/42f2242a-422b-4aa3-8d60-d67b229c4a52@dunslane.net

Backpatch-through: master
2024-09-15 21:25:24 -04:00
Tom Lane d5622acb32 Replace usages of xmlXPathCompile() with xmlXPathCtxtCompile().
In existing releases of libxml2, xmlXPathCompile can be driven
to stack overflow because it fails to protect itself against
too-deeply-nested input.  While there is an upstream fix as of
yesterday, it will take years for that to propagate into all
shipping versions.  In the meantime, we can protect our own
usages basically for free by calling xmlXPathCtxtCompile instead.

(The actual bug is that libxml2 keeps its nesting counter in the
xmlXPathContext, and its parsing code was willing to just skip
counting nesting levels if it didn't have a context.  So if we supply
a context, all is well.  It seems odd actually that it works at all
to not supply a context, because this means that XPath parsing does
not have access to XML namespace info.  Apparently libxml2 never
checks namespaces until runtime?  Anyway, this seems like good
future-proofing even if its only immediate effect is to dodge a bug.)

Sadly, this hack only offers protection with libxml2 2.9.11 and newer.
Before that there are multiple similar problems, so if you are
processing untrusted XML it behooves you to get a newer version.
But we have some pretty old libxml2 in the buildfarm, so it seems
impractical to add a regression test to verify this fix.

Per bug #18617 from Jingzhou Fu.  Back-patch to all supported
versions.

Discussion: https://postgr.es/m/18617-1cee4d2ed1f4e7ae@postgresql.org
Discussion: https://gitlab.gnome.org/GNOME/libxml2/-/issues/799
2024-09-15 13:33:09 -04:00
Bruce Momjian 43ce181059 Perl scripts: eliminate "Useless interpolation" warnings
Eliminate warnings of Perl Critic from src/tools.

Backpatch-through: master
2024-09-15 10:55:37 -04:00
Tom Lane b8ea0f675f Run regression tests with timezone America/Los_Angeles.
Historically we've used timezone "PST8PDT", but the recent release
2024b of tzdb changes the definition of that zone in a way that
breaks many test cases concerned with dates before 1970.  Although
we've not yet adopted 2024b into our own tree, this is already
problematic for people using --with-system-tzdata if their platform
has already adopted 2024b.  To work with both older and newer
versions of tzdb, switch to using "America/Los_Angeles", accepting
the ensuing changes in regression test results.

Back-patch to all supported branches.

Per report and patch from Wolfgang Walther.

Discussion: https://postgr.es/m/0a997455-5aba-4cf2-a354-d26d8bcbfae6@technowledgy.de
2024-09-14 17:55:02 -04:00
Alvaro Herrera f64074c88c
Add commit 7229ebe011 to .git-blame-ignore-revs. 2024-09-14 20:17:30 +02:00
Tom Lane 94537982ec Remove obsolete comment in pg_stat_statements.
Commit 76db9cb63 removed the use of multiple nesting counters,
but missed one comment describing that arrangement.

Back-patch to v17 where 76db9cb63 came in, just to avoid confusion.

Julien Rouhaud

Discussion: https://postgr.es/m/gfcwh3zjxc2vygltapgo7g6yacdor5s4ynr234b6v2ohhuvt7m@gr42joxalenw
2024-09-14 11:42:31 -04:00
Andrew Dunstan 76f2a0e547 Improve meson's detection of perl build flags
The current method of detecting perl build flags breaks if the path to
perl contains a space. This change makes two improvements. First,
instead of getting a list of ldflags and ccdlflags and then trying to
filter those out of the reported ldopts, we tell perl to suppress
reporting those in the first instance. Second, it tells perl to parse
those and output them, one per line. Thus any space on the option in a
file name, for example, is preserved.

Issue reported off-list by Muralikrishna Bandaru

Discussion: https://postgr.es/01117f88-f465-bf6c-9362-083bd72ca305@dunslane.net

Backpatch to release 16.
2024-09-14 10:26:25 -04:00
Andrew Dunstan bc46104fc9 Only define NO_THREAD_SAFE_LOCALE for MSVC plperl when required
Latest versions of Strawberry Perl define USE_THREAD_SAFE_LOCALE, and we
therefore get a handshake error when building against such instances.
The solution is to perform a test to see if USE_THREAD_SAFE_LOCALE is
defined and only define NO_THREAD_SAFE_LOCALE if it isn't.

Backpatch the meson.build fix back to release 16 and apply the same
logic to Mkvcbuild.pm in releases 12 through 16.

Original report of the issue from Muralikrishna Bandaru.
2024-09-14 08:47:06 -04:00
Tom Lane fae55f0bb3 Allow _h_indexbuild() to be interrupted.
When we are building a hash index that is large enough to need
pre-sorting (larger than either maintenance_work_mem or NBuffers),
the initial sorting phase is interruptible, but the insertion
phase wasn't.  Add the missing CHECK_FOR_INTERRUPTS().

Per bug #18616 from Alexander Lakhin.  Back-patch to all
supported branches.

Pavel Borisov

Discussion: https://postgr.es/m/18616-acbb9e5caf41e964@postgresql.org
2024-09-13 16:17:04 -04:00
Nathan Bossart 9a23967063 Add commit 2b03cfeea4 to .git-blame-ignore-revs. 2024-09-13 13:06:06 -05:00
Nathan Bossart 70d1c664f4 Fix contrib/pageinspect's test for sequences.
I managed to break this test in two different ways in commit
05036a3155.

First, the output of the new call to tuple_data_split() on the test
sequence is dependent on endianness.  This is fixed by setting a
special start value for the test sequence that produces the same
output regardless of the endianness of the machine.

Second, on versions older than v15, the new test case fails under
"force_parallel_mode = regress" with the following error:

	ERROR:  cannot access temporary tables during a parallel operation

This is because pageinspect's disk-accessing functions are
incorrectly marked PARALLEL SAFE on versions older than v15 (see
commit aeaaf520f4 for details).  This one is fixed by changing the
test sequence to be permanent.  The only reason it was previously
marked temporary was to avoid needing a DROP SEQUENCE command at
the end of the test.  Unlike some other tests in this file, the use
of a permanent sequence here shouldn't result in any test
instability like what was fixed by commit e2933a6e11.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/ZuOKOut5hhDlf_bP%40nathan
Backpatch-through: 12
2024-09-13 10:16:40 -05:00