Commit Graph

59455 Commits

Author SHA1 Message Date
Fujii Masao 412229d197 Deduplicate code in LargeObjectExists and myLargeObjectExists.
myLargeObjectExists() and LargeObjectExists() had nearly identical code,
except for handling snapshots. This commit renames myLargeObjectExists()
to LargeObjectExistsWithSnapshot() and refactors LargeObjectExists()
to call it internally, reducing duplication.

Author: Yugo Nagata
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20240702163444.ab586f6075e502eb84f11b1a@sranhm.sraoss.co.jp
2024-09-12 21:45:42 +09:00
Peter Eisentraut 23d0b48468 Remove hardcoded hash opclass function signature exceptions
hashvalidate(), which validates the signatures of support functions
for the hash AM, contained several hardcoded exceptions.  For example,
hash/date_ops support function 1 was hashint4(), which would
ordinarily fail validation because the function argument is int4, not
date.  But this works internally because int4 and date are of the same
size.  There are several more exceptions like this that happen to work
and were allowed historically but would now fail the function
signature validation.

This patch removes those exceptions by providing new support functions
that have the proper declared signatures.  They internally share most
of the code with the "wrong" functions they replace, so the behavior
is still the same.

With the exceptions gone, hashvalidate() is now simplified and relies
fully on check_amproc_signature().

hashvarlena() and hashvarlenaextended() are kept in pg_proc.dat
because some extensions currently use them to build hash functions for
their own types, and we need to keep exposing these functions as
"LANGUAGE internal" functions for that to continue to work.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/29c3b746-69e7-482a-b37c-dbbf7e5b009b@eisentraut.org
2024-09-12 12:57:43 +02:00
David Rowley 5bb9ba2739 Doc: alphabetize aggregate function table
A few recent JSON aggregates have been added without much consideration
to the existing order.  Put these back in alphabetical order (with the
exception of the JSONB variant of each JSON aggregate).

Author: Wolfgang Walther <walther@technowledgy.de>
Reviewed-by: Marlene Reiterer <marlene.reiterer.03@gmail.com>
Discussion: https://postgr.es/m/6a7b910c-3feb-4006-b817-9b4759cb6bb6%40technowledgy.de
Backpatch-through: 16, where these aggregates were added
2024-09-12 22:36:39 +12:00
Fujii Masao fefa76f70f Remove old RULE privilege completely.
The RULE privilege for tables was removed in v8.2, but for backward
compatibility, GRANT/REVOKE and privilege functions like
has_table_privilege continued to accept the RULE keyword without
any effect.

After discussions on pgsql-hackers, it was agreed that this compatibility
is no longer needed. Since it's been long enough since the deprecation,
we've decided to fully remove support for RULE privilege,
so GRANT/REVOKE and privilege functions will no longer accept it.

Author: Fujii Masao
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/976a3581-6939-457f-b947-fc3dc836c083@oss.nttdata.com
2024-09-12 19:33:44 +09:00
Peter Eisentraut 811af9786b Don't overwrite scan key in systable_beginscan()
When systable_beginscan() and systable_beginscan_ordered() choose an
index scan, they remap the attribute numbers in the passed-in scan
keys to the attribute numbers of the index, and then write those
remapped attribute numbers back into the scan key passed by the
caller.  This second part is surprising and gratuitous.  It means that
a scan key cannot safely be used more than once (but it might
sometimes work, depending on circumstances).  Also, there is no value
in providing these remapped attribute numbers back to the caller,
since they can't do anything with that.

Fix that by making a copy of the scan keys passed by the caller and
make the modifications there.

Also, some code that had to work around the previous situation is
simplified.

Discussion: https://www.postgresql.org/message-id/flat/f8c739d9-f48d-4187-b214-df3391ba41ab@eisentraut.org
2024-09-12 10:48:39 +02:00
Michael Paquier 00c76cf21c Move logic related to WAL replay of Heap/Heap2 into its own file
This brings more clarity to heapam.c, by cleanly separating all the
logic related to WAL replay and the rest of Heap and Heap2, similarly
to other RMGRs like hash, btree, etc.

The header reorganization is also nice in heapam.c, cutting half of the
headers required.

Author: Li Yong
Reviewed-by: Sutou Kouhei, Michael Paquier
Discussion: https://postgr.es/m/EFE55E65-D7BD-4C6A-B630-91F43FD0771B@ebay.com
2024-09-12 13:32:05 +09:00
David Rowley 9fba1ed294 Adjust tuplestore stats API
1eff8279d added an API to tuplestore.c to allow callers to obtain
storage telemetry data.  That API wasn't quite good enough for callers
that perform tuplestore_clear() as the telemetry functions only
accounted for the current state of the tuplestore, not the maximums
before tuplestore_clear() was called.

There's a pending patch that would like to add tuplestore telemetry
output to EXPLAIN ANALYZE for WindowAgg.  That node type uses
tuplestore_clear() before moving to the next window partition and we
want to show the maximum space used, not the space used for the final
partition.

Reviewed-by: Tatsuo Ishii, Ashutosh Bapat
Discussion: https://postgres/m/CAApHDvoY8cibGcicLV0fNh=9JVx9PANcWvhkdjBnDCc9Quqytg@mail.gmail.com
2024-09-12 16:02:01 +12:00
Amit Langote e6c45d85dc SQL/JSON: Fix JSON_QUERY(... WITH CONDITIONAL WRAPPER)
Currently, when WITH CONDITIONAL WRAPPER is specified, array wrappers
are applied even to a single SQL/JSON item if it is a scalar JSON
value, but this behavior does not comply with the standard.

To fix, apply wrappers only when there are multiple SQL/JSON items
in the result.

Reported-by: Peter Eisentraut <peter@eisentraut.org>
Author: Peter Eisentraut <peter@eisentraut.org>
Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/8022e067-818b-45d3-8fab-6e0d94d03626%40eisentraut.org
Backpatch-through: 17
2024-09-12 09:39:56 +09:00
Tom Lane 77761ee5dd Remove incorrect Assert.
check_agglevels_and_constraints() asserted that if we find an
aggregate function in an EXPR_KIND_FROM_SUBSELECT expression, the
expression must be in a LATERAL subquery.  Alexander Lakhin found a
case where that's not so: because of the odd scoping rules for NEW/OLD
within a rule, a reference to NEW/OLD could cause an aggregate to be
considered top-level even though it's in an unmarked sub-select.
The error message that would be thrown seems sufficiently on-point,
so just remove the Assert.  (Hence, this is not a bug for production
builds.)

This Assert was added by me in commit eaccfded9 (9.3 era).  It looks
like I put it in to cross-check that the new logic for detecting
misplaced aggregates (using agglevelsup) caught the same cases that a
previous check on p_lateral_active did.  So there might have been some
related misbehavior before eaccfded9 ... but that's very ancient
history by now, so I didn't dig any deeper.

Per bug #18608 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/18608-48de0717508ee429@postgresql.org
2024-09-11 11:41:47 -04:00
Magnus Hagander 280423300b pg_createsubscriber: minor documentation fixes 2024-09-11 16:30:17 +02:00
Peter Eisentraut 8b5c6a54c4 Replace gratuitous memmove() with memcpy()
The index access methods all had similar code that copied the
passed-in scan keys to local storage.  They all used memmove() for
that, which is not wrong, but it seems confusing not to use memcpy()
when that would work.  Presumably, this was all once copied from
ancient code and never adjusted.

Discussion: https://www.postgresql.org/message-id/flat/f8c739d9-f48d-4187-b214-df3391ba41ab@eisentraut.org
2024-09-11 15:21:36 +02:00
Tomas Vondra 842265631d Fix unique key checks in JSON object constructors
When building a JSON object, the code builds a hash table of keys, to
allow checking if the keys are unique. The uniqueness check and adding
the new key happens in json_unique_check_key(), but this assumes the
pointer to the key remains valid.

Unfortunately, two places passed pointers to keys in a buffer, while
also appending more data (additional key/value pairs) to the buffer.
With enough data the buffer is resized by enlargeStringInfo(), which
calls repalloc(), invalidating the earlier key pointers.

Due to this the uniqueness check may fail with both false negatives and
false positives, producing JSON objects with duplicate keys or failing
to produce a perfectly valid JSON object.

This affects multiple functions that enforce uniqueness of keys, all
introduced in PG16 with the new SQL/JSON:

- json_object_agg_unique / jsonb_object_agg_unique
- json_object / jsonb_objectagg

Existing regression tests did not detect the issue, simply because the
initial buffer size is 1024 and the objects were small enough not to
require the repalloc.

With a sufficiently large object, AddressSanitizer reported the access
to invalid memory immediately. So would valgrind, of course.

Fixed by copying the key into the hash table memory context, and adding
regression tests with enough data to repalloc the buffer. Backpatch to
16, where the functions were introduced.

Reported by Alexander Lakhin. Investigation and initial fix by Junwang
Zhao, with various improvements and tests by me.

Reported-by: Alexander Lakhin
Author: Junwang Zhao, Tomas Vondra
Backpatch-through: 16
Discussion: https://postgr.es/m/18598-3279ed972a2347c7@postgresql.org
Discussion: https://postgr.es/m/CAEG8a3JjH0ReJF2_O7-8LuEbO69BxPhYeXs95_x7+H9AMWF1gw@mail.gmail.com
2024-09-11 13:21:10 +02:00
Peter Eisentraut 6b25c57a2d Update .gitignore
for commit 0785d1b8b2
2024-09-11 09:26:20 +02:00
Peter Eisentraut 1fb2308e69 Remove obsolete unconstify()
This is no longer needed as of OpenSSL 1.1.0 (the current minimum
version).  LibreSSL made the same change around the same time as well.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/20463f79-a7b0-4bba-a178-d805f99c02f9%40eisentraut.org
2024-09-11 09:18:12 +02:00
Peter Eisentraut 0785d1b8b2 common/jsonapi: support libpq as a client
Based on a patch by Michael Paquier.

For libpq, use PQExpBuffer instead of StringInfo. This requires us to
track allocation failures so that we can return JSON_OUT_OF_MEMORY as
needed rather than exit()ing.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Co-authored-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/d1b467a78e0e36ed85a09adf979d04cf124a9d4b.camel@vmware.com
2024-09-11 09:01:07 +02:00
Amit Kapila 3beb945da9 Improve assertion in FindReplTupleInLocalRel().
The first part of the assertion verifying that the passed index must be PK
or RI was incorrectly passing index relation instead of heap relation in
GetRelationIdentityOrPK(). The assertion was not failing because the
second part of the assertion which needs to be performed only when remote
relation has REPLICA_IDENTITY_FULL set was also incorrect.

The change is not backpatched because the current coding doesn't lead to
any failure.

Reported-by: Dilip Kumar
Author: Amit Kapila
Reviewed-by: Vignesh C
Discussion: https://postgr.es/m/CAFiTN-tmguaT1DXbCC+ZomZg-oZLmU6BPhr0po7akQSG6vNJrg@mail.gmail.com
2024-09-11 09:18:23 +05:30
Noah Misch 65c310b310 Optimize pg_visibility with read streams.
We've measured 5% performance improvement, and this arranges to benefit
automatically from future optimizations to the read_stream subsystem.
The area lacked test coverage, so close that gap.

Nazir Bilal Yavuz

Discussion: https://postgr.es/m/CAN55FZ1_Ru3XpMgTwsU67FTH2fs_FrRROmb7x6zs+F44QBEiww@mail.gmail.com
Discussion: https://postgr.es/m/CAEudQAozv3wTY5TV2t29JcwPydbmKbiWQkZD42S2OgzdixPMDQ@mail.gmail.com
2024-09-10 15:21:33 -07:00
Tom Lane 52c707483c Use a hash table to de-duplicate column names in ruleutils.c.
Commit 8004953b5 added a hash table to avoid O(N^2) cost in choosing
unique relation aliases while deparsing a view or rule.  It did
nothing about the similar O(N^2) (maybe worse) costs of choosing
unique column aliases within each RTE.  However, that's now
demonstrably a bottleneck when deparsing CHECK constraints for wide
tables, so let's use a similar hash table to handle those.

The extra cost of setting up the hash table will not be repaid unless
the table has many columns.  I've set this up so that we use the brute
force method if there are less than 32 columns.  The exact cutoff is
not too critical, but this value seems good because it results in both
code paths getting exercised by existing regression-test cases.

Patch by me; thanks to David Rowley for review.

Discussion: https://postgr.es/m/2885468.1722291250@sss.pgh.pa.us
2024-09-10 16:49:09 -04:00
Tom Lane bccca780ee Fix some whitespace issues in XMLSERIALIZE(... INDENT).
We must drop whitespace while parsing the input, else libxml2
will include "blank" nodes that interfere with the desired
indentation behavior.  The end result is that we didn't indent
nodes separated by whitespace.

Also, it seems that libxml2 may add a trailing newline when working
in DOCUMENT mode.  This is semantically insignificant, so strip it.

This is in the gray area between being a bug fix and a definition
change.  However, the INDENT option is still pretty new (since v16),
so I think we can get away with changing this in stable branches.
Hence, back-patch to v16.

Jim Jones

Discussion: https://postgr.es/m/872865a8-548b-48e1-bfcd-4e38e672c1e4@uni-muenster.de
2024-09-10 16:20:31 -04:00
Tom Lane ed055d249d Improve documentation and testing of jsonpath string() for datetimes.
Point out that the output format depends on DateStyle, and test that,
along with testing some cases previously not covered.

In passing, adjust the horology test to verify that the prevailing
DateStyle is 'Postgres, MDY', much as it has long verified the
prevailing TimeZone.  We expect pg_regress to have set these up,
and there are multiple regression tests relying on these settings.

Also make the formatting of entries in table 9.50 more consistent.

David Wheeler (marginal additional hacking by me); review by jian he

Discussion: https://postgr.es/m/56955B33-6959-4FDA-A459-F00363ECDFEE@justatheory.com
2024-09-10 14:48:13 -04:00
Tomas Vondra 861086493f Add PG_TEST_PG_COMBINEBACKUP_MODE to CI tasks
The environment variable PG_TEST_PG_COMBINEBACKUP_MODE has been
available since 35a7b288b9, but was not set by any built-in CI tasks.
This commit modifies two of the CI tasks to use the alternative modes,
to exercise the pg_combinebackup code.

The Linux task uses --copy-file-range, macOS uses --clone.

This is not an exhaustive test of combinations. The supported modes
depend on the operating system and filesystem, and it would be nice to
test all supported combinations. Right now we have just one task for
each OS, and it doesn't seem worth adding more just for this.

Reported-by: Peter Eisentraut
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/48da4a1f-ccd9-4988-9622-24f37b1de2b4%40eisentraut.org
2024-09-10 16:30:38 +02:00
Daniel Gustafsson 390b3cbbb2 Protect against small overread in SASLprep validation
In case of torn UTF8 in the input data we might end up going
past the end of the string since we don't account for length.
While validation won't be performed on a sequence with a NULL
byte it's better to avoid going past the end to beging with.
Fix by taking the length into consideration.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAOYmi+mTnmM172g=_+Yvc47hzzeAsYPy2C4UBY3HK9p-AXNV0g@mail.gmail.com
2024-09-10 11:02:28 +02:00
Peter Eisentraut 56fead44dc Add amgettreeheight index AM API routine
The only current implementation is for btree where it calls
_bt_getrootheight().  Other index types can now also use this to pass
information to their amcostestimate routine.  Previously, btree was
hardcoded and other index types could not hook into the optimizer at
this point.

Author: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2024-09-10 10:03:23 +02:00
Richard Guo f5050f795a Mark expressions nullable by grouping sets
When generating window_pathkeys, distinct_pathkeys, or sort_pathkeys,
we failed to realize that the grouping/ordering expressions might be
nullable by grouping sets.  As a result, we may incorrectly deem that
the PathKeys are redundant by EquivalenceClass processing and thus
remove them from the pathkeys list.  That would lead to wrong results
in some cases.

To fix this issue, we mark the grouping expressions nullable by
grouping sets if that is the case.  If the grouping expression is a
Var or PlaceHolderVar or constructed from those, we can just add the
RT index of the RTE_GROUP RTE to the existing nullingrels field(s);
otherwise we have to add a PlaceHolderVar to carry on the nullingrel
bit.

However, we have to manually remove this nullingrel bit from
expressions in various cases where these expressions are logically
below the grouping step, such as when we generate groupClause pathkeys
for grouping sets, or when we generate PathTarget for initial input to
grouping nodes.

Furthermore, in set_upper_references, the targetlist and quals of an
Agg node should have nullingrels that include the effects of the
grouping step, ie they will have nullingrels equal to the input
Vars/PHVs' nullingrels plus the nullingrel bit that references the
grouping RTE.  In order to perform exact nullingrels matches, we also
need to manually remove this nullingrel bit.

Bump catversion because this changes the querytree produced by the
parser.

Thanks to Tom Lane for the idea to invent a new kind of RTE.

Per reports from Geoff Winkless, Tobias Wendorff, Richard Guo from
various threads.

Author: Richard Guo
Reviewed-by: Ashutosh Bapat, Sutou Kouhei
Discussion: https://postgr.es/m/CAMbWs4_dp7e7oTwaiZeBX8+P1rXw4ThkZxh1QG81rhu9Z47VsQ@mail.gmail.com
2024-09-10 12:36:48 +09:00
Richard Guo 247dea89f7 Introduce an RTE for the grouping step
If there are subqueries in the grouping expressions, each of these
subqueries in the targetlist and HAVING clause is expanded into
distinct SubPlan nodes.  As a result, only one of these SubPlan nodes
would be converted to reference to the grouping key column output by
the Agg node; others would have to get evaluated afresh.  This is not
efficient, and with grouping sets this can cause wrong results issues
in cases where they should go to NULL because they are from the wrong
grouping set.  Furthermore, during re-evaluation, these SubPlan nodes
might use nulled column values from grouping sets, which is not
correct.

This issue is not limited to subqueries.  For other types of
expressions that are part of grouping items, if they are transformed
into another form during preprocessing, they may fail to match lower
target items.  This can also lead to wrong results with grouping sets.

To fix this issue, we introduce a new kind of RTE representing the
output of the grouping step, with columns that are the Vars or
expressions being grouped on.  In the parser, we replace the grouping
expressions in the targetlist and HAVING clause with Vars referencing
this new RTE, so that the output of the parser directly expresses the
semantic requirement that the grouping expressions be gotten from the
grouping output rather than computed some other way.  In the planner,
we first preprocess all the columns of this new RTE and then replace
any Vars in the targetlist and HAVING clause that reference this new
RTE with the underlying grouping expressions, so that we will have
only one instance of a SubPlan node for each subquery contained in the
grouping expressions.

Bump catversion because this changes the querytree produced by the
parser.

Thanks to Tom Lane for the idea to invent a new kind of RTE.

Per reports from Geoff Winkless, Tobias Wendorff, Richard Guo from
various threads.

Author: Richard Guo
Reviewed-by: Ashutosh Bapat, Sutou Kouhei
Discussion: https://postgr.es/m/CAMbWs4_dp7e7oTwaiZeBX8+P1rXw4ThkZxh1QG81rhu9Z47VsQ@mail.gmail.com
2024-09-10 12:35:34 +09:00
Michael Paquier fba49d5293 Remove emode argument from XLogFileRead() and XLogFileReadAnyTLI()
This change makes the code slightly easier to reason about, because
there is actually no need to know if a specific caller of one of these
routines should fail hard on a PANIC, or just let it go through with a
DEBUG2.

The only caller of XLogFileReadAnyTLI() used DEBUG2, and XLogFileRead()
has never used its emode.  This can be simplified since 1bb2558046
that has introduced XLogFileReadAnyTLI(), splitting both.

Author: Yugo Nagata
Discussion: https://postgr.es/m/20240906201043.a640f3b44e755d4db2b6943e@sraoss.co.jp
2024-09-10 08:44:31 +09:00
Masahiko Sawada bb77752342 Add WAL usage reporting to ANALYZE VERBOSE output.
This change adds WAL usage reporting to the output of ANALYZE VERBOSE
and autoanalyze reports. It aligns the analyze output with VACUUM,
providing consistency. Additionally, it aids in troubleshooting cases
where WAL records are generated during analyze operations.

Author: Anthonin Bonnefoy
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAO6_Xqr__kTTCLkftqS0qSCm-J7_xbRG3Ge2rWhucxQJMJhcRA%40mail.gmail.com
2024-09-09 14:56:08 -07:00
Tom Lane de239d01e7 Consistently use PageGetExactFreeSpace() in pgstattuple.
Previously this code used PageGetHeapFreeSpace on heap pages,
and usually used PageGetFreeSpace on index pages (though for some
reason GetHashPageStats used PageGetExactFreeSpace instead).
The difference is that those functions subtract off the size of
a line pointer, and PageGetHeapFreeSpace has some additional
rules about returning zero if adding another line pointer would
require exceeding MaxHeapTuplesPerPage.  Those things make sense
when testing to see if a new tuple can be put on the page, but
they seem pretty strange for pure statistics collection.

Additionally, statapprox_heap had a special rule about counting
a "new" page as being fully available space.  This also seems
strange, because it's not actually usable until VACUUM or some
such process initializes the page.  Moreover, it's inconsistent
with what pgstat_heap does, which is to count such a page as
having zero free space.  So make it work like pgstat_heap, which
as of this patch unconditionally calls PageGetExactFreeSpace.

This is more of a definitional change than a bug fix, so no
back-patch.  The module's documentation doesn't define exactly
what "free space" means either, so we left that as-is.

Frédéric Yhuel, reviewed by Rafia Sabih and Andreas Karlsson.

Discussion: https://postgr.es/m/3a18f843-76f6-4a84-8cca-49537fefa15d@dalibo.com
2024-09-09 14:34:10 -04:00
Tom Lane 218527d014 Don't bother checking the result of SPI_connect[_ext] anymore.
SPI_connect/SPI_connect_ext have not returned any value other than
SPI_OK_CONNECT since commit 1833f1a1c in v10; any errors are thrown
via ereport.  (The most likely failure is out-of-memory, which has
always been thrown that way, so callers had better be prepared for
such errors.)  This makes it somewhat pointless to check these
functions' result, and some callers within our code haven't been
bothering; indeed, the only usage example within spi.sgml doesn't
bother.  So it's likely that the omission has propagated into
extensions too.

Hence, let's standardize on not checking, and document the return
value as historical, while not actually changing these functions'
behavior.  (The original proposal was to change their return type
to "void", but that would needlessly break extensions that are
conforming to the old practice.)  This saves a small amount of
boilerplate code in a lot of places.

Stepan Neretin

Discussion: https://postgr.es/m/CAMaYL5Z9Uk8cD9qGz9QaZ2UBJFOu7jFx5Mwbznz-1tBbPDQZow@mail.gmail.com
2024-09-09 12:18:34 -04:00
Robert Haas cdb6b0fdb0 Add PQfullProtocolVersion() to surface the precise protocol version.
The existing function PQprotocolVersion() does not include the minor
version of the protocol.  In preparation for pending work that will
bump that number for the first time, add a new function to provide it
to clients that may care, using the (major * 10000 + minor)
convention already used by PQserverVersion().

Jacob Champion based on earlier work by Jelte Fennema-Nio

Discussion: http://postgr.es/m/CAOYmi+mM8+6Swt1k7XsLcichJv8xdhPnuNv7-02zJWsezuDL+g@mail.gmail.com
2024-09-09 11:54:55 -04:00
Michael Paquier 5bbdfa8a18 Fix waits of REINDEX CONCURRENTLY for indexes with predicates or expressions
As introduced by f9900df5f9, a REINDEX CONCURRENTLY job done for an
index with predicates or expressions would set PROC_IN_SAFE_IC in its
MyProc->statusFlags, causing it to be ignored by other concurrent
operations.

Such concurrent index rebuilds should never be ignored, as a predicate
or an expression could call a user-defined function that accesses a
different table than the table where the index is rebuilt.

A test that uses injection points is added, backpatched down to 17.
Michail has proposed a different test, but I have added something
simpler with more coverage.

Oversight in f9900df5f9.

Author: Michail Nikolaev
Discussion: https://postgr.es/m/CANtu0oj9A3kZVduFTG0vrmGnKB+DCHgEpzOp0qAyOgmks84j0w@mail.gmail.com
Backpatch-through: 14
2024-09-09 13:49:36 +09:00
Amit Langote dd8bea88ab SQL/JSON: Avoid initializing unnecessary ON ERROR / ON EMPTY steps
When the ON ERROR / ON EMPTY behavior is to return NULL, returning
NULL directly from ExecEvalJsonExprPath() suffices. Therefore, there's
no need to create separate steps to check the error/empty flag or
those to evaluate the the constant NULL expression.  This speeds up
common cases because the default ON ERROR / ON EMPTY behavior for
JSON_QUERY() and JSON_VALUE() is to return NULL.  However, these steps
are necessary if the RETURNING type is a domain, as constraints on the
domain may need to be checked.

Reported-by: Jian He <jian.universality@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-09-09 13:46:58 +09:00
Richard Guo 87b6c3c0b7 Fix order of parameters in a cost_sort call
In label_sort_with_costsize, the cost_sort function is called with the
parameters 'input_disabled_nodes' and 'input_cost' in the wrong order.
This does not cause any plan diffs in the regression tests, because
label_sort_with_costsize is only used to label the Sort node nicely
for EXPLAIN, and cost numbers are not displayed in regression tests.

Oversight in e22253467.  Fixed by passing arguments in the right
order.

Per report from Alexander Lakhin running UBSan.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/a9b7231d-68bc-f117-a07c-96688f3e6aef@gmail.com
2024-09-09 12:58:31 +09:00
Michael Paquier fc415edf8c Add callbacks to control flush of fixed-numbered stats
This commit adds two callbacks in pgstats to have a better control of
the flush timing of pgstat_report_stat(), whose operation depends on the
three PGSTAT_*_INTERVAL variables:
- have_fixed_pending_cb(), to check if a stats kind has any pending
data waiting for a flush.  This is used as a fast path if there are no
pending statistics to flush, and this check is done for fixed-numbered
statistics only if there are no variable-numbered statistics to flush.
A flush will need to happen if at least one callback reports any pending
data.
- flush_fixed_cb(), to do the actual flush.

These callbacks are currently used by the SLRU, WAL and IO statistics,
generalizing the concept for all stats kinds (builtin and custom).

The SLRU and IO stats relied each on one global variable to determine
whether a flush should happen; these are now local to pgstat_slru.c and
pgstat_io.c, cleaning up a bit how the pending flush states are tracked
in pgstat.c.

pgstat_flush_io() and pgstat_flush_wal() are still required, but we do
not need to check their return result anymore.

Reviewed-by: Bertrand Drouvot, Kyotaro Horiguchi
Discussion: https://postgr.es/m/ZtaVO0N-aTwiAk3w@paquier.xyz
2024-09-09 11:12:29 +09:00
Tom Lane 2e62fa62d6 Avoid core dump after getpwuid_r failure.
Looking up a nonexistent user ID would lead to a null pointer
dereference.  That's unlikely to happen here, but perhaps
not impossible.

Thinko in commit 4d5111b3f, noticed by Coverity.
2024-09-08 19:14:40 -04:00
Michael Paquier d8df7ac5c0 Update extension lookup routines to use the syscache
The following routines are changed to use the syscache entries added for
pg_extension in 490f869d92e5:
- get_extension_oid()
- get_extension_name()
- get_extension_schema()

A catalog scan is costly and could easily lead to a noticeable
performance impact when called once or more per query, so this is going
to be helpful for developers for extension data lookups.

Author: Andrei Lepikhov
Reviewed-by: Jelte Fennema-Nio
Discussion: https://postgr.es/m/529295b2-6ba9-4dae-acd1-20a9c6fb8f9a@gmail.com
2024-09-07 20:20:46 +09:00
Jeff Davis 51edc4ca54 Remove lc_ctype_is_c().
Instead always fetch the locale and look at the ctype_is_c field.

hba.c relies on regexes working for the C locale without needing
catalog access, which worked before due to a special case for
C_COLLATION_OID in lc_ctype_is_c(). Move the special case to
pg_set_regex_collation() now that lc_ctype_is_c() is gone.

Author: Andreas Karlsson
Discussion: https://postgr.es/m/60929555-4709-40a7-b136-bcb44cff5a3c@proxel.se
2024-09-06 13:23:21 -07:00
Tom Lane 129a2f6679 Fix incorrect pg_stat_io output on 32-bit machines.
pg_stat_get_io() applied TimestampTzGetDatum twice to the
stat_reset_timestamp value.  On 64-bit builds that's harmless because
TimestampTzGetDatum is a no-op, but on 32-bit builds it results in
displaying garbage in the stats_reset column of the pg_stat_io view.

Bug dates to commit a9c70b46d which introduced pg_stat_io, so
back-patch to v16 where that came in.

Bertrand Drouvot

Discussion: https://postgr.es/m/Ztrd+XcPTz1zorkg@ip-10-97-1-34.eu-west-3.compute.internal
2024-09-06 11:57:57 -04:00
Peter Eisentraut 9e43ab3dd7 Remove useless unconstify
Digging into the history, this was not necessary even when it was
added, but might have been some time before that.  In any case, there
is no use for this now.
2024-09-06 11:25:48 +02:00
Amit Langote bbd4c058a8 SQL/JSON: Fix default ON ERROR behavior for JSON_TABLE
Use EMPTY ARRAY instead of EMPTY.

This change does not affect the runtime behavior of JSON_TABLE(),
which continues to return an empty relation ON ERROR. It only alters
whether the default ON ERROR behavior is shown in the deparsed output.

Reported-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-09-06 13:25:53 +09:00
Amit Langote ee75a03f37 SQL/JSON: Fix JSON_TABLE() column deparsing
The deparsing code in get_json_expr_options() unnecessarily emitted
the default column-specific ON ERROR / EMPTY behavior when the
top-level ON ERROR behavior in JSON_TABLE was set to ERROR. Fix that
by not overriding the column-specific default, determined based on
the column's JsonExprOp in get_json_table_columns(), with
JSON_BEHAVIOR_ERROR when that is the top-level ON ERROR behavior.

Note that this only removes redundancy; the current deparsing output
is not incorrect, just redundant.

Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-09-06 13:25:47 +09:00
Amit Langote 4d7e24e0f4 Revert recent SQL/JSON related commits
Reverts 68222851d5, 565caaa79a, and 3a97460970, because a few
BF animals didn't like one or all of them.
2024-09-06 12:53:01 +09:00
Amit Langote 3a97460970 SQL/JSON: Avoid initializing unnecessary ON ERROR / ON EMPTY steps
When the ON ERROR / ON EMPTY behavior is to return NULL, returning
NULL directly from ExecEvalJsonExprPath() suffices. Therefore, there's
no need to create separate steps to check the error/empty flag or
those to evaluate the the constant NULL expression.  This speeds up
common cases because the default ON ERROR / ON EMPTY behavior for
JSON_QUERY() and JSON_VALUE() is to return NULL.  However, these steps
are necessary if the RETURNING type is a domain, as constraints on the
domain may need to be checked.

Reported-by: Jian He <jian.universality@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-09-06 12:05:40 +09:00
Amit Langote 565caaa79a SQL/JSON: Fix default ON ERROR behavior for JSON_TABLE
Use EMPTY ARRAY instead of EMPTY.

This change does not affect the runtime behavior of JSON_TABLE(),
which continues to return an empty relation ON ERROR. It only alters
whether the default ON ERROR behavior is shown in the deparsed output.

Reported-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-09-06 10:14:01 +09:00
Amit Langote 68222851d5 SQL/JSON: Fix JSON_TABLE() column deparsing
The deparsing code in get_json_expr_options() unnecessarily emitted
the default column-specific ON ERROR / EMPTY behavior when the
top-level ON ERROR behavior in JSON_TABLE was set to ERROR. Fix that
by not overriding the column-specific default, determined based on
the column's JsonExprOp in get_json_table_columns(), with
JSON_BEHAVIOR_ERROR when that is the top-level ON ERROR behavior.

Note that this only removes redundancy; the current deparsing output
is not incorrect, just redundant.

Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-09-06 10:13:57 +09:00
Amit Langote 3422f5f93f Update comment about ExprState.escontext
The updated comment provides more helpful guidance by mentioning that
escontext should be set when soft error handling is needed.

Reported-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
2024-09-06 10:13:53 +09:00
Jeff Davis 7829f85a62 Be more careful with error paths in pg_set_regex_collation().
Set global variables after error paths so that they don't end up in an
inconsistent state.

The inconsistent state doesn't lead to an actual problem, because
after an error, pg_set_regex_collation() will be called again before
the globals are accessed.

Change extracted from patch by Andreas Karlsson, though not discussed
explicitly.

Discussion: https://postgr.es/m/60929555-4709-40a7-b136-bcb44cff5a3c@proxel.se
2024-09-05 12:10:08 -07:00
Tom Lane fadff3fc94 Prevent mis-encoding of "trailing junk after numeric literal" errors.
Since commit 2549f0661, we reject an identifier immediately following
a numeric literal (without separating whitespace), because that risks
ambiguity with hex/octal/binary integers.  However, that patch used
token patterns like "{integer}{ident_start}", which is problematic
because {ident_start} matches only a single byte.  If the first
character after the integer is a multibyte character, this ends up
with flex reporting an error message that includes a partial multibyte
character.  That can cause assorted bad-encoding problems downstream,
both in the report to the client and in the postmaster log file.

To fix, use {identifier} not {ident_start} in the "junk" token
patterns, so that they will match complete multibyte characters.
This seems generally better user experience quite aside from the
encoding problem: for "123abc" the error message will now say that
the error appeared at or near "123abc" instead of "123a".

While at it, add some commentary about why these patterns exist
and how they work.

Report and patch by Karina Litskevich; review by Pavel Borisov.
Back-patch to v15 where the problem came in.

Discussion: https://postgr.es/m/CACiT8iZ_diop=0zJ7zuY3BXegJpkKK1Av-PU7xh0EDYHsa5+=g@mail.gmail.com
2024-09-05 12:42:33 -04:00
Daniel Gustafsson 85837b8037 Fix handling of NULL return value in typarray lookup
Commit 6ebeeae29 accidentally omitted testing the return value from
findTypeByOid which can return NULL.  Fix by adding a check to make
sure that we have a pointer to dereference.

Author: Ranier Vilela <ranier.vf@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAEudQAqfMTH8Ya_J6E-NW_y_JyDFDxtQ4V_g6nY_1=0oDbQqdg@mail.gmail.com
2024-09-05 15:32:22 +02:00
Peter Eisentraut 4af123ad45 Fix misleading error message context
Author: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Stepan Neretin <sncfmgg@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAFj8pRAw+OkVW=FgMKHKyvY3CgtWy3cWdY7XT+S5TJaTttu=oA@mail.gmail.com
2024-09-05 15:19:00 +02:00