Commit Graph

59205 Commits

Author SHA1 Message Date
Heikki Linnakangas 3ab2668d48 Use psprintf to simplify gtsvectorout()
The buffer allocation was correct, but looked archaic and scary:

- It was weird to calculate the buffer size before determining which
  format string was used. With the same effort, we could've used the
  right-sized buffer for each branch.

- Commit aa0d350456 added one more possible return string ("all true
  bits"), but didn't adjust the code at the top of the function to
  calculate the returned string's max size. It was not a live bug,
  because the new string was smaller than the existing ones, but
  seemed wrong in principle.

- Use of sprintf() is generally eyebrow-raising these days

Switch to psprintf(). psprintf() allocates a larger buffer than what
was allocated before, 128 bytes vs 80 bytes, which is acceptable as
this code is not performance or space critical.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/54c29fb0-edf2-48ea-9814-44e918bbd6e8@iki.fi
2024-08-06 23:05:25 +03:00
Heikki Linnakangas d5f139cb68 Constify fields and parameters in spell.c
I started by marking VoidString as const, and fixing the fallout by
marking more fields and function arguments as const. It proliferated
quite a lot, but all within spell.c and spell.h.

A more narrow patch to get rid of the static VoidString buffer would
be to replace it with '#define VoidString ""', as C99 allows assigning
"" to a non-const pointer, even though you're not allowed to modify
it. But it seems like good hygiene to mark all these as const. In the
structs, the pointers can point to the constant VoidString, or a
buffer allocated with palloc(), or with compact_palloc(), so you
should not modify them.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/54c29fb0-edf2-48ea-9814-44e918bbd6e8@iki.fi
2024-08-06 23:04:51 +03:00
Heikki Linnakangas fe8dd65bf2 Mark misc static global variables as const
Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/54c29fb0-edf2-48ea-9814-44e918bbd6e8@iki.fi
2024-08-06 23:04:48 +03:00
Heikki Linnakangas 85829c973c Make nullSemAction const, add 'const' decorators to related functions
To make it more clear that these should never be modified.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/54c29fb0-edf2-48ea-9814-44e918bbd6e8@iki.fi
2024-08-06 23:04:22 +03:00
Heikki Linnakangas 1e35951e71 Turn a few 'validnsps' static variables into locals
There was no need for these to be static buffers, local variables work
just as well. I think they were marked as 'static' to imply that they
are read-only, but 'const' is more appropriate for that, so change
them to const.

To make it possible to mark the variables as 'const', also add 'const'
decorations to the transformRelOptions() signature.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/54c29fb0-edf2-48ea-9814-44e918bbd6e8@iki.fi
2024-08-06 23:03:43 +03:00
Jeff Davis a890ad2149 selfuncs.c: use pg_strxfrm() instead of strxfrm().
pg_strxfrm() takes a pg_locale_t, so it works properly with all
providers. This improves estimates for ICU when performing linear
interpolation within a histogram bin.

Previously, convert_string_datum() always used strxfrm() and relied on
setlocale(). That did not produce good estimates for non-default or
non-libc collations.

Discussion: https://postgr.es/m/89475ee5487d795124f4e25118ea8f1853edb8cb.camel@j-davis.com
2024-08-06 12:25:12 -07:00
Heikki Linnakangas a54d4ed183 Fix datatypes in comments in instr_time.h
The INSTR_TIME_GET_NANOSEC(t) and INSTR_TIME_GET_MICROSEC(t) macros
return a signed int64.

Discussion: https://www.postgresql.org/message-id/ZrHkv3MAQfwNSmTG@ip-10-97-1-34.eu-west-3.compute.internal
2024-08-06 22:15:55 +03:00
Heikki Linnakangas 39a138fbef Revert "Fix comments in instr_time.h and remove an unneeded cast to int64"
This reverts commit 3dcb09de7b. Tom Lane pointed out that it broke the
abstraction provided by the macros. The callers should not need to
know what the internal type is.

This commit is an exact revert, the next commit will fix the comments
on the macros that incorrectly claim that they return uint64.

Discussion: https://www.postgresql.org/message-id/ZrHkv3MAQfwNSmTG@ip-10-97-1-34.eu-west-3.compute.internal
2024-08-06 22:15:46 +03:00
Tom Lane 6e086fa2e7 Allow parallel workers to cope with a newly-created session user ID.
Parallel workers failed after a sequence like
BEGIN;
CREATE USER foo;
SET SESSION AUTHORIZATION foo;
because check_session_authorization could not see the uncommitted
pg_authid row for "foo".  This is because we ran RestoreGUCState()
in a separate transaction using an ordinary just-created snapshot.
The same disease afflicts any other GUC that requires catalog lookups
and isn't forgiving about the lookups failing.

To fix, postpone RestoreGUCState() into the worker's main transaction
after we've set up a snapshot duplicating the leader's.  This affects
check_transaction_isolation and check_transaction_deferrable, which
think they should only run during transaction start.  Make them
act like check_transaction_read_only, which already knows it should
silently accept the value when InitializingParallelWorker.

This un-reverts commit f5f30c22e.  The original plan was to back-patch
that, but the fact that 0ae5b763e proved to be a pre-requisite shows
that the subtle API change for GUC hooks might actually break some of
them.  The problem we're trying to fix seems not worth taking such a
risk for in stable branches.

Per bug #18545 from Andrey Rachitskiy.

Discussion: https://postgr.es/m/18545-feba138862f19aaa@postgresql.org
2024-08-06 12:36:42 -04:00
Tom Lane 0ae5b763ea Clean up handling of client_encoding GUC in parallel workers.
The previous coding here threw an error from assign_client_encoding
if it was invoked in a parallel worker.  That's a very fundamental
violation of the GUC hook API: assign hooks must not throw errors.
The place to complain is in the check hook, so move the test to
there, and use the regular check-hook API (ie return false) to
report it.

The reason this coding is a problem is that it breaks GUC rollback,
which may occur after we leave InitializingParallelWorker state.
That case seems not actually reachable before now, but commit
f5f30c22e made it reachable, so we need to fix this before that
can be un-reverted.

In passing, improve the commentary in ParallelWorkerMain, and
add a check for failure of SetClientEncoding.  That's another
case that can't happen now but might become possible after
foreseeable code rearrangements (notably, if the shortcut of
skipping PrepareClientEncoding stops being OK).

Discussion: https://postgr.es/m/18545-feba138862f19aaa@postgresql.org
2024-08-06 12:21:53 -04:00
Nathan Bossart 8928817769 Remove volatile qualifiers from pg_stat_statements.c.
Prior to commit 0709b7ee72, which changed the spinlock primitives
to function as compiler barriers, access to variables within a
spinlock-protected section required using a volatile pointer, but
that is no longer necessary.

Reviewed-by: Bertrand Drouvot, Michael Paquier
Discussion: https://postgr.es/m/Zqkv9iK7MkNS0KaN%40nathan
2024-08-06 10:56:37 -05:00
Heikki Linnakangas 3dcb09de7b Fix comments in instr_time.h and remove an unneeded cast to int64
03023a2664 represented time as an int64 on all platforms but forgot to
update the comment related to INSTR_TIME_GET_MICROSEC() and provided
an incorrect comment for INSTR_TIME_GET_NANOSEC().

In passing remove an unneeded cast to int64.

Author: Bertrand Drouvot
Discussion: https://www.postgresql.org/message-id/ZrHkv3MAQfwNSmTG@ip-10-97-1-34.eu-west-3.compute.internal
2024-08-06 14:28:02 +03:00
Michael Paquier 8771298605 Remove unnecessary declaration of heapam_methods
This overlaps with the declaration at the end of heapam_handler.c that
lists all the callback routines for the heap table AM.

Author: Japin Li
Discussion: https://postgr.es/m/ME0P300MB04459456D5C4E70D48116896B6B12@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2024-08-06 16:27:38 +09:00
Jeff Davis e9931bfb75 Remove support for null pg_locale_t most places.
Previously, passing NULL for pg_locale_t meant "use the libc provider
and the server environment". Now that the database collation is
represented as a proper pg_locale_t (not dependent on setlocale()),
remove special cases for NULL.

Leave wchar2char() and char2wchar() unchanged for now, because the
callers don't always have a libc-based pg_locale_t available.

Discussion: https://postgr.es/m/cfd9eb85-c52a-4ec9-a90e-a5e4de56e57d@eisentraut.org
Reviewed-by: Peter Eisentraut, Andreas Karlsson
2024-08-05 18:31:48 -07:00
Robert Haas f80b09bac8 Move astreamer (except astreamer_inject) to fe_utils.
This allows the code to be used by other frontend applications.

Amul Sul, reviewed by Sravan Kumar, Andres Freund (whose input
I specifically solicited regarding the meson.build changes),
and me.

Discussion: http://postgr.es/m/CAAJ_b94StvLWrc_p4q-f7n3OPfr6GhL8_XuAg2aAaYZp1tF-nw@mail.gmail.com
2024-08-05 11:41:57 -04:00
Robert Haas 53b2c921a0 Move recovery injector astreamer to a separate header file.
Unlike the rest of the astreamer (formerly bbstreamer) infrastructure
which is reusable by other tools, astreamer_inject.c seems extremely
specific to pg_basebackup. Hence, move the corresponding declarations
to a separate header file, so that we can move the rest of the code
without moving this.

Amul Sul, reviewed by Sravan Kumar and by me.

Discussion: http://postgr.es/m/CAAJ_b94StvLWrc_p4q-f7n3OPfr6GhL8_XuAg2aAaYZp1tF-nw@mail.gmail.com
2024-08-05 10:55:06 -04:00
Robert Haas 3c90569811 Rename bbstreamer to astreamer.
I (rhaas) intended "bbstreamer" to stand for "base backup streamer,"
but that implies that this infrastructure can only ever be used by
pg_basebackup.  In fact, it is a generally useful way of streaming
data from a tar or compressed tar file, and it could be extended to
work with other archive formats as well if we ever wanted to do that.
Hence, rename it to "astreamer" (archive streamer) in preparation for
reusing the infrastructure from pg_verifybackup (and perhaps
eventually also other utilities, such as pg_combinebackup or
pg_waldump).

This is purely a renaming commit. Comment adjustments and relocation
of the actual code to someplace from which it can be reused are left
to future commits.

Amul Sul, reviewed by Sravan Kumar and by me.

Discussion: http://postgr.es/m/CAAJ_b94StvLWrc_p4q-f7n3OPfr6GhL8_XuAg2aAaYZp1tF-nw@mail.gmail.com
2024-08-05 09:56:25 -04:00
Masahiko Sawada 66e94448ab Restrict accesses to non-system views and foreign tables during pg_dump.
When pg_dump retrieves the list of database objects and performs the
data dump, there was possibility that objects are replaced with others
of the same name, such as views, and access them. This vulnerability
could result in code execution with superuser privileges during the
pg_dump process.

This issue can arise when dumping data of sequences, foreign
tables (only 13 or later), or tables registered with a WHERE clause in
the extension configuration table.

To address this, pg_dump now utilizes the newly introduced
restrict_nonsystem_relation_kind GUC parameter to restrict the
accesses to non-system views and foreign tables during the dump
process. This new GUC parameter is added to back branches too, but
these changes do not require cluster recreation.

Back-patch to all supported branches.

Reviewed-by: Noah Misch
Security: CVE-2024-7348
Backpatch-through: 12
2024-08-05 06:05:33 -07:00
David Rowley ca6fde9225 Optimize JSON escaping using SIMD
Here we adjust escape_json_with_len() to make use of SIMD to allow
processing of up to 16-bytes at a time rather than processing a single
byte at a time.  This has been shown to speed up escaping of JSON
strings significantly.

Escaping is required for both JSON string properties and also the
property names themselves, so this should also help improve the speed of
the conversion from JSON into text for JSON objects that have property
names 16 or more bytes long.

Escaping JSON strings was often a significant bottleneck for longer
strings.  With these changes, some benchmarking has shown a query
performing nearly 4 times faster when escaping a JSON object with a 1MB
text property.  Tests with shorter text properties saw smaller but still
significant performance improvements.  For example, a test outputting 1024
JSON strings with a text property length ranging from 1 char to 1024 chars
became around 2 times faster.

Author: David Rowley
Reviewed-by: Melih Mutlu
Discussion: https://postgr.es/m/CAApHDvpLXwMZvbCKcdGfU9XQjGCDm7tFpRdTXuB9PVgpNUYfEQ@mail.gmail.com
2024-08-05 23:16:44 +12:00
Amit Kapila b5df24e520 Fix typo in bufpage.h.
Author: Senglee Choi
Reviewed-by: Tender Wang
Discussion: https://postgr.es/m/CACUsy79U0=S5zWEf6D57F=vB7rOEa86xFY6oovDZ58jRcROCxQ@mail.gmail.com
2024-08-05 14:38:00 +05:30
Michael Paquier f68cd847fa injection_points: Add some fixed-numbered statistics
Like 75534436a4, this acts mainly as a template to show what can be
achieved with fixed-numbered stats (like WAL, bgwriter, etc.) with the
pluggable cumulative statistics APIs introduced in 7949d95945.

Fixed-numbered stats are defined in their own file, named
injection_stats_fixed.c, separated entirely from the variable-numbered
case in injection_stats.c.  This is mainly for clarity as having both
examples in the same file would be confusing.

Note that this commit uses the helper routines added in 2eff9e678d.
The stats stored track globally the number of times injection points
have been attached, detached or run.  Two more fields should be added
later for the number of times a point has been cached or loaded, but
what's here is enough as a template.

More TAP tests are added, providing coverage for fixed-numbered custom
stats.

Author: Michael Paquier
Reviewed-by: Dmitry Dolgov, Bertrand Drouvot
Discussion: https://postgr.es/m/Zmqm9j5EO0I4W8dx@paquier.xyz
2024-08-05 12:29:22 +09:00
Michael Paquier 75534436a4 injection_points: Add some cumulative stats for injection points
This acts as a template of what can be achieved with the pluggable
cumulative stats APIs introduced in 7949d95945 for the
variable-numbered case where stats entries are stored in the pgstats
dshash, while being potentially useful on its own for injection points,
say to add starting and/or stopping conditions based on the statistics
(want to trigger a callback after N calls, for example?).

Currently, the only data gathered is the number of times an injection
point is run.  More fields can always be added as required.  All the
routines related to the stats are located in their own file, called
injection_stats.c in the test module injection_points, for clarity.

The stats can be used only if the test module is loaded through
shared_preload_libraries.  The key of the dshash uses InvalidOid for the
database, and an int4 hash of the injection point name as object ID.

A TAP test is added to provide coverage for the new custom cumulative
stats APIs, showing the persistency of the data across restarts, for
example.

Author: Michael Paquier
Reviewed-by: Dmitry Dolgov, Bertrand Drouvot
Discussion: https://postgr.es/m/Zmqm9j5EO0I4W8dx@paquier.xyz
2024-08-05 12:06:54 +09:00
Michael Paquier 2eff9e678d Add helper routines to retrieve data for custom fixed-numbered pgstats
This is useful for extensions to get snapshot and shmem data for custom
cumulative statistics when these have a fixed number of objects, so as
these do not need to know about the snapshot internals, aka pgStatLocal.

An upcoming commit introducing an example template for custom cumulative
stats with fixed-numbered objects will make use of these.  I have
noticed that this is useful for extension developers while hacking my
own example, actually.

Author: Michael Paquier
Reviewed-by: Dmitry Dolgov, Bertrand Drouvot
Discussion: https://postgr.es/m/Zmqm9j5EO0I4W8dx@paquier.xyz
2024-08-05 11:43:33 +09:00
Alexander Korotkov 8036d73ae3 pg_wal_replay_wait(): Fix typo in the doc
Reported-by: Kevin Hale Boyes
Discussion: https://postgr.es/m/CADAecHWKpaPuPGXAMOH%3DwmhTpydHWGPOk9KWX97UYhp5GdqCWw%40mail.gmail.com
2024-08-04 20:26:48 +03:00
Michael Paquier 7949d95945 Introduce pluggable APIs for Cumulative Statistics
This commit adds support in the backend for $subject, allowing
out-of-core extensions to plug their own custom kinds of cumulative
statistics.  This feature has come up a few times into the lists, and
the first, original, suggestion came from Andres Freund, about
pg_stat_statements to use the cumulative statistics APIs in shared
memory rather than its own less efficient internals.  The advantage of
this implementation is that this can be extended to any kind of
statistics.

The stats kinds are divided into two parts:
- The in-core "builtin" stats kinds, with designated initializers, able
to use IDs up to 128.
- The "custom" stats kinds, able to use a range of IDs from 128 to 256
(128 slots available as of this patch), with information saved in
TopMemoryContext.  This can be made larger, if necessary.

There are two types of cumulative statistics in the backend:
- For fixed-numbered objects (like WAL, archiver, etc.).  These are
attached to the snapshot and pgstats shmem control structures for
efficiency, and built-in stats kinds still do that to avoid any
redirection penalty.  The data of custom kinds is stored in a first
array in snapshot structure and a second array in the shmem control
structure, both indexed by their ID, acting as an equivalent of the
builtin stats.
- For variable-numbered objects (like tables, functions, etc.).  These
are stored in a dshash using the stats kind ID in the hash lookup key.

Internally, the handling of the builtin stats is unchanged, and both
fixed and variabled-numbered objects are supported.  Structure
definitions for builtin stats kinds are renamed to reflect better the
differences with custom kinds.

Like custom RMGRs, custom cumulative statistics can only be loaded with
shared_preload_libraries at startup, and must allocate a unique ID
shared across all the PostgreSQL extension ecosystem with the following
wiki page to avoid conflicts:
https://wiki.postgresql.org/wiki/CustomCumulativeStats

This makes the detection of the stats kinds and their handling when
reading and writing stats much easier than, say, allocating IDs for
stats kinds from a shared memory counter, that may change the ID used by
a stats kind across restarts.  When under development, extensions can
use PGSTAT_KIND_EXPERIMENTAL.

Two examples that can be used as templates for fixed-numbered and
variable-numbered stats kinds will be added in some follow-up commits,
with tests to provide coverage.

Some documentation is added to explain how to use this plugin facility.

Author: Michael Paquier
Reviewed-by: Dmitry Dolgov, Bertrand Drouvot
Discussion: https://postgr.es/m/Zmqm9j5EO0I4W8dx@paquier.xyz
2024-08-04 19:41:24 +09:00
Peter Eisentraut 365b5a345b Use CXXFLAGS instead of CFLAGS for linking C++ code
Otherwise, this would break if using C and C++ compilers from
different families and they understand different options.  It already
used the right flags for compiling, this is only for linking.  Also,
the meson setup already did this correctly.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/228700.1722717983@sss.pgh.pa.us
2024-08-04 11:17:46 +02:00
Michael Paquier 028b4b21df Fix incorrect format placeholders in pgstat.c
These should have been switched from %d to %u in 3188a4582a in the
debugging elogs added in ca1ba50fcb.  PgStat_Kind should never be
higher than INT32_MAX, but let's be clean.

Issue noticed while hacking more on this area.
2024-08-04 03:07:20 +09:00
Peter Eisentraut 6618891256 Add -Wmissing-variable-declarations to the standard compilation flags
This warning flag detects global variables not declared in header
files.  This is similar to what -Wmissing-prototypes does for
functions.  (More correctly, it is similar to what
-Wmissing-declarations does for functions, but -Wmissing-prototypes is
a superset of that in C.)

This flag is new in GCC 14.  Clang has supported it for a while.

Several recent commits have cleaned up warnings triggered by this, so
it should now be clean.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-08-03 11:51:02 +02:00
Jeff Davis 7926a9a80f Small refactoring around ExecCreateTableAs().
Since commit 4b74ebf726, the refresh logic is used to populate
materialized views, so we can simplify the error message in
ExecCreateTableAs().

Also, RefreshMatViewByOid() is moved to just after
create_ctas_nodata() call to improve code readability.

Author: Yugo Nagata
Discussion: https://postgr.es/m/20240802161301.d975daca9ba7a706fa05ecd7@sraoss.co.jp
2024-08-02 12:52:56 -07:00
Noah Misch 3cffe7946c Fix name of "Visual Studio" in documentation.
Back-patch to v17, which introduced this.

Aleksander Alekseev

Discussion: https://postgr.es/m/CAJ7c6TM7ct0EjoCQaLSVYoxxnEw4xCUFebWj77GktWsqEdyCtQ@mail.gmail.com
2024-08-02 12:49:56 -07:00
Alexander Korotkov 3c5db1d6b0 Implement pg_wal_replay_wait() stored procedure
pg_wal_replay_wait() is to be used on standby and specifies waiting for
the specific WAL location to be replayed.  This option is useful when
the user makes some data changes on primary and needs a guarantee to see
these changes are on standby.

The queue of waiters is stored in the shared memory as an LSN-ordered pairing
heap, where the waiter with the nearest LSN stays on the top.  During
the replay of WAL, waiters whose LSNs have already been replayed are deleted
from the shared memory pairing heap and woken up by setting their latches.

pg_wal_replay_wait() needs to wait without any snapshot held.  Otherwise,
the snapshot could prevent the replay of WAL records, implying a kind of
self-deadlock.  This is why it is only possible to implement
pg_wal_replay_wait() as a procedure working without an active snapshot,
not a function.

Catversion is bumped.

Discussion: https://postgr.es/m/eb12f9b03851bb2583adab5df9579b4b%40postgrespro.ru
Author: Kartyshov Ivan, Alexander Korotkov
Reviewed-by: Michael Paquier, Peter Eisentraut, Dilip Kumar, Amit Kapila
Reviewed-by: Alexander Lakhin, Bharath Rupireddy, Euler Taveira
Reviewed-by: Heikki Linnakangas, Kyotaro Horiguchi
2024-08-02 21:16:56 +03:00
Alvaro Herrera a83f3088b8
Fix NLS file reference in pg_createsubscriber
pg_createsubscriber is referring to a non-existent message translation
file, causing NLS to not work correctly. This command should use the
same file as pg_basebackup.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20240802.115717.1083441453338151622.horikyota.ntt@gmail.com
2024-08-02 12:05:38 -04:00
Alvaro Herrera 3b2f668b78
pg_createsubscriber: Fix bogus error message
Also some desultory style improvement
2024-08-02 12:01:10 -04:00
Peter Eisentraut 9fb855fe1a Include bison header files into implementation files
Before Bison 3.4, the generated parser implementation files run afoul
of -Wmissing-variable-declarations (in spite of commit ab61c40bfa)
because declarations for yylval and possibly yylloc are missing.  The
generated header files contain an extern declaration, but the
implementation files don't include the header files.  Since Bison 3.4,
the generated implementation files automatically include the generated
header files, so then it works.

To make this work with older Bison versions as well, include the
generated header file from the .y file.

(With older Bison versions, the generated implementation file contains
effectively a copy of the header file pasted in, so including the
header file is redundant.  But we know this works anyway because the
core grammar uses this arrangement already.)

Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-08-02 10:25:11 +02:00
Heikki Linnakangas 63bef4df97 Minor refactoring of assign_backendlist_entry()
Make assign_backendlist_entry() responsible just for allocating the
Backend struct. Linking it to the RegisteredBgWorker is the caller's
responsibility now. Seems more clear that way.

Discussion: https://www.postgresql.org/message-id/835232c0-a5f7-4f20-b95b-5b56ba57d741@iki.fi
2024-08-01 23:23:55 +03:00
Heikki Linnakangas ef4c35b416 Fix outdated comment; all running bgworkers are in BackendList
Before commit 8a02b3d732, only bgworkers that connected to a database
had an entry in the Backendlist. Commit 8a02b3d732 changed that, but
forgot to update this comment.

Discussion: https://www.postgresql.org/message-id/835232c0-a5f7-4f20-b95b-5b56ba57d741@iki.fi
2024-08-01 23:23:47 +03:00
Michael Paquier 3188a4582a Switch PgStat_Kind from an enum to a uint32 type
A follow-up patch is planned to make cumulative statistics pluggable,
and using a type is useful in the internal routines used by pgstats as
PgStat_Kind may have a value that was not originally in the enum removed
here, once made pluggable.

While on it, this commit switches pgstat_is_kind_valid() to use
PgStat_Kind rather than an int, to be more consistent with its existing
callers.  Some loops based on the stats kind IDs are switched to use
PgStat_Kind rather than int, for consistency with the new time.

Author: Michael Paquier
Reviewed-by: Dmitry Dolgov, Bertrand Drouvot
Discussion: https://postgr.es/m/Zmqm9j5EO0I4W8dx@paquier.xyz
2024-08-02 04:49:34 +09:00
Michael Paquier b860848232 Add redo LSN to pgstats files
This is used in the startup process to check that the pgstats file we
are reading includes the redo LSN referring to the shutdown checkpoint
where it has been written.  The redo LSN in the pgstats file needs to
match with what the control file has.

This is intended to be used for an upcoming change that will extend the
write of the stats file to happen during checkpoints, rather than only
shutdown sequences.

Bump PGSTAT_FILE_FORMAT_ID.

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Zp8o6_cl0KSgsnvS@paquier.xyz
2024-08-02 01:57:28 +09:00
Peter Eisentraut c27090bd60 Convert some extern variables to static, Windows code
Similar to 720b0eaae9, discovered by MinGW.
2024-08-01 13:28:32 +02:00
Peter Eisentraut 6bbbd7f65f Convert an extern variable to static
Similar to 720b0eaae9, fixes new code from bd15b7db48.
2024-08-01 12:43:26 +02:00
Peter Eisentraut c671e142bf pg_createsubscriber: Rename option --socket-directory to --socketdir
For consistency with the equivalent option in pg_upgrade.

Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Discussion: https://www.postgresql.org/message-id/flat/1ed82b9b-8e20-497d-a2f8-aebdd793d595%40eisentraut.org
2024-08-01 12:14:01 +02:00
Etsuro Fujita e66b32e43b Update comment in portal.h.
We store tuples into the portal's tuple store for a PORTAL_ONE_MOD_WITH
query as well.

Back-patch to all supported branches.

Reviewed by Andy Fan.

Discussion: https://postgr.es/m/CAPmGK14HVYBZYZtHabjeCd-e31VT%3Dwx6rQNq8QfehywLcpZ2Hw%40mail.gmail.com
2024-08-01 17:45:00 +09:00
Peter Eisentraut a292c98d62 Convert node test compile-time settings into run-time parameters
This converts

    COPY_PARSE_PLAN_TREES
    WRITE_READ_PARSE_PLAN_TREES
    RAW_EXPRESSION_COVERAGE_TEST

into run-time parameters

    debug_copy_parse_plan_trees
    debug_write_read_parse_plan_trees
    debug_raw_expression_coverage_test

They can be activated for tests using PG_TEST_INITDB_EXTRA_OPTS.

The compile-time symbols are kept for build farm compatibility, but
they now just determine the default value of the run-time settings.

Furthermore, support for these settings is not compiled in at all
unless assertions are enabled, or the new symbol
DEBUG_NODE_TESTS_ENABLED is defined at compile time, or any of the
legacy compile-time setting symbols are defined.  So there is no
run-time overhead in production builds.  (This is similar to the
handling of DISCARD_CACHES_ENABLED.)

Discussion: https://www.postgresql.org/message-id/flat/30747bd8-f51e-4e0c-a310-a6e2c37ec8aa%40eisentraut.org
2024-08-01 10:09:18 +02:00
Amit Kapila a67da49e1d Avoid duplicate table scans for cross-partition updates during logical replication.
When performing a cross-partition update in the apply worker, it
needlessly scans the old partition twice, resulting in noticeable
overhead.

This commit optimizes it by removing the redundant table scan.

Author: Hou Zhijie
Reviewed-by: Hayato Kuroda, Amit Kapila
Discussion: https://postgr.es/m/OS0PR01MB571623E39984D94CBB5341D994AB2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2024-08-01 10:11:06 +05:30
Andres Freund a7f107df2b Evaluate arguments of correlated SubPlans in the referencing ExprState
Until now we generated an ExprState for each parameter to a SubPlan and
evaluated them one-by-one ExecScanSubPlan. That's sub-optimal as creating lots
of small ExprStates
a) makes JIT compilation more expensive
b) wastes memory
c) is a bit slower to execute

This commit arranges to evaluate parameters to a SubPlan as part of the
ExprState referencing a SubPlan, using the new EEOP_PARAM_SET expression
step. We emit one EEOP_PARAM_SET for each argument to a subplan, just before
the EEOP_SUBPLAN step.

It likely is worth using EEOP_PARAM_SET in other places as well, e.g. for
SubPlan outputs, nestloop parameters and - more ambitiously - to get rid of
ExprContext->domainValue/caseValue/ecxt_agg*.  But that's for later.

Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Alena Rybakina <lena.ribackina@yandex.ru>
Discussion: https://postgr.es/m/20230225214401.346ancgjqc3zmvek@awork3.anarazel.de
2024-07-31 19:54:46 -07:00
Tom Lane e6a9637488 Revert "Allow parallel workers to cope with a newly-created session user ID."
This reverts commit f5f30c22ed.

Some buildfarm animals are failing with "cannot change
"client_encoding" during a parallel operation".  It looks like
assign_client_encoding is unhappy at being asked to roll back a
client_encoding setting after a parallel worker encounters a
failure.  There must be more to it though: why didn't I see this
during local testing?  In any case, it's clear that moving the
RestoreGUCState() call is not as side-effect-free as I thought.
Given that the bug f5f30c22e intended to fix has gone unreported
for years, it's not something that's urgent to fix; I'm not
willing to risk messing with it further with only days to our
next release wrap.
2024-07-31 20:57:00 -04:00
Jeff Davis ca2eea3ac8 Add is_create parameter to RefreshMatviewByOid().
RefreshMatviewByOid is used for both REFRESH and CREATE MATERIALIZED
VIEW.  This flag is currently just used for handling internal error
messages, but also aimed to improve code-readability.

Author: Yugo Nagata
Discussion: https://postgr.es/m/20240726122630.70e889f63a4d7e26f8549de8@sraoss.co.jp
2024-07-31 16:42:19 -07:00
Jeff Davis f683d3a4ca Remove unused ParamListInfo argument from ExecRefreshMatView.
Author: Yugo Nagata
Discussion: https://postgr.es/m/20240726122630.70e889f63a4d7e26f8549de8@sraoss.co.jp
2024-07-31 16:37:53 -07:00
Tom Lane f5f30c22ed Allow parallel workers to cope with a newly-created session user ID.
Parallel workers failed after a sequence like
	BEGIN;
	CREATE USER foo;
	SET SESSION AUTHORIZATION foo;
because check_session_authorization could not see the uncommitted
pg_authid row for "foo".  This is because we ran RestoreGUCState()
in a separate transaction using an ordinary just-created snapshot.
The same disease afflicts any other GUC that requires catalog lookups
and isn't forgiving about the lookups failing.

To fix, postpone RestoreGUCState() into the worker's main transaction
after we've set up a snapshot duplicating the leader's.  This affects
check_transaction_isolation and check_transaction_deferrable, which
think they should only run during transaction start.  Make them
act like check_transaction_read_only, which already knows it should
silently accept the value when InitializingParallelWorker.

Per bug #18545 from Andrey Rachitskiy.  Back-patch to all
supported branches, because this has been wrong for awhile.

Discussion: https://postgr.es/m/18545-feba138862f19aaa@postgresql.org
2024-07-31 18:54:10 -04:00
Nathan Bossart bd15b7db48 Improve performance of dumpSequenceData().
As one might guess, this function dumps the sequence data.  It is
called once per sequence, and each such call executes a query to
retrieve the relevant data for a single sequence.  This can cause
pg_dump to take significantly longer, especially when there are
many sequences.

This commit improves the performance of this function by gathering
all the sequence data with a single query at the beginning of
pg_dump.  This information is stored in a sorted array that
dumpSequenceData() can bsearch() for what it needs.  This follows a
similar approach as previous commits that introduced sorted arrays
for role information, pg_class information, and sequence metadata.
As with those commits, this patch will cause pg_dump to use more
memory, but that isn't expected to be too egregious.

Note that we use the brand new function pg_sequence_read_tuple() in
the query that gathers all sequence data, so we must continue to
use the preexisting query-per-sequence approach for versions older
than 18.

Reviewed-by: Euler Taveira, Michael Paquier, Tom Lane
Discussion: https://postgr.es/m/20240503025140.GA1227404%40nathanxps13
2024-07-31 10:12:42 -05:00