When COPYing into a partitioned table that does now permit the use of
table_multi_insert(), we could error out with
ERROR: could not read block NN in file "base/...": read only 0 of 8192 bytes
because BulkInsertState->next_free was not reset between partitions. This
problem occurred only when not able to use table_multi_insert(), as a
dedicated BulkInsertState for each partition is used in that case.
The bug was introduced in 00d1e02be24, but it was hard to hit at that point,
as commonly bulk relation extension is not used when not using
table_multi_insert(). It became more likely after 82a4edabd27, which expanded
the use of bulk extension.
To fix the bug, reset the bulk relation extension state in BulkInsertState in
ReleaseBulkInsertStatePin(). That was added (in b1ecb9b3fcf) to tackle a very
similar issue. Obviously the name is not quite correct, but there might be
external callers, and bulk insert state needs to be reset in precisely in the
situations that ReleaseBulkInsertStatePin() already needed to be called.
Medium term the better fix likely is to disallow reusing BulkInsertState
across relations.
Add a test that, without the fix, reproduces #18130 in most
configurations. The test also catches the problem fixed in b1ecb9b3fcf when
run with small shared_buffers.
Reported-by: Ivan Kolombet <enderstd@gmail.com>
Analyzed-by: Tom Lane <tgl@sss.pgh.pa.us>
Analyzed-by: Andres Freund <andres@anarazel.de>
Bug: #18130
Discussion: https://postgr.es/m/18130-7a86a7356a75209d%40postgresql.org
Discussion: https://postgr.es/m/257696.1695670946%40sss.pgh.pa.us
Backpatch: 16-
* sync_method is renamed to wal_sync_method.
* sync_method_options[] is renamed to wal_sync_method_options[].
* assign_xlog_sync_method() is renamed to assign_wal_sync_method().
* The names of the available synchronization methods are now
prefixed with "WAL_SYNC_METHOD_" and have been moved into a
WalSyncMethod enum.
* PLATFORM_DEFAULT_SYNC_METHOD is renamed to
PLATFORM_DEFAULT_WAL_SYNC_METHOD, and DEFAULT_SYNC_METHOD is
renamed to DEFAULT_WAL_SYNC_METHOD.
These more descriptive names help distinguish the code for
wal_sync_method from the code for DataDirSyncMethod (e.g., the
recovery_init_sync_method configuration parameter and the
--sync-method option provided by several frontend utilities). This
change also prevents name collisions between the aforementioned
sets of code. Since this only improves the naming of internal
identifiers, there should be no behavior change.
Author: Maxim Orlov
Discussion: https://postgr.es/m/CACG%3DezbL1gwE7_K7sr9uqaCGkWhmvRTcTEnm3%2BX1xsRNwbXULQ%40mail.gmail.com
Commit 5a3423ad8e updated pg_stat_statements to 1.11 due to added
columns in the view for jit deform counters. Fixing oldextversion
was however missed in that commit. This adds a test for the 1.11
version view definition.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ2Y7c2VEg4S1KDXFcaHjyF8=KZa_rMV6WOe+KwE-KE97g@mail.gmail.com
AT TIME ZONE is completed with a list of supported timezones, something
not needed by AT LOCAL.
Author: Dagfinn Ilmari Mannsåker
Reviewed-by: Jim Jones
Discussion: https://postgr.es/m/87jzyzsvgv.fsf@wibble.ilmari.org
When converting a timestamp to/from with/without time zone, the SQL
Standard specifies an AT LOCAL variant of AT TIME ZONE which uses the
session's time zone. This includes three system functions able to do
the work in the same way as the existing flavors for AT TIME ZONE,
except that these need to be marked as stable as they depend on the
session's TimeZone GUC.
Bump catalog version.
Author: Vik Fearing
Reviewed-by: Laurenz Albe, Cary Huang, Michael Paquier
Discussion: https://postgr.es/m/8e25dec4-5667-c1a5-6581-167d710c2182@postgresfriends.org
When MyProc->delayChkptFlags is set to temporarily block phase
transitions in a concurrent checkpoint, the checkpointer enters a
sleep-poll loop to wait for the flag to be cleared. We should show that
as a wait event in the pg_stat_activity view.
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKGL7Whi8iwKbzkbn_1fixH3Yy8aAPz7mfq6Hpj7FeJrKMg%40mail.gmail.com
timezone(zone, timestamp) is already mentioned as an equivalent of the
two first patterns in the table describing the AT TIME ZONE variants,
but did not mention the third case about "time" and its equivalent as an
SQL function, so let's be consistent here.
Extracted from a larger patch by the same author.
Author: Vik Fearing
Discussion: https://postgr.es/m/8e25dec4-5667-c1a5-6581-167d710c2182@postgresfriends.org
An upcoming patch wants to introduce an additional special case in
this function. To keep that as cheap as possible, minimize the amount
of branching that we do based on whether this is an XLOG_SWITCH
record.
Additionally, and also in the interest of keeping the overhead of
special-case code paths as low as possible, apply likely() to the
non-XLOG_SWITCH case, since only a very tiny fraction of WAL records
will be XLOG_SWITCH records.
Patch by me, reviewed by Dilip Kumar, Amit Kapila, Andres Freund,
and Michael Paquier.
Discussion: http://postgr.es/m/CA+TgmoYy-Vc6G9QKcAKNksCa29cv__czr+N9X_QCxEfQVpp_8w@mail.gmail.com
This could only affect HASH partitioned tables with at least 2 partition
key columns.
If partition pruning was delayed until execution and the query contained
an IS NULL qual on one of the partitioned keys, and some subsequent
partitioned key was being compared to a non-Const, then this could result
in a crash due to the incorrect keyno being used to calculate the
stateidx for the expression evaluation code.
Here we fix this by properly skipping partitioned keys which have a
nullkey set. Effectively, this must be the same as what's going on
inside perform_pruning_base_step().
Sergei Glukhov also provided a patch, but that's not what's being used
here.
Reported-by: Sergei Glukhov
Reviewed-by: tender wang, Sergei Glukhov
Discussion: https://postgr.es/m/d05b26fa-af54-27e1-f693-6c31590802fa@postgrespro.ru
Backpatch-through: 11, where runtime partition pruning was added.
get_steps_using_prefix_recurse() incorrectly assumed that it could stop
recursive processing of the 'prefix' list when cur_keyno was one before
the step_lastkeyno. Since hash partition pruning can prune using IS
NULL quals, and these IS NULL quals are not present in the 'prefix'
list, then that logic could cause more levels of recursion than what is
needed and lead to there being no more items in the 'prefix' list to
process. This would manifest itself as a crash in some code that
expected the 'start' ListCell not to be NULL.
Here we adjust the logic so that instead of stopping recursion at 1 key
before the step_lastkeyno, we just look at the llast(prefix) item and
ensure we only recursively process up until just before whichever the last
key is. This effectively allows keys to be missing in the 'prefix' list.
This change does mean that step_lastkeyno is no longer needed, so we
remove that from the static functions. I also spent quite some time
reading this code and testing it to try to convince myself that there
are no other issues. That resulted in the irresistible temptation of
rewriting some comments, many of which were just not true or inconcise.
Reported-by: Sergei Glukhov
Reviewed-by: Sergei Glukhov, tender wang
Discussion: https://postgr.es/m/2f09ce72-315e-2a33-589a-8519ada8df61@postgrespro.ru
Backpatch-through: 11, where partition pruning was introduced.
This adds a new option called BGWORKER_BYPASS_ROLELOGINCHECK to the
flags available to BackgroundWorkerInitializeConnection() and
BackgroundWorkerInitializeConnectionByOid().
This gives the possibility to bgworkers to bypass the role login check,
making possible the use of a role that has no login rights while not
being a superuser. PostgresInit() gains a new flag called
INIT_PG_OVERRIDE_ROLE_LOGIN, taking advantage of the refactoring done in
4800a5dfb4c4.
Regression tests are added to worker_spi to check the behavior of this
new option with bgworkers.
Author: Bertrand Drouvot
Reviewed-by: Nathan Bossart, Michael Paquier, Bharath Rupireddy
Discussion: https://postgr.es/m/bcc36259-7850-4882-97ef-d6b905d2fc51@gmail.com
In commit 3fc6e2d7f, I (tgl) argued that we only need to check for
a constant-FALSE restriction clause when there's exactly one
restriction clause, on the grounds that const-folding would have
thrown away anything ANDed with a Const FALSE. That's true just after
const-folding has been applied, but subsequent processing such as
equivalence class expansion could result in cases where a Const FALSE
is ANDed with some other stuff. (Compare for instance joinrels.c's
restriction_is_constant_false.) Hence, tweak this logic to check all
the elements of the baserestrictinfo list, not just one; that's cheap
enough to not be worth worrying about.
There is one existing test case where this visibly improves the plan.
There would not be any savings in runtime, but the planner effort and
executor startup effort will be reduced, and anyway it's odd that
we can detect related cases but not this one.
Richard Guo (independently discovered by David Rowley)
Discussion: https://postgr.es/m/CAMbWs4_x3-CnVVrCboS1LkEhB5V+W7sLSCabsRiG+n7+5_kqbg@mail.gmail.com
The check is not about processing the startup packet, so the calling
function seems like a more natural place. I'm also working on a patch
that moves 'canAcceptConnections' out of the Port struct, and this
makes that refactoring more convenient.
Reviewed-by: Tristan Partin
Discussion: https://www.postgresql.org/message-id/7a59b073-5b5b-151e-7ed3-8b01ff7ce9ef@iki.fi
InitPostgres() has been using a set of boolean arguments to control its
behavior, and a patch under discussion was aiming at expanding it with a
third one. In preparation for expanding this area, this commit switches
all the current boolean arguments of this routine to a single bits32
argument instead. Two values are currently supported for the flags:
- INIT_PG_LOAD_SESSION_LIBS to load [session|local]_preload_libraries at
startup.
- INIT_PG_OVERRIDE_ALLOW_CONNS to allow connection to a database even if
it has !datallowconn. This is used by bgworkers.
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/ZSTn66_BXRZCeaqS@paquier.xyz
Since we added the PlannerInfo.all_baserels set, it's not really
necessary to grovel over the rangetable to count baserels in the
current query. So let's drop has_multiple_baserels() in favor
of a bms_membership() test. This might be microscopically
faster, but the main point is to remove some unnecessary code.
Richard Guo
Discussion: https://postgr.es/m/CAMbWs4_8RcSbbfs1ASZLrMuL0c0EQgXWcoLTQD8swBRY_pQQiA@mail.gmail.com
The present pg_resetwal code hardcodes the minimum value for -c as 2,
which is FrozenTransactionId, but it's not clear why that is allowed.
After some research, it was probably a mistake in the original patch.
Change it to FirstNormalTransactionId, which matches other xid-related
options in pg_resetwal.
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/d09f0e91-8757-642b-1a92-da9a52f5589a%40eisentraut.org
While working on a8a968a82, I failed to consider that
cheapest_startup_path can be NULL when there is no non-parameterized
path in the pathlist. This is well documented in set_cheapest(), I just
failed to notice.
Here we adjust the code to just check if the RelOptInfo has a
cheapest_startup_path set before adding it to the startup_subpaths list.
Reported-by: Richard Guo
Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs49w3t03V69XhdCuw+GDwivny4uQUxrkVp6Gejaspt0wMQ@mail.gmail.com
worker_spi_launch() may report that a worker stopped when it fails to
connect on a database that does not allow connections if the worker
exits before the SQL function checks for the current status of the
worker. The test is switched to use Cluster::psql instead of
safe_psql() so as it does not fail hard when this query errors. While
on it, this removes a query that looks at pg_stat_activity to simplify
the test, as a check on the contents of the server logs achieves the
same when the worker cannot connect to the database without
datallowconn.
Per buildfarm members kestrel, mamba and serinus. Bonus thanks to Tom
Lane for providing the logs of the failure from mamba that the buildfarm
was not able to show up. Note that I have reproduced the failure with a
hardcoded stop point.
Discussion: https://postgr.es/m/3365937.1696801735@sss.pgh.pa.us
While these two built-in functions do exactly the same thing,
CURRENT_USER seems preferable to use in documentation examples.
It's easier to look up if the reader is unsure what it is.
Also, this puts these examples in sync with an adjacent example
that already used CURRENT_USER.
Per question from Kirk Parker.
Discussion: https://postgr.es/m/CANwZ8rmN_Eb0h0hoMRS8Feftaik0z89PxVsKg+cP+PctuOq=Qg@mail.gmail.com
The comment claimed that it is "called from postmaster", but it is
actually called in the child process, pretty early in the process
initialization. I guess you could interpret "called from postmaster"
to mean that, but it seems wrong to me. Rename the function to be
consistent with other functions with similar role.
Reviewed-by: Thomas Munro
Discussion: https://www.postgresql.org/message-id/4f95c1fc-ad3c-7974-3a8c-6faa3931804c@iki.fi
In EXEC_BACKEND or single-user mode, we process
shared_preload_libraries at postmaster startup as usual, but also at
backend startup. When a library calls RegisterBackgroundWorker() when
being loaded into a backend process, we go through the motions to add
the worker to BackgroundWorkerList, even though that is a
postmaster-private data structure. Make it return early when called in
a backend process, without changing BackgroundWorkerList.
You could argue that it was intentional: In non-EXEC_BACKEND mode, the
backend processes inherit BackgroundWorkerList at fork(), so it does
make some sense to initialize it to the same state in EXEC_BACKEND
mode, too. It's clearly a postmaster-private structure, though, and
all the functions that use it are clearly marked as "should only be
called in postmaster".
You could also argue that libraries should not call
RegisterBackgroundWorker() during backend startup. It's too late to
correctly register any static background workers at that stage. But
it's a common pattern in extensions, and it doesn't seem worth the
churn to require all extensions to change it.
Another sloppiness was the exception for "internal" background
workers. We checked that RegisterBackgroundWorker() was called during
shared_preload_libraries processing, or the background worker was an
internal one. That exception was made in commit 665d1fad99 to allow
postmaster to register the logical apply launcher in
ApplyLauncherRegister(). The way the check was written, it would not
complain if you registered an internal background worker in a regular
backend process. But it would complain if postmaster registered a
background worker defined in a shared library, outside
shared_preload_libraries processing. I think the correct rule is that
you can only register static background workers in the postmaster
process, and only before the bgworker shared memory array has been
initialized. Check for that more directly.
Reviewed-by: Thomas Munro
Discussion: https://www.postgresql.org/message-id/4f95c1fc-ad3c-7974-3a8c-6faa3931804c@iki.fi
The serialized representation of an internal aggregate state is a bytea
value. In each deserial function, in order to "receive" the bytea value
we appended it onto a short-lived StringInfoData using
appendBinaryStringInfo. This was a little wasteful as it meant having to
palloc memory, copy a (possibly long) series of bytes then later pfree
that memory. Instead of going to this extra trouble, we can just fake up
a StringInfoData and point the data directly at the bytea's payload. This
should help increase the performance of internal aggregate
deserialization.
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/CAApHDvr=e-YOigriSHHm324a40HPqcUhSp6pWWgjz5WwegR=cQ@mail.gmail.com
1349d2790 added code to adjust the PlannerInfo.group_pathkeys so that
ORDER BY / DISTINCT aggregate functions could obtain pre-sorted inputs
to allow faster execution. That commit forgot to adjust the pathkeys in
create_agg_path(). Some code in there assumed that it was always fine
to make the AggPath's pathkeys the same as its subpath's. That seems to
have been ok up until 1349d2790, but since that commit adds pathkeys for
columns which are within the aggregate function, those columns won't be
available above the aggregate node. This can result in "could not find
pathkey item to sort" during create_plan().
The fix here is to strip off the additional pathkeys added by
adjust_group_pathkeys_for_groupagg(). It seems that the pathkeys here
will only ever be group_pathkeys, so all we need to do is check if the
length of the pathkey list is longer than the num_groupby_pathkeys and
get rid of the additional ones only if we see extras.
Reported-by: Justin Pryzby
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/ZQhYYRhUxpW3PSf9%40telsasoft.com
Backpatch-through: 16, where 1349d2790 was introduced
Going by c4a1933b4, b33ef397a and 05893712c (to name just a few), it seems
that maintaining debug_print_rel() is often forgotten. In the case of
c4a1933b4, it was several years before anyone noticed that a path type
was not handled by debug_print_rel(). (debug_print_rel() is only
compiled when building with OPTIMIZER_DEBUG).
After a quick survey on the pgsql-hackers mailing list, nobody came
forward to admit that they use OPTIMIZER_DEBUG. So to prevent any future
maintenance neglect, let's just remove debug_print_rel() and have
OPTIMIZER_DEBUG make use of pprint() instead (as suggested by Tom Lane).
If anyone wants to come forward to claim they make use of
OPTIMIZER_DEBUG in a way that they need debug_print_rel() then they have
around 10 months remaining in the v17 cycle where we could revert this.
If nobody comes forward in that time, then we can likely safely declare
debug_print_rel() as not worth keeping.
Discussion: https://postgr.es/m/CAApHDvoCdjo8Cu2zEZF4-AxWG-90S+pYXAnoDDa9J3xH-OrczQ@mail.gmail.com
Back in the 8.3 era we discovered that it was problematic if
libpq.so had encoding ID assignments different from the backend,
which is possible because on some platforms libpq.so might be
of a different major version from the calling programs.
psql should use libpq's assignments, but initdb has to use the
backend's, else it will put wrong values into pg_database.
The solution devised in commit 8468146b0 relied on giving initdb
its own copy of encnames.c rather than relying on the functions
exported by libpq. Later, that metamorphosed into ensuring that
libpgcommon got linked before libpq -- which made things OK for
initdb but broke psql. We didn't notice for lack of any changes
in enum pg_enc since then. Commit 06843df4a reversed that, fixing
the latent bug in psql but adding one in initdb. The meson build
infrastructure is also not being sufficiently careful about link
order, and trying to make it so would be equally fragile.
Hence, let's use a new scheme based on giving the libpq-exported
symbols different real names than the same functions exported from
libpgcommon.a or libpgcommon_srv.a. (We could distinguish those
two cases as well, but there seems no need to.) libpq gets the
official names to avoid an ABI break for libpq clients, while the
other cases use #define's to make the real names "xxx_private"
rather than "xxx". By controlling where the #define's are
applied, we can force any particular client program to use one
set or the other of the encnames.c functions.
We cannot back-patch this, since it'd be an ABI break for backend
loadable modules, but there seems little need to. We're just
trying to ensure that the world is safe for hypothetical future
additions to enum pg_enc.
In passing this should fix "duplicate symbol" linker warnings
that we've been seeing on AIX buildfarm members since commit
06843df4a. It's not very clear why that linker is complaining
now, when there were strictly *more* duplicates visible before,
but in any case this should remove the reason for complaint.
Patch by me; thanks to Andres Freund for review.
Discussion: https://postgr.es/m/2385119.1696354473@sss.pgh.pa.us
There was some discussion what the line length should be. Most output
currently clearly targets around 80 columns, but the maximum in use
currently is 95, so we set that as the current maximum. This just
ensures that there is some guidance and there are no wild deviations.
based on patch by Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Discussion: https://www.postgresql.org/message-id/flat/50ca8ff35a8dd8f9ec89963b503571a7@oss.nttdata.com
This seemed inconsistent with the --help output of other tools.
Depending on the values, it can cause ugly formatting. Also, we're
not getting the defaults from libpq, we're just emulating the methods
libpq uses to derive these values, so they might not be 100% correct.
Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Discussion: https://www.postgresql.org/message-id/flat/50ca8ff35a8dd8f9ec89963b503571a7@oss.nttdata.com
Currently, B-tree code matches every scan key to every item on the page.
Imagine the ordered B-tree scan for the query like this.
SELECT * FROM tbl WHERE col > 'a' AND col < 'b' ORDER BY col;
The (col > 'a') scan key will be always matched once we find the location to
start the scan. The (col < 'b') scan key will match every item on the page
as long as it matches the last item on the page.
This patch implements prechecking of the scan keys required for directional
scan on beginning of page scan. If precheck is successful we can skip this
scan keys check for the items on the page. That could lead to significant
acceleration especially if the comparison operator is expensive.
Idea from patch by Konstantin Knizhnik.
Discussion: https://postgr.es/m/079c3f8e-3371-abe2-e93c-fc8a0ae3f571%40garret.ru
Reviewed-by: Peter Geoghegan, Pavel Borisov
A bgworker can spawn parallel workers of its own when executing queries,
and if the worker uses BGWORKER_BYPASS_ALLOWCONN while the database it
is connected to does not allow connections, a parallel worker would fail
to startup. In the case of this module, the step checking for the
presence of the schema to create was spawning a worker, failing the last
test introduced by 991bb0f9653c.
This issue could be reproduced with debug_parallel_query = 'regress',
for example.
Per buildfarm member crake.