A comment in hba.h mentioned that AuthTokens are used when building the
IdentLines from pg_ident.conf, but since 8fea868 that has added support
of regexps for databases and roles in pg_hba.conf, it is also the case
of HBA files. This refreshes the comment to refer to both HBA and ident
files.
Issue spotted while going through a different patch.
The creation of a logical decoding context in CreateDecodingContext()
updates some data of its slot for two-phase transactions if enabled by
the caller, but the code forgot to acquire a spinlock when updating
these fields like any other code paths. This could lead to the read of
inconsistent data.
Oversight in a8fd13c.
Author: Sawada Masahiko
Discussion: https://postgr.es/m/CAD21AoAD8_fp47191LKuecjDd3DYhoQ4TaucFco1_TEr_jQ-Zw@mail.gmail.com
Backpatch-through: 15
This reverts commit 1b4d280ea1eb7ddb2e16654d5fa16960bb959566.
It's broken the buildfarm members that run cross-version-upgrade tests,
because they're not prepared to deal with cosmetic differences between
CREATE VIEW commands emitted by older servers and HEAD. Even if we had
a solution to that, which we don't, it'd take some time to roll it out
to the affected animals. This improvement isn't valuable enough to
justify addressing that problem on an emergency basis, so revert it
for now.
Switch to a design similar to regular backends, instead of the previous
arrangement where signal handlers did non-trivial state management and
called fork(). The main changes are:
* The postmaster now has its own local latch to wait on. (For now, we
don't want other backends setting its latch directly, but that could
probably be made to work with more research on robustness.)
* The existing signal handlers are cut in two: a handle_pm_XXX() part
that just sets pending_pm_XXX flags and the latch, and a
process_pm_XXX() part that runs later when the latch is seen.
* Signal handlers are now installed with the regular pqsignal()
function rather than the special pqsignal_pm() function; historical
portability concerns about the effect of SA_RESTART on select() are no
longer relevant, and we don't need to block signals anymore.
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKG%2BZ-HpOj1JsO9eWUP%2Bar7npSVinsC_npxSy%2BjdOMsx%3DGg%40mail.gmail.com
Rename the heapam.c freeze plan deduplication routines added by commit
9e540599 to names that follow conventions for functions in heapam.c.
Also relocate the functions so that they're next to their caller, which
runs during original execution, when FREEZE_PAGE WAL records are built.
The routines were initially placed next to (and followed the naming
conventions of) conceptually related REDO routine code, but that scheme
turned out to be kind of jarring when considered in a wider context.
Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/20230109214308.icz26oqvt3k2274c@awork3.anarazel.de
The rule system needs "old" and/or "new" pseudo-RTEs in rule actions
that are ON INSERT/UPDATE/DELETE. Historically it's put such entries
into the ON SELECT rules of views as well, but those are really quite
vestigial. The only thing we've used them for is to carry the
view's relid forward to AcquireExecutorLocks (so that we can
re-lock the view to verify it hasn't changed before re-using a plan)
and to carry its relid and permissions data forward to execution-time
permissions checks. What we can do instead of that is to retain
these fields of the RTE_RELATION RTE for the view even after we
convert it to an RTE_SUBQUERY RTE. This requires a tiny amount of
extra complication in the planner and AcquireExecutorLocks, but on
the other hand we can get rid of the logic that moves that data from
one place to another.
The principal immediate benefit of doing this, aside from a small
saving in the pg_rewrite data for views, is that these pseudo-RTEs
no longer trigger ruleutils.c's heuristic about qualifying variable
names when the rangetable's length is more than 1. That results
in quite a number of small simplifications in regression test outputs,
which are all to the good IMO.
Bump catversion because we need to dump a few more fields of
RTE_SUBQUERY RTEs. While those will always be zeroes anyway in
stored rules (because we'd never populate them until query rewrite)
they are useful for debugging, and it seems like we'd better make
sure to transmit such RTEs accurately in plans sent to parallel
workers. I don't think the executor actually examines these fields
after startup, but someday it might.
Amit Langote
Discussion: https://postgr.es/m/CA+HiwqEf7gPN4Hn+LoZ4tP2q_Qt7n3vw7-6fJKOf92tSEnX6Gg@mail.gmail.com
This appends the set of object types supported by these commands, and
the objects defined in the cluster are completed after that. Note that
these may not be in the extension being working on when using DROP, to
keep the code simple, but this is much more useful than the previous
behavior of not knowing the objects that can be touched.
Author: Vignesh C
Discussion: https://postgr.es/m/CALDaNm3LVM2QcUWqgOonKZH80TveT-tUthbw4ZhuE_6pD3yi-A@mail.gmail.com
Document that TransactionIdDidAbort() won't indicate that transactions
that were in-progress during a crash have aborted. Tie this to existing
discussion of the TransactionIdDidCommit() and TransactionIdDidCommit()
protocol that code in heapam_visibility.c (and a few other places) must
observe.
Follow-up to bugfix commit eb5ad4ff.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAH2-Wzn4bEEqgmaUQL3aJ73yM9gAeK-wE4ngi7kjRjLztb+P0w@mail.gmail.com
In both partitioning and traditional inheritance, require child
columns to be GENERATED if and only if their parent(s) are.
Formerly we allowed the case of an inherited column being
GENERATED when its parent isn't, but that results in inconsistent
behavior: the column can be directly updated through an UPDATE
on the parent table, leading to it containing a user-supplied
value that might not match the generation expression. This also
fixes an oversight that we enforced partition-key-columns-can't-
be-GENERATED against parent tables, but not against child tables
that were dynamically attached to them.
Also, remove the restriction that the child's generation expression
be equivalent to the parent's. In the wake of commit 3f7836ff6,
there doesn't seem to be any reason that we need that restriction,
since generation expressions are always computed per-table anyway.
By removing this, we can also allow a child to merge multiple
inheritance parents with inconsistent generation expressions, by
overriding them with its own expression, much as we've long allowed
for DEFAULT expressions.
Since we're rejecting a case that we used to accept, this doesn't
seem like a back-patchable change. Given the lack of field
complaints about the inconsistent behavior, it's likely that no
one is doing this anyway, but we won't change it in minor releases.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/2793383.1672944799@sss.pgh.pa.us
Commits cf5eb37c5 and e5b8a4c09 each created a new role that they
forgot to remove again. This breaks the use-case of running "make
installcheck" more than once, and it's also against project policy
because it'd be quite unfriendly behavior if one were running
"make installcheck" against a non-throwaway installation.
There are a number of places where a shell command is constructed with
percent-placeholders (like %x). It's cumbersome to have to open-code
this several times. This factors out this logic into a separate
function. This also allows us to ensure consistency for and document
some subtle behaviors, such as what to do with unrecognized
placeholders.
The unified handling is now that incorrect and unknown placeholders
are an error, where previously in most cases they were skipped or
ignored. This affects the following settings:
- archive_cleanup_command
- archive_command
- recovery_end_command
- restore_command
- ssl_passphrase_command
The following settings are part of this refactoring but already had
stricter error handling and should be unchanged in their behavior:
- basebackup_to_shell.command
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/5238bbed-0b01-83a6-d4b2-7eb0562a054e%40enterprisedb.com
Prior to this, we only considered a full sort on the cheapest input path
and uniquifying any path which was already sorted in the required sort
order. Here we adjust create_final_distinct_paths() so that it also
adds an Incremental Sort path on any path which has presorted keys.
Additionally, this adjusts the parallel distinct code so that we now
consider sorting the cheapest partial path and incrementally sorting any
partial paths with presorted keys. Previously we didn't consider any
sorting for parallel distinct and only added a unique path atop any path
which had the required pathkeys already.
Author: David Rowley
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/CAApHDvo8Lz2H=42urBbfP65LTcEUOh288MT7DsG2_EWtW1AXHQ@mail.gmail.com
Can be set to the empty string, or to either or both of "set" or
"inherit". If set to a non-empty value, a non-superuser who creates
a role (necessarily by relying up the CREATEROLE privilege) will
grant that role back to themselves with the specified options.
This isn't a security feature, because the grant that this feature
triggers can also be performed explicitly. Instead, it's a user experience
feature. A superuser would necessarily inherit the privileges of any
created role and be able to access all such roles via SET ROLE;
with this patch, you can configure createrole_self_grant = 'set, inherit'
to provide a similar experience for a user who has CREATEROLE but not
SUPERUSER.
Discussion: https://postgr.es/m/CA+TgmobN59ct+Emmz6ig1Nua2Q-_o=r6DSD98KfU53kctq_kQw@mail.gmail.com
Previously, CREATEROLE users were permitted to make nearly arbitrary
changes to roles that they didn't create, with certain exceptions,
particularly superuser roles. Instead, allow CREATEROLE users to make such
changes to roles for which they possess ADMIN OPTION, and to
grant membership only in roles for which they possess ADMIN OPTION.
When a CREATEROLE user who is not a superuser creates a role, grant
ADMIN OPTION on the newly-created role to the creator, so that they
can administer roles they create or for which they have been given
privileges.
With these changes, CREATEROLE users still have very significant
powers that unprivileged users do not receive: they can alter, rename,
drop, comment on, change the password for, and change security labels
on roles. However, they can now do these things only for roles for
which they possess appropriate privileges, rather than all
non-superuser roles; moreover, they cannot grant a role such as
pg_execute_server_program unless they themselves possess it.
Patch by me, reviewed by Mark Dilger.
Discussion: https://postgr.es/m/CA+TgmobN59ct+Emmz6ig1Nua2Q-_o=r6DSD98KfU53kctq_kQw@mail.gmail.com
As I suspected, some machines have even more low-order-bit
inaccuracy than the ones I tested. Tweak new test so that
(hopefully) it will pass everywhere. Per buildfarm.
Discussion: https://postgr.es/m/4173840.1673290336@sss.pgh.pa.us
We aren't using this anymore in the wake of commit 09d517773,
so delete it. We can always revert this if some future use
emerges, but I think our standards for test quality are now
high enough that that will never happen.
Discussion: https://postgr.es/m/4173840.1673290336@sss.pgh.pa.us
We had some pretty ad-hoc and inefficient code here. To make
matters worse, it didn't test the properties of the random()
function very thoroughly, and it had a test failure rate of
one in every few tens of thousands of runs. Replace the
script altogether with new test cases that prove much more
about random()'s output, run faster, and can be calculated
to have test failure rates on the order of 1e-9.
Having done that, the failure rate of this script should be
negligible in comparison to other causes of test failures,
so remove the "ignore" marker for it in parallel_schedule.
(If it does fail, we'd like to know about that, so "ignore"
was always pretty counterproductive.)
Tom Lane and Dean Rasheed
Discussion: https://postgr.es/m/4173840.1673290336@sss.pgh.pa.us
This allows left join removals and unique joins to work with partitioned
tables. The planner just lacked sufficient proofs that a given join
would not cause any row duplication. Unique indexes currently serve as
that proof, so have get_relation_info() populate the indexlist for
partitioned tables too.
Author: Arne Roland
Reviewed-by: Alvaro Herrera, Zhihong Yu, Amit Langote, David Rowley
Discussion: https://postgr.es/m/c3b2408b7a39433b8230bbcd02e9f302@index.de
Currently, for large transactions, the publisher sends the data in
multiple streams (changes divided into chunks depending upon
logical_decoding_work_mem), and then on the subscriber-side, the apply
worker writes the changes into temporary files and once it receives the
commit, it reads from those files and applies the entire transaction. To
improve the performance of such transactions, we can instead allow them to
be applied via parallel workers.
In this approach, we assign a new parallel apply worker (if available) as
soon as the xact's first stream is received and the leader apply worker
will send changes to this new worker via shared memory. The parallel apply
worker will directly apply the change instead of writing it to temporary
files. However, if the leader apply worker times out while attempting to
send a message to the parallel apply worker, it will switch to
"partial serialize" mode - in this mode, the leader serializes all
remaining changes to a file and notifies the parallel apply workers to
read and apply them at the end of the transaction. We use a non-blocking
way to send the messages from the leader apply worker to the parallel
apply to avoid deadlocks. We keep this parallel apply assigned till the
transaction commit is received and also wait for the worker to finish at
commit. This preserves commit ordering and avoid writing to and reading
from files in most cases. We still need to spill if there is no worker
available.
This patch also extends the SUBSCRIPTION 'streaming' parameter so that the
user can control whether to apply the streaming transaction in a parallel
apply worker or spill the change to disk. The user can set the streaming
parameter to 'on/off', or 'parallel'. The parameter value 'parallel' means
the streaming will be applied via a parallel apply worker, if available.
The parameter value 'on' means the streaming transaction will be spilled
to disk. The default value is 'off' (same as current behaviour).
In addition, the patch extends the logical replication STREAM_ABORT
message so that abort_lsn and abort_time can also be sent which can be
used to update the replication origin in parallel apply worker when the
streaming transaction is aborted. Because this message extension is needed
to support parallel streaming, parallel streaming is not supported for
publications on servers < PG16.
Author: Hou Zhijie, Wang wei, Amit Kapila with design inputs from Sawada Masahiko
Reviewed-by: Sawada Masahiko, Peter Smith, Dilip Kumar, Shi yu, Kuroda Hayato, Shveta Mallik
Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com
GIN index scans were not taking any descent CPU-based cost into account. That
made them look cheaper than other types of indexes when they shouldn't be.
We use the same heuristic as for btree indexes, but multiply it by the number
of searched entries.
Additionally, the CPU cost for the tree was based largely on a
genericcostestimate. For a GIN index, we should not charge index quals per
tuple, but per entry. On top of this, charge cpu_index_tuple_cost per actual
tuple.
This should fix the cases where a GIN index is preferred over a btree and
the ones where a memoize node is not added on top of the GIN index scan
because it seemed too cheap.
We don't packpatch this to evade unexpected plan changes in stable versions.
Discussion: https://postgr.es/m/CABs3KGQnOkyQ42-zKQqiE7M0Ks9oWDSee%3D%2BJx3-TGq%3D68xqWYw%40mail.gmail.com
Discussion: https://postgr.es/m/3188617.44csPzL39Z%40aivenronan
Author: Ronan Dunklau
Reported-By: Hung Nguyen
Reviewed-by: Tom Lane, Alexander Korotkov
Check the remote relkind before trying to use TABLESAMPLE to acquire
sample from the remote relation. Even if the remote server version has
TABLESAMPLE support, the foreign table may point to incompatible relkind
(e.g. a view or a sequence).
If the relkind does not support TABLESAMPLE, error out if TABLESAMPLE
was requested specifically (as system/bernoulli), or fallback to random
just like we do for old server versions.
We currently end up disabling sampling for such relkind values anyway,
due to reltuples being -1 or 1, but that seems rather accidental, and
might get broken by improving reltuples estimates, etc. So better to
make the check explicit.
Reported-by: Tom Lane
Discussion: https://postgr.es/m/951485.1672461744%40sss.pgh.pa.us
This allows an optional "S" modifier to be added to \dp and \z, to
have them include system objects in the list.
Note that this also changes the behaviour of a bare \dp or \z without
the "S" modifier to include temp objects in the list, and exclude
information_schema objects, making them consistent with other psql
meta-commands.
Nathan Bossart, reviewed by Maxim Orlov.
Discussion: https://postgr.es/m/20221206193606.GB3078082@nathanxps13
After restart, we try to stream the changes for large transactions that
were not sent before server crash and restart. However, we forget to send
the abort message for such transactions. This leads to spurious streaming
files on the subscriber which won't be cleaned till the apply worker or
the subscriber server restarts.
Reported-by: Dilip Kumar
Author: Hou Zhijie
Reviewed-by: Dilip Kumar and Amit Kapila
Backpatch-through: 14
Discussion: https://postgr.es/m/OS0PR01MB5716A773F46768A1B75BE24394FB9@OS0PR01MB5716.jpnprd01.prod.outlook.com
During the development of 728202b63, which was aimed at reducing the
number of sorts required to evaluate multiple window functions with
different WindowClause definitions, the code written sorted the
WindowClauses in reverse tleSortGroupRef order. There appears to be no
discussion in the thread which was opened to discuss the development of
this patch and no comments mentioning the fact that having the
WindowClauses in reverse tleSortGroupRef order makes it more likely that
the final WindowClause to be evaluated will provide presorted input to
the query's DISTINCT or ORDER BY clause. The reason for this is that the
tleSortGroupRef indexes are assigned for the DISTINCT and ORDER BY clauses
before they are for the WindowClauses PARTITION BY and ORDER BY clauses.
Putting the WindowClause with the lowest tleSortGroupRef last means that
it's more likely that no additional sorting is required for the query's
DISTINCT or ORDER BY clause.
All we're doing here is adding some tests and a comment to help ensure
that remains true and that we don't accidentally forget to consider this
again should we ever rewrite that code.
Author: Ankit Kumar Pandey, David Rowley
Discussion: https://postgr.es/m/CAApHDvq=g2=ny59f1bvwRVvupsgPHK-KjLPBsSL25fVuGZ4idQ@mail.gmail.com
Waken related worker processes immediately at commit of a transaction
that has performed ALTER SUBSCRIPTION (including the RENAME and
OWNER variants). This reduces the response time for such operations.
In the real world that might not be worth much, but it shaves several
seconds off the runtime for the subscription test suite.
In the case of PREPARE, we just throw away this notification state;
it doesn't seem worth the work to preserve it. The workers will
still react after the eventual COMMIT PREPARED, but not as quickly.
Nathan Bossart
Discussion: https://postgr.es/m/20221122004119.GA132961@nathanxps13
Previously this function checked to see if we were ready to switch
to two_phase mode at its start, but that's silly: we should check
at the end, after we've done the work that might make us ready.
This simple change removes one sleep cycle from the time needed to
switch to two_phase mode. In the real world that might not be
worth much, but it shaves a few seconds off the runtime for the
subscription test suite.
Nathan Bossart
Discussion: https://postgr.es/m/20221122004119.GA132961@nathanxps13
VACUUM normally ends by running vac_update_datfrozenxid(), which
requires a scan of pg_class. Therefore, if one attempts to vacuum a
database one table at a time --- as vacuumdb has done since v12 ---
we will spend O(N^2) time in vac_update_datfrozenxid(). That causes
serious performance problems in databases with tens of thousands of
tables, and indeed the effect is measurable with only a few hundred.
To add insult to injury, only one process can run
vac_update_datfrozenxid at the same time per DB, so this behavior
largely defeats vacuumdb's -j option.
Hence, invent options SKIP_DATABASE_STATS and ONLY_DATABASE_STATS
to allow applications to postpone vac_update_datfrozenxid() until the
end of a series of VACUUM requests, and teach vacuumdb to use them.
Per bug #17717 from Gunnar L. Sadly, this answer doesn't seem
like something we'd consider back-patching, so the performance
problem will remain in v12-v15.
Tom Lane and Nathan Bossart
Discussion: https://postgr.es/m/17717-6c50eb1c7d23a886@postgresql.org
A schema rename should cause reporting the new qualified names of
tables to logical replication subscribers, but that wasn't happening.
Flush the RelationSyncCache to make it happen.
(If you ask me, the new test case shows that the behavior in this area
is still pretty dubious, but apparently it's operating as designed.)
Vignesh C
Discussion: https://postgr.es/m/CALDaNm32vLRv5KdrDFeVC-CU+4Wg1daA55hMqOxDGJBzvd76-w@mail.gmail.com
A comment was left behind referencing sample rate adjustment removed
from 8ad51b5f44. So clean that up. While at it also remove the sample
rate clamping which should not be necessary without the clamping, and
just check that with an assert.
Reported-by: Tom Lane
Discussion: https://postgr.es/m/951485.1672461744%40sss.pgh.pa.us
The ALTER DATABASE|FUNCTION|PROCEDURE|ROLE|ROUTINE|USER ... SET <name>
case in psql tab completion failed to exclude <name> = "SCHEMA", which
caused ALTER FUNCTION|PROCEDURE|ROUTINE ... SET SCHEMA to complete
with "FROM CURRENT" and "TO", which won't work.
Fix that, so that those cases now complete with the list of schemas,
like other ALTER ... SET SCHEMA commands.
Noticed while testing the recent patch to improve tab completion for
ALTER FUNCTION/PROCEDURE/ROUTINE, but this is not directly related to
that patch. Rather, this is a long-standing bug, so back-patch to all
supported branches.
Discussion: https://postgr.es/m/CALDaNm0s7GQmkLP_mx5Cvk=UzYMnjhPmXBxU8DsHEunFbC5sTg@mail.gmail.com