Commit Graph

42710 Commits

Author SHA1 Message Date
Tom Lane b4d8be9f6d Fix postmaster's behavior during smart shutdown.
Up to now, upon receipt of a SIGTERM ("smart shutdown" command), the
postmaster has immediately killed all "optional" background processes,
and subsequently refused to launch new ones while it's waiting for
foreground client processes to exit.  No doubt this seemed like an OK
policy at some point; but it's a pretty bad one now, because it makes
for a seriously degraded environment for the remaining clients:

* Parallel queries are killed, and new ones fail to launch. (And our
parallel-query infrastructure utterly fails to deal with the case
in a reasonable way --- it just hangs waiting for workers that are
not going to arrive.  There is more work needed in that area IMO.)

* Autovacuum ceases to function.  We can tolerate that for awhile,
but if bulk-update queries continue to run in the surviving client
sessions, there's eventually going to be a mess.  In the worst case
the system could reach a forced shutdown to prevent XID wraparound.

* The bgwriter and walwriter are also stopped immediately, likely
resulting in performance degradation.

Hence, let's rearrange things so that the only immediate change in
behavior is refusing to let in new normal connections.  Once the last
normal connection is gone, shut everything down as though we'd received
a "fast" shutdown.  To implement this, remove the PM_WAIT_BACKUP and
PM_WAIT_READONLY states, instead staying in PM_RUN or PM_HOT_STANDBY
while normal connections remain.  A subsidiary state variable tracks
whether or not we're letting in new connections in those states.

This also allows having just one copy of the logic for killing child
processes in smart and fast shutdown modes.  I moved that logic into
PostmasterStateMachine() by inventing a new state PM_STOP_BACKENDS.

Back-patch to 9.6 where parallel query was added.  In principle
this'd be a good idea in 9.5 as well, but the risk/reward ratio
is not as good there, since lack of autovacuum is not a problem
during typical uses of smart shutdown.

Per report from Bharath Rupireddy.

Patch by me, reviewed by Thomas Munro

Discussion: https://postgr.es/m/CALj2ACXAZ5vKxT9P7P89D87i3MDO9bfS+_bjMHgnWJs8uwUOOw@mail.gmail.com
2020-08-14 13:26:57 -04:00
Alvaro Herrera e5902117dd
Handle new HOT chains in index-build table scans
When a table is scanned by heapam_index_build_range_scan (née
IndexBuildHeapScan) and the table lock being held allows concurrent data
changes, it is possible for new HOT chains to sprout in a page that were
unknown when the scan of a page happened.  This leads to an error such
as
  ERROR:  failed to find parent tuple for heap-only tuple at (X,Y) in table "tbl"
because the root tuple was not present when we first obtained the list
of the page's root tuples.  This can be fixed by re-obtaining the list
of root tuples, if we see that a heap-only tuple appears to point to a
non-existing root.

This was reported by Anastasia as occurring for BRIN summarization
(which exists since 9.5), but I think it could theoretically also happen
with CREATE INDEX CONCURRENTLY (much older) or REINDEX CONCURRENTLY
(very recent).  It seems a happy coincidence that BRIN forces us to
backpatch this all the way to 9.5.

Reported-by: Anastasia Lubennikova <a.lubennikova@postgrespro.ru>
Diagnosed-by: Anastasia Lubennikova <a.lubennikova@postgrespro.ru>
Co-authored-by: Anastasia Lubennikova <a.lubennikova@postgrespro.ru>
Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/602d8487-f0b2-5486-0088-0f372b2549fa@postgrespro.ru
Backpatch: 9.5 - master
2020-08-13 17:33:49 -04:00
Alvaro Herrera 7a3c261fbb
BRIN: Handle concurrent desummarization properly
If a page range is desummarized at just the right time concurrently with
an index walk, BRIN would raise an error indicating index corruption.
This is scary and unhelpful; silently returning that the page range is
not summarized is sufficient reaction.

This bug was introduced by commit 975ad4e602 as additional protection
against a bug whose actual fix was elsewhere.  Backpatch equally.

Reported-By: Anastasia Lubennikova <a.lubennikova@postgrespro.ru>
Diagnosed-By: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/2588667e-d07d-7e10-74e2-7e1e46194491@postgrespro.ru
Backpatch: 9.5 - master
2020-08-12 15:33:36 -04:00
Tom Lane 03f5894b35 Stamp 9.6.19. 2020-08-10 17:21:12 -04:00
Tom Lane a7e51a4076 Last-minute updates for release notes.
Security: CVE-2020-14349, CVE-2020-14350
2020-08-10 15:35:46 -04:00
Tom Lane 2ea8a60fc4 Make contrib modules' installation scripts more secure.
Hostile objects located within the installation-time search_path could
capture references in an extension's installation or upgrade script.
If the extension is being installed with superuser privileges, this
opens the door to privilege escalation.  While such hazards have existed
all along, their urgency increases with the v13 "trusted extensions"
feature, because that lets a non-superuser control the installation path
for a superuser-privileged script.  Therefore, make a number of changes
to make such situations more secure:

* Tweak the construction of the installation-time search_path to ensure
that references to objects in pg_catalog can't be subverted; and
explicitly add pg_temp to the end of the path to prevent attacks using
temporary objects.

* Disable check_function_bodies within installation/upgrade scripts,
so that any security gaps in SQL-language or PL-language function bodies
cannot create a risk of unwanted installation-time code execution.

* Adjust lookup of type input/receive functions and join estimator
functions to complain if there are multiple candidate functions.  This
prevents capture of references to functions whose signature is not the
first one checked; and it's arguably more user-friendly anyway.

* Modify various contrib upgrade scripts to ensure that catalog
modification queries are executed with secure search paths.  (These
are in-place modifications with no extension version changes, since
it is the update process itself that is at issue, not the end result.)

Extensions that depend on other extensions cannot be made fully secure
by these methods alone; therefore, revert the "trusted" marking that
commit eb67623c9 applied to earthdistance and hstore_plperl, pending
some better solution to that set of issues.

Also add documentation around these issues, to help extension authors
write secure installation scripts.

Patch by me, following an observation by Andres Freund; thanks
to Noah Misch for review.

Security: CVE-2020-14350
2020-08-10 10:44:43 -04:00
Peter Eisentraut 145d297048 Translation updates
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: f7a31187f2c0d0988e21cb6f3d788957c5d59fd8
2020-08-10 15:34:18 +02:00
Tom Lane 105dbff875 Check for fseeko() failure in pg_dump's _tarAddFile().
Coverity pointed out, not unreasonably, that we checked fseeko's
result at every other call site but these.  Failure to seek in the
temp file (note this is NOT pg_dump's output file) seems quite
unlikely, and even if it did happen the file length cross-check
further down would probably detect the problem.  Still, that's a
poor excuse for not checking the result of a system call.
2020-08-09 12:39:08 -04:00
Tom Lane 011aa7c0bc Release notes for 12.4, 11.9, 10.14, 9.6.19, 9.5.23. 2020-08-08 20:01:41 -04:00
Alvaro Herrera 55d42c9178
walsnd: Don't set waiting_for_ping_response spuriously
Ashutosh Bapat noticed that when logical walsender needs to wait for
WAL, and it realizes that it must send a keepalive message to
walreceiver to update the sent-LSN, which *does not* request a reply
from walreceiver, it wrongly sets the flag that it's going to wait for
that reply.  That means that any future would-be sender of feedback
messages ends up not sending a feedback message, because they all
believe that a reply is expected.

With built-in logical replication there's not much harm in this, because
WalReceiverMain will send a ping-back every wal_receiver_timeout/2
anyway; but with other logical replication systems (e.g. pglogical) it
can cause significant pain.

This problem was introduced in commit 41d5f8ad73, where the
request-reply flag was changed from true to false to WalSndKeepalive,
without at the same time removing the line that sets
waiting_for_ping_response.

Just removing that line would be a sufficient fix, but it seems better
to shift the responsibility of setting the flag to WalSndKeepalive
itself instead of requiring caller to do it; this is clearly less
error-prone.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Ashutosh Bapat <ashutosh.bapat@2ndquadrant.com>
Backpatch: 9.5 and up
Discussion: https://postgr.es/m/20200806225558.GA22401@alvherre.pgsql
2020-08-08 12:31:55 -04:00
Bruce Momjian d06c96185c doc: clarify "state" table reference in tutorial
Reported-by: Vyacheslav Shablistyy

Discussion: https://postgr.es/m/159586122762.680.1361378513036616007@wrigleys.postgresql.org

Backpatch-through: 9.5
2020-08-05 17:12:09 -04:00
Tom Lane 1fac10d9fd Increase hard-wired timeout values in ecpg regression tests.
A couple of test cases had connect_timeout=14, a value that seems
to have been plucked from a hat.  While it's more than sufficient
for normal cases, slow/overloaded buildfarm machines can get a timeout
failure here, as per recent report from "sungazer".  Increase to 180
seconds, which is in line with our typical timeouts elsewhere in
the regression tests.

Back-patch to 9.6; the code looks different in 9.5, and this doesn't
seem to be quite worth the effort to adapt to that.

Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2020-08-04%2007%3A12%3A22
2020-08-04 15:20:31 -04:00
Tom Lane 4399909af8 Doc: fix obsolete info about allowed range of TZ offsets in timetz.
We've allowed UTC offsets up to +/- 15:59 since commit cd0ff9c0f, but
that commit forgot to fix the documentation about timetz.

Per bug #16571 from osdba.

Discussion: https://postgr.es/m/16571-eb7501598de78c8a@postgresql.org
2020-08-03 13:11:16 -04:00
Tom Lane 8ab00b9fe9 Fix recently-introduced performance problem in ts_headline().
The new hlCover() algorithm that I introduced in commit c9b0c678d
turns out to potentially take O(N^2) or worse time on long documents,
if there are many occurrences of individual query words but few or no
substrings that actually satisfy the query.  (One way to hit this
behavior is with a "common_word & rare_word" type of query.)  This
seems unavoidable given the original goal of checking every substring
of the document, so we have to back off that idea.  Fortunately, it
seems unlikely that anyone would really want headlines spanning all of
a long document, so we can avoid the worse-than-linear behavior by
imposing a maximum length of substring that we'll consider.

For now, just hard-wire that maximum length as a multiple of max_words
times max_fragments.  Perhaps at some point somebody will argue for
exposing it as a ts_headline parameter, but I'm hesitant to make such
a feature addition in a back-patched bug fix.

I also noted that the hlFirstIndex() function I'd added in that
commit was unnecessarily stupid: it really only needs to check whether
a HeadlineWordEntry's item pointer is null or not.  This wouldn't make
all that much difference in typical cases with queries having just
a few terms, but a cycle shaved is a cycle earned.

In addition, add a CHECK_FOR_INTERRUPTS call in TS_execute_recurse.
This ensures that hlCover's loop is cancellable if it manages to take
a long time, and it may protect some other TS_execute callers as well.

Back-patch to 9.6 as the previous commit was.  I also chose to add the
CHECK_FOR_INTERRUPTS call to 9.5.  The old hlCover() algorithm seems
to avoid the O(N^2) behavior, at least on the test case I tried, but
nonetheless it's not very quick on a long document.

Per report from Stephen Frost.

Discussion: https://postgr.es/m/20200724160535.GW12375@tamriel.snowman.net
2020-07-31 11:43:13 -04:00
Tatsuo Ishii 8680303f9b Doc: fix high availability solutions comparison.
In "High Availability, Load Balancing, and Replication" chapter,
certain descriptions of Pgpool-II were not correct at this point.  It
does not need conflict resolution. Also "Multiple-Server Parallel
Query Execution" is not supported anymore.

Discussion: https://postgr.es/m/20200726.230128.53842489850344110.t-ishii%40sraoss.co.jp
Author: Tatsuo Ishii
Reviewed-by: Bruce Momjian
Backpatch-through: 9.5
2020-07-31 07:56:38 +09:00
Peter Geoghegan e3ba932da3 Backpatch tuplesort.c assertion.
Backpatch an assertion (that was originally added to Postgres 12 by
commit dd299df818) that seems broadly useful.  The assertion can detect
violations of the HOT invariant (i.e. no two index tuples can point to
the same heap TID) when CREATE INDEX somehow incorrectly allows that to
take place.

For example, a IndexBuildHeapScan/heapam_index_build_range_scan bug
might result in two tuples that both point to the same heap TID.  If
these two tuples also happen to be duplicates, the assertion will fail.

Discussion: https://postgr.es/m/CAH2-WzmBxu4o=pMsniur+bwHqCGCmV_AOLkuK6BuU7ngA6evqw@mail.gmail.com
Backpatch: 9.5-11 only
2020-07-29 16:00:50 -07:00
Michael Paquier 8a60f541f3 Fix corner case with 16kB-long decompression in pgcrypto, take 2
A compressed stream may end with an empty packet.  In this case
decompression finishes before reading the empty packet and the
remaining stream packet causes a failure in reading the following
data.  This commit makes sure to consume such extra data, avoiding a
failure when decompression the data.  This corner case was reproducible
easily with a data length of 16kB, and existed since e94dd6a.  A cheap
regression test is added to cover this case based on a random,
incompressible string.

The first attempt of this patch has allowed to find an older failure
within the compression logic of pgcrypto, fixed by b9b6105.  This
involved SLES 15 with z390 where a custom flavor of libz gets used.
Bonus thanks to Mark Wong for providing access to the specific
environment.

Reported-by: Frank Gagnepain
Author: Kyotaro Horiguchi, Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/16476-692ef7b84e5fb893@postgresql.org
Backpatch-through: 9.5
2020-07-27 15:59:13 +09:00
Tom Lane ccf964a801 Fix ancient violation of zlib's API spec.
contrib/pgcrypto mishandled the case where deflate() does not consume
all of the offered input on the first try.  It reset the next_in pointer
to the start of the input instead of leaving it alone, causing the wrong
data to be fed to the next deflate() call.

This has been broken since pgcrypto was committed.  The reason for the
lack of complaints seems to be that it's fairly hard to get stock zlib
to not consume all the input, so long as the output buffer is big enough
(which it normally would be in pgcrypto's usage; AFAICT the input is
always going to be packetized into packets no larger than ZIP_OUT_BUF).
However, IBM's zlibNX implementation for AIX evidently will do it
in some cases.

I did not add a test case for this, because I couldn't find one that
would fail with stock zlib.  When we put back the test case for
bug #16476, that will cover the zlibNX situation well enough.

While here, write deflate()'s second argument as Z_NO_FLUSH per its
API spec, instead of hard-wiring the value zero.

Per buildfarm results and subsequent investigation.

Discussion: https://postgr.es/m/16476-692ef7b84e5fb893@postgresql.org
2020-07-23 17:20:04 -04:00
Peter Eisentraut edfc08652a doc: Document that ssl_ciphers does not affect TLS 1.3
TLS 1.3 uses a different way of specifying ciphers and a different
OpenSSL API.  PostgreSQL currently does not support setting those
ciphers.  For now, just document this.  In the future, support for
this might be added somehow.

Reviewed-by: Jonathan S. Katz <jkatz@postgresql.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
2020-07-23 20:56:40 +02:00
Thomas Munro 47adb24882 Fix error message.
Remove extra space.  Back-patch to all releases, like commit 7897e3bb.

Author: Lu, Chenyang <lucy.fnst@cn.fujitsu.com>
Discussion: https://postgr.es/m/795d03c6129844d3803e7eea48f5af0d%40G08CNEXMBPEKD04.g08.fujitsu.local
2020-07-23 21:18:25 +12:00
Michael Paquier 11c320dceb Revert "Fix corner case with PGP decompression in pgcrypto"
This reverts commit 9e10898, after finding out that buildfarm members
running SLES 15 on z390 complain on the compression and decompression
logic of the new test: pipistrelles, barbthroat and steamerduck.

Those hosts are visibly using hardware-specific changes to improve zlib
performance, requiring more investigation.

Thanks to Tom Lane for the discussion.

Discussion: https://postgr.es/m/20200722093749.GA2564@paquier.xyz
Backpatch-through: 9.5
2020-07-23 08:29:33 +09:00
Michael Paquier e0cb8282d6 Fix corner case with PGP decompression in pgcrypto
A compressed stream may end with an empty packet, and PGP decompression
finished before reading this empty packet in the remaining stream.  This
caused a failure in pgcrypto, handling this case as corrupted data.
This commit makes sure to consume such extra data, avoiding a failure
when decompression the entire stream.  This corner case was reproducible
with a data length of 16kB, and existed since its introduction in
e94dd6a.  A cheap regression test is added to cover this case.

Thanks to Jeff Janes for the extra investigation.

Reported-by: Frank Gagnepain
Author: Kyotaro Horiguchi, Michael Paquier
Discussion: https://postgr.es/m/16476-692ef7b84e5fb893@postgresql.org
Backpatch-through: 9.5
2020-07-22 14:53:22 +09:00
Michael Paquier 5cd4954bb6 doc: Refresh more URLs in the docs
This updates some URLs that are redirections, mostly to an equivalent
using https.  One URL referring to generalized partial indexes was
outdated.

Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20200717.121308.1369606287593685396.horikyota.ntt@gmail.com
Backpatch-through: 9.5
2020-07-18 22:43:59 +09:00
Tom Lane 39064225a6 Ensure that distributed timezone abbreviation files are plain ASCII.
We had two occurrences of "Mitteleuropäische Zeit" in Europe.txt,
though the corresponding entries in Default were spelled
"Mitteleuropaeische Zeit".  Standardize on the latter spelling to
avoid questions of which encoding to use.

While here, correct a couple of other trivial inconsistencies between
the Default file and the supposedly-matching entries in the *.txt
files, as exposed by some checking with comm(1).  Also, add BDST to
the Europe.txt file; it previously was only listed in Default.
None of this has any direct functional effect.

Per complaint from Christoph Berg.  As usual for timezone data patches,
apply to all branches.

Discussion: https://postgr.es/m/20200716100743.GE3534683@msg.df7cb.de
2020-07-17 11:04:52 -04:00
Michael Paquier a452b239e3 Switch pg_test_fsync to use binary mode on Windows
pg_test_fsync has always opened files using the text mode on Windows, as
this is the default mode used if not enforced by _setmode().

This fixes a failure when running pg_test_fsync down to 12 because
O_DSYNC and the text mode are not able to work together nicely.  We
fixed the handling of O_DSYNC in 12~ for the tool by switching to the
concurrent-safe version of fopen() in src/port/ with 0ba06e0.  And
40cfe86, by enforcing the text mode for compatibility reasons if O_TEXT
or O_BINARY are not specified by the caller, broke pg_test_fsync.  For
all versions, this avoids any translation overhead, and pg_test_fsync
should test binary writes, so it is a gain in all cases.

Note that O_DSYNC is still not handled correctly in ~11, leading to
pg_test_fsync to show insanely high numbers for open_datasync() (using
this property it is easy to notice that the binary mode is much
faster).  This would require a backpatch of 0ba06e0 and 40cfe86, which
could potentially break existing applications, so this is left out.

There are no TAP tests for this tool yet, so I have checked all builds
manually using MSVC.  We could invent a new option to run a single
transaction instead of using a duration of 1s to make the tests a
maximum short, but this is left as future work.

Thanks to Bruce Momjian for the discussion.

Reported-by: Jeff Janes
Author: Michael Paquier
Discussion: https://postgr.es/m/16526-279ded30a230d275@postgresql.org
Backpatch-through: 9.5
2020-07-16 15:53:09 +09:00
Tom Lane 9e043d93c8 Replace use of sys_siglist[] with strsignal().
This commit back-patches the v12-era commits a73d08319, cc92cca43,
and 7570df0f3 into supported pre-v12 branches.  The net effect is to
eliminate our former dependency on the never-standard sys_siglist[]
array, instead using POSIX-standard strsignal(3).

What motivates doing this now is that glibc just removed sys_siglist[]
from the set of symbols available to newly-built programs.  While our
code can survive without sys_siglist[], it then fails to print any
description of the signal that killed a child process, which is a
non-negligible loss of friendliness.  We can expect that people will
be wanting to build the back branches on platforms that include this
change, so we need to do something.

Since strsignal(3) has existed for quite a long time, and we've not
had any trouble with these patches so far in v12, it seems safe to
back-patch into older branches.

Discussion: https://postgr.es/m/3179114.1594853308@sss.pgh.pa.us
2020-07-15 22:05:13 -04:00
Michael Paquier 14fe804139 Fix handling of missing files when using pg_rewind with online source
When working with an online source cluster, pg_rewind gets a list of all
the files in the source data directory using a WITH RECURSIVE query,
returning a NULL result for a file's metadata if it gets removed between
the moment it is listed in a directory and the moment its metadata is
obtained with pg_stat_file() (say a recycled WAL segment).  The query
result was processed in such a way that for each tuple we checked only
that the first file's metadata was NULL.  This could have two
consequences, both resulting in a failure of the rewind:
- If the first tuple referred to a removed file, all files from the
source would be ignored.
- Any file actually missing would not be considered as such.

While on it, rework slightly the code so as no values are saved if we
know that a file is going to be skipped.

Issue introduced by b36805f, so backpatch down to 9.5.

Author: Justin Pryzby, Michael Paquier
Reviewed-by: Daniel Gustafsson, Masahiko Sawada
Discussion: https://postgr.es/m/20200713061010.GC23581@telsasoft.com
Backpatch-through: 9.5
2020-07-15 15:17:55 +09:00
David Rowley c732df133a Fix timing issue with ALTER TABLE's validate constraint
An ALTER TABLE to validate a foreign key in which another subcommand
already caused a pending table rewrite could fail due to ALTER TABLE
attempting to validate the foreign key before the actual table rewrite
takes place.  This situation could result in an error such as:

ERROR:  could not read block 0 in file "base/nnnnn/nnnnn": read only 0 of 8192 bytes

The failure here was due to the SPI call which validates the foreign key
trying to access an index which is yet to be rebuilt.

Similarly, we also incorrectly tried to validate CHECK constraints before
the heap had been rewritten.

The fix for both is to delay constraint validation until phase 3, after
the table has been rewritten.  For CHECK constraints this means a slight
behavioral change.  Previously ALTER TABLE VALIDATE CONSTRAINT on
inheritance tables would be validated from the bottom up.  This was
different from the order of evaluation when a new CHECK constraint was
added.  The changes made here aligns the VALIDATE CONSTRAINT evaluation
order for inheritance tables to be the same as ADD CONSTRAINT, which is
generally top-down.

Reported-by: Nazli Ugur Koyluoglu, using SQLancer
Discussion: https://postgr.es/m/CAApHDvp%3DZXv8wiRyk_0rWr00skhGkt8vXDrHJYXRMft3TjkxCA%40mail.gmail.com
Backpatch-through: 9.5 (all supported versions)
2020-07-14 17:00:57 +12:00
Tom Lane dc6bb09949 Cope with lateral references in the quals of a subquery RTE.
The qual pushdown logic assumed that all Vars in a restriction clause
must be Vars referencing subquery outputs; but since we introduced
LATERAL, it's possible for such a Var to be a lateral reference instead.
This led to an assertion failure in debug builds.  In a non-debug
build, there might be no ill effects (if qual_is_pushdown_safe decided
the qual was unsafe anyway), or we could get failures later due to
construction of an invalid plan.  I've not gone to much length to
characterize the possible failures, but at least segfaults in the
executor have been observed.

Given that this has been busted since 9.3 and it took this long for
anybody to notice, I judge that the case isn't worth going to great
lengths to optimize.  Hence, fix by just teaching qual_is_pushdown_safe
that such quals are unsafe to push down, matching the previous behavior
when it accidentally didn't fail.

Per report from Tom Ellis.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/20200713175124.GQ8220@cloudinit-builder
2020-07-13 20:38:21 -04:00
Tom Lane 303322c5a4 Avoid trying to restore table ACLs and per-column ACLs in parallel.
Parallel pg_restore has always supposed that ACL items for different
objects are independent and can be restored in parallel without
conflicts.  However, there is one case where this fails: because
REVOKE on a table is defined to also revoke the privilege(s) at
column level, we can't restore per-column ACLs till after we restore
any table-level privileges on their table.  Failure to honor this
restriction can lead to "tuple concurrently updated" errors during
parallel restore, or even to the per-column ACLs silently disappearing
because the table-level REVOKE is executed afterwards.

To fix, add a dependency from each column-level ACL item to its table's
ACL item, if there is one.  Note that this doesn't fix the hazard
for pre-existing archive files, only for ones made with a corrected
pg_dump.  Given that the bug's been there quite awhile without
field reports, I think this is acceptable.

This requires changing the API of pg_dump's dumpACL() function.
To keep its argument list from getting even longer, I removed the
"CatalogId objCatId" argument, which has been unused for ages.

Per report from Justin Pryzby.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/20200706050129.GW4107@telsasoft.com
2020-07-11 13:36:51 -04:00
Tom Lane a20535da72 Doc: update or remove dead external links.
Re-point comp.ai.genetic FAQ link to a more stable address.

Remove stale links to AIX documentation; we don't really need to
tell AIX users how to use their systems.

Remove stale links to HP documentation about SSL.  We've had to
update those twice before, making it increasingly obvious that
HP does not intend them to be stable landing points.  They're
not particularly authoritative, either.  (This change effectively
reverts bbd3bdba3.)

Daniel Gustafsson and Álvaro Herrera, per a gripe from
Kyotaro Horiguchi.  Back-patch, since these links are
just as dead in the back branches.

Discussion: https://postgr.es/m/20200709.161226.204639179120026914.horikyota.ntt@gmail.com
2020-07-10 13:16:00 -04:00
Tom Lane ce9d053423 Tighten up Windows CRLF conversion in our TAP test scripts.
Back-patch commits 91bdf499b and ffb4cee43, so that all branches
agree on when and how to do Windows CRLF conversion.

This should close the referenced thread.  Thanks to Andrew Dunstan
for discussion/review.

Discussion: https://postgr.es/m/412ae8da-76bb-640f-039a-f3513499e53d@gmx.net
2020-07-09 17:38:52 -04:00
Michael Paquier 2d7738a83c doc: Fix incorrect reference to textout in plpgsql examples
This error has survived for 22 years, and has been introduced by
da63386.

Reported-by: Erwin Brandstetter
Discussion: https://postgr.es/m/CAGHENJ57wogGOvGXo5LgWYcqswxafLck8ELqHDR+zrkTPgs_OQ@mail.gmail.com
Backpatch-through: 9.5
2020-07-05 19:36:34 +09:00
Tom Lane 69a3fe6e63 Clamp total-tuples estimates for foreign tables to ensure planner sanity.
After running GetForeignRelSize for a foreign table, adjust rel->tuples
to be at least as large as rel->rows.  This prevents bizarre behavior
in estimate_num_groups() and perhaps other places, especially in the
scenario where rel->tuples is zero because pg_class.reltuples is
(suggesting that ANALYZE has never been run for the table).  As things
stood, we'd end up estimating one group out of any GROUP BY on such a
table, whereas the default group-count estimate is more likely to result
in a sane plan.

Also, clarify in the documentation that GetForeignRelSize has the option
to override the rel->tuples value if it has a better idea of what to use
than what is in pg_class.reltuples.

Per report from Jeff Janes.  Back-patch to all supported branches.

Patch by me; thanks to Etsuro Fujita for review

Discussion: https://postgr.es/m/CAMkU=1xNo9cnan+Npxgz0eK7394xmjmKg-QEm8wYG9P5-CcaqQ@mail.gmail.com
2020-07-03 19:01:22 -04:00
Tom Lane f0b4d9e118 Fix temporary tablespaces for shared filesets some more.
Commit ecd9e9f0b fixed the problem in the wrong place, causing unwanted
side-effects on the behavior of GetNextTempTableSpace().  Instead,
let's make SharedFileSetInit() responsible for subbing in the value
of MyDatabaseTableSpace when the default tablespace is called for.

The convention about what is in the tempTableSpaces[] array is
evidently insufficiently documented, so try to improve that.

It also looks like SharedFileSetInit() is doing the wrong thing in the
case where temp_tablespaces is empty.  It was hard-wiring use of the
pg_default tablespace, but it seems like using MyDatabaseTableSpace
is more consistent with what happens for other temp files.

Back-patch the reversion of PrepareTempTablespaces()'s behavior to
9.5, as ecd9e9f0b was.  The changes in SharedFileSetInit() go back
to v11 where that was introduced.  (Note there is net zero code change
before v11 from these two patch sets, so nothing to release-note.)

Magnus Hagander and Tom Lane

Discussion: https://postgr.es/m/CABUevExg5YEsOvqMxrjoNvb3ApVyH+9jggWGKwTDFyFCVWczGQ@mail.gmail.com
2020-07-03 17:01:34 -04:00
Magnus Hagander cc6af1f2f1 Fix temporary tablespaces for shared filesets
A likely copy/paste error in 98e8b48053 from  back in 2004 would
cause temp tablespace to be reset to InvalidOid if temp_tablespaces
was set to the same value as the primary tablespace in the database.
This would cause shared filesets (such as for parallel hash joins)
to ignore them, putting the temporary files in the default tablespace
instead of the configured one. The bug is in the old code, but it
appears to have been exposed only once we had shared filesets.

Reviewed-By: Daniel Gustafsson
Discussion: https://postgr.es/m/CABUevExg5YEsOvqMxrjoNvb3ApVyH+9jggWGKwTDFyFCVWczGQ@mail.gmail.com
Backpatch-through: 9.5
2020-07-03 15:11:07 +02:00
Bruce Momjian f1a346ec4d doc: clarify that storage parameter values are optional
In a few cases, the documented syntax specified storage parameter values
as required.

Reported-by: galiev_mr@taximaxim.ru

Discussion: https://postgr.es/m/159283163235.684.4482737698910467437@wrigleys.postgresql.org

Backpatch-through: 9.5
2020-06-30 12:26:51 -04:00
Bruce Momjian 1c76a141f3 doc: change pg_upgrade wal_level to be not minimal
Previously it was specified to be only replica.

Discussion: https://postgr.es/m/20200618180058.GK7349@momjian.us

Backpatch-through: 9.5
2020-06-30 11:55:52 -04:00
Noah Misch f1e2172794 Fix documentation of "must be vacuumed within" warning.
Warnings start 10M transactions before xidStopLimit, which is 11M
transactions before wraparound.  The sample WARNING output showed a
value greater than 11M, and its HINT message predated commit
25ec228ef7.  Hence, the sample was
impossible.  Back-patch to 9.5 (all supported versions).
2020-06-27 22:05:08 -07:00
Bruce Momjian 9ac3b6f9d2 doc: mention trigger helper functions in CREATE TRIGGER docs
Reported-by: petermpallesen@gmail.com

Discussion: https://postgr.es/m/159195294959.673.5752624528747900508@wrigleys.postgresql.org

Backpatch-through: 9.5
2020-06-25 18:33:28 -04:00
Bruce Momjian 1885b3f5e5 docs: clarify that CREATE DATABASE does not copy db permissions
That is, those database permissions set by GRANT.

Diagnosed-by: Joseph Nahmias

Discussion: https://postgr.es/m/20200614072613.GA21852@nahmias.net

Backpatch-through: 9.5
2020-06-25 18:22:44 -04:00
Tom Lane 08a2e40d14 Fix compiler warning induced by commit d8b15eeb8.
I forgot that INT64_FORMAT can't be used with sscanf on Windows.
Use the same trick of sscanf'ing into a temp variable as we do in
some other places in zic.c.

The upstream IANA code avoids the portability problem by relying on
<inttypes.h>'s SCNdFAST64 macro.  Once we're requiring C99 in all
branches, we should do likewise and drop this set of diffs from
upstream.  For now, though, a hack seems fine, since we do not
actually care about leapseconds anyway.

Discussion: https://postgr.es/m/4e5d1a5b-143e-e70e-a99d-a3b01c1ae7c3@2ndquadrant.com
2020-06-24 15:48:12 -04:00
Tom Lane bd53ea2b27 Undo double-quoting of index names in non-text EXPLAIN output formats.
explain_get_index_name() applied quote_identifier() to the index name.
This is fine for text output, but the non-text output formats all have
their own quoting conventions and would much rather start from the
actual index name.  For example in JSON you'd get something like

       "Index Name": "\"My Index\"",

which is surely not desirable, especially when the same does not
happen for table names.  Hence, move the responsibility for applying
quoting out to the callers, where it can go into already-existing
special code paths for text format.

This changes the API spec for users of explain_get_index_name_hook:
before, they were supposed to apply quote_identifier() if necessary,
now they should not.  Research suggests that the only publicly
available user of the hook is hypopg, and it actually forgot to
apply quoting anyway, so it's fine.  (In any case, there's no
behavioral change for the output of a hook as seen in non-text
EXPLAIN formats, so this won't break any case that programs should
be relying on.)

Digging in the commit logs, it appears that quoting was included in
explain_get_index_name's duties when commit 604ffd280 invented it;
and that was fine at the time because we only had text output format.
This should have been rethought when non-text formats were invented,
but it wasn't.

This is a fairly clear bug for users of non-text EXPLAIN formats,
so back-patch to all supported branches.

Per bug #16502 from Maciek Sakrejda.  Patch by me (based on
investigation by Euler Taveira); thanks to Julien Rouhaud for review.

Discussion: https://postgr.es/m/16502-57bd1c9f913ed1d1@postgresql.org
2020-06-22 11:46:41 -04:00
Alvaro Herrera 83762d0a92
Ensure write failure reports no-disk-space
A few places calling fwrite and gzwrite were not setting errno to ENOSPC
when reporting errors, as is customary; this led to some failures being
reported as
"could not write file: Success"
which makes us look silly.  Make a few of these places in pg_dump and
pg_basebackup use our customary pattern.

Backpatch-to: 9.5
Author: Justin Pryzby <pryzby@telsasoft.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20200611153753.GU14879@telsasoft.com
2020-06-19 16:46:07 -04:00
Tom Lane 9496908d46 Future-proof regression tests against possibly-missing posixrules file.
The IANA time zone folk have deprecated use of a "posixrules" file in
the tz database.  While for now it's our choice whether to keep
supplying one in our own builds, installations built with
--with-system-tzdata will soon be needing to cope with that file not
being present, at least on some platforms.

This causes a problem for the horology test, which expected the
nonstandard POSIX zone spec "CST7CDT" to apply pre-2007 US daylight
savings rules.  That does happen if the posixrules file supplies such
information, but otherwise the test produces undesired results.
To fix, add an explicit transition date rule that matches 2005 practice.
(We could alternatively have switched the test to use some real time
zone, but it seems useful to have coverage of this type of zone spec.)

While at it, update a documentation example that also relied on
"CST7CDT"; use a real-world zone name instead.  Also, document why
the zone names EST5EDT, CST6CDT, MST7MDT, PST8PDT aren't subject to
similar failures when "posixrules" is missing.

Back-patch to all supported branches, since the hazard is the same
for all.

Discussion: https://postgr.es/m/1665379.1592581287@sss.pgh.pa.us
2020-06-19 13:55:21 -04:00
Andres Freund 86593365ce Fix C99isms introduced when backpatching atomics / spinlock tests. 2020-06-18 15:14:54 -07:00
Andres Freund 8bc13287ed Fix deadlock danger when atomic ops are done under spinlock.
This was a danger only for --disable-spinlocks in combination with
atomic operations unsupported by the current platform.

While atomics.c was careful to signal that a separate semaphore ought
to be used when spinlock emulation is active, spin.c didn't actually
implement that mechanism. That's my (Andres') fault, it seems to have
gotten lost during the development of the atomic operations support.

Fix that issue and add test for nesting atomic operations inside a
spinlock.

Author: Andres Freund
Discussion: https://postgr.es/m/20200605023302.g6v3ydozy5txifji@alap3.anarazel.de
Backpatch: 9.5-
2020-06-18 14:12:24 -07:00
Andres Freund e8302f107a Add basic spinlock tests to regression tests.
As s_lock_test, the already existing test for spinlocks, isn't run in
an automated fashion (and doesn't test a normal backend environment),
adding tests that are run as part of a normal regression run is a good
idea. Particularly in light of several recent and upcoming spinlock
related fixes.

Currently the new tests are run as part of the pre-existing
test_atomic_ops() test. That perhaps can be quibbled about, but for
now seems ok.

The only operations that s_lock_test tests but the new tests don't are
the detection of a stuck spinlock and S_LOCK_FREE (which is otherwise
unused, not implemented on all platforms, and will be removed).

This currently contains a test for more than INT_MAX spinlocks (only
run with --disable-spinlocks), to ensure the recent commit fixing a
bug with more than INT_MAX spinlock initializations is correct. That
test is somewhat slow, so we might want to disable it after a few
days.

It might be worth retiring s_lock_test after this. The added coverage
of a stuck spinlock probably isn't worth the added complexity?

Author: Andres Freund
Discussion: https://postgr.es/m/20200606023103.avzrctgv7476xj7i@alap3.anarazel.de
2020-06-18 14:06:14 -07:00
Tom Lane 28589a8350 Doc: document POSIX-style time zone specifications in full.
We'd glossed over most of this complexity for years, but it's hard
to avoid writing it all down now, so that we can explain what happens
when there's no "posixrules" file in the IANA time zone database.
That was at best a tiny minority situation till now, but it's likely
to become quite common in the future, so we'd better explain it.

Nonetheless, we don't really encourage people to use POSIX zone specs;
picking a named zone is almost always what you really want, unless
perhaps you're stuck with an out-of-date zone database.  Therefore,
let's shove all this detail into an appendix.

Patch by me; thanks to Robert Haas for help with some awkward wording.

Discussion: https://postgr.es/m/1390.1562258309@sss.pgh.pa.us
2020-06-18 16:27:18 -04:00
Tom Lane c38561196f Sync our copy of the timezone library with IANA release tzcode2020a.
This absorbs a leap-second-related bug fix in localtime.c, and
teaches zic to handle an expiration marker in the leapseconds file.
Neither are of any interest to us (for the foreseeable future
anyway), but we need to stay more or less in sync with upstream.

Also adjust some over-eager changes in the README from commit 957338418.
I have no intention of making changes that require C99 in this code,
until such time as all the live back branches require C99.  Otherwise
back-patching will get too exciting.

For the same reason, absorb assorted whitespace and other cosmetic
changes from HEAD into the back branches; mostly this reflects use of
improved versions of pgindent.

All in all then, quite a boring update.  But I figured I'd get it
done while I was looking at this code.
2020-06-17 18:29:45 -04:00