Formerly, the bootstrap backend looked up the OIDs corresponding to
names in regproc catalog entries using brute-force searches of pg_proc.
It was somewhat remarkable that that worked at all, since it was used
while populating other pretty-fundamental catalogs like pg_operator.
And it was also quite slow, and getting slower as pg_proc gets bigger.
This patch moves the lookup work into genbki.pl, so that the values in
postgres.bki for regproc columns are always numeric OIDs, an option
that regprocin() already supported. Perl isn't the world's speediest
language, so this about doubles the time needed to run genbki.pl (from
0.3 to 0.6 sec on my machine). But we only do that at most once per
build. The time needed to run initdb drops significantly --- on my
machine, initdb --no-sync goes from 1.8 to 1.3 seconds. So this is
a small net win even for just one initdb per build, and it becomes
quite a nice win for test sequences requiring many initdb runs.
Strip out the now-dead code for brute-force catalog searching in
regprocin. We'd also cargo-culted similar logic into regoperin
and some (not all) of the other reg*in functions. That is all
dead code too since we currently have no need to load such values
during bootstrap. I removed it all, reasoning that if we ever
need such functionality it'd be much better to do it in a similar
way to this patch.
There might be some simplifications possible in the backend now that
regprocin doesn't require doing catalog reads so early in bootstrap.
I've not looked into that, though.
Andreas Karlsson, with some small adjustments by me
Discussion: https://postgr.es/m/30896.1492006367@sss.pgh.pa.us
Add a missing comma in the synopsis after the XMLNAMESPACES clause.
Also, add an example illustrating the use of that clause.
Author: Arjen Nienhuis and Pavel Stěhule
Free each SASL message after sending it. It's not a lot of wasted memory,
and it's short-lived, but the authentication code in general tries to
pfree() stuff, so let's follow the example.
Adding the pfree() revealed a little bug in build_server_first_message().
It attempts to keeps a copy of the sent message, but it was missing a
pstrdup(), so the pointer started to dangle, after adding the pfree()
into CheckSCRAMAuth().
Reword comments and debug messages slightly, while we're at it.
Reviewed by Michael Paquier.
Discussion: https://www.postgresql.org/message-id/6490b975-5ee1-6280-ac1d-af975b19fb9a@iki.fi
Previously the description about pg_stat_progress_vacuum was in the table
of "Collected Statistics Views" in the doc. But since it repors dynamic
information, i.e., the current progress of VACUUM, its description should be
in the table of "Dynamic Statistics Views".
Back-patch to 9.6 where pg_stat_progress_vacuum was added.
Author: Amit Langote
Discussion: http://postgr.es/m/7ab51b59-8d4d-6193-c60a-b75f222efb12@lab.ntt.co.jp
Commit 5e6d8d2bb allowed parallel workers to execute parallel-safe
subplans, but it transmitted the query's entire list of subplans to
the worker(s). Since execMain.c blindly does ExecInitNode and later
ExecEndNode on every list element, this resulted in parallel-unsafe plan
nodes nonetheless getting started up and shut down in parallel workers.
That seems mostly harmless as far as core plan node types go (but
maybe not so much for Gather?). But it resulted in postgres_fdw
opening and then closing extra remote connections, and it's likely
that other non-parallel-safe FDWs or custom scan providers would have
worse reactions.
To fix, just make ExecSerializePlan replace parallel-unsafe subplans
with NULLs in the cut-down plan tree that it transmits to workers.
This relies on ExecInitNode and ExecEndNode to do nothing on NULL
input, but they do anyway. If anything else is touching the dropped
subplans in a parallel worker, that would be a bug to be fixed.
(This thus provides a strong guarantee that we won't try to do
something with a parallel-unsafe subplan in a worker.)
This is, I think, the last fix directly occasioned by Andreas Seltenreich's
bug report of a few days ago.
Tom Lane and Amit Kapila
Discussion: https://postgr.es/m/87tw5x4vcu.fsf@credativ.de
Tweak CSS a bit to match latest similar changes to web site style. Also
move some CSS out of the HTML to the stylesheet so that the web site
stylesheet can override it. This should ensure that notes and such are
back to being centered.
We'd managed to avoid doing this so far, but it seems pretty obvious
that it would be forced on us some day, and this is much the cleanest
way of approaching the open problem that parallel-unsafe subplans are
being transmitted to parallel workers. Anyway there's no space cost
due to alignment considerations, and the time cost is pretty minimal
since we're just copying the flag from the corresponding Path node.
(At least in most cases ... some of the klugier spots in createplan.c
have to work a bit harder.)
In principle we could perhaps get rid of SubPlan.parallel_safe,
but I thought it better to keep that in case there are reasons to
consider a SubPlan unsafe even when its child plan is parallel-safe.
This patch doesn't actually do anything with the new flags, but
I thought I'd commit it separately anyway.
Note: although this touches outfuncs/readfuncs, there's no need for
a catversion bump because Plan trees aren't stored on disk.
Discussion: https://postgr.es/m/87tw5x4vcu.fsf@credativ.de
Hash indexes can contain both pages which are all-zeroes (i.e.
PageIsNew()) and pages which have been initialized but currently
aren't used. The latter category can happen either when a page
has been reserved but not yet used or when it is used for a time
and then freed. pgstattuple was only prepared to deal with the
pages that are actually-zeroes, which it called zero_pages.
Rename the column to unused_pages (extension version 1.5 is
as-yet-unreleased) and make it count both kinds of unused pages.
Along the way, slightly tidy up the way we test for pages of
various types.
Robert Haas and Ashutosh Sharma, reviewed by Amit Kapila
Discussion: http://postgr.es/m/CAE9k0PkTtKFB3YndOyQMjwuHx+-FtUP1ynK8E-nHtetoow3NtQ@mail.gmail.com
validateCheckConstraint() shouldn't try to access the storage for
a partitioned table, because it no longer has any. Creating a
_RETURN table on a partitioned table shouldn't be allowed, both
because there's no value in it and because trying to do so would
involve a validation scan against its nonexistent storage.
Amit Langote, reviewed by Tom Lane. Regression test outputs
updated to pass by me.
Discussion: http://postgr.es/m/e5c3cbd3-1551-d6f8-c9e2-51777d632fd2@lab.ntt.co.jp
While at it also update the comments in walmethods.h to make it less
likely for mistakes like this to appear in the future (thanks to Tom for
improvements to the comments).
And finally, in passing change the return type of walmethod.getlasterror
to being const, also per suggestion from Tom.
Commit a4777f355 was a shade too mechanical: we don't want to override
MSVC's own definition of _MSC_VER, as that breaks tests on its numerical
value. Per buildfarm.
Commit 0bf3ae88a encountered a need to pass the finally chosen remote qual
conditions forward from postgresGetForeignPlan to postgresPlanDirectModify.
It solved that by sticking them into the plan node's fdw_private list,
which in hindsight was a pretty bad idea. In the first place, there's no
use for those qual trees either in EXPLAIN or execution; indeed they could
never safely be used for any post-planning purposes, because they would not
get processed by setrefs.c. So they're just dead weight to carry around in
the finished plan tree, plus being an attractive nuisance for somebody who
might get the idea that they could be used that way. Secondly, because
those qual trees (sometimes) contained RestrictInfos, they created a
plan-transmission hazard for parallel query, which is how come we noticed a
problem. We dealt with that symptom in commit 28b047875, but really a more
straightforward and more efficient fix is to pass the data through in a new
field of struct PgFdwRelationInfo. So do it that way. (There's no need
to revert 28b047875, as it has sufficient reason to live anyway.)
Per fuzz testing by Andreas Seltenreich.
Discussion: https://postgr.es/m/87tw5x4vcu.fsf@credativ.de
To prevent future bugs along the lines of the one corrected by commit
8ff518699f19dd0a5076f5090bac8400b8233f7f, or find any that remain
in the current code, add an Assert() that the difference between
parallel_register_count and parallel_terminate_count is in a sane
range.
Kuntal Ghosh, with considerable tidying-up by me, per a suggestion
from Neha Khatri. Reviewed by Tomas Vondra.
Discussion: http://postgr.es/m/CAFO0U+-E8yzchwVnvn5BeRDPgX2z9vZUxQ8dxx9c0XFGBC7N1Q@mail.gmail.com
Commit b460f5d6693103076dc554aa7cbb96e1e53074f9 failed to contemplate
the possibilit that a parallel worker registered before a crash would
be unregistered only after the crash; if that happened, we'd end up
with parallel_terminate_count > parallel_register_count and the
system would refuse to launch any more parallel workers.
The easiest way to fix that seems to be to forget BGW_NEVER_RESTART
workers in ResetBackgroundWorkerCrashTimes() rather than leaving them
around to be cleaned up after the conclusion of the restart, so that
they go away before rather than after shared memory is reset.
To make sure that this fix is water-tight, don't allow parallel
workers to be anything other than BGW_NEVER_RESTART, so that after
recovering from a crash, 0 is guaranteed to be the correct starting
value for parallel_register_count. The core code wouldn't do this
anyway, but somebody might try to do it in extension code.
Report by Thomas Vondra. Patch by me, reviewed by Kuntal Ghosh.
Discussion: http://postgr.es/m/CAGz5QC+AVEVS+3rBKRq83AxkJLMZ1peMt4nnrQwczxOrmo3CNw@mail.gmail.com
Commit 98e6e89040a0534ca26914c66cae9dd49ef62ad9 made inadequate
provision for the case of a single-page shared tidbitmap. It
allocate space for a shared PagetableEntry, but failed to
initialize it.
Report by Thomas Munro. Patch by Dilip Kumar, with some comment
changes by me.
Discussion: http://postgr.es/m/CAEepm=19Cmnfbi-j2Bw-a6yGPeHE1OVhKvvKz9bRBTJGKfGHMA@mail.gmail.com
Clauses in the lists retained by postgres_fdw during planning were
sometimes bare boolean clauses, sometimes RestrictInfos, and sometimes
a mixture of the two in the same list. The comment about that situation
didn't come close to telling the full truth, either. Aside from being
confusing, this had a couple of bad practical consequences:
* waste of planning cycles due to inability to cache per-clause selectivity
and cost estimates;
* sometimes, RestrictInfos would sneak into the fdw_private list of a
finished Plan node, causing failures if, for example, we tried to ship
the Plan tree to a parallel worker.
(It may well be that it's a bug in the parallel-query logic that we
would ever try to ship such a plan to a parallel worker, but in any
case this deserves to be cleaned up.)
To fix, rearrange so that clause lists in PgFdwRelationInfo are always
lists of RestrictInfos, and then strip the RestrictInfos at the last
minute when making a Plan node. In passing do a bit of refactoring and
comment cleanup in postgresGetForeignPlan and foreign_join_ok.
Although the messiness here dates back at least to 9.6, there's no evidence
that it causes anything worse than wasted planning cycles in 9.6, so no
back-patch for now.
Per fuzz testing by Andreas Seltenreich.
Tom Lane and Ashutosh Bapat
Discussion: https://postgr.es/m/87tw5x4vcu.fsf@credativ.de
This commit also does
- add REPLICATION_SUBSCRIBERS into config_group
- mark max_logical_replication_workers and max_sync_workers_per_subscription
as REPLICATION_SUBSCRIBERS parameters
- move those parameters into "Subscribers" section in postgresql.conf.sample
Author: Masahiko Sawada, Petr Jelinek and me
Reported-by: Masahiko Sawada
Discussion: http://postgr.es/m/CAD21AoAonSCoa=v=87ZO3vhfUZA1k_E2XRNHTt=xioWGUa+0ug@mail.gmail.com
Specifically, the behavior of general-purpose and statistical aggregates
as window functions was not clearly documented, and terms were
inconsistently used. Also add docs about the difference between
cume_dist and percent_rank, rather than just the formulas.
Discussion: 20170406214918.GA5757@momjian.us
This used to mean "Visual C++ except in those parts where Borland C++
was supported where it meant one of those". Now that we don't support
Borland C++ anymore, simplify by using _MSC_VER which is the normal way
to detect Visual C++.
This removes the support for building just libpq using Borland C++ or
Visual C++. This has not worked properly for years, and given the number
of complaints it's clearly not worth the maintenance burden.
Building libpq using the standard MSVC build system is of course still
supported, along with mingw.
As a consequence of commit 1d63f7d2d, on platforms with CLOCK_MONOTONIC,
you got some random timescale or other instead of standard Unix timestamps
as expected. I'd attempted to fix pgbench for that change in commits
74baa1e3b and 67a875355, but missed this place. Fix in the same way as
those previous commits, ie, just eat the cost of an extra gettimeofday();
one extra syscall per progress report isn't worth sweating over. Per
report from Jeff Janes.
In passing, use snprintf not sprintf for this purpose. I don't think
there's any chance of actual buffer overrun, but it just looks safer.
Discussion: https://postgr.es/m/CAMkU=1zrQaPwBN+NcBd3pWCb=vWaiL=mmWfJjDJjh-a7eVr-Og@mail.gmail.com
Commit 96a7128b made pg_dump and pg_dumpall sync their output by
default. However, there's no great need for that in testing, and it
could impose a performance penalty, so we add the --no-sync flag to most
of the test cases.
Michael Paquier
The previously used ShareRowExclusiveLock, while technically probably
more correct, led to deadlocks during seemingly unrelated operations and
thus a poor experience. Use RowExclusiveLock, like for most similar
catalog operations. In some care cases, the user might see an error
from DDL commands.
Discussion: https://www.postgresql.org/message-id/flat/13592.1490851519%40sss.pgh.pa.us
Author: Petr Jelinek <petr.jelinek@2ndquadrant.com>
The backend local copy of dsa_area_control->freed_segment_counter was
not properly initialized / maintained. This could, if unlucky, lead
to keeping attached to a segment for too long.
Found via valgrind bleat on buildfarm animal skink.
Author: Thomas Munro
Discussion: https://postgr.es/m/20170407164935.obsf2jipjfos5zei@alap3.anarazel.de
This extends the castNode() notation introduced by commit 5bcab1114 to
provide, in one step, extraction of a list cell's pointer and coercion to
a concrete node type. For example, "lfirst_node(Foo, lc)" is the same
as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode
that have appeared so far include a list extraction call, so this is
pretty widely useful, and it saves a few more keystrokes compared to the
old way.
As with the previous patch, back-patch the addition of these macros to
pg_list.h, so that the notation will be available when back-patching.
Patch by me, after an idea of Andrew Gierth's.
Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
We decided in f1b4c771ea74f42447dccaed42ffcdcccf3aa694 to pass the
original slot to ExecConstraints(), but that breaks when there are
BEFORE ROW triggers involved. So we need to do reverse-map the tuples
back to the original descriptor instead, as Amit originally proposed.
Amit Langote, reviewed by Ashutosh Bapat. One overlooked comment
fixed by me.
Discussion: http://postgr.es/m/b3a17254-6849-e542-2353-bde4e880b6a4@lab.ntt.co.jp
Commit 4deb41381 modified isolationtester's query to see whether a
session is blocked to also check for waits occurring in GetSafeSnapshot.
However, it did that in a way that enormously increased the query's
runtime under CLOBBER_CACHE_ALWAYS, causing the buildfarm members
that use that to run about four times slower than before, and in some
cases fail entirely. To fix, push the entire logic into a dedicated
backend function. This should actually reduce the CLOBBER_CACHE_ALWAYS
runtime from what it was previously, though I've not checked that.
In passing, expose a SQL function to check for safe-snapshot blockage,
comparable to pg_blocking_pids. This is more or less free given the
infrastructure built to solve the other problem, so we might as well.
Thomas Munro
Discussion: https://postgr.es/m/20170407165749.pstcakbc637opkax@alap3.anarazel.de
In commit 25542d77, regression test coverage was added to sepgsql
for partitioned tables. Unfortunately it was not robust in the face
of collation differences, per the buildfarm. Force "C" collation
in order to fix that.
Discussion: https://postgr.es/m/flat/623bcaae-112e-ced0-8c22-a84f75ae0c53%40joeconway.com
The new partitioned table capability added a new relkind, namely
RELKIND_PARTITIONED_TABLE. Update sepgsql to treat this new relkind
exactly the same way it does RELKIND_RELATION.
In addition, add regression test coverage for partitioned tables.
Issue raised by Stephen Frost and initial patch by Mike Palmiotto.
Review by Tom Lane and Robert Haas, and editorializing by me.
Discussion: https://postgr.es/m/flat/623bcaae-112e-ced0-8c22-a84f75ae0c53%40joeconway.com