This reliably fails with -DRELCACHE_FORCE_RELEASE, as reported by
Andrew Dunstan, and could sometimes fail in normal operation, resulting
in a wrong persistence value being used for the transient table.
It's not immediately clear to me what effects that might have beyond
the risk of a crash while accessing OldHeap->rd_rel->relpersistence,
but it's probably not good.
Bug introduced by commit f41872d0c, and made substantially worse by
commit 85b506bbf, which added a second such access significantly
later than the heap_close. I doubt the first reference could fail
in a production scenario, but the second one definitely could.
Discussion: https://postgr.es/m/7b52f900-0579-cda9-ae2e-de5da17090e6@2ndQuadrant.com
Disallow CREATE SUBSCRIPTION and DROP SUBSCRIPTION in a transaction
block when the replication slot is to be created or dropped, since that
cannot be rolled back.
based on patch by Masahiko Sawada <sawada.mshk@gmail.com>
Add tab completion for publications and subscriptions. Also, to be able
to get a list of subscriptions, make pg_subscription world-readable but
revoke access to subconninfo using column privileges.
From: Michael Paquier <michael.paquier@gmail.com>
This makes the connection attempt from CREATE SUBSCRIPTION and from
WalReceiver interruptable by the user in case the libpq connection is
hanging. The previous coding required immediate shutdown (SIGQUIT) of
PostgreSQL in that situation.
From: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Tested-by: Thom Brown <thom@linux.com>
Allow VACUUM and Autovacuum to report the oldestxmin value they
used while cleaning tables, helping to make better sense out of
the other statistics we report in various cases.
The syslogger will write out the current stderr and csvlog names, if
it's running and there are any, to a new file in the data directory
called "current_logfiles". We take care to remove this file when it
might no longer be valid (but not at shutdown). The function
pg_current_logfile() can be used to read the entries in the file.
Gilles Darold, reviewed and modified by Karl O. Pinc, Michael
Paquier, and me. Further review by Álvaro Herrera and Christoph Berg.
Tom Lane observed buildfarm failures caused by the select_parallel
regression test trying to launch new parallel queries before the
worker slots used by the previous ones were freed. Try to fix this by
having the postmaster free the worker slots before it sends the
SIGUSR1 notifications to the registering process. This doesn't
completely eliminate the possibility that the user backend might
(correctly) observe the worker as dead before the slot is free, but I
believe it should make the window significantly narrower.
Patch by me, per complaint from Tom Lane. Reviewed by Amit Kapila.
Discussion: http://postgr.es/m/30673.1487310734@sss.pgh.pa.us
Currently, the whole row is shown without column names. Instead,
adopt a style similar to _bt_check_unique() in ExecFindPartition()
and show the failing key: (key1, ...) = (val1, ...).
Amit Langote, per a complaint from Simon Riggs. Reviewed by me;
I also adjusted the grammar in one of the comments.
Discussion: http://postgr.es/m/9f9dc7ae-14f0-4a25-5485-964d9bfc19bd@lab.ntt.co.jp
The final patch will be less messy if the prefetching support is
a bit better isolated, so do that.
Dilip Kumar, with some changes by me. The larger patch set of which
this is a part has been reviewed and tested by (at least) Andres
Freund, Amit Khandekar, Tushar Ahuja, Rafia Sabih, Haribabu Kommi, and
Thomas Munro.
Also, recursively perform VACUUM and ANALYZE on partitions when the
command is applied to a partitioned table. In passing, some related
documentation updates.
Amit Langote, reviewed by Michael Paquier, Ashutosh Bapat, and by me.
Discussion: http://postgr.es/m/47288cf1-f72c-dfc2-5ff0-4af962ae5c1b@lab.ntt.co.jp
Likewise in RestoreSnapshot(). Do so by copying between the user buffer
and a stack buffer of known alignment. Back-patch to 9.6, where this
last applies cleanly. In master, the select_parallel test dies with
SIGBUS on "Oracle Solaris 10 1/13 s10s_u11wos_24a SPARC", building
32-bit with gcc 4.9.2. In 9.6 and 9.5, the buffers in question happen
to be sufficiently-aligned, and this change is mere insurance against
future 9.6 changes or extension code compromising that.
Newer Perl or IPC::Run versions default to appending the filename to string
exceptions, e.g. the exception
psql timed out
is thrown as
psql timed out at /usr/share/perl5/vendor_perl/IPC/Run.pm line 2961.
To handle this, match exceptions with !~ rather than ne.
From: Craig Ringer <craig@2ndquadrant.com>
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
As with commit 30df93f698d016d086e8961aa6c6076b37ea0ef4 and commit
b0f18cb77f50a54e997d857d592f6a511617f52c, the goal here is to move all
of the related page modifications to a single section of code, in
preparation for adding write-ahead logging.
Amit Kapila, with slight changes by me. The larger patch series of
which this is a part has been reviewed and tested by Álvaro Herrera,
Ashutosh Sharma, Mark Kirkwood, Jeff Janes, and Jesper Pedersen.
In the previous commit I'd made MemoryContextContains() use
GetMemoryChunkContext(), but that causes trouble when the passed
pointer isn't allocated in any memory context - that's probably
something we shouldn't do, but the previous commit isn't a place for a
"policy" change.
The README was written as a "historical account", and that style
hasn't aged particularly well. Rephrase it to describe the current
situation, instead of having various version specific comments.
This also updates the description of how allocated chunks are
associated with their corresponding context, the method of which has
changed in the preceding commit.
Author: Andres Freund
Discussion: https://postgr.es/m/20170228074420.aazv4iw6k562mnxg@alap3.anarazel.de
The new slab allocator needs different per-allocation information than
the classical aset.c. The definition in 58b25e981 wasn't sufficiently
careful on 32 platforms with 8 byte alignment, leading to buildfarm
failures. That's not entirely easy to fix by just adjusting the
definition.
As slab.c doesn't actually need the size part(s) of the common header,
all chunks are equally sized after all, it seems better to instead
reduce the header to the part needed by all allocators, namely which
context an allocation belongs to. That has the advantage of reducing
the overhead of slab allocations, and also allows for more flexibility
in future allocators.
To avoid spreading the logic about accessing a chunk's context around,
centralize it in GetMemoryChunkContext(), which allows to delete a
good number of lines.
A followup commit will revise the mmgr/README portion about
StandardChunkHeader, and more.
Author: Andres Freund
Discussion: https://postgr.es/m/20170228074420.aazv4iw6k562mnxg@alap3.anarazel.de
Previously, only IndexTuple format was supported for the output data of
an index-only scan. This is fine for btree, which is just returning a
verbatim index tuple anyway. It's not so fine for SP-GiST, which can
return reconstructed data that's much larger than a page.
To fix, extend the index AM API so that index-only scan data can be
returned in either HeapTuple or IndexTuple format. There's other ways
we could have done it, but this way avoids an API break for index AMs
that aren't concerned with the issue, and it costs little except a couple
more fields in IndexScanDescs.
I changed both GiST and SP-GiST to use the HeapTuple method. I'm not
very clear on whether GiST can reconstruct data that's too large for an
IndexTuple, but that seems possible, and it's not much of a code change to
fix.
Per a complaint from Vik Fearing. Reviewed by Jason Li.
Discussion: https://postgr.es/m/49527f79-530d-0bfe-3dad-d183596afa92@2ndquadrant.fr
As with commit b0f18cb77f50a54e997d857d592f6a511617f52c, the goal
here is to move all of the related page modifications to a single
section of code, in preparation for adding write-ahead logging.
Amit Kapila, with slight changes by me. The larger patch series
of which this is a part has been reviewed and tested by Álvaro
Herrera, Ashutosh Sharma, Mark Kirkwood, Jeff Janes, and Jesper
Pedersen, all of whom should also have been credited in the
previous commit message.
In preparation for adding write-ahead logging to hash indexes,
refactor _hash_freeovflpage and _hash_squeezebucket so that all
related page modifications happen in a single section of code. The
previous coding assumed that it would be fine to move tuples one at a
time, and also that the various operations involved in freeing an
overflow page didn't necessarily all need to be done together, all
of which is true if you don't care about write-ahead logging.
Amit Kapila, with slight changes by me.
PL/Tcl has long had a facility whereby Tcl code could be autoloaded from
a database table named "pltcl_modules". However, nobody is using it, as
evidenced by the recent discovery that it's never been fixed to work with
standard_conforming_strings turned on. Moreover, it's rather shaky from
a security standpoint, and the table design is very old and crufty (partly
because it dates from before we had TOAST). A final problem is that
because the table-population scripts depend on the Tcl client library
Pgtcl, which we removed from the core distribution in 2004, it's
impossible to create a self-contained regression test for the feature.
Rather than try to surmount these problems, let's just remove it.
A follow-on patch will provide a way to execute user-defined
initialization code, similar to features that exist in plperl and plv8.
With that, it will be possible to implement this feature or similar ones
entirely in userspace, which is where it belongs.
Discussion: https://postgr.es/m/22067.1488046447@sss.pgh.pa.us
PQerrorMessage() returns an error message with a trailing newline, but
in backend use (dblink, postgres_fdw, libpqwalreceiver), we want to have
the error message without that for emitting via ereport(). To simplify
that, add a function pchomp() that returns a pstrdup'ed string with the
trailing newline characters removed.
Note that this change alone does not yet fully address the performance
problems triggering this work, a large portion of the slowdown is
triggered by the tuple allocator, which isn't converted to the new
allocator. It would be possible to do so, but using evenly sized
objects, like both the current implementation in reorderbuffer.c and
slab.c, wastes a fair amount of memory. A later patch by Tomas will
introduce a better approach.
Author: Tomas Vondra
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
The default general purpose aset.c style memory context is not a great
choice for allocations that are all going to be evenly sized,
especially when those objects aren't small, and have varying
lifetimes. There tends to be a lot of fragmentation, larger
allocations always directly go to libc rather than have their cost
amortized over several pallocs.
These problems lead to the introduction of ad-hoc slab allocators in
reorderbuffer.c. But it turns out that the simplistic implementation
leads to problems when a lot of objects are allocated and freed, as
aset.c is still the underlying implementation. Especially freeing can
easily run into O(n^2) behavior in aset.c.
While the O(n^2) behavior in aset.c can, and probably will, be
addressed, custom allocators for this behavior are more efficient
both in space and time.
This allocator is for evenly sized allocations, and supports both
cheap allocations and freeing, without fragmenting significantly. It
does so by allocating evenly sized blocks via malloc(), and carves
them into chunks that can be used for allocations. In order to
release blocks to the OS as early as possible, chunks are allocated
from the fullest block that still has free objects, increasing the
likelihood of a block being entirely unused.
A subsequent commit uses this in reorderbuffer.c, but a further
allocator is needed to resolve the performance problems triggering
this work.
There likely are further potentialy uses of this allocator besides
reorderbuffer.c.
There's potential further optimizations of the new slab.c, in
particular the array of freelists could be replaced by a more
intelligent structure - but for now this looks more than good enough.
Author: Tomas Vondra, editorialized by Andres Freund
Reviewed-By: Andres Freund, Petr Jelinek, Robert Haas, Jim Nasby
Discussion: https://postgr.es/m/d15dff83-0b37-28ed-0809-95a5cc7292ad@2ndquadrant.com
An upcoming patch introduces a new type of memory context. To avoid
duplicating debugging infrastructure within aset.c, move useful pieces
to memdebug.[ch].
While touching aset.c, fix printf format code in AllocFree* debug
macros.
Author: Tomas Vondra
Reviewed-By: Andres Freund
Discussion: https://postgr.es/m/b3b2245c-b37a-e1e5-ebc4-857c914bc747@2ndquadrant.com
Output a message about checkpoint starting in verbose mode of
pg_basebackup, and make the documentation state more clearly that this
happens.
Author: Michael Banck
This is expected to be useful mostly when performing such scans in
parallel, because in that case it allows (in combination with commit
acf555bc53acb589b5a2827e65d655fa8c9adee0) nodes below a Gather to get
control just before the DSM segment goes away.
KaiGai Kohei, except that I rewrote the documentation. Reviewed by
Claudio Freire.
Discussion: http://postgr.es/m/CADyhKSXJK0jUJ8rWv4AmKDhsUh124_rEn39eqgfC5D8fu6xVuw@mail.gmail.com
I removed this in commit 9e3755ecb, reasoning that the win32.h
port-specific header file included by c.h would have provided it.
However, that's only true on native win32 builds, not Cygwin builds.
It may be that some of the other <windows.h> inclusions also need
to be put back on the same grounds; but this is the only one that
is clearly meant to be included #ifdef __CYGWIN__, so maybe this is
the extent of the problem. Awaiting further buildfarm results.
We had some AC_CHECK_HEADER tests that were really wastes of cycles,
because the code proceeded to #include those headers unconditionally
anyway, in all or a large majority of cases. The lack of complaints
shows that those headers are available on every platform of interest,
so we might as well let configure run a bit faster by not probing
those headers at all.
I suspect that some of the tests I left alone are equally useless, but
since all the existing #includes of the remaining headers are properly
guarded, I didn't touch them.
c.h #includes a number of core libc header files, such as <stdio.h>.
There's no point in re-including these after having read postgres.h,
postgres_fe.h, or c.h; so remove code that did so.
While at it, also fix some places that were ignoring our standard pattern
of "include postgres[_fe].h, then system header files, then other Postgres
header files". While there's not any great magic in doing it that way
rather than system headers last, it's silly to have just a few files
deviating from the general pattern. (But I didn't attempt to enforce this
globally, only in files I was touching anyway.)
I'd be the first to say that this is mostly compulsive neatnik-ism,
but over time it might save enough compile cycles to be useful.