When a PostgreSQL instance performing archive recovery but not using
standby mode is promoted, and the last WAL segment that it attempted
to read ended in a partial record, the previous code would create
invalid WAL on the new timeline. The WAL from the previously timeline
would be copied to the new timeline up until the end of the last valid
record, but instead of beginning to write WAL at immediately
afterwards, the promoted server would write an overwrite contrecord at
the beginning of the next segment. The end of the previous segment
would be left as all-zeroes, resulting in failures if anything tried
to read WAL from that file.
The root of the issue is that ReadRecord() decides whether to set
abortedRecPtr and missingContrecPtr based on the value of StandbyMode,
but ReadRecord() switches to a new timeline based on the value of
ArchiveRecoveryRequested. We shouldn't try to write an overwrite
contrecord if we're switching to a new timeline, so change the test in
ReadRecod() to check ArchiveRecoveryRequested instead.
Code fix by Dilip Kumar. Comments by me incorporating suggested
language from Álvaro Herrera. Further review from Kyotaro Horiguchi
and Sami Imseih.
Discussion: http://postgr.es/m/CAFiTN-t7umki=PK8dT1tcPV=mOUe2vNhHML6b3T7W7qqvvajjg@mail.gmail.com
Discussion: http://postgr.es/m/FB0DEA0B-E14E-43A0-811F-C1AE93D00FF3%40amazon.com
Buildfarm animals skate, grison and mamba are Assert failing on the
pointer being given to repalloc not being MAXALIGNED. c6e0fe1f2a made
changes in that area.
All of these animals are 32-bit with a MAXIMUM_ALIGNOF of 8 and a
SIZEOF_VOID_P of 4. I suspect that the pointer is not properly aligned due
to the lack of padding in the MemoryChunk struct.
Here we add the same type of padding that was previously used in
AllocChunkData and GenerationChunk that c6e0fe1f2a neglected to add.
Discussion: https://postgr.es/m/CAA4eK1%2B1JyW5TiL%3DyV-3Uq1CrfnTyn0Xrk5uArt31Z%3D8rgPhXQ%40mail.gmail.com
NEON support is required on the Aarch64 architecture for standard
implementations. Hardware designers for specialized markets can choose
not to support it, but that's true of floating point as well, which
we assume is supported. As with x86, some SIMD support is available
on 32-bit platforms, but those are not interesting from a performance
standpoint and would require an inconvenient runtime check.
Nathan Bossart
Reviewed by John Naylor, Andres Freund, Thomas Munro, and Tom Lane
Discussion: https://www.postgresql.org/message-id/flat/CAFBsxsEyR9JkfbPcDXBRYEfdfC__OkwVGdwEAgY4Rv0cvw35EA%40mail.gmail.com#aba7a64b11503494ffd8dd27067626a9
Whenever we palloc a chunk of memory, traditionally, we prefix the
returned pointer with a pointer to the memory context to which the chunk
belongs. This is required so that we're able to easily determine the
owning context when performing operations such as pfree() and repalloc().
For the AllocSet context, prior to this commit we additionally prefixed
the pointer to the owning context with the size of the chunk. This made
the header 16 bytes in size. This 16-byte overhead was required for all
AllocSet allocations regardless of the allocation size.
For the generation context, the problem was worse; in addition to the
pointer to the owning context and chunk size, we also stored a pointer to
the owning block so that we could track the number of freed chunks on a
block.
The slab allocator had a 16-byte chunk header.
The changes being made here reduce the chunk header size down to just 8
bytes for all 3 of our memory context types. For small to medium sized
allocations, this significantly increases the number of chunks that we can
fit on a given block which results in much more efficient use of memory.
Additionally, this commit completely changes the rule that pointers to
palloc'd memory must be directly prefixed by a pointer to the owning
memory context and instead, we now insist that they're directly prefixed
by an 8-byte value where the least significant 3-bits are set to a value
to indicate which type of memory context the pointer belongs to. Using
those 3 bits as an index (known as MemoryContextMethodID) to a new array
which stores the methods for each memory context type, we're now able to
pass the pointer given to functions such as pfree() and repalloc() to the
function specific to that context implementation to allow them to devise
their own methods of finding the memory context which owns the given
allocated chunk of memory.
The reason we're able to reduce the chunk header down to just 8 bytes is
because of the way we make use of the remaining 61 bits of the required
8-byte chunk header. Here we also implement a general-purpose MemoryChunk
struct which makes use of those 61 remaining bits to allow the storage of
a 30-bit value which the MemoryContext is free to use as it pleases, and
also the number of bytes which must be subtracted from the chunk to get a
reference to the block that the chunk is stored on (also 30 bits). The 1
additional remaining bit is to denote if the chunk is an "external" chunk
or not. External here means that the chunk header does not store the
30-bit value or the block offset. The MemoryContext can use these
external chunks at any time, but must use them if any of the two 30-bit
fields are not large enough for the value(s) that need to be stored in
them. When the chunk is marked as external, it is up to the MemoryContext
to devise its own means to determine the block offset.
Using 3-bits for the MemoryContextMethodID does mean we're limiting
ourselves to only having a maximum of 8 different memory context types.
We could reduce the bit space for the 30-bit value a little to make way
for more than 3 bits, but it seems like it might be better to do that only
if we ever need more than 8 context types. This would only be a problem
if some future memory context type which does not use MemoryChunk really
couldn't give up any of the 61 remaining bits in the chunk header.
With this MemoryChunk, each of our 3 memory context types can quickly
obtain a reference to the block any given chunk is located on. AllocSet
is able to find the context to which the chunk is owned, by first
obtaining a reference to the block by subtracting the block offset as is
stored in the 'hdrmask' field and then referencing the block's 'aset'
field. The Generation context uses the same method, but GenerationBlock
did not have a field pointing back to the owning context, so one is added
by this commit.
In aset.c and generation.c, all allocations larger than allocChunkLimit
are stored on dedicated blocks. When there's just a single chunk on a
block like this, it's easy to find the block from the chunk, we just
subtract the size of the block header from the chunk pointer. The size of
these chunks is also known as we store the endptr on the block, so we can
just subtract the pointer to the allocated memory from that. Because we
can easily find the owning block and the size of the chunk for these
dedicated blocks, we just always use external chunks for allocation sizes
larger than allocChunkLimit. For generation.c, this sidesteps the problem
of non-external MemoryChunks being unable to represent chunk sizes >= 1GB.
This is less of a problem for aset.c as we store the free list index in
the MemoryChunk's spare 30-bit field (the value of which will never be
close to using all 30-bits). We can easily reverse engineer the chunk size
from this when needed. Storing this saves AllocSetFree() from having to
make a call to AllocSetFreeIndex() to determine which free list to put the
newly freed chunk on.
For the slab allocator, this commit adds a new restriction that slab
chunks cannot be >= 1GB in size. If there happened to be any users of
slab.c which used chunk sizes this large, they really should be using
AllocSet instead.
Here we also add a restriction that normal non-dedicated blocks cannot be
1GB or larger. It's now not possible to pass a 'maxBlockSize' >= 1GB
during the creation of an AllocSet or Generation context. Allocations can
still be larger than 1GB, it's just these will always be on dedicated
blocks (which do not have the 1GB restriction).
Author: Andres Freund, David Rowley
Discussion: https://postgr.es/m/CAApHDvpjauCRXcgcaL6+e3eqecEHoeRm9D-kcbuvBitgPnW=vw@mail.gmail.com
It has been incorrectly assumed in commit 7f13ac8123 that we can either
purge all or none in the catalog modifying xids list retrieved from a
serialized snapshot. It is quite possible that some of the xids in that
array are old enough to be pruned but not others.
As per buildfarm
Author: Amit Kapila and Masahiko Sawada
Reviwed-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAA4eK1LBtv6ayE+TvCcPmC-xse=DVg=SmbyQD1nv_AaqcpUJEg@mail.gmail.com
This has as effect to add /DYNAMICBASE to the .dll and .exe files
generated by the builds, undoing 7f3e17b. Note that ASLR was already
enabled in MinGW as we have never added --disable-dynamicbase there.
This change will ease a bit the integration of arm64 with MSVC, as ASLR
support is mandatory in this case. So, thanks to this commit, we have
no need to make ASLR conditional depending on the architecture used for
the build.
Andres Freund has done a lot of testing with this option while working
on meson, without seeing /DYNAMICBASE as being a problem in the Windows
builds of the CI. Personally, not supporting anything older than
Windows 10 on HEAD makes me feel safer about this change, as we have
seen ASLR with being a problem in process invocation particularly with
Windows 8 and server 2012 back in 2014, even if Windows 10 was not
really a thing back then. 45e004f is also something that can help in
making the process invocation more stable. We are very early in the
development of Postgres 16, giving a lot of room to detect stability
issues if any.
Discussion: https://postgr.es/m/20220826012907.gjw3jdqdgsts5y65@awork3.anarazel.de
quote_identifier's API is designed on the assumption that it's
not worth worrying about a short-term memory leak when we have
to produce a quoted version of the given identifier. Whoever wrote
quote_object_name took it on themselves to override that judgment,
but the only way to do so is to cast away const someplace. We can
avoid that and substantially shorten the function by going along
with quote_identifier's opinion. AFAICS quote_object_name is not
used in any way where this would be unsustainable.
Per discussion of commit 45987aae2, which exposed that we had
a casting-away-const situation here.
Discussion: https://postgr.es/m/20220827112304.GL2342@telsasoft.com
While the bug I just fixed in the back branches doesn't exist in
HEAD, the requirement that MULTIEXPR SubPlans not share output
parameters still does. Add a comment to memorialize that, because
perhaps it could be an issue again someday.
Discussion: https://postgr.es/m/17596-c5357f61427a81dc@postgresql.org
Commit 121d2d3d70 included simd.h into pg_wchar.h. This caused a problem
on Windows, since Perl has "#define free" (referring to globals), which
breaks the Windows' header. To fix, move the static inline function
definitions from plperl_helpers.h, into plperl.h, where we already
document the necessary inclusion order. Since those functions were the
only reason for the existence of plperl_helpers.h, remove it.
First reported by Justin Pryzby
Diagnosis and review by Andres Freund, patch by myself per suggestion
from Tom Lane
Discussion: https://www.postgresql.org/message-id/20220826115546.GE2342%40telsasoft.com
While waiting for slots to become available in wait_on_slots() in
parallel_slot.c, the cancellation always relied on the first connection
in the set to do the job. This could cause problems when this slot's
socket is gone as PQgetCancel() would return NULL in this case. Rather
than always using the first connection, this changes the logic to use
the first valid connection for the cancellation.
Author: Ranier Vilela
Reviewed-by: Justin Pryzby
Discussion: https://postgr.es/m/CAEudQAokk1h_pUwGXsYS4oVOuf35s1O2o3TXGHpV8=AWikvgHA@mail.gmail.com
Backpatch-through: 14
Per flame graph from Jelte Fennema, COPY FROM ... USING BINARY shows
input validation taking at least 5% of the profile, so it's worth trying
to be more efficient here. With this change, validation of pure ASCII is
nearly 40% faster on contemporary Intel hardware. To make this change
legible and easier to adopt to additional architectures, use helper
functions to abstract the platform details away.
Reviewed by Nathan Bossart
Discussion: https://www.postgresql.org/message-id/CAFBsxsG%3Dk8t%3DC457FXnoBXb%3D8iA4OaZkbFogFMachWif7mNnww%40mail.gmail.com
The comment in basebackup.c updated by 33bd4698c11 was actually
obsolete to begin with, since the symbols it was referring to haven't
existed in that header file for quite some time. The header file is
still needed for other reasons, though, so keep the #include, just
drop the comment.
SUSv3 <netinet/in.h> defines struct sockaddr_in6, and all targeted Unix
systems have it. Windows has it in <ws2ipdef.h>. Remove the configure
probe, the macro and a small amount of dead code.
Also remove a mention of IPv6-less builds from the documentation, since
there aren't any.
This is similar to commits f5580882 and 077bf2f2 for Unix sockets. Even
though AF_INET6 is an "optional" component of SUSv3, there are no known
modern operating system without it, and it seems even less likely to be
omitted from future systems than AF_UNIX.
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA+hUKGKErNfhmvb_H0UprEmp4LPzGN06yR2_0tYikjzB-2ECMw@mail.gmail.com
In a similar effort to f01592f91, here we're targetting fixing the
warnings where we've deemed the shadowing variable to serve a close enough
purpose to the shadowed variable just to reuse the shadowed version and
not declare the shadowing variable at all.
By my count, this takes the warning count from 106 down to 71.
Author: Justin Pryzby
Discussion: https://postgr.es/m/20220825020839.GT2342@telsasoft.com
The GRANT statement can now specify WITH INHERIT TRUE or WITH
INHERIT FALSE to control whether the member inherits the granted
role's permissions. For symmetry, you can now likewise write
WITH ADMIN TRUE or WITH ADMIN FALSE to turn ADMIN OPTION on or off.
If a GRANT does not specify WITH INHERIT, the behavior based on
whether the member role is marked INHERIT or NOINHERIT. This means
that if all roles are marked INHERIT or NOINHERIT before any role
grants are performed, the behavior is identical to what we had before;
otherwise, it's different, because ALTER ROLE [NO]INHERIT now only
changes the default behavior of future grants, and has no effect on
existing ones.
Patch by me. Reviewed and testing by Nathan Bossart and Tushar Ahuja,
with design-level comments from various others.
Discussion: http://postgr.es/m/CA+Tgmoa5Sf4PiWrfxA=sGzDKg0Ojo3dADw=wAHOhR9dggV=RmQ@mail.gmail.com
The dependencies here aren't quite right independent of vpath builds or not,
but this at least makes vpath builds succeed. And it's pretty rare to change
the exports.txt file anyway... The referenced thread has a patch that will
clean that up further.
Discussion: https://postgr.es/m/20220820174213.d574qde4ptwdzoqz@awork3.anarazel.de
This is preparatory work for a project to increase the number of bits
in a RelFileNumber from 32 to 56.
Along the way, introduce static inline accessor functions for a couple
of BufferTag fields.
Dilip Kumar, reviewed by me. The overall patch series has also had
review at various times from Andres Freund, Ashutosh Sharma, Hannu
Krosing, Vignesh C, Álvaro Herrera, and Tom Lane.
Discussion: http://postgr.es/m/CAFiTN-trubju5YbWAq-BSpZ90-Z6xCVBQE8BVqXqANOZAF1Znw@mail.gmail.com
SplitToVariants() in the ispell code, lseg_inside_poly() in geo_ops.c,
and regex_selectivity_sub() in selectivity estimation could recurse
until stack overflow; fix by adding check_stack_depth() calls.
So could next() in the regex compiler, but that case is better fixed by
converting its tail recursion to a loop. (We probably get better code
that way too, since next() can now be inlined into its sole caller.)
There remains a reachable stack overrun in the Turkish stemmer, but
we'll need some advice from the Snowball people about how to fix that.
Per report from Egor Chindyaskin and Alexander Lakhin. These mistakes
are old, so back-patch to all supported branches.
Richard Guo and Tom Lane
Discussion: https://postgr.es/m/1661334672.728714027@f473.i.mail.ru
These should have been included in 421892a19 as these shadowed variable
warnings can also be fixed by adjusting the scope of the shadowed variable
to put the declaration for it in an inner scope.
This is part of the same effort as f01592f91.
By my count, this takes the warning count from 114 down to 106.
Author: David Rowley and Justin Pryzby
Discussion: https://postgr.es/m/CAApHDvrwLGBP%2BYw9vriayyf%3DXR4uPWP5jr6cQhP9au_kaDUhbA%40mail.gmail.com
It is not customary to install a shared library with a minor version
number (libpq.5.16.dylib) on macOS. We just need the file with the
major version number (libpq.5.dylib) and the one without version
number (libpq.dylib). This also matches the installation layout used
by Meson.
Discussion: https://www.postgresql.org/message-id/e0c44fb2-8b66-a4b9-b274-7ed3a1a0ab74@enterprisedb.com
This commit moves authn_id into a new global structure called
ClientConnectionInfo (mapping to a MyClientConnectionInfo for each
backend) which is intended to hold all the client information that
should be shared between the backend and any of its parallel workers,
access for extensions and triggers being the primary use case. There is
no need to push all the data of Port to the workers, and authn_id is
quite a generic concept so using a separate structure provides the best
balance (the name of the structure has been suggested by Robert Haas).
While on it, and per discussion as this would be useful for a potential
SYSTEM_USER that can be accessed through parallel workers, a second
field is added for the authentication method, copied directly from
Port.
ClientConnectionInfo is serialized and restored using a new parallel
key and a structure tracks the length of the authn_id, making the
addition of more fields straight-forward.
Author: Jacob Champion
Reviewed-by: Bertrand Drouvot, Stephen Frost, Robert Haas, Tom Lane,
Michael Paquier, Julien Rouhaud
Discussion: https://postgr.es/m/793d990837ae5c06a558d58d62de9378ab525d83.camel@vmware.com
In a similar effort to f01592f91, here we're targetting fixing the
warnings that -Wshadow=compatible-local produces that we can fix by moving
a variable to an inner scope to stop that variable from being shadowed by
another variable declared somewhere later in the function.
All of the warnings being fixed here are changing the scope of variables
which are being used as an iterator for a "for" loop. In each instance,
the fix happens to be changing the for loop to use the C99 type
initialization. Much of this code likely pre-dates our use of C99.
Reducing the scope of the outer scoped variable seems like the safest way
to fix these. Renaming seems more likely to risk patches using the wrong
variable. Reducing the scope is more likely to result in a compilation
failure after applying some future patch rather than introducing bugs with
it.
By my count, this takes the warning count from 129 down to 114.
Author: Justin Pryzby
Discussion: https://postgr.es/m/CAApHDvrwLGBP%2BYw9vriayyf%3DXR4uPWP5jr6cQhP9au_kaDUhbA%40mail.gmail.com