The chr() function used PG_GETARG_UINT32() even though the argument is
declared as (signed) integer. As a result, you can pass negative
arguments to this function and it internally interprets them as
positive. Ultimately ends up being harmless, but it seems wrong, so
fix this and rearrange the internal error checking a bit to
accommodate this.
Another case was in the documentation, where example code used
PG_GETARG_UINT32() with an argument declared as signed integer.
Reviewed-by: Nathan Bossart <bossartn@amazon.com>
Discussion: https://www.postgresql.org/message-id/flat/7e43869b-d412-8f81-30a3-809783edc9a3%40enterprisedb.com
The repeat() function loops for potentially a long time without
ever checking for interrupts. This prevents, for example, a query
cancel from interrupting until the work is all done. Fix by
inserting a CHECK_FOR_INTERRUPTS() into the loop.
Backpatch to all supported versions.
Discussion: https://www.postgresql.org/message-id/flat/8692553c-7fe8-17d9-cbc1-7cddb758f4c6%40joeconway.com
Similar to commits 7e735035f2 and dddf4cdc33, this commit makes the order
of header file inclusion consistent for backend modules.
In the passing, removed a couple of duplicate inclusions.
Author: Vignesh C
Reviewed-by: Kuntal Ghosh and Amit Kapila
Discussion: https://postgr.es/m/CALDaNm2Sznv8RR6Ex-iJO6xAdsxgWhCoETkaYX=+9DW3q0QCfA@mail.gmail.com
A previous commit added inline functions that provide fast(er) and
correct overflow checks for signed integer math. Use them in a
significant portion of backend code. There's more to touch in both
backend and frontend code, but these were the easily identifiable
cases.
The old overflow checks are noticeable in integer heavy workloads.
A secondary benefit is that getting rid of overflow checks that rely
on signed integer overflow wrapping around, will allow us to get rid
of -fwrapv in the future. Which in turn slows down other code.
Author: Andres Freund
Discussion: https://postgr.es/m/20171024103954.ztmatprlglz3rwke@alap3.anarazel.de
Several years ago we changed chr(int) so that if the database encoding is
UTF8, it would interpret its argument as a Unicode code point and expand it
into the appropriate multibyte sequence. However, we weren't sufficiently
careful about checking validity of the input. According to RFC3629, UTF8
disallows code points above U+10FFFF (note that the predecessor standard
RFC2279 was more liberal). Also, both versions of the UTF8 spec agree
that Unicode surrogate-pair codes should never appear in UTF8. Because
our encoding validity checks follow RFC3629, our failure to enforce these
restrictions in chr() means it could be used to produce text strings that
will be rejected when the database is dumped and reloaded. To ensure
consistency with the input functions, let's actually apply
pg_utf8_islegal() to the proposed output of chr().
Per discussion, this seems like too much of a behavioral change to
back-patch, but it's not too late to squeeze it into 9.4.
This adds collation support for columns and domains, a COLLATE clause
to override it per expression, and B-tree index support.
Peter Eisentraut
reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch
warnings. Clean up various unneeded cruft that was left behind after
creating those routines. Introduce some convenience functions str_tolower_z
etc to eliminate tedious and error-prone double arguments in formatting.c.
(Currently there seems no need to export the latter, but maybe reconsider
this later.)
strings. This patch introduces four support functions cstring_to_text,
cstring_to_text_with_len, text_to_cstring, and text_to_cstring_buffer, and
two macros CStringGetTextDatum and TextDatumGetCString. A number of
existing macros that provided variants on these themes were removed.
Most of the places that need to make such conversions now require just one
function or macro call, in place of the multiple notational layers that used
to be needed. There are no longer any direct calls of textout or textin,
and we got most of the places that were using handmade conversions via
memcpy (there may be a few still lurking, though).
This commit doesn't make any serious effort to eliminate transient memory
leaks caused by detoasting toasted text objects before they reach
text_to_cstring. We changed PG_GETARG_TEXT_P to PG_GETARG_TEXT_PP in a few
places where it was easy, but much more could be done.
Brendan Jurd and Tom Lane
to not cause needless copying of text datums that have 1-byte headers.
Greg Stark, in response to performance gripe from Guillaume Smet and
ITAGAKI Takahiro.
database via builtin functions, as recently discussed on -hackers.
chr() now returns a character in the database encoding. For UTF8 encoded databases
the argument is treated as a Unicode code point. For other multi-byte encodings
the argument must designate a strict ascii character, or an error is raised,
as is also the case if the argument is 0.
ascii() is adjusted so that it remains the inverse of chr().
The two argument form of convert() is gone, and the three argument form now
takes a bytea first argument and returns a bytea. To cover this loss three new
functions are introduced:
. convert_from(bytea, name) returns text - converts the first argument from the
named encoding to the database encoding
. convert_to(text, name) returns bytea - converts the first argument from the
database encoding to the named encoding
. length(bytea, name) returns int - gives the length of the first argument in
characters in the named encoding
Get rid of VARATT_SIZE and VARATT_DATA, which were simply redundant with
VARSIZE and VARDATA, and as a consequence almost no code was using the
longer names. Rename the length fields of struct varlena and various
derived structures to catch anyplace that was accessing them directly;
and clean up various places so caught. In itself this patch doesn't
change any behavior at all, but it is necessary infrastructure if we hope
to play any games with the representation of varlena headers.
Greg Stark and Tom Lane