Fix lquery parsing to handle repeated flag characters correctly,
and to enforce the max label length correctly in some cases where
it did not before, and to detect empty labels in some cases where
it did not before.
In a more cosmetic vein, use a switch rather than if-then chains to
handle the different states, and avoid unnecessary checks on charlen
when looking for ASCII characters, and factor out multiple copies of
the label length checking code.
Tom Lane and Dmitry Belyavsky
Discussion: https://postgr.es/m/CADqLbzLVkBuPX0812o+z=c3i6honszsZZ6VQOSKR3VPbB56P3w@mail.gmail.com
The existing implementation of the ltree ~ lquery match operator is
sufficiently complex and undocumented that it's hard to tell exactly
what it does. But one thing it clearly gets wrong is the combination
of NOT symbols (!) and '*' symbols. A pattern such as '*.!foo.*'
should, by any ordinary understanding of regular expression behavior,
match any ltree that has at least one label that's not "foo". As best
we can tell by experimentation, what it's actually matching is any
ltree in which *no* label is "foo". That's surprising, and not at all
what the documentation says.
Now, that's arguably a useful behavior, so if we rewrite to fix the
bug we should provide some other way to get it. To do so, add the
ability to attach lquery quantifiers to non-'*' items as well as '*'s.
Then the pattern '!foo{,}' expresses "any ltree in which no label is
foo". For backwards compatibility, the default quantifier for non-'*'
items has to be "{1}", although the default for '*' items is '{,}'.
I wouldn't have done it like that in a green field, but it's not
totally horrible.
Armed with that, rewrite checkCond() from scratch. Treating '*' and
non-'*' items alike makes it simpler, not more complicated, so that
the function actually gets a lot shorter than it was.
Filip Rembiałkowski, Tom Lane, Nikita Glukhov, per a very
ancient bug report from M. Palm
Discussion: https://postgr.es/m/CAP_rww=waX2Oo6q+MbMSiZ9ktdj6eaJj0cQzNu=Ry2cCDij5fw@mail.gmail.com
Ensure that the type name is mentioned in all cases (bare "syntax error"
isn't that helpful). Avoid using the term "level", since that's not
used in the documentation. Phrase error position reports as "at
character N" not "in position N"; the latter seems ambiguous, and it's
certainly not how we say it elsewhere. For the same reason, make the
character position values be 1-based not 0-based. Provide a position
in more cases. (I continued to leave that out of messages complaining
about end-of-input, where it seemed pointless, as well as messages
complaining about overall input complexity, where fingering any one part
of the input would be arbitrary.)
Discussion: https://postgr.es/m/15582.1585529626@sss.pgh.pa.us
PostgreSQL provides set of template index access methods, where opclasses have
much freedom in the semantics of indexing. These index AMs are GiST, GIN,
SP-GiST and BRIN. There opclasses define representation of keys, operations on
them and supported search strategies. So, it's natural that opclasses may be
faced some tradeoffs, which require user-side decision. This commit implements
opclass parameters allowing users to set some values, which tell opclass how to
index the particular dataset.
This commit doesn't introduce new storage in system catalog. Instead it uses
pg_attribute.attoptions, which is used for table column storage options but
unused for index attributes.
In order to evade changing signature of each opclass support function, we
implement unified way to pass options to opclass support functions. Options
are set to fn_expr as the constant bytea expression. It's possible due to the
fact that opclass support functions are executed outside of expressions, so
fn_expr is unused for them.
This commit comes with some examples of opclass options usage. We parametrize
signature length in GiST. That applies to multiple opclasses: tsvector_ops,
gist__intbig_ops, gist_ltree_ops, gist__ltree_ops, gist_trgm_ops and
gist_hstore_ops. Also we parametrize maximum number of integer ranges for
gist__int_ops. However, the main future usage of this feature is expected
to be json, where users would be able to specify which way to index particular
json parts.
Catversion is bumped.
Discussion: https://postgr.es/m/d22c3a18-31c7-1879-fc11-4c1ce2f5e5af%40postgrespro.ru
Author: Nikita Glukhov, revised by me
Reviwed-by: Nikolay Shaplov, Robert Haas, Tom Lane, Tomas Vondra, Alvaro Herrera
Something like "*{2}.*{3}" should presumably mean the same as
"*{5}", but it didn't. Improve that.
Get rid of an undocumented and remarkably ugly (though not, as far as
I can tell, actually unsafe) static variable in favor of passing more
arguments to checkCond().
Reverse-engineer some commentary. This function, like all of ltree,
is still far short of what I would consider the minimum acceptable
level of internal documentation, but at least now it has more than
zero comments.
Although this certainly seems like a bug fix, people might not
thank us for changing query behavior in stable branches, so
no back-patch.
Nikita Glukhov, with cosmetic improvements by me
Discussion: https://postgr.es/m/CAP_rww=waX2Oo6q+MbMSiZ9ktdj6eaJj0cQzNu=Ry2cCDij5fw@mail.gmail.com
These uint16 fields could be overflowed by excessively long input,
producing strange results. Complain for invalid input.
Likewise check for out-of-range values of the repeat counts in lquery.
(We don't try too hard on that one, notably not bothering to detect
if atoi's result has overflowed.)
Also detect length overflow in ltree_concat.
In passing, be more consistent about whether "syntax error" messages
include the type name. Also, clarify the documentation about what
the size limit is.
This has been broken for a long time, so back-patch to all supported
branches.
Nikita Glukhov, reviewed by Benjie Gillam and Tomas Vondra
Discussion: https://postgr.es/m/CAP_rww=waX2Oo6q+MbMSiZ9ktdj6eaJj0cQzNu=Ry2cCDij5fw@mail.gmail.com
lca_inner() wasn't prepared for the possibility of getting no inputs.
Fix that, and make some cosmetic improvements to the code while at it.
Also, I thought the documentation of this function as returning the
"longest common prefix" of the paths was entirely misleading; it really
returns a path one shorter than the longest common prefix, for the typical
definition of "prefix". Don't use that term in the docs, and adjust the
examples to clarify what really happens.
This has been broken since its beginning, so back-patch to all supported
branches.
Per report from Hailong Li. Thanks to Pierre Ducroquet for diagnosing
and for the initial patch, though I whacked it around some and added
test cases.
Discussion: https://postgr.es/m/5b0d8e4f-f2a3-1305-d612-e00e35a7be66@qunar.com
I'd supposed that people would do this manually when creating new operator
classes, but the folly of that was exposed today. The tests seem fast
enough that we can just apply them during the normal regression tests.
contrib/isn fails the checks for lack of complete sets of cross-type
operators. That's a nice-to-have policy rather than a functional
requirement, so leave it as-is, but insert ORDER BY in the query to
ensure consistent cross-platform output.
Discussion: https://postgr.es/m/7076.1480446837@sss.pgh.pa.us
This isn't fully tested as yet, in particular I'm not sure that the
"foo--unpackaged--1.0.sql" scripts are OK. But it's time to get some
buildfarm cycles on it.
sepgsql is not converted to an extension, mainly because it seems to
require a very nonstandard installation process.
Dimitri Fontaine and Tom Lane
about best practice for including the module creation scripts: to wit
that you should suppress NOTICE messages. This avoids creating
regression failures by adding or removing comment lines in the module
scripts.
only remnant of this failed experiment is that the server will take
SET AUTOCOMMIT TO ON. Still TODO: provide some client-side autocommit
logic in libpq.
ltree_73.patch.gz - for 7.3 :
Fix ~ operation bug: eg '1.1.1' ~ '*.1'
ltree_74.patch.gz - for current CVS
Fix ~ operation bug: eg '1.1.1' ~ '*.1'
Add ? operation
Optimize index storage
Last change needs drop/create all ltree indexes, so only for 7.4
Teodor Sigaev
Create objects in public schema.
Make spacing/capitalization consistent.
Remove transaction block use for object creation.
Remove unneeded function GRANTs.