Adjust text search documentation for recent commits.
Fix some now-obsolete statements that were overlooked in commits 6734a1cac, 3dbbd0f02, 028350f61. Document the behavior of <0>. Also do a little bit of rearranging and copy-editing for clarity.
This commit is contained in:
parent
8dee039fa1
commit
4242a715c3
@ -3885,12 +3885,12 @@ SELECT 'a:1A fat:2B,4C cat:5D'::tsvector;
|
|||||||
|
|
||||||
<para>
|
<para>
|
||||||
It is important to understand that the
|
It is important to understand that the
|
||||||
<type>tsvector</type> type itself does not perform any normalization;
|
<type>tsvector</type> type itself does not perform any word
|
||||||
it assumes the words it is given are normalized appropriately
|
normalization; it assumes the words it is given are normalized
|
||||||
for the application. For example,
|
appropriately for the application. For example,
|
||||||
|
|
||||||
<programlisting>
|
<programlisting>
|
||||||
select 'The Fat Rats'::tsvector;
|
SELECT 'The Fat Rats'::tsvector;
|
||||||
tsvector
|
tsvector
|
||||||
--------------------
|
--------------------
|
||||||
'Fat' 'Rats' 'The'
|
'Fat' 'Rats' 'The'
|
||||||
@ -3929,12 +3929,20 @@ SELECT to_tsvector('english', 'The Fat Rats');
|
|||||||
<literal><-></> (FOLLOWED BY). There is also a variant
|
<literal><-></> (FOLLOWED BY). There is also a variant
|
||||||
<literal><<replaceable>N</>></literal> of the FOLLOWED BY
|
<literal><<replaceable>N</>></literal> of the FOLLOWED BY
|
||||||
operator, where <replaceable>N</> is an integer constant that
|
operator, where <replaceable>N</> is an integer constant that
|
||||||
specifies a maximum distance between the two lexemes being searched
|
specifies the distance between the two lexemes being searched
|
||||||
for. <literal><-></> is equivalent to <literal><1></>.
|
for. <literal><-></> is equivalent to <literal><1></>.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Parentheses can be used to enforce grouping of the operators:
|
Parentheses can be used to enforce grouping of these operators.
|
||||||
|
In the absence of parentheses, <literal>!</> (NOT) binds most tightly,
|
||||||
|
<literal><-></literal> (FOLLOWED BY) next most tightly, then
|
||||||
|
<literal>&</literal> (AND), with <literal>|</literal> (OR) binding
|
||||||
|
the least tightly.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Here are some examples:
|
||||||
|
|
||||||
<programlisting>
|
<programlisting>
|
||||||
SELECT 'fat & rat'::tsquery;
|
SELECT 'fat & rat'::tsquery;
|
||||||
@ -3951,17 +3959,21 @@ SELECT 'fat & rat & ! cat'::tsquery;
|
|||||||
tsquery
|
tsquery
|
||||||
------------------------
|
------------------------
|
||||||
'fat' & 'rat' & !'cat'
|
'fat' & 'rat' & !'cat'
|
||||||
|
|
||||||
|
SELECT '(fat | rat) <-> cat'::tsquery;
|
||||||
|
tsquery
|
||||||
|
-----------------------------------
|
||||||
|
'fat' <-> 'cat' | 'rat' <-> 'cat'
|
||||||
</programlisting>
|
</programlisting>
|
||||||
|
|
||||||
In the absence of parentheses, <literal>!</> (NOT) binds most tightly,
|
The last example demonstrates that <type>tsquery</type> sometimes
|
||||||
and <literal>&</literal> (AND) and <literal><-></literal> (FOLLOWED BY)
|
rearranges nested operators into a logically equivalent formulation.
|
||||||
both bind more tightly than <literal>|</literal> (OR).
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Optionally, lexemes in a <type>tsquery</type> can be labeled with
|
Optionally, lexemes in a <type>tsquery</type> can be labeled with
|
||||||
one or more weight letters, which restricts them to match only
|
one or more weight letters, which restricts them to match only
|
||||||
<type>tsvector</> lexemes with matching weights:
|
<type>tsvector</> lexemes with one of those weights:
|
||||||
|
|
||||||
<programlisting>
|
<programlisting>
|
||||||
SELECT 'fat:ab & cat'::tsquery;
|
SELECT 'fat:ab & cat'::tsquery;
|
||||||
@ -3981,25 +3993,7 @@ SELECT 'super:*'::tsquery;
|
|||||||
'super':*
|
'super':*
|
||||||
</programlisting>
|
</programlisting>
|
||||||
This query will match any word in a <type>tsvector</> that begins
|
This query will match any word in a <type>tsvector</> that begins
|
||||||
with <quote>super</>. Note that prefixes are first processed by
|
with <quote>super</>.
|
||||||
text search configurations, which means this comparison returns
|
|
||||||
true:
|
|
||||||
<programlisting>
|
|
||||||
SELECT to_tsvector( 'postgraduate' ) @@ to_tsquery( 'postgres:*' );
|
|
||||||
?column?
|
|
||||||
----------
|
|
||||||
t
|
|
||||||
(1 row)
|
|
||||||
</programlisting>
|
|
||||||
because <literal>postgres</> gets stemmed to <literal>postgr</>:
|
|
||||||
<programlisting>
|
|
||||||
SELECT to_tsquery('postgres:*');
|
|
||||||
to_tsquery
|
|
||||||
------------
|
|
||||||
'postgr':*
|
|
||||||
(1 row)
|
|
||||||
</programlisting>
|
|
||||||
which then matches <literal>postgraduate</>.
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
@ -4015,6 +4009,24 @@ SELECT to_tsquery('Fat:ab & Cats');
|
|||||||
------------------
|
------------------
|
||||||
'fat':AB & 'cat'
|
'fat':AB & 'cat'
|
||||||
</programlisting>
|
</programlisting>
|
||||||
|
|
||||||
|
Note that <function>to_tsquery</> will process prefixes in the same way
|
||||||
|
as other words, which means this comparison returns true:
|
||||||
|
|
||||||
|
<programlisting>
|
||||||
|
SELECT to_tsvector( 'postgraduate' ) @@ to_tsquery( 'postgres:*' );
|
||||||
|
?column?
|
||||||
|
----------
|
||||||
|
t
|
||||||
|
</programlisting>
|
||||||
|
because <literal>postgres</> gets stemmed to <literal>postgr</>:
|
||||||
|
<programlisting>
|
||||||
|
SELECT to_tsvector( 'postgraduate' ), to_tsquery( 'postgres:*' );
|
||||||
|
to_tsvector | to_tsquery
|
||||||
|
---------------+------------
|
||||||
|
'postgradu':1 | 'postgr':*
|
||||||
|
</programlisting>
|
||||||
|
which will match the stemmed form of <literal>postgraduate</>.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
</sect2>
|
</sect2>
|
||||||
|
@ -322,8 +322,7 @@ text @@ text
|
|||||||
match. Similarly, the <literal>|</literal> (OR) operator specifies that
|
match. Similarly, the <literal>|</literal> (OR) operator specifies that
|
||||||
at least one of its arguments must appear, while the <literal>!</> (NOT)
|
at least one of its arguments must appear, while the <literal>!</> (NOT)
|
||||||
operator specifies that its argument must <emphasis>not</> appear in
|
operator specifies that its argument must <emphasis>not</> appear in
|
||||||
order to have a match. Parentheses can be used to control nesting of
|
order to have a match.
|
||||||
these operators.
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
@ -346,10 +345,10 @@ SELECT to_tsvector('error is not fatal') @@ to_tsquery('fatal <-> error');
|
|||||||
|
|
||||||
There is a more general version of the FOLLOWED BY operator having the
|
There is a more general version of the FOLLOWED BY operator having the
|
||||||
form <literal><<replaceable>N</>></literal>,
|
form <literal><<replaceable>N</>></literal>,
|
||||||
where <replaceable>N</> is an integer standing for the exact distance
|
where <replaceable>N</> is an integer standing for the difference between
|
||||||
allowed between the matching lexemes. <literal><1></literal> is
|
the positions of the matching lexemes. <literal><1></literal> is
|
||||||
the same as <literal><-></>, while <literal><2></literal>
|
the same as <literal><-></>, while <literal><2></literal>
|
||||||
allows one other lexeme to appear between the matches, and so
|
allows exactly one other lexeme to appear between the matches, and so
|
||||||
on. The <literal>phraseto_tsquery</> function makes use of this
|
on. The <literal>phraseto_tsquery</> function makes use of this
|
||||||
operator to construct a <literal>tsquery</> that can match a multi-word
|
operator to construct a <literal>tsquery</> that can match a multi-word
|
||||||
phrase when some of the words are stop words. For example:
|
phrase when some of the words are stop words. For example:
|
||||||
@ -366,9 +365,17 @@ SELECT phraseto_tsquery('the cats ate the rats');
|
|||||||
'cat' <-> 'ate' <2> 'rat'
|
'cat' <-> 'ate' <2> 'rat'
|
||||||
</programlisting>
|
</programlisting>
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
The precedence of tsquery operators is as follows: <literal>|</literal>, <literal>&</literal>,
|
A special case that's sometimes useful is that <literal><0></literal>
|
||||||
<literal><-></literal>, <literal>!</literal>.
|
can be used to require that two patterns match the same word.
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Parentheses can be used to control nesting of the <type>tsquery</>
|
||||||
|
operators. Without parentheses, <literal>|</literal> binds least tightly,
|
||||||
|
then <literal>&</literal>, then <literal><-></literal>,
|
||||||
|
and <literal>!</literal> most tightly.
|
||||||
</para>
|
</para>
|
||||||
</sect2>
|
</sect2>
|
||||||
|
|
||||||
@ -1423,9 +1430,10 @@ FROM (SELECT id, body, q, ts_rank_cd(ti, q) AS rank
|
|||||||
lacks any position or weight information. The result is usually much
|
lacks any position or weight information. The result is usually much
|
||||||
smaller than an unstripped vector, but it is also less useful.
|
smaller than an unstripped vector, but it is also less useful.
|
||||||
Relevance ranking does not work as well on stripped vectors as
|
Relevance ranking does not work as well on stripped vectors as
|
||||||
unstripped ones. Also, when given stripped input,
|
unstripped ones. Also,
|
||||||
the <literal><-></> (FOLLOWED BY) <type>tsquery</> operator
|
the <literal><-></> (FOLLOWED BY) <type>tsquery</> operator
|
||||||
effectively degenerates to a simple <literal>&</> (AND) test.
|
will never match stripped input, since it cannot determine the
|
||||||
|
distance between lexeme occurrences.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user