Sync examples of psql \dF output with current CVS HEAD behavior.
Random other wordsmithing.
This commit is contained in:
parent
6d871a2538
commit
fcc6756341
@ -1,7 +1,15 @@
|
|||||||
|
<!-- $PostgreSQL: pgsql/doc/src/sgml/textsearch.sgml,v 1.16 2007/09/04 03:46:36 tgl Exp $ -->
|
||||||
|
|
||||||
<chapter id="textsearch">
|
<chapter id="textsearch">
|
||||||
|
<title id="textsearch-title">Full Text Search</title>
|
||||||
|
|
||||||
<title>Full Text Search</title>
|
<indexterm zone="textsearch">
|
||||||
|
<primary>full text search</primary>
|
||||||
|
</indexterm>
|
||||||
|
|
||||||
|
<indexterm zone="textsearch">
|
||||||
|
<primary>text search</primary>
|
||||||
|
</indexterm>
|
||||||
|
|
||||||
<sect1 id="textsearch-intro">
|
<sect1 id="textsearch-intro">
|
||||||
<title>Introduction</title>
|
<title>Introduction</title>
|
||||||
@ -67,43 +75,52 @@
|
|||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
<emphasis>Parsing documents into <firstterm>lexemes</></emphasis>. It is
|
<emphasis>Parsing documents into <firstterm>lexemes</></emphasis>. It is
|
||||||
useful to identify various lexemes, e.g. digits, words, complex words,
|
useful to identify various classes of lexemes, e.g. digits, words,
|
||||||
email addresses, so they can be processed differently. In principle
|
complex words, email addresses, so that they can be processed
|
||||||
lexemes depend on the specific application but for an ordinary search it
|
differently. In principle lexeme classes depend on the specific
|
||||||
is useful to have a predefined list of lexemes. <!-- add list of lexemes.
|
application but for an ordinary search it is useful to have a predefined
|
||||||
-->
|
set of classes.
|
||||||
|
<productname>PostgreSQL</productname> uses a <firstterm>parser</> to
|
||||||
|
perform this step. A standard parser is provided, and custom parsers
|
||||||
|
can be created for specific needs.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
|
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
<emphasis>Dictionaries</emphasis> allow the conversion of lexemes into
|
<emphasis>Converting lexemes into <firstterm>normalized
|
||||||
a <emphasis>normalized form</emphasis> so it is not necessary to enter
|
form</></emphasis>. This allows searches to find variant forms of the
|
||||||
search words in a specific form.
|
same word, without tediously entering all the possible variants.
|
||||||
|
Also, this step typically eliminates <firstterm>stop words</>, which
|
||||||
|
are words that are so common that they are useless for searching.
|
||||||
|
<productname>PostgreSQL</productname> uses <firstterm>dictionaries</> to
|
||||||
|
perform this step. Various standard dictionaries are provided, and
|
||||||
|
custom ones can be created for specific needs.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
|
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
<emphasis>Store</emphasis> preprocessed documents optimized for
|
<emphasis>Storing preprocessed documents optimized for
|
||||||
searching. For example, represent each document as a sorted array
|
searching</emphasis>. For example, each document can be represented
|
||||||
of lexemes. Along with lexemes it is desirable to store positional
|
as a sorted array of normalized lexemes. Along with the lexemes it is
|
||||||
information to use for <varname>proximity ranking</varname>, so that
|
desirable to store positional information to use for <firstterm>proximity
|
||||||
a document which contains a more "dense" region of query words is
|
ranking</firstterm>, so that a document which contains a more
|
||||||
|
<quote>dense</> region of query words is
|
||||||
assigned a higher rank than one with scattered query words.
|
assigned a higher rank than one with scattered query words.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Dictionaries allow fine-grained control over how lexemes are created. With
|
Dictionaries allow fine-grained control over how lexemes are normalized.
|
||||||
dictionaries you can:
|
With dictionaries you can:
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<itemizedlist spacing="compact" mark="bullet">
|
<itemizedlist spacing="compact" mark="bullet">
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
Define "stop words" that should not be indexed.
|
Define stop words that should not be indexed.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
|
|
||||||
@ -135,13 +152,12 @@
|
|||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
A data type (<xref linkend="datatype-textsearch">), <type>tsvector</type>
|
A data type <type>tsvector</type> is provided for storing preprocessed
|
||||||
is provided, for storing preprocessed documents,
|
documents, along with a type <type>tsquery</type> for representing processed
|
||||||
along with a type <type>tsquery</type> for representing textual
|
queries (<xref linkend="datatype-textsearch">). Also, a full text search
|
||||||
queries. Also, a full text search operator <literal>@@</literal> is defined
|
operator <literal>@@</literal> is defined for these data types (<xref
|
||||||
for these data types (<xref linkend="textsearch-searches">). Full text
|
linkend="textsearch-searches">). Full text searches can be accelerated
|
||||||
searches can be accelerated using indexes (<xref
|
using indexes (<xref linkend="textsearch-indexes">).
|
||||||
linkend="textsearch-indexes">).
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
|
|
||||||
@ -154,20 +170,20 @@
|
|||||||
</indexterm>
|
</indexterm>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
A document can be a simple text file stored in the file system. The full
|
A <firstterm>document</> is the unit of searching in a full text search
|
||||||
text indexing engine can parse text files and store associations of lexemes
|
system; for example, a magazine article or email message. The text search
|
||||||
(words) with their parent document. Later, these associations are used to
|
engine must be able to parse documents and store associations of lexemes
|
||||||
search for documents which contain query words. In this case, the database
|
(key words) with their parent document. Later, these associations are
|
||||||
can be used to store the full text index and for executing searches, and
|
used to search for documents which contain query words.
|
||||||
some unique identifier can be used to retrieve the document from the file
|
|
||||||
system.
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
A document can also be any textual database attribute or a combination
|
For searches within <productname>PostgreSQL</productname>,
|
||||||
(concatenation), which in turn can be stored in various tables or obtained
|
a document is normally a textual field within a row of a database table,
|
||||||
dynamically. In other words, a document can be constructed from different
|
or possibly a combination (concatenation) of such fields, perhaps stored
|
||||||
parts for indexing and it might not exist as a whole. For example:
|
in several tables or obtained dynamically. In other words, a document can
|
||||||
|
be constructed from different parts for indexing and it might not be
|
||||||
|
stored anywhere as a whole. For example:
|
||||||
|
|
||||||
<programlisting>
|
<programlisting>
|
||||||
SELECT title || ' ' || author || ' ' || abstract || ' ' || body AS document
|
SELECT title || ' ' || author || ' ' || abstract || ' ' || body AS document
|
||||||
@ -184,10 +200,20 @@ WHERE mid = did AND mid = 12;
|
|||||||
<para>
|
<para>
|
||||||
Actually, in the previous example queries, <literal>COALESCE</literal>
|
Actually, in the previous example queries, <literal>COALESCE</literal>
|
||||||
<!-- TODO make this a link? -->
|
<!-- TODO make this a link? -->
|
||||||
should be used to prevent a <literal>NULL</literal> attribute from causing
|
should be used to prevent a simgle <literal>NULL</literal> attribute from
|
||||||
a <literal>NULL</literal> result.
|
causing a <literal>NULL</literal> result for the whole document.
|
||||||
</para>
|
</para>
|
||||||
</note>
|
</note>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
Another possibility is to store the documents as simple text files in the
|
||||||
|
file system. In this case, the database can be used to store the full text
|
||||||
|
index and to execute searches, and some unique identifier can be used to
|
||||||
|
retrieve the document from the file system. However, retrieving files
|
||||||
|
from outside the database requires superuser permissions or special
|
||||||
|
function support, so this is usually less convenient than keeping all
|
||||||
|
the data inside <productname>PostgreSQL</productname>.
|
||||||
|
</para>
|
||||||
</sect2>
|
</sect2>
|
||||||
|
|
||||||
<sect2 id="textsearch-searches">
|
<sect2 id="textsearch-searches">
|
||||||
@ -261,8 +287,9 @@ SELECT 'fat & cow'::tsquery @@ 'a fat cat sat on a mat and ate a fat rat'::t
|
|||||||
<xref linkend="guc-default-text-search-config"> was set accordingly
|
<xref linkend="guc-default-text-search-config"> was set accordingly
|
||||||
in <filename>postgresql.conf</>. If you are using the same text search
|
in <filename>postgresql.conf</>. If you are using the same text search
|
||||||
configuration for the entire cluster you can use the value in
|
configuration for the entire cluster you can use the value in
|
||||||
<filename>postgresql.conf</>. If using different configurations but
|
<filename>postgresql.conf</>. If using different configurations
|
||||||
the same text search configuration for an entire database,
|
throughout the cluster but
|
||||||
|
the same text search configuration for any one database,
|
||||||
use <command>ALTER DATABASE ... SET</>. If not, you must set <varname>
|
use <command>ALTER DATABASE ... SET</>. If not, you must set <varname>
|
||||||
default_text_search_config</varname> in each session. Many functions
|
default_text_search_config</varname> in each session. Many functions
|
||||||
also take an optional configuration name.
|
also take an optional configuration name.
|
||||||
@ -555,7 +582,7 @@ UPDATE tt SET ti=
|
|||||||
|
|
||||||
<term>
|
<term>
|
||||||
<synopsis>
|
<synopsis>
|
||||||
ts_parse(<replaceable class="PARAMETER">parser</replaceable>, <replaceable class="PARAMETER">document</replaceable> TEXT) returns SETOF <type>tokenout</type>
|
ts_parse(<replaceable class="PARAMETER">parser</replaceable>, <replaceable class="PARAMETER">document</replaceable> text, OUT <replaceable class="PARAMETER">tokid</> integer, OUT <replaceable class="PARAMETER">token</> text) returns SETOF RECORD
|
||||||
</synopsis>
|
</synopsis>
|
||||||
</term>
|
</term>
|
||||||
|
|
||||||
@ -588,7 +615,7 @@ SELECT * FROM ts_parse('default','123 - a number');
|
|||||||
|
|
||||||
<term>
|
<term>
|
||||||
<synopsis>
|
<synopsis>
|
||||||
ts_token_type(<replaceable class="PARAMETER">parser</replaceable> ) returns SETOF <type>tokentype</type>
|
ts_token_type(<replaceable class="PARAMETER">parser</>, OUT <replaceable class="PARAMETER">tokid</> integer, OUT <replaceable class="PARAMETER">alias</> text, OUT <replaceable class="PARAMETER">description</> text) returns SETOF RECORD
|
||||||
</synopsis>
|
</synopsis>
|
||||||
</term>
|
</term>
|
||||||
|
|
||||||
@ -1107,20 +1134,20 @@ SELECT ts_lexize('english_stem', 'stars');
|
|||||||
(1 row)
|
(1 row)
|
||||||
</programlisting>
|
</programlisting>
|
||||||
|
|
||||||
Also, the <function>ts_debug</function> function (<xref linkend="textsearch-debugging">)
|
Also, the <function>ts_debug</function> function (<xref
|
||||||
can be used for this.
|
linkend="textsearch-debugging">) is helpful for testing.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<sect2 id="textsearch-stopwords">
|
<sect2 id="textsearch-stopwords">
|
||||||
<title>Stop Words</title>
|
<title>Stop Words</title>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Stop words are words which are very common, appear in almost
|
Stop words are words which are very common, appear in almost every
|
||||||
every document, and have no discrimination value. Therefore, they can be ignored
|
document, and have no discrimination value. Therefore, they can be ignored
|
||||||
in the context of full text searching. For example, every English text contains
|
in the context of full text searching. For example, every English text
|
||||||
words like <literal>a</literal> although it is useless to store them in an index.
|
contains words like <literal>a</literal> and <literal>the</>, so it is
|
||||||
However, stop words do affect the positions in <type>tsvector</type>,
|
useless to store them in an index. However, stop words do affect the
|
||||||
which in turn, do affect ranking:
|
positions in <type>tsvector</type>, which in turn affect ranking:
|
||||||
|
|
||||||
<programlisting>
|
<programlisting>
|
||||||
SELECT to_tsvector('english','in the list of stop words');
|
SELECT to_tsvector('english','in the list of stop words');
|
||||||
@ -1542,11 +1569,15 @@ SELECT ts_lexize('norwegian_ispell','sjokoladefabrikk');
|
|||||||
<para>
|
<para>
|
||||||
The <application>Snowball</> dictionary template is based on the project
|
The <application>Snowball</> dictionary template is based on the project
|
||||||
of Martin Porter, inventor of the popular Porter's stemming algorithm
|
of Martin Porter, inventor of the popular Porter's stemming algorithm
|
||||||
for the English language and now supported in many languages (see the <ulink
|
for the English language. Snowball now provides stemming algorithms for
|
||||||
url="http://snowball.tartarus.org">Snowball site</ulink> for more
|
many languages (see the <ulink url="http://snowball.tartarus.org">Snowball
|
||||||
information). The Snowball project supplies a large number of stemmers for
|
site</ulink> for more information). Each algorithm understands how to
|
||||||
many languages. A Snowball dictionary requires a language parameter to
|
reduce common variant forms of words to a base, or stem, spelling within
|
||||||
identify which stemmer to use, and optionally can specify a stopword file name.
|
its language. A Snowball dictionary requires a language parameter to
|
||||||
|
identify which stemmer to use, and optionally can specify a stopword file
|
||||||
|
name that gives a list of words to eliminate.
|
||||||
|
(<productname>PostgreSQL</productname>'s standard stopword lists are also
|
||||||
|
provided by the Snowball project.)
|
||||||
For example, there is a built-in definition equivalent to
|
For example, there is a built-in definition equivalent to
|
||||||
|
|
||||||
<programlisting>
|
<programlisting>
|
||||||
@ -1782,7 +1813,7 @@ version of our software: PostgreSQL 8.3.
|
|||||||
|
|
||||||
<programlisting>
|
<programlisting>
|
||||||
=> \dF
|
=> \dF
|
||||||
List of fulltext configurations
|
List of text search configurations
|
||||||
Schema | Name | Description
|
Schema | Name | Description
|
||||||
---------+------+-------------
|
---------+------+-------------
|
||||||
public | pg |
|
public | pg |
|
||||||
@ -2053,24 +2084,24 @@ EXPLAIN SELECT * FROM apod WHERE textsearch @@@ to_tsquery('supernovae:a');
|
|||||||
|
|
||||||
<para>
|
<para>
|
||||||
Information about full text searching objects can be obtained
|
Information about full text searching objects can be obtained
|
||||||
in <literal>psql</literal> using a set of commands:
|
in <application>psql</application> using a set of commands:
|
||||||
<synopsis>
|
<synopsis>
|
||||||
\dF{,d,p}<optional>+</optional> <optional>PATTERN</optional>
|
\dF{d,p,t}<optional>+</optional> <optional>PATTERN</optional>
|
||||||
</synopsis>
|
</synopsis>
|
||||||
An optional <literal>+</literal> produces more details.
|
An optional <literal>+</literal> produces more details.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
The optional parameter <literal>PATTERN</literal> should be the name of
|
The optional parameter <literal>PATTERN</literal> should be the name of
|
||||||
a full text searching object, optionally schema-qualified. If
|
a text searching object, optionally schema-qualified. If
|
||||||
<literal>PATTERN</literal> is not specified then information about all
|
<literal>PATTERN</literal> is not specified then information about all
|
||||||
visible objects will be displayed. <literal>PATTERN</literal> can be a
|
visible objects will be displayed. <literal>PATTERN</literal> can be a
|
||||||
regular expression and can apply <emphasis>separately</emphasis> to schema
|
regular expression and can provide <emphasis>separate</emphasis> patterns
|
||||||
names and object names. The following examples illustrate this:
|
for the schema and object names. The following examples illustrate this:
|
||||||
|
|
||||||
<programlisting>
|
<programlisting>
|
||||||
=> \dF *fulltext*
|
=> \dF *fulltext*
|
||||||
List of fulltext configurations
|
List of text search configurations
|
||||||
Schema | Name | Description
|
Schema | Name | Description
|
||||||
--------+--------------+-------------
|
--------+--------------+-------------
|
||||||
public | fulltext_cfg |
|
public | fulltext_cfg |
|
||||||
@ -2078,7 +2109,7 @@ EXPLAIN SELECT * FROM apod WHERE textsearch @@@ to_tsquery('supernovae:a');
|
|||||||
|
|
||||||
<programlisting>
|
<programlisting>
|
||||||
=> \dF *.fulltext*
|
=> \dF *.fulltext*
|
||||||
List of fulltext configurations
|
List of text search configurations
|
||||||
Schema | Name | Description
|
Schema | Name | Description
|
||||||
----------+----------------------------
|
----------+----------------------------
|
||||||
fulltext | fulltext_cfg |
|
fulltext | fulltext_cfg |
|
||||||
@ -2093,46 +2124,42 @@ EXPLAIN SELECT * FROM apod WHERE textsearch @@@ to_tsquery('supernovae:a');
|
|||||||
|
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
List full text searching configurations (add "+" for more detail)
|
List text searching configurations (add <literal>+</> for more detail).
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
By default (without <literal>PATTERN</literal>), information about
|
|
||||||
all <emphasis>visible</emphasis> full text configurations will be
|
|
||||||
displayed.
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
|
|
||||||
<programlisting>
|
<programlisting>
|
||||||
=> \dF russian
|
=> \dF russian
|
||||||
List of fulltext configurations
|
List of text search configurations
|
||||||
Schema | Name | Description
|
Schema | Name | Description
|
||||||
------------+---------+-----------------------------------
|
------------+---------+------------------------------------
|
||||||
pg_catalog | russian | default configuration for Russian
|
pg_catalog | russian | configuration for russian language
|
||||||
|
|
||||||
=> \dF+ russian
|
=> \dF+ russian
|
||||||
Configuration "pg_catalog.russian"
|
Text search configuration "pg_catalog.russian"
|
||||||
Parser name: "pg_catalog.default"
|
Parser: "pg_catalog.default"
|
||||||
Token | Dictionaries
|
Token | Dictionaries
|
||||||
--------------+-------------------------
|
--------------+--------------
|
||||||
email | pg_catalog.simple
|
email | simple
|
||||||
file | pg_catalog.simple
|
file | simple
|
||||||
float | pg_catalog.simple
|
float | simple
|
||||||
host | pg_catalog.simple
|
host | simple
|
||||||
hword | pg_catalog.russian_stem
|
hword | russian_stem
|
||||||
int | pg_catalog.simple
|
int | simple
|
||||||
lhword | public.tz_simple
|
lhword | english_stem
|
||||||
lpart_hword | public.tz_simple
|
lpart_hword | english_stem
|
||||||
lword | public.tz_simple
|
lword | english_stem
|
||||||
nlhword | pg_catalog.russian_stem
|
nlhword | russian_stem
|
||||||
nlpart_hword | pg_catalog.russian_stem
|
nlpart_hword | russian_stem
|
||||||
nlword | pg_catalog.russian_stem
|
nlword | russian_stem
|
||||||
part_hword | pg_catalog.simple
|
part_hword | russian_stem
|
||||||
sfloat | pg_catalog.simple
|
sfloat | simple
|
||||||
uint | pg_catalog.simple
|
uint | simple
|
||||||
uri | pg_catalog.simple
|
uri | simple
|
||||||
url | pg_catalog.simple
|
url | simple
|
||||||
version | pg_catalog.simple
|
version | simple
|
||||||
word | pg_catalog.russian_stem
|
word | russian_stem
|
||||||
</programlisting>
|
</programlisting>
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
@ -2142,35 +2169,31 @@ EXPLAIN SELECT * FROM apod WHERE textsearch @@@ to_tsquery('supernovae:a');
|
|||||||
<term>\dFd[+] [PATTERN]</term>
|
<term>\dFd[+] [PATTERN]</term>
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
List full text dictionaries (add "+" for more detail).
|
List text search dictionaries (add <literal>+</> for more detail).
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
By default (without <literal>PATTERN</literal>), information about
|
|
||||||
all <emphasis>visible</emphasis> dictionaries will be displayed.
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
<programlisting>
|
<programlisting>
|
||||||
=> \dFd
|
=> \dFd
|
||||||
List of fulltext dictionaries
|
List of text search dictionaries
|
||||||
Schema | Name | Description
|
Schema | Name | Description
|
||||||
------------+------------+-----------------------------------------------------------
|
------------+-----------------+-----------------------------------------------------------
|
||||||
pg_catalog | danish | Snowball stemmer for danish language
|
pg_catalog | danish_stem | snowball stemmer for danish language
|
||||||
pg_catalog | dutch | Snowball stemmer for dutch language
|
pg_catalog | dutch_stem | snowball stemmer for dutch language
|
||||||
pg_catalog | english | Snowball stemmer for english language
|
pg_catalog | english_stem | snowball stemmer for english language
|
||||||
pg_catalog | finnish | Snowball stemmer for finnish language
|
pg_catalog | finnish_stem | snowball stemmer for finnish language
|
||||||
pg_catalog | french | Snowball stemmer for french language
|
pg_catalog | french_stem | snowball stemmer for french language
|
||||||
pg_catalog | german | Snowball stemmer for german language
|
pg_catalog | german_stem | snowball stemmer for german language
|
||||||
pg_catalog | hungarian | Snowball stemmer for hungarian language
|
pg_catalog | hungarian_stem | snowball stemmer for hungarian language
|
||||||
pg_catalog | italian | Snowball stemmer for italian language
|
pg_catalog | italian_stem | snowball stemmer for italian language
|
||||||
pg_catalog | norwegian | Snowball stemmer for norwegian language
|
pg_catalog | norwegian_stem | snowball stemmer for norwegian language
|
||||||
pg_catalog | portuguese | Snowball stemmer for portuguese language
|
pg_catalog | portuguese_stem | snowball stemmer for portuguese language
|
||||||
pg_catalog | romanian | Snowball stemmer for romanian language
|
pg_catalog | romanian_stem | snowball stemmer for romanian language
|
||||||
pg_catalog | russian | Snowball stemmer for russian language
|
pg_catalog | russian_stem | snowball stemmer for russian language
|
||||||
pg_catalog | simple | simple dictionary: just lower case and check for stopword
|
pg_catalog | simple | simple dictionary: just lower case and check for stopword
|
||||||
pg_catalog | spanish | Snowball stemmer for spanish language
|
pg_catalog | spanish_stem | snowball stemmer for spanish language
|
||||||
pg_catalog | swedish | Snowball stemmer for swedish language
|
pg_catalog | swedish_stem | snowball stemmer for swedish language
|
||||||
pg_catalog | turkish | Snowball stemmer for turkish language
|
pg_catalog | turkish_stem | snowball stemmer for turkish language
|
||||||
</programlisting>
|
</programlisting>
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
@ -2181,32 +2204,28 @@ EXPLAIN SELECT * FROM apod WHERE textsearch @@@ to_tsquery('supernovae:a');
|
|||||||
<term>\dFp[+] [PATTERN]</term>
|
<term>\dFp[+] [PATTERN]</term>
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
List full text parsers (add "+" for more detail)
|
List text search parsers (add <literal>+</> for more detail).
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
By default (without <literal>PATTERN</literal>), information about
|
|
||||||
all <emphasis>visible</emphasis> full text parsers will be displayed.
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
<programlisting>
|
<programlisting>
|
||||||
=> \dFp
|
=> \dFp
|
||||||
List of fulltext parsers
|
List of text search parsers
|
||||||
Schema | Name | Description
|
Schema | Name | Description
|
||||||
------------+---------+---------------------
|
------------+---------+---------------------
|
||||||
pg_catalog | default | default word parser
|
pg_catalog | default | default word parser
|
||||||
(1 row)
|
|
||||||
=> \dFp+
|
=> \dFp+
|
||||||
Fulltext parser "pg_catalog.default"
|
Text search parser "pg_catalog.default"
|
||||||
Method | Function | Description
|
Method | Function | Description
|
||||||
-------------------+---------------------------+-------------
|
------------------+----------------+-------------
|
||||||
Start parse | pg_catalog.prsd_start |
|
Start parse | prsd_start |
|
||||||
Get next token | pg_catalog.prsd_nexttoken |
|
Get next token | prsd_nexttoken |
|
||||||
End parse | pg_catalog.prsd_end |
|
End parse | prsd_end |
|
||||||
Get headline | pg_catalog.prsd_headline |
|
Get headline | prsd_headline |
|
||||||
Get lexeme's type | pg_catalog.prsd_lextype |
|
Get lexeme types | prsd_lextype |
|
||||||
|
|
||||||
Token's types for parser "pg_catalog.default"
|
Token types for parser "pg_catalog.default"
|
||||||
Token name | Description
|
Token name | Description
|
||||||
--------------+-----------------------------------
|
--------------+-----------------------------------
|
||||||
blank | Space symbols
|
blank | Space symbols
|
||||||
email | Email
|
email | Email
|
||||||
@ -2237,6 +2256,30 @@ EXPLAIN SELECT * FROM apod WHERE textsearch @@@ to_tsquery('supernovae:a');
|
|||||||
</listitem>
|
</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
||||||
|
<varlistentry>
|
||||||
|
|
||||||
|
<term>\dFt[+] [PATTERN]</term>
|
||||||
|
<listitem>
|
||||||
|
<para>
|
||||||
|
List text search templates (add <literal>+</> for more detail).
|
||||||
|
</para>
|
||||||
|
|
||||||
|
<para>
|
||||||
|
<programlisting>
|
||||||
|
=> \dFt
|
||||||
|
List of text search templates
|
||||||
|
Schema | Name | Description
|
||||||
|
------------+-----------+-----------------------------------------------------------
|
||||||
|
pg_catalog | ispell | ispell dictionary
|
||||||
|
pg_catalog | simple | simple dictionary: just lower case and check for stopword
|
||||||
|
pg_catalog | snowball | snowball stemmer
|
||||||
|
pg_catalog | synonym | synonym dictionary: replace word by its synonym
|
||||||
|
pg_catalog | thesaurus | thesaurus dictionary: phrase by phrase substitution
|
||||||
|
</programlisting>
|
||||||
|
</para>
|
||||||
|
</listitem>
|
||||||
|
</varlistentry>
|
||||||
|
|
||||||
</variablelist>
|
</variablelist>
|
||||||
|
|
||||||
</sect1>
|
</sect1>
|
||||||
@ -2261,7 +2304,7 @@ EXPLAIN SELECT * FROM apod WHERE textsearch @@@ to_tsquery('supernovae:a');
|
|||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
<replaceable class="PARAMETER">ts_debug</replaceable> type defined as:
|
<replaceable class="PARAMETER">ts_debug</replaceable>'s result type is defined as:
|
||||||
|
|
||||||
<programlisting>
|
<programlisting>
|
||||||
CREATE TYPE ts_debug AS (
|
CREATE TYPE ts_debug AS (
|
||||||
@ -2297,7 +2340,7 @@ ALTER TEXT SEARCH CONFIGURATION public.english
|
|||||||
|
|
||||||
<programlisting>
|
<programlisting>
|
||||||
SELECT * FROM ts_debug('public.english','The Brightest supernovaes');
|
SELECT * FROM ts_debug('public.english','The Brightest supernovaes');
|
||||||
Alias | Description | Token | Dicts list | Lexized token
|
Alias | Description | Token | Dictionaries | Lexized token
|
||||||
-------+---------------+-------------+---------------------------------------+---------------------------------
|
-------+---------------+-------------+---------------------------------------+---------------------------------
|
||||||
lword | Latin word | The | {public.english_ispell,pg_catalog.english_stem} | public.english_ispell: {}
|
lword | Latin word | The | {public.english_ispell,pg_catalog.english_stem} | public.english_ispell: {}
|
||||||
blank | Space symbols | | |
|
blank | Space symbols | | |
|
||||||
|
Loading…
Reference in New Issue
Block a user