This adds mention of my latest tweak to the tsearch2/pg_trgm

integration.  It is much better to create a word list of unstemmed words
than stemmed ones.

Chris K-L
This commit is contained in:
Tom Lane 2004-11-27 00:01:02 +00:00
parent c2e5631760
commit b82323e05e

View File

@ -100,11 +100,15 @@ Tsearch2 Integration
The first step is to generate an auxiliary table containing all The first step is to generate an auxiliary table containing all
the unique words in the Tsearch2 index: the unique words in the Tsearch2 index:
CREATE TABLE words AS CREATE TABLE words AS SELECT word FROM
SELECT word FROM stat('SELECT vector FROM documents'); stat('SELECT to_tsvector(''simple'', bodytext) FROM documents');
Where 'documents' is the table that contains the Tsearch2 index Where 'documents' is a table that has a text field 'bodytext'
column 'vector', of type 'tsvector'. that TSearch2 is used to search. The use of the 'simple' dictionary
with the to_tsvector function, instead of just using the already
existing vector is to avoid creating a list of already stemmed
words. This way, only the original, unstemmed words are added
to the word list.
Next, create a trigram index on the word column: Next, create a trigram index on the word column: