please apply small patch for README.tsearch.

I've documented space usage and using CLUSTER command

Oleg Bartunov
This commit is contained in:
Bruce Momjian 2002-08-29 19:55:26 +00:00
parent 31fbdad6e5
commit 1761990e38

View File

@ -6,6 +6,8 @@ All work was done by Teodor Sigaev (teodor@stack.net) and Oleg Bartunov
CHANGES:
August 29, 2002
Space usage and using CLUSTER command documented
August 22, 2002
Fix works with 'bad' queries
August 13, 2002
@ -286,8 +288,8 @@ is strongly depends on many factors (query, collection, dictionaries
and hardware).
Collection is available for download from
http://www.sai.msu.su/~megera/postgres/gist/tsearch/
as mw_titles.gz (about 3Mb).
http://www.sai.msu.su/~megera/postgres/gist/tsearch/mw_titles.gz
(377905 titles from postgresql mailing lists, about 3Mb).
0. install contrib/tsearch module
1. createdb test
@ -353,3 +355,61 @@ using gist indices (morph)
There are no visible difference between these 2 cases but your
mileage may vary.
NOTES:
1. The size of txtidx column should be lesser than size of corresponding column.
Below some real numbers from test database (link above).
a) After loading data
-rw------- 1 postgres users 23191552 Aug 29 14:08 53016937
-rw------- 1 postgres users 81059840 Aug 29 14:08 52639027
Table titles (52639027) occupies 80Mb, index on txtidx column (53016937)
occupies 22Mb. Use contrib/oid2name to get mappings from oid to names.
After doing
test=# select title into titles_tmp from titles;
SELECT
I got size of table 'titles' without txtidx field
-rw------- 1 postgres users 30105600 Aug 29 14:14 53016938
So, txtidx column itself occupies about 50Mb.
b) after running 'vacuum full analyze' I got:
-rw------- 1 postgres users 30105600 Aug 29 14:26 53016938
-rw------- 1 postgres users 36880384 Aug 29 14:26 53016937
-rw------- 1 postgres users 51494912 Aug 29 14:26 52639027
53016938 = titles_tmp
So, actual size of 'txtidx' field is 20 Mb ! "quod erat demonstrandum"
2. CLUSTER command is highly recommended if you need fast searching.
For example:
test=# cluster t_idx on titles;
BUT ! In 7.2 CLUSTER command forgets about other indices and permissions,
so you need be carefull and rebuild these indices and restore permissions
after clustering. Also, clustering isn't dynamic, so you'd need to
use CLUSTER from time to time. In 7.3 CLUSTER command should works
fine.
after clustering:
-rw------- 1 postgres users 23404544 Aug 29 14:59 53394850
-rw------- 1 postgres users 30105600 Aug 29 14:26 53016938
-rw------- 1 postgres users 50995200 Aug 29 14:45 53394845
pg@zen:/usr/local/pgsql/data/base/52638986$ oid2name -d test
All tables from database "test":
---------------------------------
53394850 = t_idx
53394845 = titles
53016938 = titles_tmp