please apply small patch for README.tsearch.
I've documented space usage and using CLUSTER command Oleg Bartunov
This commit is contained in:
parent
31fbdad6e5
commit
1761990e38
@ -6,6 +6,8 @@ All work was done by Teodor Sigaev (teodor@stack.net) and Oleg Bartunov
|
||||
|
||||
CHANGES:
|
||||
|
||||
August 29, 2002
|
||||
Space usage and using CLUSTER command documented
|
||||
August 22, 2002
|
||||
Fix works with 'bad' queries
|
||||
August 13, 2002
|
||||
@ -286,8 +288,8 @@ is strongly depends on many factors (query, collection, dictionaries
|
||||
and hardware).
|
||||
|
||||
Collection is available for download from
|
||||
http://www.sai.msu.su/~megera/postgres/gist/tsearch/
|
||||
as mw_titles.gz (about 3Mb).
|
||||
http://www.sai.msu.su/~megera/postgres/gist/tsearch/mw_titles.gz
|
||||
(377905 titles from postgresql mailing lists, about 3Mb).
|
||||
|
||||
0. install contrib/tsearch module
|
||||
1. createdb test
|
||||
@ -353,3 +355,61 @@ using gist indices (morph)
|
||||
|
||||
There are no visible difference between these 2 cases but your
|
||||
mileage may vary.
|
||||
|
||||
|
||||
NOTES:
|
||||
|
||||
1. The size of txtidx column should be lesser than size of corresponding column.
|
||||
Below some real numbers from test database (link above).
|
||||
|
||||
a) After loading data
|
||||
|
||||
-rw------- 1 postgres users 23191552 Aug 29 14:08 53016937
|
||||
-rw------- 1 postgres users 81059840 Aug 29 14:08 52639027
|
||||
|
||||
Table titles (52639027) occupies 80Mb, index on txtidx column (53016937)
|
||||
occupies 22Mb. Use contrib/oid2name to get mappings from oid to names.
|
||||
After doing
|
||||
|
||||
test=# select title into titles_tmp from titles;
|
||||
SELECT
|
||||
|
||||
I got size of table 'titles' without txtidx field
|
||||
|
||||
-rw------- 1 postgres users 30105600 Aug 29 14:14 53016938
|
||||
|
||||
So, txtidx column itself occupies about 50Mb.
|
||||
|
||||
b) after running 'vacuum full analyze' I got:
|
||||
|
||||
-rw------- 1 postgres users 30105600 Aug 29 14:26 53016938
|
||||
-rw------- 1 postgres users 36880384 Aug 29 14:26 53016937
|
||||
-rw------- 1 postgres users 51494912 Aug 29 14:26 52639027
|
||||
|
||||
53016938 = titles_tmp
|
||||
|
||||
So, actual size of 'txtidx' field is 20 Mb ! "quod erat demonstrandum"
|
||||
|
||||
2. CLUSTER command is highly recommended if you need fast searching.
|
||||
For example:
|
||||
|
||||
test=# cluster t_idx on titles;
|
||||
|
||||
BUT ! In 7.2 CLUSTER command forgets about other indices and permissions,
|
||||
so you need be carefull and rebuild these indices and restore permissions
|
||||
after clustering. Also, clustering isn't dynamic, so you'd need to
|
||||
use CLUSTER from time to time. In 7.3 CLUSTER command should works
|
||||
fine.
|
||||
|
||||
after clustering:
|
||||
|
||||
-rw------- 1 postgres users 23404544 Aug 29 14:59 53394850
|
||||
-rw------- 1 postgres users 30105600 Aug 29 14:26 53016938
|
||||
-rw------- 1 postgres users 50995200 Aug 29 14:45 53394845
|
||||
pg@zen:/usr/local/pgsql/data/base/52638986$ oid2name -d test
|
||||
All tables from database "test":
|
||||
---------------------------------
|
||||
53394850 = t_idx
|
||||
53394845 = titles
|
||||
53016938 = titles_tmp
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user