please apply small patch for README.tsearch.

I've documented space usage and using CLUSTER command Oleg Bartunov
2002-08-29 19:55:26 +00:00 · 2002-08-29 19:55:26 +00:00 · 1761990e38
commit 1761990e38
parent 31fbdad6e5
1 changed files with 62 additions and 2 deletions
--- a/contrib/tsearch/README.tsearch
+++ b/contrib/tsearch/README.tsearch
@ -6,6 +6,8 @@ All work was done by Teodor Sigaev (teodor@stack.net) and Oleg Bartunov
 CHANGES:
 August 29, 2002
        Space usage and using CLUSTER command documented
 August 22, 2002
 	Fix works with 'bad' queries
 August 13, 2002
@ -286,8 +288,8 @@ is strongly depends on many factors (query, collection, dictionaries
 and hardware).
 Collection is available for download from
-http://www.sai.msu.su/~megera/postgres/gist/tsearch/ 
+http://www.sai.msu.su/~megera/postgres/gist/tsearch/mw_titles.gz 
-as mw_titles.gz (about 3Mb).
+(377905 titles from postgresql mailing lists, about 3Mb).
 0. install contrib/tsearch module
 1. createdb test
@ -353,3 +355,61 @@ using gist indices (morph)
 There are no visible difference between these 2 cases but your
 mileage may vary.
 NOTES:
 1. The size of txtidx column should be lesser than size of corresponding column.
   Below some real numbers from test database (link above).
   a) After loading data
 -rw-------    1 postgres users    23191552 Aug 29 14:08 53016937
 -rw-------    1 postgres users    81059840 Aug 29 14:08 52639027
 Table titles (52639027) occupies 80Mb, index on txtidx column (53016937)
 occupies 22Mb. Use contrib/oid2name to get mappings from oid to names.
 After doing
 test=# select title  into titles_tmp from titles;
 SELECT
 I got size of table 'titles' without txtidx field
 -rw-------    1 postgres users    30105600 Aug 29 14:14 53016938
 So, txtidx column itself occupies about 50Mb. 
     b) after running 'vacuum full analyze' I got:
 -rw-------    1 postgres users    30105600 Aug 29 14:26 53016938
 -rw-------    1 postgres users    36880384 Aug 29 14:26 53016937
 -rw-------    1 postgres users    51494912 Aug 29 14:26 52639027
 53016938 = titles_tmp
 So, actual size of 'txtidx' field is 20 Mb !  "quod erat demonstrandum"
 2. CLUSTER command is highly recommended if you need fast searching.
   For example:
  test=# cluster t_idx on titles;
  BUT ! In 7.2 CLUSTER command forgets about other indices and permissions,
  so you need be carefull and rebuild these indices and restore permissions
  after clustering. Also, clustering isn't dynamic, so you'd need to 
  use CLUSTER from time to time. In 7.3 CLUSTER command should works
  fine.
  after clustering:
 -rw-------    1 postgres users    23404544 Aug 29 14:59 53394850
 -rw-------    1 postgres users    30105600 Aug 29 14:26 53016938
 -rw-------    1 postgres users    50995200 Aug 29 14:45 53394845
 pg@zen:/usr/local/pgsql/data/base/52638986$ oid2name -d test                 
 All tables from database "test":
 ---------------------------------
 53394850 = t_idx
 53394845 = titles
 53016938 = titles_tmp