Commit Graph

1814 Commits

Author SHA1 Message Date
shess
fb6794360d Write doclists using a segmented technique to amortize costs better.
New items for a term are merged with the term's segment 0 doclist,
until that doclist exceeds CHUNK_MAX.  Then the segments are merged in
exponential fashion, so that segment 1 contains approximately
2*CHUNK_MAX data, segment 2 4*CHUNK_MAX, and so on. (CVS 3398)

FossilOrigin-Name: b6b93a3325d3e728ca36255c0ff6e1f63e03b0ac
2006-09-08 17:00:17 +00:00
adamd
338565ad4b A minor change to fts1.c to fix broken build. (CVS 3393)
FossilOrigin-Name: 55a03b96251515a4817a0eefb197219a460640e7
2006-09-05 18:21:31 +00:00
drh
fb52cc95ff Add a TRACE macro to the FTS1 module for troubleshooting. Turned off by
default. (CVS 3388)

FossilOrigin-Name: d4923e98c66ae03d899f633e5e309471f5695abb
2006-09-02 20:58:25 +00:00
drh
7c2d87cd71 Convert static variables into constants in the FTS module. (CVS 3385)
FossilOrigin-Name: 098cbafcd6dcf57142b0417e796d27ffddcc0920
2006-09-02 14:16:59 +00:00
adamd
9eb3997b02 Miscellaneous restructuring and cleanup based on suggestions from shess. (CVS 3382)
FossilOrigin-Name: e98b0cf292f6dc9deb6ae9b773c52b16867f7556
2006-09-02 00:23:01 +00:00
shess
b2f4d0173a Make fts1.c not rely on nul-terminated strings. Mostly a matter of
making sure we always pass around ptr/len, but there were a few places
where we actually relied on nul-termination.

An earlier change had additionally changed appropriate
sqlite3_bind_text() calls to sqlite3_bind_blob().  I've found that
this changes what's actually stored in the database, so backed those
changes out.  Also (and this is weird), I found that I could no longer
do straight-forward = queries against %_term.term at a command-line. (CVS 3379)

FossilOrigin-Name: 5844db1aa9c23a005c88104b084f68afb21891c7
2006-09-01 00:33:44 +00:00
shess
c0beb14f23 Make tokenizer not rely on nul-terminated text. Instead of using
strcspn() and a nul-terminated delimiter list, I just flagged
delimiters in an array and wrote things inline.  Submitting this for
review separately because it's pretty standalone. (CVS 3378)

FossilOrigin-Name: 2631ceaeefaca3aa837e3b439399f13c51456914
2006-09-01 00:05:17 +00:00
drh
5db455e7b5 Refactor the FTS1 module so that its name is "fts1" instead of "fulltext",
so that all symbols with external linkage begin with "sqlite3Fts1", and
so that all filenames begin with "fts1". (CVS 3377)

FossilOrigin-Name: e1891f0dc58e5498a8845d8b9b5b092d7f9c7003
2006-08-31 15:07:14 +00:00
shess
2b85d5f46e Just don't run tolower() on hi-bit characters. This shouldn't cause
us to break any UTF-8 code points, unless they were already broken in
the input. (CVS 3376)

FossilOrigin-Name: 6c77c2d5e15e9d3efed3e274bc93cd5a4868f574
2006-08-30 21:40:30 +00:00
shess
c9e0a9057e Make static some symbols which shouldn't have been exported. (CVS 3371)
FossilOrigin-Name: 58006e38af760b53cf72bf127d7c7b8a619a1282
2006-08-28 23:46:01 +00:00
shess
4f4897e80d Make hi-bit characters delimiters. This is a stopgap until the tokenizer
and fulltext.c recognize UTF-8 correctly. (CVS 3370)

FossilOrigin-Name: ca850d3d80f67672172d11392fcdf60bfbb94c02
2006-08-28 20:08:56 +00:00
shess
0de250e46f Fix gcc gripe about parens in a ||/&& in mergePosList().
Drop unused pBlob/nBlob in index_insert_term().
Fix NULL deref in an assertion in docListUpdate() delete case.
Minor code tightening in docListUpdate(). (CVS 3367)

FossilOrigin-Name: a6fcf9101a831bf5f129c6045eabf30376d365dc
2006-08-25 19:20:26 +00:00
adamd
1717edd157 A first implementation of a full-text search module for SQLite. (CVS 3363)
FossilOrigin-Name: b0d8e0d314d6f77b7d4b5dd00c694a1323f7a8e4
2006-08-23 23:58:50 +00:00
drh
fa9b4b1499 Add the ext/fts1 subdirectory for holding the first full-text search
extension. (CVS 3360)

FossilOrigin-Name: 7f152f9f3a647d30874f2da46ce93a1e31ea7cf3
2006-08-22 14:45:37 +00:00