Commit Graph

61 Commits

Author SHA1 Message Date
mistachkin
049d487e2e Modify several extensions to use the new exported function naming. Fix some shared library compilation issues.
FossilOrigin-Name: f2ab8747825ab5131ffab174aa0ffe5e474f6811
2013-07-04 23:53:56 +00:00
mistachkin
48864df97d Many spelling fixes in comments. No changes to code.
FossilOrigin-Name: 6f6e2d50941e444ebc83604daddcc034137a05b7
2013-03-21 21:20:32 +00:00
mistachkin
d5578433ff Fix all known instances of 'repeated the' style typos in comments. No changes to code.
FossilOrigin-Name: 9b19b847533f944f289d93dcdba29c0d67bf251c
2012-08-25 10:01:29 +00:00
drh
f2fcd0750b Completely remove all trace of ctype.h from FTS2.
FossilOrigin-Name: 876845661a944ec1c841d1e2486d070efb76e5cd
2010-09-17 01:07:53 +00:00
shane
be21779385 Corrected typos and misspellings. Ticket #3702. (CVS 6336)
FossilOrigin-Name: 6404afa0c515a6536fc2e878d4fb451e4dc06942
2009-03-05 04:20:31 +00:00
shess
7fdb522caf Backport http://www.sqlite.org/cvstrac/chngview?cn=5489 from fts3.
Re-used prepared statement from fts2 cursor. (CVS 5499)

FossilOrigin-Name: 02870ed21dae2601a656b2f30c3ca0041e9cb60f
2008-07-29 20:38:17 +00:00
shess
25192cac24 Be a bit more susicious of invalid results from the tokenizer.
Backports check-in (4514) from fts3. (CVS 5459)

FossilOrigin-Name: 311aeb9c2b75c420a37198a93e353c72e9166747
2008-07-22 23:54:50 +00:00
shess
db94e39b07 Implement optimize() function.
Backports check-in (5417) from fts3. (CVS 5458)

FossilOrigin-Name: c16900dc7603cab30f8729b25361bc88bb37ae43
2008-07-22 23:49:44 +00:00
shess
08904673c8 Delete all fts2 index data the table becomes empty.
Backports check-in (5413) from fts3. (CVS 5457)

FossilOrigin-Name: 4c98179be258319f441ae4e123cf59af77e96409
2008-07-22 23:41:26 +00:00
shess
3d373110f0 fts2 functions for testing scripts.
Backports (5340) from fts3. (CVS 5456)

FossilOrigin-Name: 4e47394be9dfbf0f9309e55eb6c6a3a517ea2006
2008-07-22 23:32:27 +00:00
shess
deca811cb5 Change prefix search from O(N*M) to O(NlogM).
Backports (4599) from fts3. (CVS 5455)

FossilOrigin-Name: 3f614453d2d7c753a5963b027fe8618b50b4f6b9
2008-07-22 23:08:40 +00:00
shess
b2822a2b5e Changes fts2 to use only sqlite3_malloc() and not system malloc.
Backports (4554) and (4555) from fts3. (CVS 5454)

FossilOrigin-Name: ecf2dec66cb979cb7d8db3b7ce5c64cab57fe2bb
2008-07-22 22:57:54 +00:00
drh
8a29dfdea0 Remove all instances of sprintf() from the FTS modules. Ticket #3049. (CVS 4996)
FossilOrigin-Name: 062bf5d44d53ae0ee2bf96eddcc8de09157aa789
2008-04-12 13:06:09 +00:00
drh
85b623f2f9 Change all instances of "it's" in comments to either "its" or "it is",
as appropriate, in case the comments are ever again read by a pedantic
grammarian.  Ticket #2840. (CVS 4629)

FossilOrigin-Name: 4e91a267febda572e7239f0f1cc66b3102558c36
2007-12-13 21:54:09 +00:00
drh
a6f46e991e Do not require SQLITE_ENABLE_BROKEN_FTS2 if FTS2 is not enabled.
The same for FTS1.  Ticket #2777. (CVS 4556)

FossilOrigin-Name: f94cdcfd1171fd110ed9cd4c47f1fb5fa7e99ca9
2007-11-23 18:06:23 +00:00
shess
cd7274ceb0 Don't do anything when input doclists are both empty. Ticket #2774 (CVS 4546)
FossilOrigin-Name: 75cb46f82a6a95dbe9e279dede299bafa2e91cae
2007-11-16 00:23:07 +00:00
shess
961303c1e7 Drop the forced error from fts3.c and add forced errors to fts2.c and
fts1.c. (CVS 4427)

FossilOrigin-Name: fec6567a0f8a868cda9bba2a473491dfb17b6c88
2007-09-13 18:16:08 +00:00
shess
27a770e044 Fix memory leak of InteriorReader.term. Comes up when doing queries
against large segments. (CVS 4315)

FossilOrigin-Name: 6c617bd89fc57881a2a308a6360e8ebb42835d46
2007-08-28 20:36:53 +00:00
shess
9fa502205d Convert fts2 to use sqlite3_prepare_v2() to prevent certain logic
errors around SQLITE_SCHEMA handling.  This also allows
sql_step_statement() and sql_step_leaf_statement() to be replaced with
sqlite3_step().

Also fix a logic error in flushPendingTerms() which was clearing the
term table in case of error.  This was wrong in the face of
SQLITE_SCHEMA.  Even though the change to sqlite3_prepare_v2() should
cause us not to see SQLITE_SCHEMA any longer, it was still a logic
error... (CVS 4205)

FossilOrigin-Name: 16730cb137eaf576b87cdc17913564c9c5c0ed82
2007-08-10 23:47:03 +00:00
drh
e6e4d6bb1a Fix some compiler warnings. (CVS 4196)
FossilOrigin-Name: 6cc15409ad6baefbe6e2214a4ac1cb3a0433f922
2007-08-05 23:52:05 +00:00
rse
e21733baa5 Fix ticket #2439: the FTS1 and FTS2 extensions use the non-standard,
unportable and highly deprecated <malloc.h> header on all platforms
except Apple Mac OS X. The <malloc.h> actually is never required on
any OS with an at least partly POSIX-conforming API as the malloc(3) &
friends functions officially live in <stdlib.h> since over 10 years.
Under some platform like FreeBSD the inclusion of <malloc.h> since a few
years even causes an "#error" and this way a build failure. So, just get
rid of the bad <malloc.h> usage in FTS1 and FTS2 extensions at all and
stick with <stdlib.h> there only. (CVS 4191)

FossilOrigin-Name: 3f9a666143a8aafa0b1a5d56ec68f69f2b3d6a21
2007-07-30 18:55:36 +00:00
danielk1977
ab9749ebb9 Modify handling of SQLITE_SCHEMA in fts2 code. An SQLITE_SCHEMA error may cause SQLite to reload the internal schema, deleting and recreating v-table objects. So the sqlite3_vtab structure can be deleted out from under a v-table implementation. (CVS 4151)
FossilOrigin-Name: dee1a0fd28e8341af6523ab0c5628b671d7d2811
2007-07-02 10:16:49 +00:00
danielk1977
c033b64276 Implement xRename() for fts2 so that it is possible to rename fts2 tables. (CVS 4143)
FossilOrigin-Name: 488474fde753c5a7a14ed8f2fad7f16efd236491
2007-06-27 16:26:07 +00:00
danielk1977
08ada518ff Remove the unused EXTSRC variable from the non-configure makefile. (CVS 4129)
FossilOrigin-Name: bbdcf372c6f2144a62fba742b3f4bd6b2fe58593
2007-06-26 10:56:40 +00:00
drh
397aa141ed Put #ifdefs in fts2_tokenizer so that the build works even when FTS2
is omitted.  Add the SQLite blessing to the header comments on all FTS2
source files. (CVS 4120)

FossilOrigin-Name: c795e6fd8f01bcbc1967062632c13d4952abf4d8
2007-06-25 13:50:03 +00:00
drh
5665b3ea44 All the use of MySQL-style quoting in the FTS modules. Ticket #2446. (CVS 4119)
FossilOrigin-Name: 3be2a6d1c342454d93b05c38f3d9a960ab15dae2
2007-06-25 12:49:05 +00:00
danielk1977
832a58a68c Extend fts2 so that user defined tokenizers may be added. Add a tokenizer that uses the ICU library if available. Documentation and tests to come. (CVS 4108)
FossilOrigin-Name: 68677e420c744b39ea9d7399819e0f376748886d
2007-06-22 15:21:15 +00:00
danielk1977
86889fc3c6 Fix snippet generation when the left-most column of an fts2 table is used in the MATCH clause. Fix for ticket #2429. (CVS 4095)
FossilOrigin-Name: fec56ad2ede53e3e202d9ad869a059eeb315796f
2007-06-20 06:23:54 +00:00
shess
401b80656d Minor comment edits from my prefix development client. No code changes. (CVS 4058)
FossilOrigin-Name: 6953cd0935b5526756ab745545420e40adc3c56d
2007-06-12 18:20:04 +00:00
shess
8a7de08a8b Fix overzealous fts2 assertions WRT rowid 0 or lower. Only check that
docids are ascending if there was a prior docid set for the doclist,
ignore the initial docid of 0. (CVS 4026)

FossilOrigin-Name: ed3a131f1d3fe51d1e79bdfe1bfafa55f825afa9
2007-05-21 21:59:18 +00:00
shess
290283fe69 Enable prefix-search in query-parsing and snippet generation. If the
character immediately after the end of a term is '*', that term is
marked for prefix matching.  Modify term comparison in
snippetOffsetsOfColumn() to respect isPrefix.  fts2n.test runs prefix
searching through some obvious test cases. (CVS 3893)

FossilOrigin-Name: 7c4c65924035d9f260f6b64eb92c5c6cf6c04b7b
2007-05-01 18:25:52 +00:00
shess
cc3e986643 Modify loadSegmentLeavesInt() to correctly handle prefix searching.
The new function docListUnion() is used to accumulate a union of the
hits for the matching terms, which will be merged across segments
using docListMerge(). (CVS 3891)

FossilOrigin-Name: 72c796307338c2751a91c30f6fb16989afbf3816
2007-05-01 17:14:59 +00:00
shess
0b6212090f Propagate prefix flag through implementation of doclist query code.
Also implement correct prefix-handling for traversal of interior nodes
of segment tree.  A given prefix can span multiple children of an
interior node, and from there the branches need to be followed in
parallel. (CVS 3889)

FossilOrigin-Name: cae844a01a1d87ffb00bba8b4e7b62a92e633aa9
2007-04-30 22:09:36 +00:00
shess
f055154108 Lift docListMerge() call out of loadSegmentLeavesInt() for prefix
search.  Doclists from multiple prefix matches will need a union merge
function, which will have to logically happen across a segment before
doclists are merged between segments. (CVS 3887)

FossilOrigin-Name: 7ddb82668906e33e2d6a796f2da1795032e036d5
2007-04-30 17:52:51 +00:00
shess
8ffcadb57e Break interior-node and leaf-node readers apart in loadSegment().
Previously, the code looped until the block was a leaf node as
indicated by a leading NUL.  Now the code loops until it finds a block
in the range of leaf nodes for this segment, then reads it using
LeavesReader.  This will make it easier to traverse a range of leaves
when doing a prefix search. (CVS 3884)

FossilOrigin-Name: 9466367d65f43d58020e709428268dc2ff98aa35
2007-04-27 22:02:57 +00:00
shess
ac7b2dd518 Lift code to traverse interior nodes out of loadSegment().
Refactoring towards prefix searching. (CVS 3882)

FossilOrigin-Name: 25935db73877c0cb132acb30c2fed2544d0e5e32
2007-04-27 21:24:18 +00:00
shess
1c7ebb0805 Refactor fts2 loadSegmentLeaf() in preparation for prefix-searching.
Prefix-searching will want to accumulate data across multiple leaves
in the segment, using LeavesReader instead of LeafReader is the first
step in that direction. (CVS 3881)

FossilOrigin-Name: 22ffdae4b6f3d0ea584dafa5268af7aa6fdcdc6e
2007-04-27 21:01:59 +00:00
shess
3b2f10cd8f Fix bug in fts2 handling of OR queries. When one doclist ends before
the other, the code potentially tries to read past the end of the
doclist.  http://www.sqlite.org/cvstrac/tktview?tn=2309 (CVS 3862)

FossilOrigin-Name: dfac6082e8ffc52a85c4906107a7fc0e1aa9df82
2007-04-19 18:36:32 +00:00
shess
6b6ab13353 Fix crash in delete when existing row has null fields. Previous code
assumed that the row had values in all columns, sigh.  Fixes bug
http://www.sqlite.org/cvstrac/tktview?tn=2289 . (CVS 3833)

FossilOrigin-Name: 81be7290a4db7b74a533aaf95c7389eb4bde6a88
2007-04-09 20:45:40 +00:00
shess
06c69d2ed6 Buffer updates per-transaction rather than per-update. If lots of
updates happen within a single transaction, there was a lot of wasted
encode/decode overhead due to segment merges.  This code buffers
updates in memory and writes out larger level-0 segments.  It only
works when documents are presented in ascending order by docid.
Comparing a test set running 100 documents per transaction, the total
runtime is cut almost in half. (CVS 3751)

FossilOrigin-Name: 0229cba69698ab4b44f8583ef50a87c49422f8ec
2007-03-29 18:41:03 +00:00
shess
194f8972d5 Don't call ctype functions on hi-bit chars. Some platforms raise
assertions when this occurs, and it's almost certainly not the right
thing to do in the first place. (CVS 3746)

FossilOrigin-Name: f6c3abdc6c5e916e5366ba28fb1cd06ca3554303
2007-03-29 16:30:38 +00:00
shess
13ee81fe96 Refactor PLWriter to remove owned buffer. DLCollector (Document List
Collector) now handles the case where PLWriter (Position List Writer)
needed a local buffer.  Change to using the associated DLWriter
(Document List Writer) buffer, which reduces the number of memory
copies needed in doclist processing, and brings PLWriter operation in
line with DLWriter operation. (CVS 3707)

FossilOrigin-Name: d04fa3a13a84f49074c673b8ee2fb6541da061b5
2007-03-22 00:14:28 +00:00
shess
4607fc06f6 Refactor PLWriter in preparation for buffered-document change.
Currently, PLWriter (Position List Writer) creates a locally-owned
DataBuffer to write into.  This is necessary to support doclist
collection during tokenization, where there is no obvious buffer to
write output to, but is not necessary for the other users of PLWriter.
 This change adds a DLCollector (Doc List Collector) structure to
handle the tokenization case.

Also fix a potential memory leak in writeZeroSegment().  In case of
error from leafWriterStep(), the DataBuffer dl was being leaked. (CVS 3706)

FossilOrigin-Name: 1b9918e20767aebc9c1e7523027139e5fbc12688
2007-03-20 23:52:37 +00:00
shess
3438ea3b9e http://www.sqlite.org/cvstrac/tktview?tn=2219
When creating fts tables in an attached database, the backing tables
are created in database 'main'.  This change propagates the
appropriate database name to the routines which build sql statements.

Note that I propagate the database name and table name separately.  I
briefly considered just making the table name be "db.table", but it
didn't fit so well in the model used to store the table name and other
information, and having the db name passed separately seemed a bit
more transparent. (CVS 3631)

FossilOrigin-Name: 283385d20724f0144f38de89bd179715ee5e738b
2007-02-07 01:01:17 +00:00
shess
3ad202dd17 http://www.sqlite.org/cvstrac/tktview?tn=2166,35
Calling UPDATE against an fts table in a UTF-16 database inserts
corrupted data into the database.  The UTF-8 data is being inserted
directly.  This appears to happen because sqlite3_ value_text()
destructively coerces a value to UTF-8, and it's never converted back
when updating the table. This works around the problem by rearranging
things so that the update happens before the coercion. (CVS 3596)

FossilOrigin-Name: 4f2ab4b6320ffc621900049b41f50bc30d76d7f5
2007-01-19 22:59:56 +00:00
shess
f7912aff8a Drop a couple variables which are no longer used anywhere. (CVS 3524)
FossilOrigin-Name: 08c2cc0e0782cfaca89947a01b7ea4474dbe71aa
2006-11-29 23:41:10 +00:00
shess
5c327dbb46 http://www.sqlite.org/cvstrac/tktview?tn=2046
The virtual table interface allows for a cursor to field multiple
xFilter() calls.  For instance, if a join is done with a virtual
table, there could be a call for each row which potentially matches.
Unfortunately, fulltextFilter() assumes that it has a fresh cursor,
and overwrites a prepared statement and a malloc'ed pointer, resulting
in unfinalized statements and a memory leak.

This change hacks the code to manually clean up offending items in
fulltextFilter(), emphasis on "hacks", since it's a fragile fix
insofar as future additions to fulltext_cursor could continue to have
the problem. (CVS 3521)

FossilOrigin-Name: 18142fdb6d1f5bfdbb1155274502b9a602885fcb
2006-11-29 05:17:28 +00:00
shess
7e3d0c2d2f Delta-encode terms in interior nodes. While experiments have shown
that this is of marginal utility when encoding terms resulting from
regular English text, it turns out to be very useful when encoding
inputs with very large terms. (CVS 3520)

FossilOrigin-Name: c8151a998ec2423b417566823dc9957c7d5d782c
2006-11-29 01:02:03 +00:00
shess
f72442be68 Store minimal terms in interior nodes. Whenever there's a break
between leaf nodes, instead of storing the entire leftmost term of the
rightmost child, store only that portion of the leftmost term
necessary to distinguish it from the rightmost term of the leftmost
child. (CVS 3513)

FossilOrigin-Name: f6e0b080dcfaf554b2c05df5e7d4db69d012fba3
2006-11-18 00:12:44 +00:00
shess
9e6a561554 Refactoring groundwork for coming work on interior nodes. Change
LeafWriter to use empty data buffer (instead of empty term) to detect
an empty block.  Code to validate interior nodes.  Moderate revisions
to leaf-node and doclist validation.  Recast leafWriterStep() in terms
of LeafWriterStepMerge(). (CVS 3512)

FossilOrigin-Name: f30771d5c7ef2b502af95d81a18796b75271ada4
2006-11-17 21:12:15 +00:00