Commit Graph

3134 Commits

Author SHA1 Message Date
dan
6d67e33776 Mark fts3ReallocOrFree and fts3InitVtab as static. Ticket [ff44d82f3b].
FossilOrigin-Name: a9038306c33c88120d8bef27209d8f0641c85c9b
2009-12-09 05:30:36 +00:00
dan
84db21ec6a Add tests to improve coverage of fts3. Associated bugfixes.
FossilOrigin-Name: f0eac4175aee6c50ee68acc253f76fbe44574250
2009-12-08 19:05:53 +00:00
dan
7bf44fc018 Remove a redundant line from fts3.
FossilOrigin-Name: cd50acf37fd1e3b388f98fb2df7ed03cff454b24
2009-12-07 16:26:52 +00:00
shaneh
0d935576a4 Move some SQLITE_TEST code down to avoid mixing code and variable declarations.
Fix a test for high-order bit handling in sqlite3Fts3InitTokenizer().

FossilOrigin-Name: fad43d290f9489726aaa2e999a17ea17ed78b27b
2009-12-07 16:18:37 +00:00
dan
ff32e39c8e Add some tests for OR, AND and NOT operations to fts3rnd.test. Add tests to check that errors are returned when bad arguments are passed to fts3 functions snippet, offsets and optimize. Minor fix for the same
FossilOrigin-Name: 5811df3f0412598d189d46b58de4deff24573651
2009-12-07 12:34:51 +00:00
dan
28f372f6d9 Fix another bug in 3-way NEAR queries.
FossilOrigin-Name: 3bb13a06521b54194c9f3eb44e0dc42bacf016a4
2009-12-05 14:29:22 +00:00
dan
6e053f9c23 Fix some problems with FTS3 and 3-way NEAR queries.
FossilOrigin-Name: 23dc6fb5b28712d1ba18dc7ddb3f2ef3b469d611
2009-12-05 11:37:19 +00:00
dan
165b67cb36 Fix a problem involving a 3-way NEAR query.
FossilOrigin-Name: 507890a9139875b1b594225c432c714f67312c0e
2009-12-04 19:07:24 +00:00
dan
acf28fbdd8 Modify [2ad1461f25] to avoid leaving a prepared statement in "active" state following an OOM error in FTS3.
FossilOrigin-Name: 69c21ee46aeeb624fd8638b17ff7259a3e5f9a46
2009-12-04 14:11:33 +00:00
drh
406a15ad8b Fix an FTS3 problem where an OOM error was not being propagated back
out to the top-level interface.

FossilOrigin-Name: 2ad1461f255c2499367b706a5ec65b44c1fc1618
2009-12-04 13:42:59 +00:00
dan
e2e5145441 Fix an incorrect assert() in fts3.c. Add further fts3 tests.
FossilOrigin-Name: 75863c2d55e0801add5b8dcf88d575c5c870af04
2009-12-03 17:36:22 +00:00
shaneh
e585b8f05c Updates to FTS3 to correct compiler warnings under MSVC.
FossilOrigin-Name: 37495b55ffbdc2db4482367ac7d8e32d4d71d58e
2009-12-03 06:26:46 +00:00
drh
c12075b3b2 Change an error message in FTS3 to avoid using an uninitialized variable.
FossilOrigin-Name: 620a8a2b38f5f2ad3db304b2bc88360285c174db
2009-12-02 20:25:57 +00:00
dan
19125aaf68 Fix an uninitialized value read in sqlite3async.c.
FossilOrigin-Name: 1cf2136c39239a6fea6ac2a555f55865dd517d93
2009-12-02 18:16:56 +00:00
dan
bc7c039ce2 Clear the Pager.dbModified flag when unlocking the database. Assert that it is clear when locking the database.
FossilOrigin-Name: d17ec16b7c5051c904c09580a856593b2fb85edc
2009-12-02 14:44:32 +00:00
dan
6bd3b2aa64 Use #include "sqlite3.h" instead of <sqlite3.h> in fts3Int.h.
FossilOrigin-Name: 7737db490ceba02c29c36fe181b4e7895b40aa75
2009-12-01 17:08:09 +00:00
dan
8c4499764b Add typedefs for i16 and u8 to fts3Int.h when not building via the amalgamation method.
FossilOrigin-Name: fa56c1c47296c2f9ba1de9450d421dd06fde5a6a
2009-12-01 17:05:50 +00:00
drh
65e8c82e1a Changes to the TCL interface header to allow it to be compiled independently
from the amalgamation.

FossilOrigin-Name: 58113932d93926b4aa037a7487105a55f883cd0a
2009-12-01 13:57:48 +00:00
dan
3acb07d6c3 Open a savepoint within the FTS3 optimize() function.
FossilOrigin-Name: 4924fbb244bd1b7103e29e045812cb1c4d2d81c8
2009-12-01 13:48:13 +00:00
dan
612b1d5cd0 Fix a segfault that can occur when querying an empty FTS3 table. Also restore the rowid/docid conflict handling to work as it did in version 3.6.20.
FossilOrigin-Name: c022f66b5a65aa54d5ebd55cfe941118a2042280
2009-12-01 12:00:22 +00:00
drh
ff3f307cd0 Test coverage improvements in the FTS3 porter stemmer.
FossilOrigin-Name: 6d112bfd53998b8f6693d3f2edbcd5ab4cdf5fb1
2009-11-30 19:48:16 +00:00
drh
0a62730d3f Updates to snippet() and offsets() functions of FTS3 so that they work
sanely following an OOM fault.

FossilOrigin-Name: b939a37a8ce296785a300e79ab9d3d87ad91343f
2009-11-28 21:33:21 +00:00
drh
9287d93c17 Change FTS3 to detect when the RHS of the MATCH opertor encounters an OOM
during string format conversion and report back an SQLITE_NOMEM error.

FossilOrigin-Name: 31eed4f8f95f0799d634eccbd9e09cb58172d250
2009-11-28 17:23:47 +00:00
drh
44c1e5a13c Remove all benign OOM failure opportunities from the FTS3 hash table
implementation.  All OOM faults cause SQLITE_NOMEM to be returned.

FossilOrigin-Name: 80754d383a0e890ea3f315dab941b9f166481ddd
2009-11-28 17:07:42 +00:00
dan
8e9f6aedae Add a test case for creating an FTS3 table with no module arguments or opening/closing brackets in the CREATE VIRTUAL TABLE statement.
FossilOrigin-Name: a9cba7ea0a06efa7a63a3069b219cc30fb127e98
2009-11-28 15:35:16 +00:00
dan
81fa6dc319 Fix a bug introduced by the fts3 refactoring (segfault when creating a table with zero module args). Also a fix to handle an OOM error.
FossilOrigin-Name: eada284bc10cafcab9beb3473bb0c70b3b4de2f9
2009-11-28 12:40:32 +00:00
dan
f6d7b055fa Remove a C++ism accidentally added to fts3.c.
FossilOrigin-Name: 97d332416069d2fbce323740b276d0e7523eeee5
2009-11-27 12:14:47 +00:00
dan
7eee299243 Add some missing comments to fts3 files. No source code changes.
FossilOrigin-Name: b6402b2065b844acb3f1bb94ad964568706bcb86
2009-11-21 03:03:21 +00:00
dan
e6828f5503 Merge leaf accidentally created by [1c4984c62f].
FossilOrigin-Name: cae949ce971ca216e0f8880b2f93866619fa05be
2009-11-20 10:23:12 +00:00
dan
d1414c58e5 Improve comments and other things in fts3_write.c.
FossilOrigin-Name: 1cf0e3cc14bad22867e740736c2886dc1c4a48dc
2009-11-20 05:05:19 +00:00
dan
d313865550 Minor optimizations to fts3 code.
FossilOrigin-Name: b456eacbbb16513d1b27e90015ea58a6dc92cc3b
2009-11-20 02:24:15 +00:00
dan
8f4a4f24dd Fix a performance regression introduced while reworking the fts3 code.
FossilOrigin-Name: 7cd178a72ab99c94fdacffb19aad819ae600e57d
2009-11-19 18:28:45 +00:00
dan
16708c4a0d Fix some fts3 related issues with the autoconf and amalgamation build systems.
FossilOrigin-Name: 3b17924754343c0163464dabf01a9c46ffccef28
2009-11-19 15:25:25 +00:00
dan
5dc842ddf7 Fix problems introduced into fts3 as part of the refactoring.
FossilOrigin-Name: fa0998e19d984ee57f4f506c34eb858026cc49c3
2009-11-19 00:15:27 +00:00
dan
bd61689382 Add some missing comments and fix some other issues in fts3 code.
FossilOrigin-Name: 2fe579e778b75fbf503c02e01e5424c1926f2b49
2009-11-18 15:35:58 +00:00
dan
f13b704ee6 Improvements to the way fts3 reads the full-text index.
FossilOrigin-Name: 45c051e78651d8204c17cecdda2bde705698881f
2009-11-17 12:52:10 +00:00
dan
948a5f88ea Add a few extra coverage test cases for fts3.
FossilOrigin-Name: f29c8fcade4aadeae3824975cf59f306c11c906b
2009-11-16 16:36:23 +00:00
dan
91f0ce39e4 Further OOM testing for fts3 code. Add Tcl code implementing an integrity-check for fts3.
FossilOrigin-Name: c27d46b33e8596b45c562c2742b05030e8899092
2009-11-14 11:41:00 +00:00
dan
09977bb9f0 Start reworking fts3 code to match the rest of SQLite (code conventions, malloc-failure handling etc.).
FossilOrigin-Name: 30a92f1132801c7582007ee625c577ea2ac31cdf
2009-11-13 10:36:20 +00:00
drh
c81c11f62c Remove the obsolete "$Id:$" RCS identifier strings from the source code.
FossilOrigin-Name: f6c045f649036958078cb15cd9d5453680c82b0c
2009-11-10 01:30:52 +00:00
dan
fd3b22265e Use 64-bit arithmetic in the xRead() method of asyncRead. Fix for [94c04eaadb].
FossilOrigin-Name: ca3e41b0574cfd8d971c2be2114e58273a531970
2009-10-19 07:50:25 +00:00
dan
4ec56ff0ce Fix some errors in the guttman versions (disabled by default) of the algorithms in rtree.c.
FossilOrigin-Name: 64bad00b4f6fbbc3e5e75966f9c3959ad3d542ef
2009-10-05 05:40:08 +00:00
dan
17458718b2 Update an r-tree test to account for changes in the query planner.
FossilOrigin-Name: e5ce66d40bd68dc014071f7830112fa3b1d72948
2009-09-10 18:26:05 +00:00
danielk1977
9af00021a1 Mark the rtreeUpdate function as static. (CVS 6961)
FossilOrigin-Name: b6bdfdc69df4fc6cad669fd8b2cbaa9ecb95cb78
2009-08-06 18:36:47 +00:00
danielk1977
ee0484c1b5 Add the experimental API sqlite3_strnicmp(). Modify fts3 so that in terms like 'column_name:token' the column_name is interpreted in a case-insenstive fashion. Ticket #3996. (CVS 6950)
FossilOrigin-Name: 4571aa9e9142db465ae8250b0adf27e0a094331a
2009-07-28 16:44:26 +00:00
danielk1977
5368f29ac4 When the asynchronous IO backend opens a file with the EXCLUSIVE flag set, make sure only a single file-descriptor is opened (not one for reading and one for writing). This change fixes #3978. (CVS 6905)
FossilOrigin-Name: 630e669b97a81f9125d4bdc18517738b74eecdec
2009-07-18 11:52:04 +00:00
danielk1977
33c54a989e Return a meaningful error message if a keyword is used as an rtree table column name. Ticket #3970. (CVS 6902)
FossilOrigin-Name: 046efe46b50fbe928f39a0cda1b1006d486ce9f5
2009-07-17 16:54:48 +00:00
danielk1977
e932ba260e Fix a double-free that can occur when using the fts3 legacy syntax '-' operator. Add tests for the same operator. Ticket #3960. (CVS 6874)
FossilOrigin-Name: c19d419e8cf94a26d9bb6ad478e84841168a882e
2009-07-10 09:24:43 +00:00
danielk1977
1ed93e9085 Add conditional 'extern "C"' block to sqlite3async.h. Ticket #3866. (CVS 6662)
FossilOrigin-Name: e4d1b117c90dca341bfa74291c7dfc2afca38cc6
2009-05-21 04:42:19 +00:00
shane
eb4ac06f4e More cleanup, etc. to support MSVC compiles. (CVS 6582)
FossilOrigin-Name: 2cd9655e7313671f2bbe8d4a6f13246cbbf61205
2009-04-30 17:45:33 +00:00
shane
a3628d14d7 Fixed compile for MSVC; removed compiler warnings; changes for NDEBUG build; minor code tweaks. (CVS 6570)
FossilOrigin-Name: e98b12425ff036b36165dfd2002e0530ca27a677
2009-04-29 18:11:59 +00:00
danielk1977
6f050aa2bf Tests for the new asynchronous IO API. (CVS 6549)
FossilOrigin-Name: 11b2564e7159168cd0815bb9bc93688586fad1e0
2009-04-25 08:39:14 +00:00
danielk1977
4598b8e4a1 Make selecting the asynchronous IO file-locking mode a runtime operation. Still untested. (CVS 6544)
FossilOrigin-Name: 577277e84a05707b8c21aa08bc5fc314c1ac38ac
2009-04-24 10:13:05 +00:00
danielk1977
debcfd2dcb Improve comments and documentation of the asynchronous IO VFS module. (CVS 6543)
FossilOrigin-Name: 92bc6be2a86f8a68ceded2bc08fe7d6ff23b56fb
2009-04-24 09:27:16 +00:00
danielk1977
a3f065980e Move the asynchronous IO code from src/test_async.c to ext/async/. Refactor it to be a standalone module and to support windows. (CVS 6539)
FossilOrigin-Name: e71fb0fb8d83b4453c3c1e84606bf58d04926809
2009-04-23 14:58:39 +00:00
danielk1977
2fe5cb1809 Avoid fts3 crash on (MATCH '""') expressions. Ticket #3717. (CVS 6343)
FossilOrigin-Name: 03679857a320517a7b89e5214e948bce9af896a9
2009-03-12 15:43:47 +00:00
shane
be21779385 Corrected typos and misspellings. Ticket #3702. (CVS 6336)
FossilOrigin-Name: 6404afa0c515a6536fc2e878d4fb451e4dc06942
2009-03-05 04:20:31 +00:00
danielk1977
e1d3ac9cd0 Add a comment to fts3_tokenizer.h to make it clear how the xNext() method is supposed to set its output variables. Make sure the output variables of xNext() are only used if SQLITE_OK is returned. Ticket #3604. (CVS 6198)
FossilOrigin-Name: 5b3c075f96be9671d0bcffe928589b211559e835
2009-01-21 17:45:33 +00:00
drh
d162988b47 Fix typos in comments in FTS3 implementation. (CVS 6178)
FossilOrigin-Name: b0f066630c35c4947d3ecd29d32d91036da19e94
2009-01-14 18:59:41 +00:00
drh
be90df0b3e Do not display matches against
the right-hand side of a NOT operator in the output
of the FTS snippet() or offsets() functions. (CVS 6097)

FossilOrigin-Name: d44c84c0f77bd0fc4a9942177b6cae6d109b89b7
2009-01-02 01:10:42 +00:00
danielk1977
fc8c9f84ab Fix some problems in the fts3 expression parser with mismatched parenthesis. (CVS 6095)
FossilOrigin-Name: ccfe4580ac7ba9add0e69c786a9a3a43d69b7753
2009-01-01 14:06:13 +00:00
drh
b39187ae89 Additional test cases and cleanup of FTS3 parenthesis processing. (CVS 6094)
FossilOrigin-Name: afac4293000f81410d105a99956605bf7102fa62
2009-01-01 12:34:45 +00:00
danielk1977
5973e6a30b Add pseudo-random tests of the fts3 expression parser. Revise the fix in (6091). (CVS 6092)
FossilOrigin-Name: 11c2d4686197fb3f0d601651d5bbb3492af8f0dd
2009-01-01 07:08:54 +00:00
danielk1977
49b4b4d84a Fix a bug parsing "<expr> AND (abc NEAR def)" in fts3_expr.c. (CVS 6091)
FossilOrigin-Name: d1a6a2edd799d65ff88510df951e909919e35b6b
2009-01-01 04:19:51 +00:00
drh
42128b9e33 Fix the name in the documentation of the compile-time macro for
enabling FTS3 parenthesis processing. (CVS 6089)

FossilOrigin-Name: ac8258da6ecd3ea37f394dc3b48834eb57832cf4
2008-12-31 19:27:53 +00:00
drh
757b178100 Fix the FTS3 expression parser so that it works in the amalgamation when
FTS3 is disabled. (CVS 6088)

FossilOrigin-Name: 7e238e8604b9a9f786d84a47d21c6b42f1585755
2008-12-31 16:27:58 +00:00
drh
aeba020bea Fix the FTS3 module with parenthesis syntax so that it will work in
the amalgamation. (CVS 6087)

FossilOrigin-Name: c2b9891fc05ec05b270f108f61ab81b2df874e01
2008-12-31 16:01:04 +00:00
danielk1977
d597e08b23 Fix a bug in README.tokenizers. Ticket #3559. (CVS 6075)
FossilOrigin-Name: b8898d132e84888dc7c51b2f1ab67f78cc21f31b
2008-12-30 06:36:50 +00:00
danielk1977
7974759cb4 Fix a reference counting bug in rtree. Ticket #3549. (CVS 6054)
FossilOrigin-Name: bbdc0e9f2481f8d59e05ea282b615f97e09fb471
2008-12-22 15:04:32 +00:00
danielk1977
d34c03a946 Add the file ext/fts3/README.syntax, containing documentation describing the two query syntaxes now supported by fts3. (CVS 6042)
FossilOrigin-Name: ed81ad5a5d22304a4d96e778e8e9094f74c461c0
2008-12-19 11:37:38 +00:00
danielk1977
78d41832fc Fix a bug in icuOpen() in fts2. (CVS 6038)
FossilOrigin-Name: b9c722bd96b44e0fabd1564ddd982d2aabb7047c
2008-12-18 05:30:26 +00:00
danielk1977
f0f9f75443 Fix some strict-aliasing problems in fts3_expr.c. (CVS 6035)
FossilOrigin-Name: 20a4ca5d361ecbb982129171f10cccac4f5ad093
2008-12-17 15:49:51 +00:00
danielk1977
33e8903540 Modify fts3 to support a more complex expression syntax that allows parenthesis. The new syntax is not entirely backwards compatible, so is disabled by default. Use -DSQLITE_ENABLE_FTS3_PARENTHESIS to enable it. (CVS 6034)
FossilOrigin-Name: 7389b9ecb80294569845c40a23e0c832d07f7a45
2008-12-17 15:18:17 +00:00
danielk1977
777da0848d Fix a couple of memory leaks that may follow malloc failures. (CVS 5906)
FossilOrigin-Name: 4cf8a8e1bf22e1d8f7166e64328a95fe36c75033
2008-11-13 19:12:34 +00:00
drh
7ab49bfd1e Do not redefine the MIN and MAX macros if they are already defined. (CVS 5896)
FossilOrigin-Name: f41dd2053c8a297a05b47d0ef631b4d9a7db2fff
2008-11-12 15:24:27 +00:00
danielk1977
a7435e31ab Remove unused parameter from function rtreeInit() (part of the r-tree extension). (CVS 5842)
FossilOrigin-Name: 3224ea59812d0f3b5685bd92751054b81e3b681e
2008-10-25 17:10:10 +00:00
drh
8578611b95 Fix the NEAR connector in FTS3 so that it can take ranges in excess of 9.
The maximum range is now 32767. (CVS 5695)

FossilOrigin-Name: 8e9b9553115c42dae38cad0612d98d9a0c453a5c
2008-09-12 18:25:30 +00:00
danielk1977
b9134e3e84 Fix a bug in r-tree related to internal nodes with one or more dimensions of size zero. Ticket #3363. (CVS 5682)
FossilOrigin-Name: 8b600ed083d48784df4b1da1320a01bebbf233d7
2008-09-08 11:07:03 +00:00
danielk1977
1c82665040 Add header file sqliteicu.h to the ICU extension. This is analogous to the rtree.h and fts3.h headers used by other extensions to declare their entry points. Fix for ticket #3361. (CVS 5680)
FossilOrigin-Name: 79364b963b348d5433da737b4e21e97952882389
2008-09-08 08:08:09 +00:00
danielk1977
075c23af26 Begin adding support for the SQLITE_OMIT_WSD macro. Some (many) WSD variables still need wrappers added to them. (CVS 5652)
FossilOrigin-Name: 573d92abb9adb1c321ebc2fcadcf14374213b093
2008-09-01 18:34:20 +00:00
danielk1977
865d4d4290 Have the rtree module set the estimatedCost output variable. Ticket #3312. (CVS 5649)
FossilOrigin-Name: 483932c4e08901a11b7ab671073fd0a048b10d66
2008-09-01 12:46:59 +00:00
shess
7fdb522caf Backport http://www.sqlite.org/cvstrac/chngview?cn=5489 from fts3.
Re-used prepared statement from fts2 cursor. (CVS 5499)

FossilOrigin-Name: 02870ed21dae2601a656b2f30c3ca0041e9cb60f
2008-07-29 20:38:17 +00:00
shess
b5f94870c2 Re-used prepared statement from fts3 cursor. Previously, each call to
fulltextFilter() finalized any existing prepared statement and
prepared a new one.  In the case where idxNum has not changed, simply
reseting the statement suffices.  This provides an order of magnitude
speedup in incoming joins against docid. (CVS 5489)

FossilOrigin-Name: a08a5f2b1256b8a93beca5a359ccfc28d403efa3
2008-07-29 01:13:02 +00:00
shess
25192cac24 Be a bit more susicious of invalid results from the tokenizer.
Backports check-in (4514) from fts3. (CVS 5459)

FossilOrigin-Name: 311aeb9c2b75c420a37198a93e353c72e9166747
2008-07-22 23:54:50 +00:00
shess
db94e39b07 Implement optimize() function.
Backports check-in (5417) from fts3. (CVS 5458)

FossilOrigin-Name: c16900dc7603cab30f8729b25361bc88bb37ae43
2008-07-22 23:49:44 +00:00
shess
08904673c8 Delete all fts2 index data the table becomes empty.
Backports check-in (5413) from fts3. (CVS 5457)

FossilOrigin-Name: 4c98179be258319f441ae4e123cf59af77e96409
2008-07-22 23:41:26 +00:00
shess
3d373110f0 fts2 functions for testing scripts.
Backports (5340) from fts3. (CVS 5456)

FossilOrigin-Name: 4e47394be9dfbf0f9309e55eb6c6a3a517ea2006
2008-07-22 23:32:27 +00:00
shess
deca811cb5 Change prefix search from O(N*M) to O(NlogM).
Backports (4599) from fts3. (CVS 5455)

FossilOrigin-Name: 3f614453d2d7c753a5963b027fe8618b50b4f6b9
2008-07-22 23:08:40 +00:00
shess
b2822a2b5e Changes fts2 to use only sqlite3_malloc() and not system malloc.
Backports (4554) and (4555) from fts3. (CVS 5454)

FossilOrigin-Name: ecf2dec66cb979cb7d8db3b7ce5c64cab57fe2bb
2008-07-22 22:57:54 +00:00
shess
29647900e2 fts2.c buildTerms() passes -1 for nInput.
Backports (4511) from fts3. (CVS 5453)

FossilOrigin-Name: d562515e1cdd05212674516033c64b5f5668b799
2008-07-22 22:20:50 +00:00
shess
4249b3f539 Cleanup the hash functions in FTS2.
Backports (4440) from fts3. (CVS 5452)

FossilOrigin-Name: e31d2f875c13ee41742c9aaee6291662cdbbf863
2008-07-22 22:15:47 +00:00
drh
7cb53b0fdb Allow the r-tree extension to be compiled as part of the amalgamation. (CVS 5424)
FossilOrigin-Name: 5c26f63e476be3e18b2acdec5dd459da3bfceefa
2008-07-16 14:43:34 +00:00
shess
7d9ef0d0fc Implement optimize() function. This merges all segments in the fts
index into a single segment, including dropping delete cookies. (CVS 5417)

FossilOrigin-Name: b22e187bc2b38bd219dd0feba19b97279bd83089
2008-07-15 21:32:07 +00:00
shess
c2c66a030d Delete all fts3 index data the table becomes empty. Previously,
deleting all rows from an fts3 table would leave a bunch of index data
describing the terms of the original data, plus deletions of those
terms, perhaps with some amount of it merged together so the deletions
knocked out the originals.  Even when all rows were deleted that
original data would hang out, though eventually it would mostly be
overwritten if new data contained the same set of terms. (CVS 5413)

FossilOrigin-Name: 8b872e426091d9ef108e52dbec0d968ed7452907
2008-07-14 20:43:15 +00:00
danielk1977
3ddb5a5104 Have the rtree extension publish two virtual table types: "rtree" and "rtree_i32". rtree_i32 stores coordinate data as 32-bit signed integers. rtree uses 32-bit real (floating point) values. (CVS 5410)
FossilOrigin-Name: c060a9a6beca455bdceee9ce6ca71a7262f98a5f
2008-07-14 15:37:00 +00:00
shess
6c106e3f3b fts3 functions for testing scripts. These are a first step towards
being able to write test script which verify that fts3 is internally
building indices in the expected way.  Both new functions are only
defined if fts3.c is compiled with SQLITE_TEST defined, as when
building testfixture.  These functions are not intended to be part of
the exposed fts3 API.

dump_terms() generates a TEXT result of all the terms in the index (or
a specified segment), sorted and joined with spaces.

dump_doclist() generates a TEXT representation of the doclist
associated with a given term in the index (or a specified segment). (CVS 5340)

FossilOrigin-Name: a48e3d95f7a656285e959cef595cbe6d53428ad9
2008-07-03 19:53:21 +00:00
danielk1977
8cf6c554c0 Fix a bug causing the pager-cache size to be reset to its default value whenever the database schema was reloaded. (CVS 5283)
FossilOrigin-Name: 6dbe67da5cb0141e011b4fdcc3964a20f68be843
2008-06-23 16:53:46 +00:00
danielk1977
b13dee9900 Run (a subset of) the rtree tests from quick.test. (CVS 5282)
FossilOrigin-Name: e872c78c72eb5976e72123485692a76409bd857f
2008-06-23 15:55:52 +00:00
drh
0d287cf775 Fix another typo in the rtree README file. (CVS 5187)
FossilOrigin-Name: 9ab87b7b0d0195787f1527b5be1475fb89330f08
2008-06-04 15:09:16 +00:00
drh
72e87f44d0 Fix a bug in the R-Tree documentation. (CVS 5186)
FossilOrigin-Name: bb445a4b1fe43d7b3e8546a6510f4e3c3ecb500b
2008-06-04 14:20:09 +00:00
drh
0224d26d37 Allow the SQLITE_MAX_EXPR_DEPTH compile-time parameter to be set to 0 in
order to disable expression depth checking.  Ticket #3143. (CVS 5166)

FossilOrigin-Name: 5ceef40e397fc535173996404345b93f695e8cac
2008-05-28 13:49:34 +00:00
drh
4b4f780188 Fix a bug in rtree that occurs when too many constraints are passed
in on a query. (CVS 5162)

FossilOrigin-Name: 54b84a3ddba9d27814c2f613dd197f691ac549a4
2008-05-27 00:06:02 +00:00
drh
9f86ad2354 Use %w instead of %q when constructing shadow table names for rtree. (CVS 5161)
FossilOrigin-Name: 78f4ba974d9b768b62391d8cd2ed407d49584cb8
2008-05-26 20:49:02 +00:00
drh
58f1c8b773 Update the amalgamation builder to incorporate the RTREE extension. (CVS 5160)
FossilOrigin-Name: aa8eba3360c31182f5238e96b83a382374f40fab
2008-05-26 20:19:25 +00:00
danielk1977
ebaecc148f Import 'rtree' extension. (CVS 5159)
FossilOrigin-Name: b104dcd6adadbd3fe15a348fe9d4d290119e139e
2008-05-26 18:41:54 +00:00
drh
8a29dfdea0 Remove all instances of sprintf() from the FTS modules. Ticket #3049. (CVS 4996)
FossilOrigin-Name: 062bf5d44d53ae0ee2bf96eddcc8de09157aa789
2008-04-12 13:06:09 +00:00
drh
dd95535f20 Minor fixes to FTS3 so that it works better when appended to the end
of the amalgamation. (CVS 4769)

FossilOrigin-Name: 62ede6699d8f116921a5a0baddca5e7e63740cd3
2008-02-01 15:34:09 +00:00
drh
820a90694e Version number to 3.5.5. Include FTS3 in the amalgamation by default
(but disabled unless compiled with -DSQLITE_ENABLE_FTS3).  Fix a memory
allocation problem. (CVS 4757)

FossilOrigin-Name: 72411043e60d5358d5a7adf566d662d65d3b3336
2008-01-31 13:35:48 +00:00
drh
85b623f2f9 Change all instances of "it's" in comments to either "its" or "it is",
as appropriate, in case the comments are ever again read by a pedantic
grammarian.  Ticket #2840. (CVS 4629)

FossilOrigin-Name: 4e91a267febda572e7239f0f1cc66b3102558c36
2007-12-13 21:54:09 +00:00
shess
b6a75606ed Change prefix search from O(N*M) to O(NlogM). The previous code
linearly merged the doclists, so as the accumulated list got large,
things got slow (the M term, a fucntion of the number of documents in
the index).  This change does pairwise merges until a single doclist
remains.  A test search of 't*' against a database of RFC text
improves from 1m16s to 4.75s. (CVS 4599)

FossilOrigin-Name: feef1b15d645d638b4a05742f214b0445fa7e176
2007-12-07 23:47:53 +00:00
drh
8255feca02 The FTS3 amalgamation can now be appended to the SQLite amalgamation to
generate a single source file that contains both components. (CVS 4558)

FossilOrigin-Name: 0fc61f99b54bd269fcc011f448b9b971e902cb01
2007-11-24 00:41:52 +00:00
drh
a6f46e991e Do not require SQLITE_ENABLE_BROKEN_FTS2 if FTS2 is not enabled.
The same for FTS1.  Ticket #2777. (CVS 4556)

FossilOrigin-Name: f94cdcfd1171fd110ed9cd4c47f1fb5fa7e99ca9
2007-11-23 18:06:23 +00:00
drh
ac320cc14a Add a #include of sqlite3.h to fts3_hash.c. Tickets #2762 and #2777. (CVS 4555)
FossilOrigin-Name: c8485eb8bc62c810ec9f73e103468c57116fd94c
2007-11-23 18:01:07 +00:00
drh
613a0fe455 Changes fts3 to use only sqlite3_malloc() and not system malloc.
Ticket #2762. (CVS 4554)

FossilOrigin-Name: 460af6bb668094c99a1d4dc1540b44b6d1d036b6
2007-11-23 17:31:17 +00:00
shess
cd7274ceb0 Don't do anything when input doclists are both empty. Ticket #2774 (CVS 4546)
FossilOrigin-Name: 75cb46f82a6a95dbe9e279dede299bafa2e91cae
2007-11-16 00:23:07 +00:00
shess
adafd5747f Be a bit more susicious of invalid results from the tokenizer. (CVS 4514)
FossilOrigin-Name: deb8f56d3adea0025d28b8effabec7c7b7fe3026
2007-10-24 23:24:22 +00:00
shess
b6d78dc7bb fts3.c buildTerms() passes -1 for nInput. (CVS 4511)
FossilOrigin-Name: e87c883a1235ac47ee340a31051dcd5deb369d4e
2007-10-24 21:52:37 +00:00
danielk1977
1c1764ae66 Add the NEAR operator to fts3. (CVS 4502)
FossilOrigin-Name: aef7720e0bb49d52332ddebe6f698feb926ef7d7
2007-10-22 18:02:20 +00:00
drh
8a07c7a414 Cleanup the hash functions in FTS3. (CVS 4440)
FossilOrigin-Name: ac645c8f30aac0d98fc481260084c9bd3975a845
2007-09-20 12:53:27 +00:00
shess
961303c1e7 Drop the forced error from fts3.c and add forced errors to fts2.c and
fts1.c. (CVS 4427)

FossilOrigin-Name: fec6567a0f8a868cda9bba2a473491dfb17b6c88
2007-09-13 18:16:08 +00:00
shess
d83ae45639 Add an implicit (HIDDEN) docid column. This works as an alias to
rowid, similar to how things work in SQLite tables with INTEGER
PRIMARY KEY.  Add tests to verify operation. (CVS 4426)

FossilOrigin-Name: c8d2345200f9ece1af712543982097d0b6f348c7
2007-09-13 18:14:49 +00:00
shess
0ec85ae216 Mark the table-named column HIDDEN. Add tests to make sure it's
working as expected. (CVS 4425)

FossilOrigin-Name: ca669eaf1b4af441741129bee4af02f32a7c74b8
2007-09-13 18:12:09 +00:00
shess
999cc5d7e8 Fix memory leak reported by an fts1 user. Was losing a doclist on a
query error. (CVS 4347)

FossilOrigin-Name: eee025024972852990e704253d1443c1cefb376c
2007-08-30 19:56:37 +00:00
shess
27a770e044 Fix memory leak of InteriorReader.term. Comes up when doing queries
against large segments. (CVS 4315)

FossilOrigin-Name: 6c617bd89fc57881a2a308a6360e8ebb42835d46
2007-08-28 20:36:53 +00:00
shess
bae37537b0 Make comments and variable naming more consistent WRT rowid versus
docid/blockid.  This should have no code impact. (CVS 4281)

FossilOrigin-Name: 76f1e18ebc25d692f122784e87d202992c4cfed2
2007-08-23 20:28:49 +00:00
shess
6beeb0329a Fix fts3 to not have the VACUUM bug from fts2. %_content.docid is an
alias to fix the rowid for documents, %_segments.blockid is an alias
to fix the rowid for segment blocks.  Unit test for the problem. (CVS 4280)

FossilOrigin-Name: 6eb2d74a8cfce322930f05c97d4ec255f3711efb
2007-08-23 20:23:37 +00:00
shess
acce22f5c7 Copy fts2 to fts3, renaming, and replacing references to fts2 with
fts3, including capitalization variants. (CVS 4249)

FossilOrigin-Name: 216c91d2fc49792d9ff53596746f1162f5b7f8d4
2007-08-20 17:37:02 +00:00
shess
9fa502205d Convert fts2 to use sqlite3_prepare_v2() to prevent certain logic
errors around SQLITE_SCHEMA handling.  This also allows
sql_step_statement() and sql_step_leaf_statement() to be replaced with
sqlite3_step().

Also fix a logic error in flushPendingTerms() which was clearing the
term table in case of error.  This was wrong in the face of
SQLITE_SCHEMA.  Even though the change to sqlite3_prepare_v2() should
cause us not to see SQLITE_SCHEMA any longer, it was still a logic
error... (CVS 4205)

FossilOrigin-Name: 16730cb137eaf576b87cdc17913564c9c5c0ed82
2007-08-10 23:47:03 +00:00
drh
e6e4d6bb1a Fix some compiler warnings. (CVS 4196)
FossilOrigin-Name: 6cc15409ad6baefbe6e2214a4ac1cb3a0433f922
2007-08-05 23:52:05 +00:00
rse
e21733baa5 Fix ticket #2439: the FTS1 and FTS2 extensions use the non-standard,
unportable and highly deprecated <malloc.h> header on all platforms
except Apple Mac OS X. The <malloc.h> actually is never required on
any OS with an at least partly POSIX-conforming API as the malloc(3) &
friends functions officially live in <stdlib.h> since over 10 years.
Under some platform like FreeBSD the inclusion of <malloc.h> since a few
years even causes an "#error" and this way a build failure. So, just get
rid of the bad <malloc.h> usage in FTS1 and FTS2 extensions at all and
stick with <stdlib.h> there only. (CVS 4191)

FossilOrigin-Name: 3f9a666143a8aafa0b1a5d56ec68f69f2b3d6a21
2007-07-30 18:55:36 +00:00
shess
a2d04e9a0f Implement xRename() for fts1 so that it is possible to rename fts1 tables.
See http://www.sqlite.org/cvstrac/chngview?cn=4143 (CVS 4184)

FossilOrigin-Name: febf75f022b9414fc456ddf274d301f95d61e1b8
2007-07-25 00:56:09 +00:00
shess
443ecd036d Replicates http://www.sqlite.org/cvstrac/chngview?cn=4151 which
modified fts2:

Modify handling of SQLITE_SCHEMA in fts2 code. An SQLITE_SCHEMA error
may cause SQLite to reload the internal schema, deleting and
recreating v-table objects. So the sqlite3_vtab structure can be
deleted out from under a v-table implementation. (CVS 4183)

FossilOrigin-Name: f9020cffda02923ef45979bb447ec2e232086ad5
2007-07-25 00:38:05 +00:00
shess
f6e3624cfc Sorry, previous check-in included a last-minute "Did it really work?"
change :-). (CVS 4182)

FossilOrigin-Name: 5db25e369a1a4b5a4d87947abdbf25f96fe64807
2007-07-25 00:27:59 +00:00
shess
9f8a4b43ef Apply change 4095 to fts1. Fix snippet generation when the left-most
column of an fts table is used in the MATCH clause. Fix for ticket
#2429. (CVS 4181)

FossilOrigin-Name: c2ba3cc0f7ac9f5dfe5ffb554f9a1cd96b28335a
2007-07-25 00:25:20 +00:00
danielk1977
ab9749ebb9 Modify handling of SQLITE_SCHEMA in fts2 code. An SQLITE_SCHEMA error may cause SQLite to reload the internal schema, deleting and recreating v-table objects. So the sqlite3_vtab structure can be deleted out from under a v-table implementation. (CVS 4151)
FossilOrigin-Name: dee1a0fd28e8341af6523ab0c5628b671d7d2811
2007-07-02 10:16:49 +00:00
danielk1977
c033b64276 Implement xRename() for fts2 so that it is possible to rename fts2 tables. (CVS 4143)
FossilOrigin-Name: 488474fde753c5a7a14ed8f2fad7f16efd236491
2007-06-27 16:26:07 +00:00
danielk1977
9ff802627a Reorganize comments in fts2_tokenizer.h. No code changes. (CVS 4132)
FossilOrigin-Name: b331e30395e9fc90abe40ab802972a67648cf48e
2007-06-26 12:54:07 +00:00
danielk1977
08ada518ff Remove the unused EXTSRC variable from the non-configure makefile. (CVS 4129)
FossilOrigin-Name: bbdcf372c6f2144a62fba742b3f4bd6b2fe58593
2007-06-26 10:56:40 +00:00
danielk1977
4877ef2aae Fix an unitialized variable in fts2. (CVS 4128)
FossilOrigin-Name: c349cf942534357955f80fc2aa8c96206af97b78
2007-06-26 10:55:01 +00:00
danielk1977
576d3db541 Modify the non-configure build system to make it easier to build the library with the fts2 or icu extensions linked in. (CVS 4121)
FossilOrigin-Name: 02b23c4394da7efb82e9318146f10818b0f68b1f
2007-06-25 14:28:48 +00:00
drh
397aa141ed Put #ifdefs in fts2_tokenizer so that the build works even when FTS2
is omitted.  Add the SQLite blessing to the header comments on all FTS2
source files. (CVS 4120)

FossilOrigin-Name: c795e6fd8f01bcbc1967062632c13d4952abf4d8
2007-06-25 13:50:03 +00:00
drh
5665b3ea44 All the use of MySQL-style quoting in the FTS modules. Ticket #2446. (CVS 4119)
FossilOrigin-Name: 3be2a6d1c342454d93b05c38f3d9a960ab15dae2
2007-06-25 12:49:05 +00:00
danielk1977
46760820a1 Add a test that calls fts2_tokenizer() with an argument set via C code. (CVS 4118)
FossilOrigin-Name: fbcf2d75cd2b88d175c122477aa483f0771870e5
2007-06-25 12:05:40 +00:00
danielk1977
f86643b32f Add some tests for the fts2 icu tokenizer. (CVS 4117)
FossilOrigin-Name: b79ced3e0a26b0db13613073c847c2d2ba7e174e
2007-06-25 11:24:38 +00:00
danielk1977
24e1afa222 Add some documentation for user-defined fts2 tokenizers. (CVS 4116)
FossilOrigin-Name: 5a9eee86587219a68655d548864d129edec969ae
2007-06-25 09:52:31 +00:00
danielk1977
832a58a68c Extend fts2 so that user defined tokenizers may be added. Add a tokenizer that uses the ICU library if available. Documentation and tests to come. (CVS 4108)
FossilOrigin-Name: 68677e420c744b39ea9d7399819e0f376748886d
2007-06-22 15:21:15 +00:00
danielk1977
86889fc3c6 Fix snippet generation when the left-most column of an fts2 table is used in the MATCH clause. Fix for ticket #2429. (CVS 4095)
FossilOrigin-Name: fec56ad2ede53e3e202d9ad869a059eeb315796f
2007-06-20 06:23:54 +00:00
shess
401b80656d Minor comment edits from my prefix development client. No code changes. (CVS 4058)
FossilOrigin-Name: 6953cd0935b5526756ab745545420e40adc3c56d
2007-06-12 18:20:04 +00:00
danielk1977
b39fa65289 Add a README.txt file for the ICU extension. (CVS 4055)
FossilOrigin-Name: 7b6927829f18d39052e67eebca4275e7aa496035
2007-06-11 08:00:00 +00:00
shess
8a7de08a8b Fix overzealous fts2 assertions WRT rowid 0 or lower. Only check that
docids are ascending if there was a prior docid set for the doclist,
ignore the initial docid of 0. (CVS 4026)

FossilOrigin-Name: ed3a131f1d3fe51d1e79bdfe1bfafa55f825afa9
2007-05-21 21:59:18 +00:00
danielk1977
7de68a097e Add a version of the LIKE operator to the icu extension. Requires optimisation. (CVS 3939)
FossilOrigin-Name: 3e96105c1f084a4ab4dad4de6f4759e43fc497f7
2007-05-07 16:58:02 +00:00
danielk1977
2559136971 Add interface to configure SQLite to use ICU collation functions. (CVS 3936)
FossilOrigin-Name: b29a81b4fbb926fa09186340342848b9fe589033
2007-05-07 11:53:13 +00:00
danielk1977
a9808b31a8 Add the experimental create_collation_x() api. (CVS 3934)
FossilOrigin-Name: ff49d48f2f025898a0f4ace1fc227e1d367ea89f
2007-05-07 09:32:45 +00:00
danielk1977
83852acc44 Add the start of the ICU extension. (CVS 3931)
FossilOrigin-Name: f473e8526770b6a332dfde3e1fd1ddf8df493e9a
2007-05-06 16:04:11 +00:00
shess
290283fe69 Enable prefix-search in query-parsing and snippet generation. If the
character immediately after the end of a term is '*', that term is
marked for prefix matching.  Modify term comparison in
snippetOffsetsOfColumn() to respect isPrefix.  fts2n.test runs prefix
searching through some obvious test cases. (CVS 3893)

FossilOrigin-Name: 7c4c65924035d9f260f6b64eb92c5c6cf6c04b7b
2007-05-01 18:25:52 +00:00
shess
cc3e986643 Modify loadSegmentLeavesInt() to correctly handle prefix searching.
The new function docListUnion() is used to accumulate a union of the
hits for the matching terms, which will be merged across segments
using docListMerge(). (CVS 3891)

FossilOrigin-Name: 72c796307338c2751a91c30f6fb16989afbf3816
2007-05-01 17:14:59 +00:00
shess
0b6212090f Propagate prefix flag through implementation of doclist query code.
Also implement correct prefix-handling for traversal of interior nodes
of segment tree.  A given prefix can span multiple children of an
interior node, and from there the branches need to be followed in
parallel. (CVS 3889)

FossilOrigin-Name: cae844a01a1d87ffb00bba8b4e7b62a92e633aa9
2007-04-30 22:09:36 +00:00
shess
f055154108 Lift docListMerge() call out of loadSegmentLeavesInt() for prefix
search.  Doclists from multiple prefix matches will need a union merge
function, which will have to logically happen across a segment before
doclists are merged between segments. (CVS 3887)

FossilOrigin-Name: 7ddb82668906e33e2d6a796f2da1795032e036d5
2007-04-30 17:52:51 +00:00
shess
8ffcadb57e Break interior-node and leaf-node readers apart in loadSegment().
Previously, the code looped until the block was a leaf node as
indicated by a leading NUL.  Now the code loops until it finds a block
in the range of leaf nodes for this segment, then reads it using
LeavesReader.  This will make it easier to traverse a range of leaves
when doing a prefix search. (CVS 3884)

FossilOrigin-Name: 9466367d65f43d58020e709428268dc2ff98aa35
2007-04-27 22:02:57 +00:00
shess
ac7b2dd518 Lift code to traverse interior nodes out of loadSegment().
Refactoring towards prefix searching. (CVS 3882)

FossilOrigin-Name: 25935db73877c0cb132acb30c2fed2544d0e5e32
2007-04-27 21:24:18 +00:00
shess
1c7ebb0805 Refactor fts2 loadSegmentLeaf() in preparation for prefix-searching.
Prefix-searching will want to accumulate data across multiple leaves
in the segment, using LeavesReader instead of LeafReader is the first
step in that direction. (CVS 3881)

FossilOrigin-Name: 22ffdae4b6f3d0ea584dafa5268af7aa6fdcdc6e
2007-04-27 21:01:59 +00:00
drh
6ed34c59c5 Add the ability to turn the FTS2 module into an amalgamation. (CVS 3864)
FossilOrigin-Name: 94374654ccabb391f5dcccfc88176ca677c5804e
2007-04-21 16:37:48 +00:00
shess
3b2f10cd8f Fix bug in fts2 handling of OR queries. When one doclist ends before
the other, the code potentially tries to read past the end of the
doclist.  http://www.sqlite.org/cvstrac/tktview?tn=2309 (CVS 3862)

FossilOrigin-Name: dfac6082e8ffc52a85c4906107a7fc0e1aa9df82
2007-04-19 18:36:32 +00:00
shess
6b6ab13353 Fix crash in delete when existing row has null fields. Previous code
assumed that the row had values in all columns, sigh.  Fixes bug
http://www.sqlite.org/cvstrac/tktview?tn=2289 . (CVS 3833)

FossilOrigin-Name: 81be7290a4db7b74a533aaf95c7389eb4bde6a88
2007-04-09 20:45:40 +00:00
shess
06c69d2ed6 Buffer updates per-transaction rather than per-update. If lots of
updates happen within a single transaction, there was a lot of wasted
encode/decode overhead due to segment merges.  This code buffers
updates in memory and writes out larger level-0 segments.  It only
works when documents are presented in ascending order by docid.
Comparing a test set running 100 documents per transaction, the total
runtime is cut almost in half. (CVS 3751)

FossilOrigin-Name: 0229cba69698ab4b44f8583ef50a87c49422f8ec
2007-03-29 18:41:03 +00:00
shess
194f8972d5 Don't call ctype functions on hi-bit chars. Some platforms raise
assertions when this occurs, and it's almost certainly not the right
thing to do in the first place. (CVS 3746)

FossilOrigin-Name: f6c3abdc6c5e916e5366ba28fb1cd06ca3554303
2007-03-29 16:30:38 +00:00
shess
13ee81fe96 Refactor PLWriter to remove owned buffer. DLCollector (Document List
Collector) now handles the case where PLWriter (Position List Writer)
needed a local buffer.  Change to using the associated DLWriter
(Document List Writer) buffer, which reduces the number of memory
copies needed in doclist processing, and brings PLWriter operation in
line with DLWriter operation. (CVS 3707)

FossilOrigin-Name: d04fa3a13a84f49074c673b8ee2fb6541da061b5
2007-03-22 00:14:28 +00:00
shess
4607fc06f6 Refactor PLWriter in preparation for buffered-document change.
Currently, PLWriter (Position List Writer) creates a locally-owned
DataBuffer to write into.  This is necessary to support doclist
collection during tokenization, where there is no obvious buffer to
write output to, but is not necessary for the other users of PLWriter.
 This change adds a DLCollector (Doc List Collector) structure to
handle the tokenization case.

Also fix a potential memory leak in writeZeroSegment().  In case of
error from leafWriterStep(), the DataBuffer dl was being leaked. (CVS 3706)

FossilOrigin-Name: 1b9918e20767aebc9c1e7523027139e5fbc12688
2007-03-20 23:52:37 +00:00
shess
0d9f55a177 Out-of-memory cleanup in tokenizers. Handle NULL return from
malloc/calloc/realloc appropriately, and use sizeof(var) instead of
sizeof(type) to make certain that we don't get a mismatch between
them as the code rots. (CVS 3693)

FossilOrigin-Name: fbc53da8c645935c74e49af2ab2cf447dc72ba4e
2007-03-16 18:30:54 +00:00
shess
3438ea3b9e http://www.sqlite.org/cvstrac/tktview?tn=2219
When creating fts tables in an attached database, the backing tables
are created in database 'main'.  This change propagates the
appropriate database name to the routines which build sql statements.

Note that I propagate the database name and table name separately.  I
briefly considered just making the table name be "db.table", but it
didn't fit so well in the model used to store the table name and other
information, and having the db name passed separately seemed a bit
more transparent. (CVS 3631)

FossilOrigin-Name: 283385d20724f0144f38de89bd179715ee5e738b
2007-02-07 01:01:17 +00:00
shess
3ad202dd17 http://www.sqlite.org/cvstrac/tktview?tn=2166,35
Calling UPDATE against an fts table in a UTF-16 database inserts
corrupted data into the database.  The UTF-8 data is being inserted
directly.  This appears to happen because sqlite3_ value_text()
destructively coerces a value to UTF-8, and it's never converted back
when updating the table. This works around the problem by rearranging
things so that the update happens before the coercion. (CVS 3596)

FossilOrigin-Name: 4f2ab4b6320ffc621900049b41f50bc30d76d7f5
2007-01-19 22:59:56 +00:00
shess
f7912aff8a Drop a couple variables which are no longer used anywhere. (CVS 3524)
FossilOrigin-Name: 08c2cc0e0782cfaca89947a01b7ea4474dbe71aa
2006-11-29 23:41:10 +00:00
shess
5c327dbb46 http://www.sqlite.org/cvstrac/tktview?tn=2046
The virtual table interface allows for a cursor to field multiple
xFilter() calls.  For instance, if a join is done with a virtual
table, there could be a call for each row which potentially matches.
Unfortunately, fulltextFilter() assumes that it has a fresh cursor,
and overwrites a prepared statement and a malloc'ed pointer, resulting
in unfinalized statements and a memory leak.

This change hacks the code to manually clean up offending items in
fulltextFilter(), emphasis on "hacks", since it's a fragile fix
insofar as future additions to fulltext_cursor could continue to have
the problem. (CVS 3521)

FossilOrigin-Name: 18142fdb6d1f5bfdbb1155274502b9a602885fcb
2006-11-29 05:17:28 +00:00
shess
7e3d0c2d2f Delta-encode terms in interior nodes. While experiments have shown
that this is of marginal utility when encoding terms resulting from
regular English text, it turns out to be very useful when encoding
inputs with very large terms. (CVS 3520)

FossilOrigin-Name: c8151a998ec2423b417566823dc9957c7d5d782c
2006-11-29 01:02:03 +00:00
shess
f72442be68 Store minimal terms in interior nodes. Whenever there's a break
between leaf nodes, instead of storing the entire leftmost term of the
rightmost child, store only that portion of the leftmost term
necessary to distinguish it from the rightmost term of the leftmost
child. (CVS 3513)

FossilOrigin-Name: f6e0b080dcfaf554b2c05df5e7d4db69d012fba3
2006-11-18 00:12:44 +00:00
shess
9e6a561554 Refactoring groundwork for coming work on interior nodes. Change
LeafWriter to use empty data buffer (instead of empty term) to detect
an empty block.  Code to validate interior nodes.  Moderate revisions
to leaf-node and doclist validation.  Recast leafWriterStep() in terms
of LeafWriterStepMerge(). (CVS 3512)

FossilOrigin-Name: f30771d5c7ef2b502af95d81a18796b75271ada4
2006-11-17 21:12:15 +00:00
shess
de163af26e Delta-encode docids. This is good for around 22% reduction in index
size with DL_POSITIONS.  It improves performance about 5%-6%. (CVS 3511)

FossilOrigin-Name: 9b6d413d751d962b67cb4e3a208efe61581cb822
2006-11-13 21:09:24 +00:00
shess
debbcdfead Require a minimum fanout for interior nodes. This prevents cases
where excessively large terms keep the tree from finding a single
root.  A downside is that this could result in large interior nodes in
the presence of large terms, which may be prone to fragmentation,
though if the nodes were smaller that would translate into more levels
in the tree, which would also have that problem. (CVS 3510)

FossilOrigin-Name: 64b7e3406134ac4891113b9bb432ad97504268bb
2006-11-13 21:00:54 +00:00
shess
545311eeca Allow backing tables to be missing on dropping fts table. Fixes
http://www.sqlite.org/cvstrac/tktview?tn=1992,35 . (CVS 3509)

FossilOrigin-Name: 9628a61a6f33b7bec3455086534b76437d2622b4
2006-11-13 20:15:27 +00:00
shess
aedbce0376 Fix a pair of memory leaks. These were turned up by running valgrind
memcheck with various 10k doc insert, update, delete, and query tests. (CVS 3497)

FossilOrigin-Name: 3cd9b64b96018f69163ad0be0b5c07dd1be6abc6
2006-10-31 18:13:42 +00:00
shess
93d2a81401 Empty queries should get no results. My recent change
( http://www.sqlite.org/cvstrac/chngview?cn=3486 ) broke test fts2a-5.3.
This change should make the expected result more obvious. (CVS 3489)

FossilOrigin-Name: cde383eb467de0d752e94a22cd2f890c2dc599cc
2006-10-26 00:41:51 +00:00
shess
9d5586fc9f Make memset() uses less error-prone.
http://www.sqlite.org/cvstrac/tktview?tn=2036,35 describes some cases
where we were passing memset() a length which was the sizeof a
pointer, rather than the structure pointed to.  Instead, wrap this
idiom up in CLEAR() and SCRAMBLE() macros. (CVS 3488)

FossilOrigin-Name: 5878add0839f9c5bec77caae2361ec20cb60b48b
2006-10-26 00:04:31 +00:00
shess
627a74c48c Remove unreferenced local variable. (CVS 3487)
FossilOrigin-Name: 2d3b22197c7c06488b789cce333b34b6d1ae39aa
2006-10-25 23:22:03 +00:00
shess
87f1d16bdb Replace the DocList and DocListReader structures. The new structures
distinguish reading from a static buffer from writing to a dynamic
buffer.  This allows n-way doclist merging, and in-place merging of
segment leaf nodes, which together cut segment merge times in half. (CVS 3486)

FossilOrigin-Name: af5bfb986e39248abbfc6fff2e13c6f9e634a751
2006-10-25 21:00:09 +00:00
shess
9289cba076 Don't store empty segments. When inserting empty strings, the code
was writing out a segment made up of a single leaf node containing the
\0 header.  LeafReader assumed that leaf nodes always contained at
least one term, so assertions would fail.

While it would be possible to support reading and merging empty
segments, there's no reason to do so.  While this change could have
been done in writeZeroSegment(), I put it in leafWriterFlush() so that
it would work right if segmentMerge() created an empty segment, which
could happen with future changes to how deleted documents are handled. (CVS 3484)

FossilOrigin-Name: fed79beec7da24a26ae94494bdc0c98dd102bc06
2006-10-25 05:21:55 +00:00
drh
d9033a6569 Removing debugging printf from the porter stemmer code. Ticket #2016. (CVS 3475)
FossilOrigin-Name: 7a08c6272f76d53b13313019b4f9da3c8f02b650
2006-10-13 11:55:39 +00:00
shess
8a235d4d3b Convert fts2 to store data in a way which allows for much faster
updates.  Groups of documents form segments which are encoded in a
btree layered over a table of blocks, with various tricks to make
merges fast.  This performs 20x-25x faster than fts1 when loading the
Enron corpus, and is only slightly slower for queries. (CVS 3474)

FossilOrigin-Name: 85272b2f5394e37916afb1d509e7296810d976f5
2006-10-12 23:15:24 +00:00
shess
0d6e29b832 Fix leaky symbols. With this change, fts1 and fts2 can both be
statically linked. (CVS 3472)

FossilOrigin-Name: 5e8bbb85c1493e3ab2d807d24c68294f26838e49
2006-10-10 23:22:40 +00:00
shess
2670a173ed Copy fts1/ to fts2/, changing reference from fts1 to fts2. For future
reference, the source versions copied were:

README.txt r1.1
fts1.c r1.37
fts1.h r1.2
fts1_hash.c r1.1
fts1_hash.h r1.1
fts1_porter.c r1.1
fts1_tokenizer.h r1.4
fts1_tokenizer1.c r1.6 (CVS 3471)

FossilOrigin-Name: d0d1e7cdcc1dd085f1e359ce35c441699d517b02
2006-10-10 17:37:14 +00:00
shess
9f4683cd42 Fix incorrect doclist initialization in term_select_all().
docListRestrictColumn() generates a DL_POSITIONS doclist, which means
that after the first doclist is processed, the second doclist is
initialized as DL_POSITIONS, but with DL_POSITIONS_OFFSETS data.
(Note that DL_DEFAULT is now DL_POSITIONS, which masks this bug.) (CVS 3467)

FossilOrigin-Name: 144e3f11e22c6efd6f2d960599ab2d93542db406
2006-10-05 21:48:56 +00:00
drh
53c36d5444 The snippet generator adds ellipsis between text from different columns. (CVS 3465)
FossilOrigin-Name: 6cf1fb9f801dc1b2865c0d1f9afb1b2076d4246e
2006-10-04 17:35:28 +00:00
drh
b1b6d4a929 Make DL_POSITION the default mode in FTS1. Remove the need to compile
with SQLITE_CORE when SQLITE_ENABLE_FTS1 is used. (CVS 3462)

FossilOrigin-Name: df1a4b4834fdc88056371bcc767c5dfde2eaab72
2006-10-03 19:37:37 +00:00
drh
d75e03df2b Add the option to omit offset information from posting lists in FTS1. (CVS 3456)
FossilOrigin-Name: fdcea7b1ffd821f3f2b6d30997d3957f705a6d0c
2006-10-03 11:42:28 +00:00
drh
6da40bcd79 Add a Porter stemmer option to the FTS1 module. (CVS 3452)
FossilOrigin-Name: 936b06aaa8133e83104de87e03dc94e286a31f86
2006-10-01 18:41:19 +00:00
drh
7cf43fa64e Fix a bug in the handling of the OR operator in FTS1. Test cases added to
prevent a repeat. (CVS 3450)

FossilOrigin-Name: 8cdf1d6ae018dfc93f8f0962b2530e31aa0bebff
2006-09-28 19:43:31 +00:00
drh
07aa67c14a More snippet generator improvements and test cases. (CVS 3449)
FossilOrigin-Name: 0934d220b33c52024f42c89fa13326bd52333f39
2006-09-28 18:57:59 +00:00
drh
1e7423e57f Bug fix in the FTS1 snippet generator. Improvements in the way the snippet
generator handles whitespace. (CVS 3448)

FossilOrigin-Name: d3f4ae827582bd0aac54ae3211d272a1429b6523
2006-09-28 18:37:15 +00:00
drh
361e2bdeb5 Avoid segfaults when inserted NULL values into FTS1. (CVS 3447)
FossilOrigin-Name: 165645d30115f3171fc45489823f85639fe2bfcd
2006-09-28 11:41:41 +00:00
adamd
adf52ce14b Implemented UPDATE for full-text tables.
We handle an UPDATE to a row by performing an UPDATE on the content table and by building new position lists for each term which appears in either the old or new versions of the row.  We write these position lists all at once; this is presumably more efficient than a delete followed by an insert (which would first write empty position lists, then new position lists). (CVS 3434)

FossilOrigin-Name: 757fa22400b363212b4d5f648bdc9fcbd9a7f152
2006-09-22 00:06:39 +00:00
adamd
f40a504164 When gathering a doclist for querying, don't discard empty position lists until the end; this allows empty position lists to override non-empty lists encountered later in the gathering process. This fixes #1982, which was caused by the fact that for all-column queries we weren't discarding empty position lists at all. (CVS 3433)
FossilOrigin-Name: 111ca616713dd89b5d1e114de29c83256731c482
2006-09-21 20:56:52 +00:00
drh
8b62817797 Implementation of the snippet() function for FTS1. Includes a few
simple test cases but more testing is needed. (CVS 3431)

FossilOrigin-Name: c7ee60d00976efab25a830e7416538010c734129
2006-09-21 02:03:08 +00:00
adamd
d47522807e Fixed a build problem in sqlite3_extension_init(). (CVS 3430)
FossilOrigin-Name: bb2e1871cb10b470f96c793bb137c043ef30e1da
2006-09-18 21:14:40 +00:00
drh
a70034de7c Convert all names to lower case before sending them to the xFindFunction
method of a virtual table.  In FTS1, use strcmp instead of strcasecmp.
Ticket #1981. (CVS 3428)

FossilOrigin-Name: efa8fb32a596c7232bb1754b3231e4f2421df75b
2006-09-18 20:24:02 +00:00
drh
b08249ced3 Modify FTS1 so that the "magic" column has the same name as the virtual
table.  Offsets are retrieved using a special "offsets" function whose
first argument is the magic column.  Snippets will ultimately be retrieved
in the same way. (CVS 3427)

FossilOrigin-Name: 5e35dc1ffadfe7fa47673d052501ee79903eead9
2006-09-18 02:12:47 +00:00
drh
b7481e70c5 Add the sqlite3_overload_function() API - part of the virtual table
interface. (CVS 3426)

FossilOrigin-Name: aa7728f9f5b80dbb1b3db124f84b9166bf72bdd3
2006-09-16 21:45:14 +00:00
drh
ae2f2048df Fix an initialization problem in FTS1. Ticket #1977. (CVS 3424)
FossilOrigin-Name: 5a18dd88498ca35ca1333d88c4635868d0b61073
2006-09-15 16:08:59 +00:00
drh
f800e3e63a The FTS1 tables have a new automatic column named "offset" that returns
a string containing byte offset information for all matching terms.
Also added a large test case based on SQLite mailing list entries. (CVS 3417)

FossilOrigin-Name: f25cfa1aec0e4c1fe07176039a1b7f4e6a2c66ec
2006-09-14 01:17:30 +00:00
drh
8f116cc15c In FTS1: Retain the Query structure as part of the cursor. It will be used
laster as part of snippet generation. (CVS 3414)

FossilOrigin-Name: 607d928ce91f3efa9c7019fc789a9cd3c41cfc92
2006-09-13 19:18:29 +00:00
shess
c48f2a10aa Earlier refactoring changed name in fts1.c but not fts1.h. (CVS 3413)
FossilOrigin-Name: d4edb8035c8abbdb301893557934dd644ef3c950
2006-09-13 18:40:25 +00:00
drh
1de6154d39 Minor code cleanup in FTS1. (CVS 3412)
FossilOrigin-Name: fca592816767de397fbaf22cccdf1028fc5dfc91
2006-09-13 17:17:48 +00:00
drh
a3baa963bc Implementation of "column:" modifiers in FTS1 queries. (CVS 3411)
FossilOrigin-Name: 820634f71e3a3499994f82b56b784d22a7e3cdcf
2006-09-13 16:02:43 +00:00
drh
cbaac514bc Module spec parser enhancements for FTS1. Now able to cope with column
names in the spec that are SQL keywords or have special characters, etc.
Also added support for additional control lines.  Column names can be
followed by a type specifier (which is ignored.) (CVS 3410)

FossilOrigin-Name: adb780e0dc8bc7dcd1102efbfa4bc17eefdf968e
2006-09-13 15:20:13 +00:00
drh
a6be0dc938 Fix the FTS1 test cases and add new tests. Comments added to the FTS1 code. (CVS 3409)
FossilOrigin-Name: 528036c828c93c78ca879bf89a52131b72e24067
2006-09-13 12:36:08 +00:00
adamd
4f1a424e72 Allow virtual tables to contain multiple full-text-indexed columns. Added a magic column "_all" which can be used for querying all columns in a table at once.
For now, each posting list stores position/offset information for multiple columns.  We may implement separate posting lists for separate columns at some future point. (CVS 3408)

FossilOrigin-Name: 366a70b086c817bddecd83053472ec76ef20f309
2006-09-13 02:18:20 +00:00
adamd
341d60838c Answer queries for a particular rowid in a full-text table by looking up
that rowid directly rather than by performing a table scan. (CVS 3407)

FossilOrigin-Name: 877d5558b1a6f65201b1825336935b146583bffa
2006-09-12 23:36:45 +00:00
shess
4240240f12 Re-use deleted rowids for new segments. This has a somewhat
surprising impact on performance, I believe because it keeps the index
smaller (by keeping rowids smaller), and also because it improves
locality in the table (deleting a row means we've already touched the
pages leading to that rowid). (CVS 3405)

FossilOrigin-Name: 2f5f6290c9ef99c7b060aecc4d996c976c50c9d7
2006-09-11 21:39:21 +00:00
drh
e410296021 Add a rudimentary tokenizer and parser to FTS1 for parsing the module
arguments during initialization.   Recognized arguments include a
tokenizer selector and a list of virtual table columns. (CVS 3403)

FossilOrigin-Name: 227dc3feb537e6efd5b0c1d2dad40193db07d5aa
2006-09-11 00:34:22 +00:00
drh
4ca8aac2b4 Add pzErr parameters to the xConnect and xCreate methods of virtual tables
in order to provide better error reporting.  This is an interface change
for virtual tables.  Prior virtual table implementations will need to be
modified and recompiled. (CVS 3402)

FossilOrigin-Name: f44b8bae97b6872524580009c96d07391578c388
2006-09-10 17:31:58 +00:00
drh
a2a9d18869 Add some simple test cases for the OR and NOT logic of the fts1 module.
Fix lots of bugs discovered while developing these test cases. (CVS 3400)

FossilOrigin-Name: 70bcff024b44d1b40afac6eba959fa89fb993147
2006-09-10 03:34:06 +00:00
drh
a7e98f2a54 Add support for OR and NOT terms in fts1. (CVS 3399)
FossilOrigin-Name: ae50265791d1a7500aa3c405a78a9bca8ff0cc08
2006-09-09 23:11:51 +00:00
shess
fb6794360d Write doclists using a segmented technique to amortize costs better.
New items for a term are merged with the term's segment 0 doclist,
until that doclist exceeds CHUNK_MAX.  Then the segments are merged in
exponential fashion, so that segment 1 contains approximately
2*CHUNK_MAX data, segment 2 4*CHUNK_MAX, and so on. (CVS 3398)

FossilOrigin-Name: b6b93a3325d3e728ca36255c0ff6e1f63e03b0ac
2006-09-08 17:00:17 +00:00
adamd
338565ad4b A minor change to fts1.c to fix broken build. (CVS 3393)
FossilOrigin-Name: 55a03b96251515a4817a0eefb197219a460640e7
2006-09-05 18:21:31 +00:00
drh
fb52cc95ff Add a TRACE macro to the FTS1 module for troubleshooting. Turned off by
default. (CVS 3388)

FossilOrigin-Name: d4923e98c66ae03d899f633e5e309471f5695abb
2006-09-02 20:58:25 +00:00
drh
7c2d87cd71 Convert static variables into constants in the FTS module. (CVS 3385)
FossilOrigin-Name: 098cbafcd6dcf57142b0417e796d27ffddcc0920
2006-09-02 14:16:59 +00:00
adamd
9eb3997b02 Miscellaneous restructuring and cleanup based on suggestions from shess. (CVS 3382)
FossilOrigin-Name: e98b0cf292f6dc9deb6ae9b773c52b16867f7556
2006-09-02 00:23:01 +00:00
shess
b2f4d0173a Make fts1.c not rely on nul-terminated strings. Mostly a matter of
making sure we always pass around ptr/len, but there were a few places
where we actually relied on nul-termination.

An earlier change had additionally changed appropriate
sqlite3_bind_text() calls to sqlite3_bind_blob().  I've found that
this changes what's actually stored in the database, so backed those
changes out.  Also (and this is weird), I found that I could no longer
do straight-forward = queries against %_term.term at a command-line. (CVS 3379)

FossilOrigin-Name: 5844db1aa9c23a005c88104b084f68afb21891c7
2006-09-01 00:33:44 +00:00
shess
c0beb14f23 Make tokenizer not rely on nul-terminated text. Instead of using
strcspn() and a nul-terminated delimiter list, I just flagged
delimiters in an array and wrote things inline.  Submitting this for
review separately because it's pretty standalone. (CVS 3378)

FossilOrigin-Name: 2631ceaeefaca3aa837e3b439399f13c51456914
2006-09-01 00:05:17 +00:00
drh
5db455e7b5 Refactor the FTS1 module so that its name is "fts1" instead of "fulltext",
so that all symbols with external linkage begin with "sqlite3Fts1", and
so that all filenames begin with "fts1". (CVS 3377)

FossilOrigin-Name: e1891f0dc58e5498a8845d8b9b5b092d7f9c7003
2006-08-31 15:07:14 +00:00
shess
2b85d5f46e Just don't run tolower() on hi-bit characters. This shouldn't cause
us to break any UTF-8 code points, unless they were already broken in
the input. (CVS 3376)

FossilOrigin-Name: 6c77c2d5e15e9d3efed3e274bc93cd5a4868f574
2006-08-30 21:40:30 +00:00
shess
c9e0a9057e Make static some symbols which shouldn't have been exported. (CVS 3371)
FossilOrigin-Name: 58006e38af760b53cf72bf127d7c7b8a619a1282
2006-08-28 23:46:01 +00:00
shess
4f4897e80d Make hi-bit characters delimiters. This is a stopgap until the tokenizer
and fulltext.c recognize UTF-8 correctly. (CVS 3370)

FossilOrigin-Name: ca850d3d80f67672172d11392fcdf60bfbb94c02
2006-08-28 20:08:56 +00:00
shess
0de250e46f Fix gcc gripe about parens in a ||/&& in mergePosList().
Drop unused pBlob/nBlob in index_insert_term().
Fix NULL deref in an assertion in docListUpdate() delete case.
Minor code tightening in docListUpdate(). (CVS 3367)

FossilOrigin-Name: a6fcf9101a831bf5f129c6045eabf30376d365dc
2006-08-25 19:20:26 +00:00
adamd
1717edd157 A first implementation of a full-text search module for SQLite. (CVS 3363)
FossilOrigin-Name: b0d8e0d314d6f77b7d4b5dd00c694a1323f7a8e4
2006-08-23 23:58:50 +00:00
drh
fa9b4b1499 Add the ext/fts1 subdirectory for holding the first full-text search
extension. (CVS 3360)

FossilOrigin-Name: 7f152f9f3a647d30874f2da46ce93a1e31ea7cf3
2006-08-22 14:45:37 +00:00