Commit Graph

59 Commits

Author SHA1 Message Date
dan
6788c7b7c0 Begin adding support for deleting rows from contentless fts5 tables.
FossilOrigin-Name: e513bea84dfaf2280f7429c9a528b3a1354a46c36e58ab178ca45478975634e0
2023-07-10 20:44:09 +00:00
dan
3f23eb6813 Add an assert() to fts5_config to ensure that a potential OOM is being handled correctly.
FossilOrigin-Name: fe9c207657400f9d9f4e822eb658157bc147ed538e2701322f6f973933f023ed
2023-05-03 13:57:57 +00:00
dan
015020cd1a Add the 'secure-delete' option to fts5. Used to configure fts5 to aggressively remove old full-text-index entries belonging to deleted or updated rows.
FossilOrigin-Name: 4240fd09b717dbc69dffe3b88ec9149777ca4c3efa12f282af65be3af6fa5bb0
2023-04-12 17:40:44 +00:00
drh
7a3b4451a1 Fixes for harmless static-analyzer warnings. This also makes the code easier
for humans to understand.

FossilOrigin-Name: 36177a62feeb4fa93ab6e3c6f4dbe1ddcf63bb02f93284abab979da0261b218e
2021-10-05 17:41:12 +00:00
drh
f1f12661c3 Avoid taking the address of a NULL pointer following an OOM in FTS5. Doing
so is harmless in actual practice, but it technically UB so we want to
avoid it.

FossilOrigin-Name: 1cfcd9dceb56b5987e6900a36a0ec092f0e1b13a7e754b8c3d8efb943e5bcc66
2021-04-12 18:32:33 +00:00
dan
33a99fad08 Add experimental unicode-aware trigram tokenizer to fts5. And support for LIKE and GLOB optimizations for fts5 tables that use said tokenizer.
FossilOrigin-Name: 0d7810c1aea93c0a3da1ccc4911dbce8a1b6e1dbfe1ab7e800289a0c783b5985
2020-09-30 20:35:37 +00:00
dan
db5ed35609 Avoid a buffer overread in fts5 that could occur when parsing corrupt configuration records.
FossilOrigin-Name: 355afd77df21a2265871ca6d075f26b1fa121c7c2682cf512281944ff0c2186d
2019-12-10 03:40:11 +00:00
dan
ae55737fbf Do not allow users to effectively disable fts5 crisismerge operations by setting the crisismerge threshold to higher than the maximum allowable segment b-trees on a single level. Fix for [d392017c].
FossilOrigin-Name: 86e497209217abb7bcb491a023cd353f3c7c9c103ebd9f58dd8661b12cf3694c
2019-10-09 18:36:32 +00:00
dan
a6bd1871d1 Disallow fts5 page sizes greater than 65536 bytes - as there are 16-bit offsets used in the page header.
FossilOrigin-Name: 75775c5ab44e497cb19be10397229637f1374f05c3244e8f92d6c54fcea94f5f
2019-10-09 15:26:45 +00:00
dan
b186a622ee Disallow page-sizes smaller than 32 bytes in fts5. Also ensure the fts5 integrity-check works even when "PRAGMA reverse_unordered_selects" is true. Fix for [265e935b26].
FossilOrigin-Name: 8ab0aebdb3c2d6fb3160b2c58ce6cc0495a6ddd960878a6395958c837f3d1b71
2019-10-07 20:36:18 +00:00
dan
685b2ee0c3 Allow fts5 to filter on multiple MATCH clauses in a single scan.
FossilOrigin-Name: 9d418a7a491761eeb38a70898677a493e2631e5d62e75ee88431f52d3dfd2344
2019-09-12 19:38:40 +00:00
mistachkin
065f3bf4f2 Fix various harmless compiler warnings seen with MSVC.
FossilOrigin-Name: 1c0fe5b5763fe5cbace9773dcdab742e126d0bd035ab13d61f9d134afa0afc0c
2019-03-20 05:45:03 +00:00
drh
2d77d80a65 Use 64-bit math to compute the sizes of memory allocations in extensions.
FossilOrigin-Name: ca67f2ec0e294384c397db438605df1b47aae5f348a8de94f97286997625d169
2019-01-08 20:02:48 +00:00
dan
e8c20120ce Fix handling of strings that contain zero tokens in fts5. And other problems found by fuzzing.
FossilOrigin-Name: 72b3ff0f0df83e62adda6584b4281cf086d45e45
2016-03-12 16:32:16 +00:00
dan
4dbc65b29a Add an incremental optimize capability to fts5. Make the 'merge' command independent of the 'automerge' settings.
FossilOrigin-Name: 556671444c03e3afca072d0f5e9bea2657de6fd3
2016-03-09 20:54:14 +00:00
drh
df3a907ecc Add JSON1 and FTS5 to the set of extensions subject to close compiler warning
analysis.  Fix some warnings in each.   More (harmless) warnings still exist
in FTS5.

FossilOrigin-Name: cfe2eb88b504f5e9b1351022036641b1ac4c3e78
2016-02-11 15:37:18 +00:00
dan
8631402e6a Add further tests for fts5. Fix some problems with detail=col mode and auxiliary functions.
FossilOrigin-Name: de77d6026e8035c505a704e7b8cfe5af6579d35f
2016-01-16 18:58:51 +00:00
dan
f705e9deab Fix compiler warnings in fts5.
FossilOrigin-Name: 5a343cc0336bba056df4449e6cd2e3fb9e75a105
2016-01-14 14:15:54 +00:00
dan
9f44deed93 Change the name of the offsets=0 option to "detail=column". Have the xInst, xPhraseFirst and other API functions work by parsing the original text for detail=column tables.
FossilOrigin-Name: 228b4d10e38f7d70e1b008c3c9b4a1ae3e32e30d
2015-12-28 19:55:00 +00:00
dan
b12dc84fbb Add the "offsets=0" option to fts5, to create a smaller index without term offset information. A few things are currently broken on this branch.
FossilOrigin-Name: 40b5bbf02a824ca73b33aa4ae1c7d5f65b7cda10
2015-12-17 20:36:13 +00:00
dan
f5d8c58950 Fix the fts5 "prefix=" option to match the documentation (space separated list, multiple prefix= options supported). The undocumented comma-separated format (compatible with fts4) still works.
FossilOrigin-Name: 11eb8e877e2ba859ef6b44318f286597186dfaf2
2015-11-25 11:56:24 +00:00
dan
dbbda39453 Have fts5 load its configuration from the xConnect() method is invoked. This ensures that the very first query run uses the correct value of the 'rank' option.
FossilOrigin-Name: 33e6606f5e497e81119ec491cf2370f60bddafc0
2015-11-06 12:50:57 +00:00
dan
d82211db56 Add the 'hashsize' configuration option to fts5, for configuring the amount of memory allocated to the in-memory hash table while writing.
FossilOrigin-Name: 445480095e6877cce8220b1c095f334bbb04c1c3
2015-11-05 18:09:16 +00:00
mistachkin
cdabd7bd50 Fix harmless compiler warnings.
FossilOrigin-Name: 1c46c194a2da24fe613d77b5a8d727cc2fc9faa4
2015-10-14 20:34:57 +00:00
dan
fe8e2eba0a Remove the 0x00 terminators from the end of doclists stored on disk.
FossilOrigin-Name: 00d990061dec3661b0376bd167082942d5563bfe
2015-09-08 19:55:26 +00:00
dan
ee0c0a8de3 Another change to the fts5 tokenizer API.
FossilOrigin-Name: fc71868496f45f9c7a79ed2bf2d164a7c4718ce1
2015-08-29 15:44:27 +00:00
dan
57e0add3f9 Change the fts5 tokenizer API to allow more than one token to occupy a single position within a document.
FossilOrigin-Name: 90b85b42f2b2dd3e939b129b7df2b822a05e243d
2015-08-28 19:56:47 +00:00
dan
e3229c19cb Use a WITHOUT ROWID table to index fts5 btree leaves. This is faster to query and only slightly larger than storing btree nodes within an intkey table.
FossilOrigin-Name: 862418e3506d4b7cca9c44d58c2eb9dc915d75c9
2015-07-15 19:46:02 +00:00
dan
3f09beda45 Remove "#ifdef SQLITE_ENABLE_FTS5" from individual fts5 source files. Add a single "#if !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_FTS5)" to fts5.c.
FossilOrigin-Name: 7819002ed85497bbd0f9cf4d39df641573324436
2015-07-02 15:52:21 +00:00
dan
6394d99a0e Fix a segfault that could follow an OOM error in fts5.
FossilOrigin-Name: 713239b8cf2900e8f7d97646c7f350248b4e804f
2015-06-26 20:08:25 +00:00
mistachkin
ed52f9ff48 Initial changes to get FTS5 working with MSVC.
FossilOrigin-Name: ef2052f81e33ca98e85a60f8a78cdd19a7c1c35c
2015-06-26 04:34:36 +00:00
dan
51ef0f57c7 Improve test coverage of fts5.
FossilOrigin-Name: df5ccea80e8f0da83af5e595b539687006085120
2015-06-23 18:47:55 +00:00
dan
bcc2f04c68 Add the "columnsize=" option to fts5, similar to fts4's "matchinfo=fts3".
FossilOrigin-Name: aa12f9d9b79c2f523fd6b00e47bcb66dba09ce0c
2015-06-09 20:58:39 +00:00
dan
27aac274b9 Improve test coverage of fts5_config.c.
FossilOrigin-Name: 47dbfadb994814c9349d4c9c113b862c2e97c01a
2015-05-18 17:50:17 +00:00
dan
76724372ae Improve the error message returned by FTS5 if it encounters an unknown file format.
FossilOrigin-Name: f369caec145f311bb136cf7af144e2695badcb9b
2015-05-08 09:21:05 +00:00
dan
4591334dd4 Change to storing all keys in a single merge-tree structure instead of one main structure and a separate one for each prefix index. This is a file-format change. Also introduce a mechanism for managing file-format changes.
FossilOrigin-Name: a684b5e2d9d52cf4700e7e5f9dd547a2ba54e8e9
2015-05-07 19:29:46 +00:00
dan
7c479d51e5 Reorganize some of the fts5 expression parsing code. Improve test coverage of the same.
FossilOrigin-Name: c4456dc5f5f8f45f04e3bbae53b6bcc209fc27d5
2015-05-02 20:35:24 +00:00
dan
7b2ec1ae41 Improve fts5 tests.
FossilOrigin-Name: c1f07a3aa98eac87e2747527d15e5e5562221ceb
2015-04-29 20:54:08 +00:00
dan
a3bdec7ee4 Change the fts5 content= option so that it matches fts5 columns with the underlying table columns by name, not by their position within the CREATE TABLE statement.
FossilOrigin-Name: e38e2bb637844dae8ae5d5f3e23d8369e1b91e45
2015-04-27 16:21:49 +00:00
dan
ef8b74324d Merge latest trunk changes with this branch.
FossilOrigin-Name: 1c78d8920fb59da3cb97dd2eb09b3e08dfd14259
2015-04-24 20:18:21 +00:00
dan
df5bd1fed2 Add the "unindexed" column option to fts5.
FossilOrigin-Name: 86309961344f4076ddcf55d730d3600ec3b6e45c
2015-04-24 19:41:43 +00:00
dan
47c467c80e Fix a couple of build problems.
FossilOrigin-Name: a5d5468c0509d129e198bf9432190ee07cedb7af
2015-03-04 08:29:24 +00:00
dan
626d9e3062 Remove some redundant code from fts5.
FossilOrigin-Name: 939b7a5de25e064bdf08e03864c35ab718da6f6f
2015-01-23 06:50:33 +00:00
dan
aacf3d1a3b Remove the iPos parameter from the tokenizer callback. Fix the "tokenchars" and "separators" options on the simple tokenizer.
FossilOrigin-Name: 65f0262fb82dbfd9f80233ac7c3108e2f2716c0a
2015-01-06 19:08:26 +00:00
dan
2a28e507f7 Further fixes and test cases related to external content tables.
FossilOrigin-Name: ce6a899baff7265a60c880098a9a57ea352b5415
2015-01-06 14:38:34 +00:00
dan
ded4f41d1a Tests and fixes for fts5 external content tables.
FossilOrigin-Name: 047aaf830d1e72f0fdad3832a0b617e769d66468
2015-01-05 20:41:39 +00:00
dan
0fbc269fef Add support for external content tables to fts5.
FossilOrigin-Name: 17ef5b59f789e9fa35c4f053246d819987fd06f8
2015-01-03 20:44:58 +00:00
dan
ade921c3ad Allow the rank column to be remapped on a per-query basis by including a term similar to "rank match 'bm25(10,2)'" in a where clause.
FossilOrigin-Name: 1cd15a1759004d5d321056905dbb6acff20dc7d9
2015-01-02 14:55:22 +00:00
dan
5fa3acabf4 Fixes to built-in tokenizers.
FossilOrigin-Name: b33fe0dd89f3180c209fa1f9e75d0a7acab12b8e
2014-12-29 11:24:46 +00:00
dan
e4bec37900 Fix various problems in fts5 revealed by fault-injection tests.
FossilOrigin-Name: e358c3de5c916f2c851ab9324ceaae4e4e7a0fbd
2014-12-18 18:25:48 +00:00