Commit Graph

162 Commits

Author SHA1 Message Date
dan
c44041e03b Ensure that tokendata=1 queries avoid loading large doclists for queries like "common AND uncommon", just as tokendata=0 queries do.
FossilOrigin-Name: 7bda09ab404a110d57449e149a3281fca8dc4cacf7bd9832ea2a1356ad20fe8e
2023-12-02 17:32:16 +00:00
dan
b5effc0605 Different approach to querying a tokendata=1 table. Saves cpu and memory.
FossilOrigin-Name: c523f40895866e6fc979a26483dbea8206126b4bbdf4b73b77263c09e13c855e
2023-12-01 20:09:59 +00:00
dan
af54826e4a Defer building xInstToken() hash-table until it is to be used.
FossilOrigin-Name: 9b005085ff4a53cda0a1dff0c836630d6d3b95b9c40658ffd2a886f3e1b37faa
2023-11-22 20:02:55 +00:00
dan
5c268bbf67 Fix tokendata=1 and xInstToken() APIs for detail=none and detail=column tables.
FossilOrigin-Name: 37b271c19d772bd06524db816ded03377b426efed7a7783c8a96f6fb156ecd86
2023-11-22 19:02:54 +00:00
dan
50b0e25a55 Add implementation of xInstToken() API.
FossilOrigin-Name: a34b26fe7f60b74e7ae5cf64900920a3d352a20da2496401bcbc27041689cd07
2023-11-15 11:45:19 +00:00
dan
e108029332 Add new fts5 API xQueryToken().
FossilOrigin-Name: 828566392b3ea8db603cb1ae5eccbc8ac035efaa284bc7c15ba89874f634aec9
2023-11-13 14:29:12 +00:00
dan
03204e9106 Add the tokendata=1 option to ignore trailing token-data when querying an fts5 table.
FossilOrigin-Name: 122935182ad5869ce3a4c6d796c38a0509f6f3384dd1b3e60a3f2f0f366cc5f5
2023-10-11 21:08:12 +00:00
dan
a35ae44150 Changes so that fts5 can handle tokens with embedded '\0' bytes.
FossilOrigin-Name: c027c092c4af53bd6ae3cc6e2b4439167d9eeb0f9de549b6a2c2a72a67ee886c
2023-09-30 18:13:35 +00:00
dan
3f874b58fb Change the name of the fts5 'delete-automerge' option to 'deletemerge'. And add tests for it.
FossilOrigin-Name: 1079300db2a7d1fbc86a01c215c234a3af64889c5396e6da63ff4f3c7efae4c5
2023-07-25 15:48:58 +00:00
dan
24730de8d1 Add the fts5 'delete-automerge' integer option. A level is eligible for auto-merging if it has a greater than or equal percentage of its entries deleted by tombstones than the 'delete-automerge' option. Default value is 10.
FossilOrigin-Name: b314be66b9ac0190b5373b3b6baec012382bc588c2d86c2edab796669a4303c3
2023-07-24 19:13:06 +00:00
dan
2159292ce0 Integrate contentless delete with auto-merge.
FossilOrigin-Name: 85c1589ab1fc69d1eef4bbc1bdefa2b10af5f6b9c08e813130b93829b592f416
2023-07-22 19:47:46 +00:00
dan
55e0fd4a9d Do not allow the 'delete' command to be used on contentless_delete=1 fts5 tables.
FossilOrigin-Name: cc694b83408ccb5d42204cb624145c76e95329cbe1d1fe8815c70a7a00af231a
2023-07-17 17:59:58 +00:00
dan
6788c7b7c0 Begin adding support for deleting rows from contentless fts5 tables.
FossilOrigin-Name: e513bea84dfaf2280f7429c9a528b3a1354a46c36e58ab178ca45478975634e0
2023-07-10 20:44:09 +00:00
drh
3442306989 Protect a macro argument with parentheses in FTS5.
FossilOrigin-Name: bc07fe51fe0c6bb50ca8ae1baefcc35c8f5395b2d0de641bf0b0cedc92d754d4
2023-05-03 13:48:33 +00:00
dan
015020cd1a Add the 'secure-delete' option to fts5. Used to configure fts5 to aggressively remove old full-text-index entries belonging to deleted or updated rows.
FossilOrigin-Name: 4240fd09b717dbc69dffe3b88ec9149777ca4c3efa12f282af65be3af6fa5bb0
2023-04-12 17:40:44 +00:00
dan
6aafd74853 Avoid some cases of signed integer overflow in fts5 by casting to unsigned values.
FossilOrigin-Name: 46a78c8c0ed518c4521e6e0bdebeb065bab07076abc444775002e7f4361d2242
2022-08-08 19:29:53 +00:00
drh
134f544ab2 Fix a harmless scan-build warning in FTS5.
FossilOrigin-Name: 0bf42bb5611dc3f672cb898b8be245fd25f7a3862c1e0734effd18d75e812f22
2021-10-16 19:50:03 +00:00
drh
922c54206f Some #defines somehow failed to get set correctly in the previous check-in.
Fixed here.

FossilOrigin-Name: 15bbdf9ac840a220f384411d3025ef22f949d310194b60bca8e6d6a759e6042e
2021-10-04 18:57:42 +00:00
drh
11a9ad5669 Fix harmless static analyzer warnings in sessions, rtree, fts3 and fts5.
Add the -DSQLITE_OMIT_AUXILIARY_SAFETY_CHECKS compile-time option to cause
ALWAYS() and NEVER() macros to be omitted from the build.

FossilOrigin-Name: 1c67f957fc77e37ce8f0d447c41ca975e8e79a35d332739c24a633649b5b0387
2021-10-04 18:21:14 +00:00
dan
cc516af4cc Instead of disallowing writes to fts5 tables if there are fts5vocab cursors open on them (commit [c49a6ed7]), abort any fts5vocab queries if the on-disk structure of the fts5 table changes.
FossilOrigin-Name: 9dbdc9001e3258e71ca995fbcdebf66ab95890ded87fa7125c6cb4bd43010aaf
2021-07-07 11:51:03 +00:00
dan
b9324fea07 Do not allow writes to an fts5 table if there are any open fts5vocab cursors.
FossilOrigin-Name: c49a6ed78a917d4972e048e2a9bbe4d400691f97ce7e022f0e4436ceaed7fb73
2021-07-05 19:01:09 +00:00
dan
f46be6a1b9 Allow fts5 trigram tables created with detail=column or detail=none to optimize LIKE and GLOB queries. Allow case-insensitive tables to optimize GLOB as well as LIKE.
FossilOrigin-Name: 64782463be62b72b5cd0bfaa7c9b69aa487d807c5fe0e65a272080b7739fd21b
2020-10-05 16:41:56 +00:00
dan
33a99fad08 Add experimental unicode-aware trigram tokenizer to fts5. And support for LIKE and GLOB optimizations for fts5 tables that use said tokenizer.
FossilOrigin-Name: 0d7810c1aea93c0a3da1ccc4911dbce8a1b6e1dbfe1ab7e800289a0c783b5985
2020-09-30 20:35:37 +00:00
dan
7548ab20e6 In fts5 integrity checks, do not compare the contents of the index against an external content table unless specifically requested.
FossilOrigin-Name: 782163693f37aeb65209bebbaeb6659a36881b8c4b4bec778b366658488bf966
2020-09-21 14:53:21 +00:00
dan
52612bec3c Fix a resource leak in fts5 that could occur if an auxiliary function is called from within a query that does not use the full-text index.
FossilOrigin-Name: b528bdcd45db1b783ecd9739c3d3c890f04de7003f079668970eafaf8e23b2f3
2019-10-20 08:26:08 +00:00
dan
ae55737fbf Do not allow users to effectively disable fts5 crisismerge operations by setting the crisismerge threshold to higher than the maximum allowable segment b-trees on a single level. Fix for [d392017c].
FossilOrigin-Name: 86e497209217abb7bcb491a023cd353f3c7c9c103ebd9f58dd8661b12cf3694c
2019-10-09 18:36:32 +00:00
dan
685b2ee0c3 Allow fts5 to filter on multiple MATCH clauses in a single scan.
FossilOrigin-Name: 9d418a7a491761eeb38a70898677a493e2631e5d62e75ee88431f52d3dfd2344
2019-09-12 19:38:40 +00:00
dan
3cbbd195ca Prevent an fts5 table from being its own content table, or part of a view that is the content table.
FossilOrigin-Name: b6d52c9364767ff4ab7279ae981afb97799299dcfaf38a0110c40ca82c72a825
2019-08-05 12:55:56 +00:00
dan
b15f19c75e Fix an fts5 problem with interleaving reads and writes in a single transaction.
FossilOrigin-Name: 45c73deb440496e848cb24d4c1326d4105dacfee8bbafb115e567051855e6518
2019-03-18 15:23:20 +00:00
dan
e88609f23e Fix asan warnings in fts5 triggered by corrupt databases - passing NULL to memcmp, out-of-range left-shift values and signed integer overflow.
FossilOrigin-Name: 93f8ec146d63af13f04e337ada4fa75e9254f72b1394df09701ae12e185f27e2
2019-01-25 16:54:06 +00:00
dan
ccfa550922 Fix a buffer overrun that could occur in fts5 if a prefix query is made on a corrupt database.
FossilOrigin-Name: 1abc4415648e69362061e9f9a4f2c1d419ba33801999b377650d8b9a4d2d3a7c
2019-01-22 21:17:40 +00:00
dan
25fb50674f Fix problems with joining two or more fts5_vocab tables that access the same
underlying fts5 table.

FossilOrigin-Name: 49956395e14b61f6bf839e59ae7dd95eb32ebf32f3d16388844de6621b9c2d98
2019-01-17 17:39:15 +00:00
drh
2d77d80a65 Use 64-bit math to compute the sizes of memory allocations in extensions.
FossilOrigin-Name: ca67f2ec0e294384c397db438605df1b47aae5f348a8de94f97286997625d169
2019-01-08 20:02:48 +00:00
dan
b163b57212 Fix problems in fts5 found by ASAN.
FossilOrigin-Name: c564bf870106faef297594a51995619c80311d06bd5f8a0c7644f666f22ba576
2018-12-28 07:37:22 +00:00
dan
2f69b5c5d0 Remove an unused function declaration from fts5.
FossilOrigin-Name: 148d9b61471a874a16a9ec9c9603da03cadb3a40662fb550af51cb36212426b1
2018-07-13 20:28:54 +00:00
dan
b80bb6ce88 Add the "categories" option to the unicode61 tokenizer in fts5.
FossilOrigin-Name: 80d2b9e635e3100f90cffdcffa5b5038da6fbbfccc9f5777c59a4ae760d4cb62
2018-07-13 19:52:43 +00:00
dan
d37ce8396a Add the "^" syntax from fts3/4 to fts5.
FossilOrigin-Name: 24d7058e2799133dd681d2fef341025ca50554861bb4cd39e93ee87ae1d8a605
2017-11-24 19:24:44 +00:00
dan
929b695111 Allow a user column name to be used on the LHS of a MATCH operator in FTS5.
FossilOrigin-Name: 6f54ffd151b0eca6f9ef57ac54802584a839cfc7373f10c100fc18c855edcc0a
2017-04-13 09:45:21 +00:00
dan
03155f659f Update fts5 to support "<colset> : ( <expr> )" for column filtering, as well
as "<colset> : NEAR(...)" and "<colset> : <phrase>".

FossilOrigin-Name: c847543f8bb1376fef52bca72b4191162a32eb7e6c5f0cd1aa0ab116b3183396
2017-04-12 17:50:12 +00:00
dan
be0bc8bb4d Have fts5 close any open blob-handle when a new savepoint is opened. This
ensures that fts5 does not prevent DROP TABLE statements (which always open a
savepoint) from succeeding.

FossilOrigin-Name: a921ada89050ce1d162fd1b0056939573635e2cec7ac0c2a99ae924b3ae593f7
2017-04-08 09:12:20 +00:00
drh
ac279be98e Avoid a duplication #define in FTS5
FossilOrigin-Name: c447441cff1884d6fe5f0a76d64b3e7d908584a1
2017-02-13 13:20:02 +00:00
drh
8d57d7af23 Fix harmless compiler warning.
FossilOrigin-Name: 9a5a4f6e3bc265fecf79a7f63d14abbf239da636
2016-08-09 21:01:52 +00:00
dan
ccf03677a3 Minor update to the way fts5 column filters are parsed.
FossilOrigin-Name: 14864f2b8470fe98dbd17f59963bf1be8d4962f9
2016-08-09 19:48:37 +00:00
dan
882ef0b8c0 Have fts5 interpret column lists that begin with a "-" character as "match any column except" lists.
FossilOrigin-Name: e517545650631d1e8a7ee63c6646a8b183a0a894
2016-08-09 19:26:57 +00:00
dan
0df5024e83 Fix an FTS5 problem (segfault or incorrect query results) with "... MATCH 'x OR y' ORDER BY rank" queries when either token 'x' or 'y' is completely absent from the dataset.
FossilOrigin-Name: 64ca1a835a89fd211078d2cd8f9b649e89be528d
2016-05-30 08:28:21 +00:00
dan
848b190e40 Explicitly limit the size of fts5 tokens to 32768 bytes.
FossilOrigin-Name: 70fc69eed9b09159899d7cbd1416a59d04210a63
2016-03-23 15:04:00 +00:00
dan
f55fb6615b Have fts5 cache the decoded structure of fts5 indexes in memory. Use "PRAGMA data_version" to detect stale caches.
FossilOrigin-Name: 33ef2210ef19e55c8d460bfe9d3dc146034c8acc
2016-03-16 19:48:10 +00:00
dan
e8c20120ce Fix handling of strings that contain zero tokens in fts5. And other problems found by fuzzing.
FossilOrigin-Name: 72b3ff0f0df83e62adda6584b4281cf086d45e45
2016-03-12 16:32:16 +00:00
dan
4dbc65b29a Add an incremental optimize capability to fts5. Make the 'merge' command independent of the 'automerge' settings.
FossilOrigin-Name: 556671444c03e3afca072d0f5e9bea2657de6fd3
2016-03-09 20:54:14 +00:00
dan
f2d328fa25 Fix a fairly obscure buffer overread in fts5.
FossilOrigin-Name: 130580207ab5cee762b2893808acef7c8afad027
2016-02-12 17:56:27 +00:00