Commit Graph

35 Commits

Author SHA1 Message Date
drh
d36f588f31 Fix harmless compiler warnings about unused function parameters.
FossilOrigin-Name: 25d067c270966d9506db8bedf280883e32b69050b14bdbbeda4bb2d9a362619c
2020-11-25 16:28:04 +00:00
dan
95dca8d0cf FTS5 does not handle tokens that contain embedded nul characters. Prevent the trigram tokenizer from returning such tokens. Fix for [2ba5930b2].
FossilOrigin-Name: b1d048748c054575425a4bebf0c5d09962f9329d5ce6a978cf54e508b238584c
2020-10-03 14:36:06 +00:00
dan
ccf578d435 Add tests for the trigram tokenizer. Fix minor issues.
FossilOrigin-Name: 897ced99b44085012aa44d3264940dcbd4c77b295a894a1b58fb2c03a0f7fee8
2020-10-01 16:10:22 +00:00
dan
33a99fad08 Add experimental unicode-aware trigram tokenizer to fts5. And support for LIKE and GLOB optimizations for fts5 tables that use said tokenizer.
FossilOrigin-Name: 0d7810c1aea93c0a3da1ccc4911dbce8a1b6e1dbfe1ab7e800289a0c783b5985
2020-09-30 20:35:37 +00:00
drh
3b574e4ea9 Use the 64-bit memory allocator interfaces in extensions, whenever possible.
FossilOrigin-Name: 07ee06fd390bfebebc014b47583d489747b0423bb96c810bed5c605ce0e3be71
2019-04-13 04:38:32 +00:00
drh
2d77d80a65 Use 64-bit math to compute the sizes of memory allocations in extensions.
FossilOrigin-Name: ca67f2ec0e294384c397db438605df1b47aae5f348a8de94f97286997625d169
2019-01-08 20:02:48 +00:00
drh
f9231c34eb Fix harmless compiler warnings.
FossilOrigin-Name: b57c545a384ab5d62becf3164945b32b1e108b2fb4c8dbd939a1706c2079e18b
2018-12-31 21:43:55 +00:00
dan
eefc72d12f Avoid an undefined left-shift operation in fts5 caused by malformed utf-8
text.

FossilOrigin-Name: c3a3a11194586bef80a9d7ca54caae8af30d4e7b464b8bb3d257ba2d2ec4791f
2018-12-28 14:33:55 +00:00
dan
b163b57212 Fix problems in fts5 found by ASAN.
FossilOrigin-Name: c564bf870106faef297594a51995619c80311d06bd5f8a0c7644f666f22ba576
2018-12-28 07:37:22 +00:00
dan
e89feee5c3 Add the "remove_diacritics=2" option to the unicode61 tokenizer in both FTS5
and FTS3/4.

FossilOrigin-Name: 06177f3f114b5d804b84c27ac843740282e2176fdf0f7a999feda0e1b624adec
2018-12-03 16:14:49 +00:00
dan
b80bb6ce88 Add the "categories" option to the unicode61 tokenizer in fts5.
FossilOrigin-Name: 80d2b9e635e3100f90cffdcffa5b5038da6fbbfccc9f5777c59a4ae760d4cb62
2018-07-13 19:52:43 +00:00
dan
22e8356368 Handle parser stack overflow when parsing fts5 query expressions. Fix some compiler warnings in fts5 code.
FossilOrigin-Name: bc3f7900d5a06829d123814a5ac7b951bcfc1560
2016-02-11 17:01:32 +00:00
dan
e9eb1593f5 Fix an fts5 problem with using both xPhraseFirst() and xPhraseFirstColumn() within a single statement in detail=col mode.
FossilOrigin-Name: 72d53699bf0dcdb9d2a22e229989d7435f061399
2016-01-23 18:51:59 +00:00
dan
3e6a141130 Fix some harmless gcc compiler warnings. Mostly in fts5, but also two in the core code.
FossilOrigin-Name: 5d44d4a6cf5c6b983cbd846d9bc34251df8f4bc5
2015-12-23 16:42:27 +00:00
mistachkin
b9becaa268 Fix even more harmless compiler warnings.
FossilOrigin-Name: 1d0e6aa119da8e15d35508f5d75ffc729979da92
2015-12-16 23:30:30 +00:00
mistachkin
cdabd7bd50 Fix harmless compiler warnings.
FossilOrigin-Name: 1c46c194a2da24fe613d77b5a8d727cc2fc9faa4
2015-10-14 20:34:57 +00:00
dan
9c671b741c Further tests to raise coverage of fts5 synonym code to 100%. Fix a dropped error code in the same.
FossilOrigin-Name: bdedd838bb3028c586bcc9f643852ce1364adb49
2015-09-02 19:48:55 +00:00
dan
ee0c0a8de3 Another change to the fts5 tokenizer API.
FossilOrigin-Name: fc71868496f45f9c7a79ed2bf2d164a7c4718ce1
2015-08-29 15:44:27 +00:00
dan
57e0add3f9 Change the fts5 tokenizer API to allow more than one token to occupy a single position within a document.
FossilOrigin-Name: 90b85b42f2b2dd3e939b129b7df2b822a05e243d
2015-08-28 19:56:47 +00:00
dan
79e2347fdf Fix a bug in the fts5 porter tokenizer preventing it from passing xCreate() arguments through to its parent tokenizer.
FossilOrigin-Name: c3c672af97edf2ae5d793f6fa47364370aa4f4ec
2015-07-31 14:43:02 +00:00
dan
3f09beda45 Remove "#ifdef SQLITE_ENABLE_FTS5" from individual fts5 source files. Add a single "#if !defined(SQLITE_CORE) || defined(SQLITE_ENABLE_FTS5)" to fts5.c.
FossilOrigin-Name: 7819002ed85497bbd0f9cf4d39df641573324436
2015-07-02 15:52:21 +00:00
dan
3f3074e0c1 Remove the "#include sqlite3Int.h" from fts5Int.h.
FossilOrigin-Name: e008c3c8e29c843ec945ddad54b9688bbf2bdb44
2015-05-30 11:49:58 +00:00
dan
21b7d2a9b8 Improve test coverage of fts5_unicode2.c.
FossilOrigin-Name: fea8a4db9d8c7b9a946017a0dc984cbca6ce240e
2015-05-22 06:08:25 +00:00
dan
8c1f46de50 Improve test coverage of fts5_tokenize.c.
FossilOrigin-Name: 0e91a6a520f040b8902da6a1a4d9107dc66c0ea3
2015-05-20 09:27:51 +00:00
dan
b10210ea1b Fix a memory leak that could follow an OOM condition in fts5.
FossilOrigin-Name: de9f8ef6ebf036df5a558cd78fb4927da2d83ce8
2015-05-19 11:32:01 +00:00
dan
7b2ec1ae41 Improve fts5 tests.
FossilOrigin-Name: c1f07a3aa98eac87e2747527d15e5e5562221ceb
2015-04-29 20:54:08 +00:00
dan
f5fab92d82 Add an optimization to the fts5 unicode tokenizer code.
FossilOrigin-Name: f5db489250029678fce845dfb2b1109fde46bea5
2015-03-11 14:51:39 +00:00
dan
47c467c80e Fix a couple of build problems.
FossilOrigin-Name: a5d5468c0509d129e198bf9432190ee07cedb7af
2015-03-04 08:29:24 +00:00
dan
57fec54b53 Fix some problems with building fts5 and fts3 together using the amalgamation.
FossilOrigin-Name: fb10bbb9f9c4481e6043d323a3018a4ec68eb0ff
2015-02-02 11:32:20 +00:00
dan
2656167f6e Improve the performance of the fts5 porter tokenizer implementation.
FossilOrigin-Name: 96ea600440de05ee663e71c3f0d0de2c64108bf9
2015-01-17 17:48:10 +00:00
dan
73f7d6ed75 Optimize the unicode61 tokenizer so that it handles ascii text faster. Make it the default tokenizer. Change the name of the simple tokenizer to "ascii".
FossilOrigin-Name: f22dbccad9499624880ddd48df1b07fb42b1ad66
2015-01-12 17:58:04 +00:00
dan
aacf3d1a3b Remove the iPos parameter from the tokenizer callback. Fix the "tokenchars" and "separators" options on the simple tokenizer.
FossilOrigin-Name: 65f0262fb82dbfd9f80233ac7c3108e2f2716c0a
2015-01-06 19:08:26 +00:00
dan
6024772ba2 Add a version of the unicode61 tokenizer to fts5.
FossilOrigin-Name: d09f7800cf14f73ea86d037107ef80295b2c173a
2015-01-01 16:46:10 +00:00
dan
5fa3acabf4 Fixes to built-in tokenizers.
FossilOrigin-Name: b33fe0dd89f3180c209fa1f9e75d0a7acab12b8e
2014-12-29 11:24:46 +00:00
dan
48d7014067 Fix the customization interfaces so that they match the documentation.
FossilOrigin-Name: fba0b5fc7eead07a4853e78e02d788e7c714f6cd
2014-11-15 20:07:31 +00:00