postgres

Commit Graph

Author	SHA1	Message	Date
Andrew Dunstan	d6eaeb335b	Adjust contrib/tsearch2 regression results to use XML tag and XML entity descriptions, as now used by core text search default parser.	2007-11-20 04:23:10 +00:00
Tom Lane	f00d75b8d7	Add snb_ru_init(internal) to list of stub functions in tsearch2 compatibility module. Needed to support loading of 8.1-era tsearch2 configuration data.	2007-11-16 00:34:54 +00:00
Bruce Momjian	fdf5a5efb7	pgindent run for 8.3.	2007-11-15 21:14:46 +00:00
Tom Lane	4394c1b09c	Resurrect the code for the rewrite(ARRAY[...]) aggregate function, and put it into contrib/tsearch2 compatibility module.	2007-11-13 22:14:50 +00:00
Tom Lane	abd183e4e7	Ooops, missed one file to remove.	2007-11-13 21:25:25 +00:00
Tom Lane	90e3f2aca7	Replace the now-incompatible-with-core contrib/tsearch2 module with a compatibility package. This supports importing dumps from past versions using tsearch2, and provides the old names and API for most functions that were changed. (rewrite(ARRAY[...]) is a glaring omission, though.) Pavel Stehule and Tom Lane	2007-11-13 21:02:29 +00:00
Bruce Momjian	33e2e02493	Add CVS version labels to all install/uninstall scripts.	2007-11-13 04:24:29 +00:00
Bruce Momjian	926bbab448	Make /contrib install/uninstall script consistent: remove transactions use create or replace function make formatting consistent set search patch on first line Add documentation on modifying *.sql to set the search patch, and mention that major upgrades should still run the installation scripts. Some of these issues were spotted by Tom today.	2007-11-11 03:25:35 +00:00
Peter Eisentraut	5f9869d0ee	Use "alternative" instead of "alternate" where it is clearer.	2007-11-07 12:24:24 +00:00
Tom Lane	a190eb3d7d	Avoid possibly-unportable initializer, per buildfarm warning.	2007-07-15 22:57:48 +00:00
Tom Lane	7176e60bc8	Silence Solaris compiler warnings, per buildfarm.	2007-07-15 22:49:36 +00:00
Tom Lane	af18d3d05c	Fix compile warning on Solaris, per buildfarm. (Why have we got three slightly different copies of this file?)	2007-07-15 22:40:28 +00:00
Tom Lane	c11b8dcdbb	Fix unportable use of isspace(), per buildfarm results.	2007-07-15 22:32:53 +00:00
Tom Lane	b09c248bdd	Fix PGXS conventions so that extensions can be built against Postgres installations whose pg_config program does not appear first in the PATH. Per gripe from Eddie Stanley and subsequent discussions with Fabien Coelho and others.	2007-06-26 22:05:04 +00:00
Tom Lane	3e23b68dac	Support varlena fields with single-byte headers and unaligned storage. This commit breaks any code that assumes that the mere act of forming a tuple (without writing it to disk) does not "toast" any fields. While all available regression tests pass, I'm not totally sure that we've fixed every nook and cranny, especially in contrib. Greg Stark with some help from Tom Lane	2007-04-06 04:21:44 +00:00
Teodor Sigaev	9477f12ea8	Fix caching of unsuccessful initialization of parser or configuration. Per report from Listmail <lists@peufeu.com>	2007-04-02 11:42:04 +00:00
Tom Lane	0565579c5b	Fix uninitialized-variable bug.	2007-03-28 01:28:34 +00:00
Teodor Sigaev	66daeb074b	Add checking of end of line in parsing stopword list. Thanks to sharp eyes of Tom lane	2007-03-26 13:57:07 +00:00
Teodor Sigaev	debb3aa8e9	Fix stopword and synonym files parsing bug in MSVC build, per report from Magnus Hagander. Also, now it ignores space symbol after stopwords.	2007-03-26 12:25:35 +00:00
Teodor Sigaev	bb8998a475	Fix parser bug on Windows with UTF8 encoding and C locale, the reason was sizeof(wchar_t) = 2 instead of 4.	2007-03-22 15:58:24 +00:00
Tom Lane	9f652d430f	Fix up several contrib modules that were using varlena datatypes in not-so-obvious ways. I'm not totally sure that I caught everything, but at least now they pass their regression tests with VARSIZE/SET_VARSIZE defined to reverse byte order.	2007-02-28 22:44:38 +00:00
Tom Lane	234a02b2a8	Replace direct assignments to VARATT_SIZEP(x) with SET_VARSIZE(x, len). Get rid of VARATT_SIZE and VARATT_DATA, which were simply redundant with VARSIZE and VARDATA, and as a consequence almost no code was using the longer names. Rename the length fields of struct varlena and various derived structures to catch anyplace that was accessing them directly; and clean up various places so caught. In itself this patch doesn't change any behavior at all, but it is necessary infrastructure if we hope to play any games with the representation of varlena headers. Greg Stark and Tom Lane	2007-02-27 23:48:10 +00:00
Teodor Sigaev	44655290cc	Fix backend crash in parsing incorrect tsquery. Per report from Jon Rosebaugh <jon@inklesspen.com>	2007-02-12 14:14:33 +00:00
Peter Eisentraut	c138b966d4	Replace useless uses of := by = in makefiles.	2007-02-09 15:56:00 +00:00
Peter Eisentraut	086c189456	Normalize fgets() calls to use sizeof() for calculating the buffer size where possible, and fix some sites that apparently thought that fgets() will overwrite the buffer by one byte. Also add some strlcpy() to eliminate some weird memory handling.	2007-02-08 11:10:27 +00:00
Peter Eisentraut	f11aa82d03	Use memcpy() instead of strncpy() for copying into varlena structures.	2007-02-07 00:32:15 +00:00
Bruce Momjian	8b4ff8b6a1	Wording cleanup for error messages. Also change can't -> cannot. Standard English uses "may", "can", and "might" in different ways: may - permission, "You may borrow my rake." can - ability, "I can lift that log." might - possibility, "It might rain today." Unfortunately, in conversational English, their use is often mixed, as in, "You may use this variable to do X", when in fact, "can" is a better choice. Similarly, "It may crash" is better stated, "It might crash".	2007-02-01 19:10:30 +00:00
Teodor Sigaev	d4c6da1527	Allow GIN's extractQuery method to signal that nothing can satisfy the query. In this case extractQuery should returns -1 as nentries. This changes prototype of extractQuery method to use int32* instead of uint32* for nentries argument. Based on that gincostestimate may see two corner cases: nothing will be found or seqscan should be used. Per proposal at http://archives.postgresql.org/pgsql-hackers/2007-01/msg01581.php PS tsearch_core patch should be sightly modified to support changes, but I'm waiting a verdict about reviewing of tsearch_core patch.	2007-01-31 15:09:45 +00:00
Neil Conway	8ff2bccee3	Squelch some VC++ compiler warnings. Mark float literals with the "f" suffix, to distinguish them from doubles. Make some function declarations and definitions use the "const" qualifier for arguments consistently. Ignore warning 4102 ("unreferenced label"), because such warnings are always emitted by bison-generated code. Patch from Magnus Hagander.	2007-01-26 17:45:42 +00:00
Teodor Sigaev	f2a01b0d5a	Fix localization support for multibyte encoding and C locale. Slightly reworked patch from Tatsuo Ishii	2007-01-15 15:16:28 +00:00
Tom Lane	859b8dd51a	Add a defense to prevent core dumps if 8.2 version of rank_cd() is used with the 8.1 SQL function definition for it. Per report from Rajesh Kumar Mallah, such a DBA error doesn't seem at all improbable, and the cost of checking for it is not very high compared to the cost of running this function. (It would have been better to change the C name of the function so it wouldn't be called by the old SQL definition, but it's too late for that now in the 8.2 branch.)	2006-12-28 01:09:01 +00:00
Teodor Sigaev	49b64d346f	Fix memory reallocation condition	2006-12-26 14:54:24 +00:00
Teodor Sigaev	ca5bc1ae51	Fix convertion for 'PFX flag N num'	2006-12-21 17:35:28 +00:00
Teodor Sigaev	6cd9a58480	Fix core dump of ispell for case of non-successfull initialization. Previous versions aren't affected. Fix synonym dictionary init: string should be malloc'ed, not palloc'ed. Bug introduced recently while fixing lowerstr().	2006-12-04 09:26:57 +00:00
Teodor Sigaev	3de2682a1e	Fix lowercasing while parse OO dictionary	2006-11-23 17:35:14 +00:00
Teodor Sigaev	84151d0644	Avoid infinity calculations in rank_cd	2006-11-22 15:55:05 +00:00
Teodor Sigaev	dd92a8c33f	Fix type in return value	2006-11-21 18:31:28 +00:00
Teodor Sigaev	419fe7cd1b	Fix bug http://archives.postgresql.org/pgsql-bugs/2006-10/msg00258.php . Fix string's length calculation for recoding, fix strlower() to avoid wrong assumption about length of recoded string (was: recoded string is no greater that source, it may not true for multibyte encodings) Thanks to Thomas H. <me@alternize.com> and Magnus Hagander <mha@sollentuna.net>	2006-11-20 14:03:30 +00:00
Neil Conway	9d6f26325f	Fix two typos.	2006-11-08 19:06:15 +00:00
Teodor Sigaev	092ed294fc	New README, forgotten when docs was updated	2006-11-08 16:00:29 +00:00
Teodor Sigaev	bf028fa8a6	Add description of new features	2006-10-31 16:23:05 +00:00
Tom Lane	e378f82e00	Make use of qsort_arg in several places that were formerly using klugy static variables. This avoids any risk of potential non-reentrancy, and in particular offers a much cleaner workaround for the Intel compiler bug that was affecting ginutil.c.	2006-10-05 17:57:40 +00:00
Tom Lane	c48f2e3124	Improve error messages from to_tsquery per yesterday's discussion: provide the bad input, and be sure to mention that we are talking about a tsearch query.	2006-10-04 17:52:52 +00:00
Bruce Momjian	f99a569a2e	pgindent run for 8.2.	2006-10-04 00:30:14 +00:00
Bruce Momjian	26ffa627ac	Update tsearch2 README. Robert Treat	2006-10-02 22:32:10 +00:00
Tom Lane	41dcc65c0e	Rename the uninstall scripts for contrib/lo and contrib/tsearch2 to match the convention that foo's uninstall script is uninstall_foo.sql. Also, stop installing lo_test.sql, which really ought to be made into a regression test anyway (though it's unclear how to avoid a dependency on the current OID counter...)	2006-09-11 15:14:46 +00:00
Tom Lane	684ad6a92f	Rename contrib contains/contained-by operators to @> and <@, per discussion.	2006-09-10 17:36:52 +00:00
Bruce Momjian	0c4f2894f9	Use '' rather than \' for literal single quotes in strings in /contrib/tsearch2. Teodor Sigaev	2006-09-02 22:03:30 +00:00
Teodor Sigaev	72a3582522	Add description of tsvector type layout	2006-08-29 13:57:34 +00:00
Teodor Sigaev	3a214ab0f1	Remove pos comparison in silly_cmp_tsvector(): it is not a semantically significant	2006-08-29 13:39:20 +00:00
Teodor Sigaev	9711782628	Fix incorrect length of lexemes in silly_cmp_tsvector()	2006-08-29 13:31:54 +00:00
Teodor Sigaev	74dbba701f	Fix regression tests: after changing comparing function order is changed.	2006-08-25 07:39:08 +00:00
Teodor Sigaev	8f91e2b607	Fix compare bug for tsvector: problem was in aligment. Per Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> and Phil Frost <indigo@bitglue.com>	2006-08-24 17:37:34 +00:00
Tom Lane	a7143b3088	Fix some makefiles that fail to yield good results from 'make -qp'. This doesn't really matter for ordinary building of Postgres, but it's useful for automated checks, such as my just-committed pgcheckdefines.	2006-07-15 03:33:14 +00:00
Tom Lane	ae643747b1	Fix a passel of recently-committed violations of the rule 'thou shalt have no other gods before c.h'. Also remove some demonstrably redundant #include lines, mostly of <errno.h> which was added to c.h years ago.	2006-07-14 05:28:29 +00:00
Bruce Momjian	66c15dfda1	Adjust /contrib for new include file contents.	2006-07-13 16:57:31 +00:00
Bruce Momjian	ac230e7431	Alphabetically order reference to include files, "S"-"Z".	2006-07-11 18:26:11 +00:00
Bruce Momjian	0ff3461bcc	Alphabetically order reference to include files, "N" - "S".	2006-07-11 17:26:59 +00:00
Teodor Sigaev	234163649e	GIN improvements - Replace sorted array of entries in maintenance_work_mem to binary tree, this should improve create performance. - More precisely calculate allocated memory, eliminate leaks with user-defined extractValue() - Improve wordings in tsearch2	2006-07-11 16:55:34 +00:00
Bruce Momjian	fa601357fb	Sort reference of include files, "A" - "F".	2006-07-11 16:35:33 +00:00
Bruce Momjian	c5133e5920	Allow /contrib include files to compile on their own.	2006-07-10 22:06:11 +00:00
Teodor Sigaev	1f7ef548ec	Changes * new split algorithm (as proposed in http://archives.postgresql.org/pgsql-hackers/2006-06/msg00254.php) * possible call pickSplit() for second and below columns * add spl_(l\|r)datum_exists to GIST_SPLITVEC - pickSplit should check its values to use already defined spl_(l\|r)datum for splitting. pickSplit should set spl_(l\|r)datum_exists to 'false' (if they was 'true') to signal to caller about using spl_(l\|r)datum. * support for old pickSplit(): not very optimal but correct split * remove 'bytes' field from GISTENTRY: in any case size of value is defined by it's type. * split GIST_SPLITVEC to two structures: one for using in picksplit and second - for internal use. * some code refactoring * support of subsplit to rtree opclasses TODO: add support of subsplit to contrib modules	2006-06-28 12:00:14 +00:00
Teodor Sigaev	04e9704b9e	Now ispell dictionary can eat dictionaries in MySpell format, used by OpenOffice. Dictionaries are placed at http://lingucomponent.openoffice.org/spell_dic.html Dictionary automatically recognizes format of files. Warning. MySpell's format has limitation with compound word support: it's impossible to mark affix as compound-only affix. So for norwegian, german etc languages it's recommended to use original ispell format. For that reason I don't want to remove my2ispell scripts, it's has workaround at least for norwegian language.	2006-06-09 13:25:59 +00:00
Teodor Sigaev	92bcb5abe0	Allow do not lexize words in substitution. Docs will be submitted some later, now it's at http://www.sai.msu.su/~megera/oddmuse/index.cgi/Thesaurus_dictionary	2006-06-06 16:25:55 +00:00
Teodor Sigaev	a513ce2dff	Fix wrong NOTICE/ERROR levels	2006-06-02 18:03:06 +00:00
Teodor Sigaev	efe1d427da	Distinguish between stop-word recognized in thesaurus_lexize()	2006-06-02 17:55:40 +00:00
Teodor Sigaev	c7faf45160	Add more strict check of stop and non-recognized words, allow only recognized words in thezaurus configuration file.	2006-06-02 15:35:42 +00:00
Teodor Sigaev	c269f0f1e2	fix comparison with SPI_processed	2006-05-31 14:53:41 +00:00
Teodor Sigaev	22505f4703	Add thesaurus dictionary which can replace N>0 lexemes by M>0 lexemes. It required some changes in lexize algorithm, but interface with dictionaries stays compatible with old dictionaries. Funded by Georgia Public Library Service and LibLime, Inc.	2006-05-31 14:05:31 +00:00
Tom Lane	a0ffab351e	Magic blocks don't do us any good unless we use 'em ... so install one in every shared library.	2006-05-30 22:12:16 +00:00
Bruce Momjian	19892feb3c	Back out \' change for tsearch2, broke regression tests.	2006-05-19 04:39:47 +00:00
Bruce Momjian	cc84163fa9	Use SQL standard '' rather than \' in /contrib. Backpatch to 8.1.X.	2006-05-19 02:38:47 +00:00
Teodor Sigaev	8a3631f8d8	GIN: Generalized Inverted iNdex. text[], int4[], Tsearch2 support for GIN.	2006-05-02 11:28:56 +00:00
Teodor Sigaev	e30df619cd	Fix stupid mistake in rank_cd_def cleanup	2006-04-10 09:56:52 +00:00
Bruce Momjian	f3d99d160d	Add CVS tag lines to files that were lacking them.	2006-03-11 04:38:42 +00:00
Teodor Sigaev	38c4fe87ac	Significantly improve ranking: 1) rank_cd now use weight of lexemes 2) rank_cd and rank can use any combination of normalization methods: no normalization normalization by log(length of document) -----/------- by length of document -----/------- by number of unique word in document -----/------- by log(number of unique word in document) -----/------- by number of covers (only rank_cd) Improve cover's search. TODO: changes in documentation	2006-03-02 19:07:19 +00:00
Neil Conway	8e5a10d46c	This patch makes the error message strings throughout the backend more compliant with the error message style guide. In particular, errdetail should begin with a capital letter and end with a period, whereas errmsg should not. I also fixed a few related issues in passing, such as fixing the repeated misspelling of "lexeme" in contrib/tsearch2 (per Tom's suggestion).	2006-03-01 06:30:32 +00:00
Peter Eisentraut	7f4f42fa10	Clean up CREATE FUNCTION syntax usage in contrib and elsewhere, in particular get rid of single quotes around language names and old WITH () construct.	2006-02-27 16:09:50 +00:00
Teodor Sigaev	dde9457294	Fixing and improve compound word support. This changes cannot be applied to previous version iwthout recreating tsvector fields... Thanks to Alexander Presber <aljoscha@weisshuhn.de> to discover a problem.	2006-02-20 17:51:05 +00:00
Tom Lane	b35fdaaa1a	Clean up some signedness warnings.	2006-02-10 15:57:58 +00:00
Teodor Sigaev	01f2172ec1	Allow "'" symbol in affixes ("'s" affix in english): it was diallowed during multibyte support work. Add line number to error output during affix file parsing.	2006-02-10 12:56:14 +00:00
Teodor Sigaev	011c520cb6	renew output of regression test accordingly to http://archives.postgresql.org/pgsql-committers/2006-02/msg00089.php	2006-02-10 11:18:40 +00:00
Teodor Sigaev	46a25ce6a9	1 Fix bug with very short word: prefix and suffix might be overlapped, sorry but fix can't be applyed to previous version: it's require refill tsvector... 2 Small optimize of load time for huge dictionaries 3 use palloc instead of malloc during load dict file	2006-02-09 18:04:20 +00:00
Teodor Sigaev	a6fefc866c	Check number of affixes to prevent core dump with zero number of affixes	2006-02-06 15:45:34 +00:00
Teodor Sigaev	5e2707c45f	Snowball multibyte. It's a pity, but snowball sources is very diferent for multibyte and singlebyte encodings, so we should have snowball for every encodings. I hope that finalize multibyte support work in tsearch2, but testing is needed...	2006-01-27 16:32:31 +00:00
Teodor Sigaev	80324fb1e3	Fix typeing as Tom suggest	2006-01-23 14:24:06 +00:00
Tom Lane	33feb55c47	Replace bitwise looping with bytewise looping in hemdistsign and sizebitvec of tsearch2, as well as identical code in several other contrib modules. This provided about a 20X speedup in building a large tsearch2 index ... didn't try to measure its effects for other operations. Thanks to Stephan Vollmer for providing a test case.	2006-01-20 22:46:16 +00:00
Teodor Sigaev	7ac8a4be89	Multibyte encodings support for ISpell dictionary	2005-12-21 13:05:49 +00:00
Teodor Sigaev	cb4ea994c6	Improve support of multibyte encoding: - tsvector_(in\|out) - tsquery_(in\|out) - to_tsvector - to_tsquery, plainto_tsquery - 'simple' dictionary	2005-12-12 11:10:12 +00:00
Teodor Sigaev	faacdab101	Improve tag recognizing	2005-12-08 09:11:19 +00:00
Teodor Sigaev	9551ab2fe9	Fix small memory leak	2005-12-07 13:30:15 +00:00
Teodor Sigaev	4f94b49a31	Improve word parser. - allow ~ in filenames - -8.2.1 now is '-' and '8.2.1' instead of '-8.2' '.' '3' - '.text' now is not a file	2005-12-07 13:12:54 +00:00
Teodor Sigaev	e8c81e179e	Improve word parser. - improve file and path recognition - fix misspeling - improve tag recognition	2005-12-05 18:13:22 +00:00
Bruce Momjian	436a2956d8	Re-run pgindent, fixing a problem where comment lines after a blank comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.	2005-11-22 18:17:34 +00:00
Teodor Sigaev	3c6cd8a113	Fixes motivated by snake and spoonbill pgbuildfarm members	2005-11-22 09:01:35 +00:00
Teodor Sigaev	62699337bc	remove forgotten // comments	2005-11-21 18:00:52 +00:00
Teodor Sigaev	c52795d18a	Text parser rewritten: - supports multibyte encodings - more strict rules for lexemes - flex isn't used Add: - tsquery plainto_tsquery(text) Function makes tsquery from plain text. - &&, \|\|, !! operation for tsquery for combining tsquery from it's parts: 'foo & bar' \|\| 'asd' => 'foo & bar \| asd'	2005-11-21 12:27:57 +00:00
Tom Lane	1d0d8d3c38	Mop-up for nulls-in-arrays patch: fix some places that access array contents directly.	2005-11-18 02:38:24 +00:00
Tom Lane	cecb607559	Make SQL arrays support null elements. This commit fixes the core array functionality, but I still need to make another pass looking at places that incidentally use arrays (such as ACL manipulation) to make sure they are null-safe. Contrib needs work too. I have not changed the behaviors that are still under discussion about array comparison and what to do with lower bounds.	2005-11-17 22:14:56 +00:00
Teodor Sigaev	bad1a5c217	Use postgres-wide macros BITS_PER_BYTE instead self-definenig macros, also use it for calculating bit length of TPQTGist	2005-11-14 14:44:06 +00:00

1 2 3 4 5 ...

257 Commits