postgres

Commit Graph

Author	SHA1	Message	Date
Amit Kapila	6f918159a9	Add more tests for FSM. In commit `b0eaa4c51b`, we left out a test that used a vacuum to remove dead rows as the behavior of test was not predictable. This test has been rewritten to use fillfactor instead to control free space. Since we no longer need to remove dead rows as part of the test, put the fsm regression test in a parallel group. Author: John Naylor Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CAA4eK1L=qWp_bJ5aTc9+fy4Ewx2LPaLWY-RbR4a60g_rupCKnQ@mail.gmail.com	2019-03-12 08:14:28 +05:30
Michael Paquier	ce6afc6823	Add routine able to update the control file to src/common/ This adds a new routine to src/common/ which is compatible with both the frontend and backend code, able to update the control file's contents. This is now getting used only by pg_rewind, but some upcoming patches which add more control on checksums for offline instances will make use of it. This could also get used more by the backend as xlog.c has its own flavor of the same logic with some wait events and an additional flush phase before closing the opened file descriptor, but this is let as separate work. Author: Michael Banck, Michael Paquier Reviewed-by: Fabien Coelho, Sergei Kornilov Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de	2019-03-12 10:03:33 +09:00
Tom Lane	1a83a80a2f	Allow fractional input values for integer GUCs, and improve rounding logic. Historically guc.c has just refused examples like set work_mem = '30.1GB', but it seems more useful for it to take that and round off the value to some reasonable approximation of what the user said. Just rounding to the parameter's native unit would work, but it would lead to rather silly-looking settings, such as 31562138kB for this example. Instead let's round to the nearest multiple of the next smaller unit (if any), producing 30822MB. Also, do the units conversion math in floating point and round to integer (if needed) only at the end. This produces saner results for inputs that aren't exact multiples of the parameter's native unit, and removes another difference in the behavior for integer vs. float parameters. In passing, document the ability to use hex or octal input where it ought to be documented. Discussion: https://postgr.es/m/1798.1552165479@sss.pgh.pa.us	2019-03-11 19:13:55 -04:00
Andrew Dunstan	fe0b2c12c9	Tweak wording on VARIADIC array doc patch. Per suggestion from Tom Lane.	2019-03-11 18:23:01 -04:00
Andrew Dunstan	5e74a42785	Document incompatibility of comparison expressions with VARIADIC array arguments COALESCE, GREATEST and LEAST all look like functions taking variable numbers of arguments, but in fact they are not functions, and so VARIADIC array arguments don't work with them. Add a note to the docs explaining this fact. The consensus is not to try to make this work, but just to document the limitation. Discussion: https://postgr.es/m/CAFj8pRCaAtuXuRtvXf5GmPbAVriUQrNMo7-=TXUFN025S31R_w@mail.gmail.com	2019-03-11 18:14:05 -04:00
Andres Freund	32b8f0b033	Remove spurious return. Per buildfarm member anole. Author: Andres Freund	2019-03-11 15:04:00 -07:00
Tom Lane	d9c5e9629b	Give up on testing guc.c's behavior for "infinity" inputs. Further buildfarm testing shows that on the machines that are failing ac75959cd's test case, what we're actually getting from strtod("-infinity") is a syntax error (endptr == value) not ERANGE at all. This test case is not worth carrying two sets of expected output for, so just remove it, and revert commit b212245f9's misguided attempt to work around the platform dependency. Discussion: https://postgr.es/m/E1h33xk-0001Og-Gs@gemulon.postgresql.org	2019-03-11 17:53:09 -04:00
Andres Freund	8cacea7a72	Ensure sufficient alignment for ParallelTableScanDescData in BTShared. Previously ParallelTableScanDescData was just a member in BTShared, but after `c2fe139c2` that doesn't guarantee sufficient alignment as specific AMs might (are likely to) need atomic variables in the struct. One might think that MAXALIGNing would be sufficient, but as a comment in shm_toc_allocate() explains, that's not enough. For now, copy the hack described there. For parallel sequential scans no such change is needed, as its allocations go through shm_toc_allocate(). An alternative approach would have been to allocate the parallel scan descriptor in a separate TOC entry, but there seems little benefit in doing so. Per buildfarm member dromedary. Author: Andres Freund Discussion: https://postgr.es/m/20190311203126.ty5gbfz42gjbm6i6@alap3.anarazel.de	2019-03-11 14:26:43 -07:00
Andres Freund	c2fe139c20	tableam: Add and use scan APIs. Too allow table accesses to be not directly dependent on heap, several new abstractions are needed. Specifically: 1) Heap scans need to be generalized into table scans. Do this by introducing TableScanDesc, which will be the "base class" for individual AMs. This contains the AM independent fields from HeapScanDesc. The previous heap_{beginscan,rescan,endscan} et al. have been replaced with a table_ version. There's no direct replacement for heap_getnext(), as that returned a HeapTuple, which is undesirable for a other AMs. Instead there's table_scan_getnextslot(). But note that heap_getnext() lives on, it's still used widely to access catalog tables. This is achieved by new scan_begin, scan_end, scan_rescan, scan_getnextslot callbacks. 2) The portion of parallel scans that's shared between backends need to be able to do so without the user doing per-AM work. To achieve that new parallelscan_{estimate, initialize, reinitialize} callbacks are introduced, which operate on a new ParallelTableScanDesc, which again can be subclassed by AMs. As it is likely that several AMs are going to be block oriented, block oriented callbacks that can be shared between such AMs are provided and used by heap. table_block_parallelscan_{estimate, intiialize, reinitialize} as callbacks, and table_block_parallelscan_{nextpage, init} for use in AMs. These operate on a ParallelBlockTableScanDesc. 3) Index scans need to be able to access tables to return a tuple, and there needs to be state across individual accesses to the heap to store state like buffers. That's now handled by introducing a sort-of-scan IndexFetchTable, which again is intended to be subclassed by individual AMs (for heap IndexFetchHeap). The relevant callbacks for an AM are index_fetch_{end, begin, reset} to create the necessary state, and index_fetch_tuple to retrieve an indexed tuple. Note that index_fetch_tuple implementations need to be smarter than just blindly fetching the tuples for AMs that have optimizations similar to heap's HOT - the currently alive tuple in the update chain needs to be fetched if appropriate. Similar to table_scan_getnextslot(), it's undesirable to continue to return HeapTuples. Thus index_fetch_heap (might want to rename that later) now accepts a slot as an argument. Core code doesn't have a lot of call sites performing index scans without going through the systable_* API (in contrast to loads of heap_getnext calls and working directly with HeapTuples). Index scans now store the result of a search in IndexScanDesc->xs_heaptid, rather than xs_ctup->t_self. As the target is not generally a HeapTuple anymore that seems cleaner. To be able to sensible adapt code to use the above, two further callbacks have been introduced: a) slot_callbacks returns a TupleTableSlotOps* suitable for creating slots capable of holding a tuple of the AMs type. table_slot_callbacks() and table_slot_create() are based upon that, but have additional logic to deal with views, foreign tables, etc. While this change could have been done separately, nearly all the call sites that needed to be adapted for the rest of this commit also would have been needed to be adapted for table_slot_callbacks(), making separation not worthwhile. b) tuple_satisfies_snapshot checks whether the tuple in a slot is currently visible according to a snapshot. That's required as a few places now don't have a buffer + HeapTuple around, but a slot (which in heap's case internally has that information). Additionally a few infrastructure changes were needed: I) SysScanDesc, as used by systable_{beginscan, getnext} et al. now internally uses a slot to keep track of tuples. While systable_getnext() still returns HeapTuples, and will so for the foreseeable future, the index API (see 1) above) now only deals with slots. The remainder, and largest part, of this commit is then adjusting all scans in postgres to use the new APIs. Author: Andres Freund, Haribabu Kommi, Alvaro Herrera Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql	2019-03-11 12:46:41 -07:00
Andrew Dunstan	a478415281	pgbench: increase the maximum number of variables/arguments pgbench's arbitrary limit of 10 arguments for SQL statements or metacommands is far too low. Increase it to 256. This results in a very modest increase in memory usage, not enough to worry about. The maximum includes the SQL statement or metacommand. This is reflected in the comments and revised TAP tests. Simon Riggs and Dagfinn Ilmari Mannsåker with some light editing by me. Reviewed by: David Rowley and Fabien Coelho Discussion: https://postgr.es/m/CANP8+jJiMJOAf-dLoHuR-8GENiK+eHTY=Omw38Qx7j2g0NDTXA@mail.gmail.com	2019-03-11 13:18:37 -04:00
Amit Kapila	a6e48da088	Fix typos in commit `8586bf7ed8`. Author: Amit Kapila Discussion: https://postgr.es/m/CAA4eK1KNv1Mg2krf4E9ssWFnE=8A9mZ1VbVywXBZTFSzb+wP2g@mail.gmail.com	2019-03-11 09:58:46 -07:00
Alvaro Herrera	af38498d4c	Move hash_any prototype from access/hash.h to utils/hashutils.h ... as well as its implementation from backend/access/hash/hashfunc.c to backend/utils/hash/hashfn.c. access/hash is the place for the hash index AM, not really appropriate for generic facilities, which is what hash_any is; having things the old way meant that anything using hash_any had to include the AM's include file, pointlessly polluting its namespace with unrelated, unnecessary cruft. Also move the HTEqual strategy number to access/stratnum.h from access/hash.h. To avoid breaking third-party extension code, add an #include "utils/hashutils.h" to access/hash.h. (An easily removed line by committers who enjoy their asbestos suits to protect them from angry extension authors.) Discussion: https://postgr.es/m/201901251935.ser5e4h6djt2@alvherre.pgsql	2019-03-11 13:17:50 -03:00
Tom Lane	b212245f96	In guc.c, ignore ERANGE errors from strtod(). Instead, just proceed with the infinity or zero result that it should return for overflow/underflow. This avoids a platform dependency, in that various versions of strtod are inconsistent about whether they signal ERANGE for a value that's specified as infinity. It's possible this won't be enough to remove the buildfarm failures we're seeing from `ac75959cd`, in which case I'll take out the infinity test case that commit added. But first let's see if we can fix it. Discussion: https://postgr.es/m/E1h33xk-0001Og-Gs@gemulon.postgresql.org	2019-03-11 11:25:26 -04:00
Michael Meskes	08cecfaf60	Fix potential memory access violation in ecpg if filename of include file is shorter than 2 characters. Patch by: "Wu, Fei" <wufei.fnst@cn.fujitsu.com>	2019-03-11 16:11:16 +01:00
Michael Meskes	98bdaab0d9	Fix ecpglib regression that made it impossible to close a cursor that was opened in a prepared statement. Patch by: "Kuroda, Hayato" <kuroda.hayato@jp.fujitsu.com>	2019-03-11 16:00:13 +01:00
Peter Eisentraut	3c06715447	Remove unused macro Use was removed in `25ca5a9a54`.	2019-03-11 09:29:50 +01:00
Peter Eisentraut	27f3dea648	psql: Add documentation URL to \help output Add a link to the specific command's reference web page to the bottom of its \help output. Discussion: https://www.postgresql.org/message-id/flat/40179bd0-fa7d-4108-1991-a20ae9ad5667%402ndquadrant.com	2019-03-11 09:11:37 +01:00
Michael Paquier	f2d84a4a6b	Adjust error message for partial writes in WAL segments `93473c6` has removed openLogOff, changing on the way the error message which is used to report partial writes to WAL segments. The newly-introduced error message used the offset up to which the write has happened, keeping always the same total length to write. This changes the error message so as the number of bytes left to write are reported. Reported-by: Michael Paquier Author: Robert Haas Discussion: https://postgr.es/m/20190306235251.GA17293@paquier.xyz	2019-03-11 09:31:25 +09:00
Alvaro Herrera	fc84c05acd	Fix documentation on partitioning vs. foreign tables 1. The PARTITION OF clause of CREATE FOREIGN TABLE was not explained in the CREATE FOREIGN TABLE reference page. Add it. (Postgres 10 onwards) 2. The limitation that tuple routing cannot target partitions that are foreign tables was not documented clearly enough. Improve wording. (Postgres 10 onwards) 3. The UPDATE tuple re-routing concurrency behavior was explained in the DDL chapter, which doesn't seem the right place. Move it to the UPDATE reference page instead. (Postgres 11 onwards). Authors: Amit Langote, David Rowley. Reviewed-by: Etsuro Fujita. Reported-by: Derek Hans Discussion: https://postgr.es/m/CAGrP7a3Xc1Qy_B2WJcgAD8uQTS_NDcJn06O5mtS_Ne1nYhBsyw@mail.gmail.com	2019-03-10 19:45:29 -03:00
Tom Lane	cbccac371c	Reduce the default value of autovacuum_vacuum_cost_delay to 2ms. This is a better way to implement the desired change of increasing autovacuum's default resource consumption. Discussion: https://postgr.es/m/28720.1552101086@sss.pgh.pa.us	2019-03-10 15:16:21 -04:00
Tom Lane	52985e4fea	Revert "Increase the default vacuum_cost_limit from 200 to 2000" This reverts commit `bd09503e63`. Per discussion, it seems like what we should do instead is to reduce the default value of autovacuum_vacuum_cost_delay by the same factor. That's functionally equivalent as long as the platform can accurately service the smaller delay request, which should be true on anything released in the last 10 years or more. And smaller, more-closely-spaced delays are better in terms of providing a steady I/O load. Discussion: https://postgr.es/m/28720.1552101086@sss.pgh.pa.us	2019-03-10 15:05:25 -04:00
Tom Lane	caf626b2cd	Convert [autovacuum_]vacuum_cost_delay into floating-point GUCs. This change makes it possible to specify sub-millisecond delays, which work well on most modern platforms, though that was not true when the cost-delay feature was designed. To support this without breaking existing configuration entries, improve guc.c to allow floating-point GUCs to have units. Also, allow "us" (microseconds) as an input/output unit for time-unit GUCs. (It's not allowed as a base unit, at least not yet.) Likewise change the autovacuum_vacuum_cost_delay reloption to be floating-point; this forces a catversion bump because the layout of StdRdOptions changes. This patch doesn't in itself change the default values or allowed ranges for these parameters, and it should not affect the behavior for any already-allowed setting for them. Discussion: https://postgr.es/m/1798.1552165479@sss.pgh.pa.us	2019-03-10 15:01:39 -04:00
Tom Lane	28a65fc360	Include GUC's unit, if it has one, in out-of-range error messages. This should reduce confusion in cases where we've applied a units conversion, so that the number being reported (and the quoted range limits) are in some other units than what the user gave in the setting we're rejecting. Some of the changes here assume that float GUCs can have units, which isn't true just yet, but will be shortly. Discussion: https://postgr.es/m/3811.1552169665@sss.pgh.pa.us	2019-03-10 13:18:17 -04:00
Tom Lane	ac75959cdc	Disallow NaN as a value for floating-point GUCs. None of the code that uses GUC values is really prepared for them to hold NaN, but parse_real() didn't have any defense against accepting such a value. Treat it the same as a syntax error. I haven't attempted to analyze the exact consequences of setting any of the float GUCs to NaN, but since they're quite unlikely to be good, this seems like a back-patchable bug fix. Note: we don't need an explicit test for +-Infinity because those will be rejected by existing range checks. I added a regression test for that in HEAD, but not older branches because the spelling of the value in the error message will be platform-dependent in branches where we don't always use port/snprintf.c. Discussion: https://postgr.es/m/1798.1552165479@sss.pgh.pa.us	2019-03-10 12:59:16 -04:00
Alvaro Herrera	203749a8a6	pg_upgrade: Ignore TOAST for partitioned tables Since partitioned tables in pg12 do not have toast tables, trying to set the toast OID confuses pg_upgrade. Have pg_dump omit those values to avoid the problem. Per Andres Freund and buildfarm members crake and snapper Discussion: https://postgr.es/m/20190306204104.yle5jfbnqkcwykni@alap3.anarazel.de	2019-03-10 13:20:58 -03:00
Alexander Korotkov	f2e403803f	Support for INCLUDE attributes in GiST indexes Similarly to B-tree, GiST index access method gets support of INCLUDE attributes. These attributes aren't used for tree navigation and aren't present in non-leaf pages. But they are present in leaf pages and can be fetched during index-only scan. The point of having INCLUDE attributes in GiST indexes is slightly different from the point of having them in B-tree. The main point of INCLUDE attributes in B-tree is to define UNIQUE constraint over part of attributes enabled for index-only scan. In GiST the main point of INCLUDE attributes is to use index-only scan for attributes, whose data types don't have GiST opclasses. Discussion: https://postgr.es/m/73A1A452-AD5F-40D4-BD61-978622FF75C1%40yandex-team.ru Author: Andrey Borodin, with small changes by me Reviewed-by: Andreas Karlsson	2019-03-10 11:37:17 +03:00
Tom Lane	a0b7626268	Simplify release-note links to back branches. Now that https://www.postgresql.org/docs/release/ is populated, replace the stopgap text we had under "Prior Releases" with a pointer to that archive. Discussion: https://postgr.es/m/e0f09c9a-bd2b-862a-d379-601dfabc8969@postgresql.org	2019-03-09 18:42:39 -05:00
Magnus Hagander	0516c61b75	Add new clientcert hba option verify-full This allows a login to require both that the cn of the certificate matches (like authentication type cert) and that another authentication method (such as password or kerberos) succeeds as well. The old value of clientcert=1 maps to the new clientcert=verify-ca, clientcert=0 maps to the new clientcert=no-verify, and the new option erify-full will add the validation of the CN. Author: Julian Markwort, Marius Timmer Reviewed by: Magnus Hagander, Thomas Munro	2019-03-09 12:19:47 -08:00
Magnus Hagander	6b9e875f72	Track block level checksum failures in pg_stat_database This adds a column that counts how many checksum failures have occurred on files belonging to a specific database. Both checksum failures during normal backend processing and those created when a base backup detects a checksum failure are counted. Author: Magnus Hagander Reviewed by: Julien Rouhaud	2019-03-09 10:47:30 -08:00
Noah Misch	3c5926301a	Avoid some table rewrites for ALTER TABLE .. SET DATA TYPE timestamp. When the timezone is UTC, timestamptz and timestamp are binary coercible in both directions. See `b8a18ad485` and `c22ecc6562` for the previous attempt in this problem space. Skip the table rewrite; for now, continue to needlessly rewrite any index on an affected column. Reviewed by Simon Riggs and Tom Lane. Discussion: https://postgr.es/m/20190226061450.GA1665944@rfd.leadboat.com	2019-03-08 20:16:27 -08:00
Michael Paquier	82a5649fb9	Tighten use of OpenTransientFile and CloseTransientFile This fixes two sets of issues related to the use of transient files in the backend: 1) OpenTransientFile() has been used in some code paths with read-write flags while read-only is sufficient, so switch those calls to be read-only where necessary. These have been reported by Joe Conway. 2) When opening transient files, it is up to the caller to close the file descriptors opened. In error code paths, CloseTransientFile() gets called to clean up things before issuing an error. However in normal exit paths, a lot of callers of CloseTransientFile() never actually reported errors, which could leave a file descriptor open without knowing about it. This is an issue I complained about a couple of times, but never had the courage to write and submit a patch, so here we go. Note that one frontend code path is impacted by this commit so as an error is issued when fetching control file data, making backend and frontend to be treated consistently. Reported-by: Joe Conway, Michael Paquier Author: Michael Paquier Reviewed-by: Álvaro Herrera, Georgios Kokolatos, Joe Conway Discussion: https://postgr.es/m/20190301023338.GD1348@paquier.xyz Discussion: https://postgr.es/m/c49b69ec-e2f7-ff33-4f17-0eaa4f2cef27@joeconway.com	2019-03-09 08:50:55 +09:00
Alvaro Herrera	2e616dee9e	Fix crash with old libxml2 Certain libxml2 versions (such as the 2.7.6 commonly seen in older distributions, but apparently only on x86_64) contain a bug that causes xmlCopyNode, when called on a XML_DOCUMENT_NODE, to return a node that xmlFreeNode crashes on. Arrange to call xmlFreeDoc instead of xmlFreeNode for those nodes. Per buildfarm members lapwing and grison. Author: Pavel Stehule, light editing by Álvaro. Discussion: https://postgr.es/m/20190308024436.GA2374@alvherre.pgsql	2019-03-08 19:13:25 -03:00
Tom Lane	1b76168da7	Reformat catalog .dat files. Test run for my previous commit; cleans up formatting issues in some other recent commits.	2019-03-08 12:01:27 -05:00
Tom Lane	27aaf6eff4	Minor improvements for reformat_dat_file.pl. Use Getopt::Long in preference to hand-rolled option parsing code. Also, remove "-I .../backend/catalog" switch from the Makefile invocations. That's been unnecessary for some time, and leaving it there gives the false impression it's needed in manual invocations. John Naylor (extracted from a larger but more controversial patch) Discussion: https://postgr.es/m/CACPNZCsHdcQN2jQ1=ptbi1Co2Nj3aHgRCUMk62=ThgWNabPY+Q@mail.gmail.com	2019-03-08 11:48:49 -05:00
Michael Paquier	e1e0e8d58c	Fix function signatures of pageinspect in documentation tuple_data_split() lacked the type of the first argument, and heap_page_item_attrs() has reversed the first and second argument, with the bytea argument using an incorrect name. Author: Laurenz Albe Discussion: https://postgr.es/m/8f9ab7b16daf623e87eeef5203a4ffc0dece8dfd.camel@cybertec.at Backpatch-through: 9.6	2019-03-08 15:10:14 +09:00
Michael Paquier	beeb8e2e07	Fix compatibility of pg_basebackup -R with 11 and older versions When `2dedf4d9` has integrated recovery.conf into postgresql.conf, it also changed pg_basebackup -R in the way recovery configuration is generated. However this implementation forgot the fact that pg_basebackup needs to keep compatibility with older server versions as well. Reported-by: Devrim Gündüz Author: Sergei Kornilov, Michael Paquier Discussion: https://postgr.es/m/3458f7cd12d74acd90180a671c8d5a081d60e162.camel@gunduz.org	2019-03-08 10:17:23 +09:00
Alvaro Herrera	251cf2e27b	Fix minor deficiencies in XMLTABLE, xpath(), xmlexists() Correctly process nodes of more types than previously. In some cases, nodes were being ignored (nothing was output); in other cases, trying to return them resulted in errors about unrecognized nodes. In yet other cases, necessary escaping (of XML special characters) was not being done. Fix all those (as far as the authors could find) and add regression tests cases verifying the new behavior. I (Álvaro) was of two minds about backpatching these changes. They do seem bugfixes that would benefit most users of the affected functions; but on the other hand it would change established behavior in minor releases, so it seems prudent not to. Authors: Pavel Stehule, Markus Winand, Chapman Flack Discussion: https://postgr.es/m/CAFj8pRA6J25CtAZ2TuRvxK3gat7-bBUYh0rfE2yM7Hj9GD14Dg@mail.gmail.com https://postgr.es/m/8BDB0627-2105-4564-AA76-7849F028B96E@winand.at The elephant in the room as pointed out by Chapman Flack, not fixed in this commit, is that we still have XMLTABLE operating on XPath 1.0 instead of the standard-mandated XQuery (or even its subset XPath 2.0). Fixing that is a major undertaking, however.	2019-03-07 18:16:34 -03:00
Tom Lane	1d33858406	Fix handling of targetlist SRFs when scan/join relation is known empty. When we introduced separate ProjectSetPath nodes for application of set-returning functions in v10, we inadvertently broke some cases where we're supposed to recognize that the result of a subquery is known to be empty (contain zero rows). That's because IS_DUMMY_REL was just looking for a childless AppendPath without allowing for a ProjectSetPath being possibly stuck on top. In itself, this didn't do anything much worse than produce slightly worse plans for some corner cases. Then in v11, commit `11cf92f6e` rearranged things to allow the scan/join targetlist to be applied directly to partial paths before they get gathered. But it inserted a short-circuit path for dummy relations that was a little too short: it failed to insert a ProjectSetPath node at all for a targetlist containing set-returning functions, resulting in bogus "set-valued function called in context that cannot accept a set" errors, as reported in bug #15669 from Madelaine Thibaut. The best way to fix this mess seems to be to reimplement IS_DUMMY_REL so that it drills down through any ProjectSetPath nodes that might be there (and it seems like we'd better allow for ProjectionPath as well). While we're at it, make it look at rel->pathlist not cheapest_total_path, so that it gives the right answer independently of whether set_cheapest has been done lately. That dependency looks pretty shaky in the context of code like apply_scanjoin_target_to_paths, and even if it's not broken today it'd certainly bite us at some point. (Nastily, unsafe use of the old coding would almost always work; the hazard comes down to possibly looking through a dangling pointer, and only once in a blue moon would you find something there that resulted in the wrong answer.) It now looks like it was a mistake for IS_DUMMY_REL to be a macro: if there are any extensions using it, they'll continue to use the old inadequate logic until they're recompiled, after which they'll fail to load into server versions predating this fix. Hopefully there are few such extensions. Having fixed IS_DUMMY_REL, the special path for dummy rels in apply_scanjoin_target_to_paths is unnecessary as well as being wrong, so we can just drop it. Also change a few places that were testing for partitioned-ness of a planner relation but not using IS_PARTITIONED_REL for the purpose; that seems unsafe as well as inconsistent, plus it required an ugly hack in apply_scanjoin_target_to_paths. In passing, save a few cycles in apply_scanjoin_target_to_paths by skipping processing of pre-existing paths for partitioned rels, and do some cosmetic cleanup and comment adjustment in that function. I renamed IS_DUMMY_PATH to IS_DUMMY_APPEND with the intention of breaking any code that might be using it, since in almost every case that would be wrong; IS_DUMMY_REL is what to be using instead. In HEAD, also make set_dummy_rel_pathlist static (since it's no longer used from outside allpaths.c), and delete is_dummy_plan, since it's no longer used anywhere. Back-patch as appropriate into v11 and v10. Tom Lane and Julien Rouhaud Discussion: https://postgr.es/m/15669-02fb3296cca26203@postgresql.org	2019-03-07 14:22:13 -05:00
Robert Haas	898e5e3290	Allow ATTACH PARTITION with only ShareUpdateExclusiveLock. We still require AccessExclusiveLock on the partition itself, because otherwise an insert that violates the newly-imposed partition constraint could be in progress at the same time that we're changing that constraint; only the lock level on the parent relation is weakened. To make this safe, we have to cope with (at least) three separate problems. First, relevant DDL might commit while we're in the process of building a PartitionDesc. If so, find_inheritance_children() might see a new partition while the RELOID system cache still has the old partition bound cached, and even before invalidation messages have been queued. To fix that, if we see that the pg_class tuple seems to be missing or to have a null relpartbound, refetch the value directly from the table. We can't get the wrong value, because DETACH PARTITION still requires AccessExclusiveLock throughout; if we ever want to change that, this will need more thought. In testing, I found it quite difficult to hit even the null-relpartbound case; the race condition is extremely tight, but the theoretical risk is there. Second, successive calls to RelationGetPartitionDesc might not return the same answer. The query planner will get confused if lookup up the PartitionDesc for a particular relation does not return a consistent answer for the entire duration of query planning. Likewise, query execution will get confused if the same relation seems to have a different PartitionDesc at different times. Invent a new PartitionDirectory concept and use it to ensure consistency. This ensures that a single invocation of either the planner or the executor sees the same view of the PartitionDesc from beginning to end, but it does not guarantee that the planner and the executor see the same view. Since this allows pointers to old PartitionDesc entries to survive even after a relcache rebuild, also postpone removing the old PartitionDesc entry until we're certain no one is using it. For the most part, it seems to be OK for the planner and executor to have different views of the PartitionDesc, because the executor will just ignore any concurrently added partitions which were unknown at plan time; those partitions won't be part of the inheritance expansion, but invalidation messages will trigger replanning at some point. Normally, this happens by the time the very next command is executed, but if the next command acquires no locks and executes a prepared query, it can manage not to notice until a new transaction is started. We might want to tighten that up, but it's material for a separate patch. There would still be a small window where a query that started just after an ATTACH PARTITION command committed might fail to notice its results -- but only if the command starts before the commit has been acknowledged to the user. All in all, the warts here around serializability seem small enough to be worth accepting for the considerable advantage of being able to add partitions without a full table lock. Although in general the consequences of new partitions showing up between planning and execution are limited to the query not noticing the new partitions, run-time partition pruning will get confused in that case, so that's the third problem that this patch fixes. Run-time partition pruning assumes that indexes into the PartitionDesc are stable between planning and execution. So, add code so that if new partitions are added between plan time and execution time, the indexes stored in the subplan_map[] and subpart_map[] arrays within the plan's PartitionedRelPruneInfo get adjusted accordingly. There does not seem to be a simple way to generalize this scheme to cope with partitions that are removed, mostly because they could then get added back again with different bounds, but it works OK for added partitions. This code does not try to ensure that every backend participating in a parallel query sees the same view of the PartitionDesc. That currently doesn't matter, because we never pass PartitionDesc indexes between backends. Each backend will ignore the concurrently added partitions which it notices, and it doesn't matter if different backends are ignoring different sets of concurrently added partitions. If in the future that matters, for example because we allow writes in parallel query and want all participants to do tuple routing to the same set of partitions, the PartitionDirectory concept could be improved to share PartitionDescs across backends. There is a draft patch to serialize and restore PartitionDescs on the thread where this patch was discussed, which may be a useful place to start. Patch by me. Thanks to Alvaro Herrera, David Rowley, Simon Riggs, Amit Langote, and Michael Paquier for discussion, and to Alvaro Herrera for some review. Discussion: http://postgr.es/m/CA+Tgmobt2upbSocvvDej3yzokd7AkiT+PvgFH+a9-5VV1oJNSQ@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoZE0r9-cyA-aY6f8WFEROaDLLL7Vf81kZ8MtFCkxpeQSw@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoY13KQZF-=HNTrt9UYWYx3_oYOQpu9ioNT49jGgiDpUEA@mail.gmail.com	2019-03-07 11:13:12 -05:00
Alvaro Herrera	ec51727f6e	Fix broken markup	2019-03-07 11:35:39 -03:00
Alvaro Herrera	eaaa5986ad	Fix the BY {REF,VALUE} clause of XMLEXISTS/XMLTABLE This clause is used to indicate the passing mode of a XML document, but we were doing it wrong: we accepted BY REF and ignored it, and rejected BY VALUE as a syntax error. The reality, however, is that documents are always passed BY VALUE, so rejecting that clause was silly. Change things so that we accept BY VALUE. BY REF continues to be accepted, and continues to be ignored. Author: Chapman Flack Reviewed-by: Pavel Stehule Discussion: https://postgr.es/m/5C297BB7.9070509@anastigmatix.net	2019-03-07 11:20:35 -03:00
Alvaro Herrera	cb706ec4b6	Add missing <limits.h> Per buildfarm	2019-03-07 10:08:29 -03:00
Alvaro Herrera	7e413a0f82	pg_dump: allow multiple rows per insert This is useful to speed up loading data in a different database engine. Authors: Surafel Temesgen and David Rowley. Lightly edited by Álvaro. Reviewed-by: Fabien Coelho Discussion: https://postgr.es/m/CALAY4q9kumSdnRBzvRJvSRf2+BH20YmSvzqOkvwpEmodD-xv6g@mail.gmail.com	2019-03-07 09:34:17 -03:00
Thomas Munro	42210524cc	Remove useless header inclusion.	2019-03-07 20:02:22 +13:00
Thomas Munro	91595f9d49	Drop the vestigial "smgr" type. Before commit `3fa2bb31` this type appeared in the catalogs to select which of several block storage mechanisms each relation used. New features under development propose to revive the concept of different block storage managers for new kinds of data accessed via bufmgr.c, but don't need to put references to them in the catalogs. So, avoid useless maintenance work on this type by dropping it. Update some regression tests that were referencing it where any type would do. Discussion: https://postgr.es/m/CA%2BhUKG%2BDE0mmiBZMtZyvwWtgv1sZCniSVhXYsXkvJ_Wo%2B83vvw%40mail.gmail.com	2019-03-07 15:44:04 +13:00
Andres Freund	277cb78983	Don't reuse slots between root and partition in ON CONFLICT ... UPDATE. Until now the the slot to store the conflicting tuple, and the result of the ON CONFLICT SET, where reused between partitions. That necessitated changing slots descriptor when switching partitions. Besides the overhead of switching descriptors on a slot (which requires memory allocations and prevents JITing), that's importantly also problematic for tableam. There individual partitions might belong to different tableams, needing different kinds of slots. In passing also fix ExecOnConflictUpdate to clear the existing slot at exit. Otherwise that slot could continue to hold a pin till the query ends, which could be far too long if the input data set is large, and there's no further conflicts. While previously also problematic, it's now more important as there will be more such slots when partitioned. Author: Andres Freund Reviewed-By: Robert Haas, David Rowley Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de	2019-03-06 15:43:33 -08:00
Andres Freund	d16a74c20c	Fix equalfuncs for accessMethod addition in `8586bf7ed8`. In a complete brown paper bag moment, I forgot to include equalfuncs in my previous fix of copy/out/readfuncs. Thanks Tom for noticing. Discussion: https://postgr.es/m/1659.1551903210@sss.pgh.pa.us	2019-03-06 13:04:09 -08:00
Andrew Dunstan	342cb650e0	Don't log incomplete startup packet if it's empty This will stop logging cases where, for example, a monitor opens a connection and immediately closes it. If the packet contains any data an incomplete packet will still be logged. Author: Tom Lane Discussion: https://postgr.es/m/a1379a72-2958-1ed0-ef51-09a21219b155@2ndQuadrant.com	2019-03-06 15:36:41 -05:00
Andres Freund	b172342321	Fix copy/out/readfuncs for accessMethod addition in `8586bf7ed8`. This includes a catversion bump, as IntoClause is theoretically speaking part of storable rules. In practice I don't think that can happen, but there's no reason to be stingy here. Per buildfarm member calliphoridae.	2019-03-06 11:55:28 -08:00
Andres Freund	863aa55624	Fix collation dependency in test introduced in `8586bf7ed8`, take 2. Per buildfarm. This time I hopefully actually made sure to get all the cases...	2019-03-06 11:27:14 -08:00

1 2 3 4 5 ...

46603 Commits All Branches Search

46603 Commits

All Branches