by creating a reference-count mechanism, similar to what we did a long time
ago for catcache entries. The back branches have an ugly solution involving
lots of extra copies, but this way is more efficient. Reference counting is
only applied to tupdescs that are actually in caches --- there seems no need
to use it for tupdescs that are generated in the executor, since they'll go
away during plan shutdown by virtue of being in the per-query memory context.
Neil Conway and Tom Lane
remove the infrastructure needed to enforce the limit, ie, the global
LRU list of cache entries. On small-to-middling databases this wins
because maintaining the LRU list is a waste of time. On large databases
this wins because it's better to keep more cache entries (we assume
such users can afford to use some more per-backend memory than was
contemplated in the Berkeley-era catcache design). This provides a
noticeable improvement in the speed of psql \d on a 10000-table
database, though it doesn't make it instantaneous.
While at it, use per-catcache settings for the number of hash buckets
per catcache, rather than the former one-size-fits-all value. It's a
bit silly to be using the same number of hash buckets for, eg, pg_am
and pg_attribute. The specific values I used might need some tuning,
but they seem to be in the right ballpark based on CATCACHE_STATS
results from the standard regression tests.
the lower-level large object functions fails, it will have already set
a suitable error message --- probably something from the backend ---
and it is not useful to overwrite that with a generic 'error while
reading large object' message. So remove redundant messages.
places --- that risks corrupting data structures, losing sync with the
backend, etc. We now longjmp only from calls to readline, fgets, and
fread, which we assume are coded to protect themselves against interrupts
at undesirable times. This requires adding explicit tests for
cancel_pressed in long-running loops, but on the whole it's far cleaner.
Martijn van Oosterhout and Tom Lane.
function call. Previously, there may have been no CHECK_FOR_INTERRUPTS
at all in the fastpath code path, making it impossible to cancel an
operation such as \lo_import externally. This addition doesn't ensure
you can cancel, since your SIGINT may arrive while the backend is idle
waiting for the client, but it gives the largest window we can easily
provide. Noted while experimenting with new control-C code for psql.
already-aborted transaction block. GetSnapshotData throws an Assert if
not in a valid transaction; hence we mustn't attempt to set a snapshot
for the function until after checking for aborted transaction. This is
harmless AFAICT if Asserts aren't enabled (GetSnapshotData will compute
a bogus snapshot, but it doesn't matter since HandleFunctionRequest will
throw an error shortly anywy). Hence, not a major bug.
Along the way, add some ability to log fastpath calls when statement
logging is turned on. This could probably stand to be improved further,
but not logging anything is clearly undesirable.
Backpatched as far as 8.0; bug doesn't exist before that.
was invoking obj_description() for each large object chunk, instead of once
per large object. This code is new as of 8.1, which may explain why the
problem hadn't been noticed already.
LWLocks during a panic exit. This avoids the possible self-deadlock pointed
out by Qingqing Zhou. Also, I noted that an error during LoadFreeSpaceMap()
or BuildFlatFiles() would result in exit(0) which would leave the postmaster
thinking all is well. Added a critical section to ensure such errors don't
allow startup to proceed.
Backpatched to 8.1. The 8.0 code is a bit different and I'm not sure if the
problem exists there; given we've not seen this reported from the field, I'm
going to be conservative about backpatching any further.
o remove many WIN32_CLIENT_ONLY defines
o add WIN32_ONLY_COMPILER define
o add 3rd argument to open() for portability
o add include/port/win32_msvc directory for
system includes
Magnus Hagander
it is just the total time to do INSTR_TIME_SET_CURRENT(), and not any of
the other code involved in InstrStartNode/InstrStopNode. Even though I
fear we may end up reverting this patch altogether, we may as well have
the most correct version in our CVS archive.
choose_bitmap_and(). It was way too fuzzy --- per comment, it was meant to be
1% relative difference, but was actually coded as 0.01 absolute difference,
thus causing selectivities of say 0.001 and 0.000000000001 to be treated as
equal. I believe this thinko explains Maxim Boguk's recent complaint. While
we could change it to a relative test coded like compare_fuzzy_path_costs(),
there's a bigger problem here, which is that any fuzziness at all renders the
comparison function non-transitive, which could confuse qsort() to the point
of delivering completely wrong results. So forget the whole thing and just
do an exact comparison.
that the Mackert-Lohmann formula applies across all the repetitions of the
nestloop, not just each scan independently. We use the M-L formula to
estimate the number of pages fetched from the index as well as from the table;
that isn't what it was designed for, but it seems reasonably applicable
anyway. This makes large numbers of repetitions look much cheaper than
before, which accords with many reports we've received of overestimation
of the cost of a nestloop. Also, change the index access cost model to
charge random_page_cost per index leaf page touched, while explicitly
not counting anything for access to metapage or upper tree pages. This
may all need tweaking after we get some field experience, but in simple
tests it seems to be giving saner results than before. The main thing
is to get the infrastructure in place to let cost_index() and amcostestimate
functions take repeated scans into account at all. Per my recent proposal.
Note: this patch changes pg_proc.h, but I did not force initdb because
the changes are basically cosmetic --- the system does not look into
pg_proc to decide how to call an index amcostestimate function, and
there's no way to call such a function from SQL at all.
cost_nonsequential_access() is really totally inappropriate for its only
remaining use, namely estimating I/O costs in cost_sort(). The routine
was designed on the assumption that disk caching might eliminate the need
for some re-reads on a random basis, but there's nothing very random in
that sense about sort's access pattern --- it'll always be picking up the
oldest outputs. If we had a good fix on the effective cache size we
might consider charging zero for I/O unless the sort temp file size
exceeds it, but that's probably putting much too much faith in the
parameter. Instead just drop the logic in favor of a fixed compromise
between seq_page_cost and random_page_cost per page of sort I/O.
This shouldn't affect simple indexscans much, while for bitmap scans that
are touching a lot of index rows, this seems to bring the estimates more
in line with reality. Per recent discussion.
assumed that a sequential page fetch has cost 1.0. This patch doesn't
in itself change the system's behavior at all, but it opens the door to
people adopting other units of measurement for EXPLAIN costs. Also, if
we ever decide it's worth inventing per-tablespace access cost settings,
this change provides a workable intellectual framework for that.
for LC_MESSAGES; instead, just press forward, leaving the effective setting
at 'C'. There is not any very good reason to complain when we are going
to replace the value soon with whatever postgresql.conf says. This change
should solve the occasionally-reported problem of initdb failing with
'failed to initialize lc_messages'; the current theory is that that is
a reflection of either wrong LANG/LC_MESSAGES or completely broken locale
support.
and there's only one place that's a kluge, ie, appendStringLiteralConn.
Note that pg_dump itself doesn't use appendStringLiteralConn, so its
behavior is not affected; only the other utility programs care.