be delivered directly to the collector process. The extra process context
swaps required to transfer data through the buffer process seem to outweigh
any value the buffering might have. Per recent discussion and tests.
I modified Bruce's draft patch to use poll() rather than select() where
available (this makes a noticeable difference on my system), and fixed
up the EXEC_BACKEND case.
analyzing, so that future analyze threshold calculations don't get confused.
Also, make sure we correctly track the decrease of live tuples cause by
deletes.
Per report from Dylan Hansen, patches by Tom Lane and me.
changing semantics too much. statement_timestamp is now set immediately
upon receipt of a client command message, and the various places that used
to do their own gettimeofday() calls to mark command startup are referenced
to that instead. I have also made stats_command_string use that same
value for pg_stat_activity.query_start for both the command itself and
its eventual replacement by <IDLE> or <idle in transaction>. There was
some debate about that, but no argument that seemed convincing enough to
justify an extra gettimeofday() call.
current commands; instead, store current-status information in shared
memory. This substantially reduces the overhead of stats_command_string
and also ensures that pg_stat_activity is fully up to date at all times.
Per my recent proposal.
---------------------------------------------------------------------------
Delay write of pg_stats file to once every five minutes, during
shutdown, or when requested by a backend:
It changes so the file is only written once every 5 minutes (changeable
of course, I just picked something) instead of once every half second.
It's still written when the stats collector shuts down, just as before.
And it is now also written on backend request. A backend requests a
rewrite by simply sending a special stats message. It operates on the
assumption that the backends aren't actually going to read the
statistics file very often, compared to how frequent it's written today.
Magnus Hagander
issued by autovacuum. Add accessor functions to them, and use those in the
pg_stat_*_tables system views.
Catalog version bumped due to changes in the pgstat views and the pgstat file.
Patch from Larry Rosenman, minor improvements by me.
shutdown, or when requested by a backend:
It changes so the file is only written once every 5 minutes (changeable
of course, I just picked something) instead of once every half second.
It's still written when the stats collector shuts down, just as before.
And it is now also written on backend request. A backend requests a
rewrite by simply sending a special stats message. It operates on the
assumption that the backends aren't actually going to read the
statistics file very often, compared to how frequent it's written today.
Magnus Hagander
Per recent discussion, this seems to be making the stats less accurate
rather than more so, particularly on Windows where PID values may be
reused very quickly. Patch by Peter Brant.
files: avoid creating stats hashtable entries for tables that aren't being
touched except by vacuum/analyze, ensure that entries for dropped tables are
removed promptly, and tweak the data layout to avoid storing useless struct
padding. Also improve the performance of pgstat_vacuum_tabstat(), and make
sure that autovacuum invokes it exactly once per autovac cycle rather than
multiple times or not at all. This should cure recent complaints about 8.1
showing much higher stats I/O volume than was seen in 8.0. It'd still be a
good idea to revisit the design with an eye to not re-writing the entire
stats dataset every half second ... but that would be too much to backpatch,
I fear.
an LWLock instead of a spinlock. This hardly matters on Unix machines
but should improve startup performance on Windows (or any port using
EXEC_BACKEND). Per previous discussion.
file. The original code probed the PGPROC array separately for each PID,
which was not good for large numbers of backends: not only is the runtime
O(N^2) but most of it is spent holding ProcArrayLock. Instead, take the
lock just once and copy the active PIDs into an array, then use qsort
and bsearch so that the lookup time is more like O(N log N).
comment line where output as too long, and update typedefs for /lib
directory. Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).
Backpatch to 8.1.X.
generated by bitmap index scans. Along the way, simplify and speed up
the code for counting sequential and index scans; it was both confusing
and inefficient to be taking care of that in the per-tuple loops, IMHO.
initdb forced because of internal changes in pg_stat view definitions.
(the stats system has always collected this info, but the views were
filtering it out). Modify autovacuum so that over-threshold activity
in a toast table can trigger a VACUUM of the parent table, even if the
parent didn't appear to need vacuuming itself. Per discussion a month
or so back about "short, wide tables".
delay and limit, both as global GUCs and as table-specific entries in
pg_autovacuum. stats_reset_on_server_start is now OFF by default,
but a reset is forced if we did WAL replay. XID-wrap vacuums do not
ANALYZE, but do FREEZE if it's a template database. Alvaro Herrera
against the PGPROC array. Anything in the file that isn't in PGPROC
gets rejected as being a stale entry. This should solve complaints about
stale entries in pg_stat_activity after a BETERM message has been dropped
due to overload.
exit, instead of trying to take shortcuts. Introduce some additional
shutdown callback routines to eliminate kluges like having ProcKill
be responsible for shutting down the buffer manager. Ensure that the
order of operations during shutdown is predictable and what you would
expect given the module layering.
track shared relations in a separate hashtable, so that operations done
from different databases are counted correctly. Add proper support for
anti-XID-wraparound vacuuming, even in databases that are never connected
to and so have no stats entries. Miscellaneous other bug fixes.
Alvaro Herrera, some additional fixes by Tom Lane.
chdir into PGDATA and subsequently use relative paths instead of absolute
paths to access all files under PGDATA. This seems to give a small
performance improvement, and it should make the system more robust
against naive DBAs doing things like moving a database directory that
has a live postmaster in it. Per recent discussion.
current time: provide a GetCurrentTimestamp() function that returns
current time in the form of a TimestampTz, instead of separate time_t
and microseconds fields. This is what all the callers really want
anyway, and it eliminates low-level dependencies on AbsoluteTime,
which is a deprecated datatype that will have to disappear eventually.
and pg_auth_members. There are still many loose ends to finish in this
patch (no documentation, no regression tests, no pg_dump support for
instance). But I'm going to commit it now anyway so that Alvaro can
make some progress on shared dependencies. The catalog changes should
be pretty much done.
spotted by Qingqing Zhou. The HASH_ENTER action now automatically
fails with elog(ERROR) on out-of-memory --- which incidentally lets
us eliminate duplicate error checks in quite a bunch of places. If
you really need the old return-NULL-on-out-of-memory behavior, you
can ask for HASH_ENTER_NULL. But there is now an Assert in that path
checking that you aren't hoping to get that behavior in a palloc-based
hash table.
Along the way, remove the old HASH_FIND_SAVE/HASH_REMOVE_SAVED actions,
which were not being used anywhere anymore, and were surely too ugly
and unsafe to want to see revived again.
collector messages, per recent discussion on pgsql-patches. This
actually required quite a few changes -- for example,
"databaseid != InvalidOid" was used to check whether a slot in the
backend entry table was initialized, but that no longer works since
the slot might be initialized prior to receiving the BESTART message
which contains the database id. We now use procpid > 0 to indicate
that a slot is non-empty.
Other changes:
- various comment improvements and cleanups
- there's no need to zero-out the entire activity buffer in
pgstat_add_backend(), we can just set activity[0] to '\0'.
- remove the counting of the # of connections to a database; this
was not used anywhere
One change in behavior I wasn't sure about: previously, the code
would create a hash table entry for a database as soon as any message
was received whose header referenced that database. Now, we only
create hash table entries as needed (so for example BESTART won't
create a database hash table entry, since it doesn't need to
access anything in the per-db hash table). It would be easy enough
to retain the old behavior, but AFAICS it is not required.
* Add session start time to pg_stat_activity
* Add the client IP address and port to pg_stat_activity
Original patch from Magnus Hagander, code review by Neil Conway. Catalog
version bumped. This patch sends the client IP address and port number in
every statistics message; that's not ideal, but will be fixed up shortly.
whose keys are OIDs. The only one that looks particularly performance
critical is the relcache hashtable, but as long as we've got the function
we may as well use it wherever it's applicable.
indexes. Replace all heap_openr and index_openr calls by heap_open
and index_open. Remove runtime lookups of catalog OID numbers in
various places. Remove relcache's support for looking up system
catalogs by name. Bulky but mostly very boring patch ...
exit. Without this, operations triggered during backend exit (such as
temp table deletions) won't be counted ... which given heavy usage of
temp tables can lead to pg_autovacuum falling way behind on the need
to vacuum pg_class and pg_attribute. Per reports from Steve Crawford
and others.
even uglier than it was already :-(. Also, on Windows only, use temporary
shared memory segments instead of ordinary files to pass over critical
variable values from postmaster to child processes. Magnus Hagander