mirror of https://github.com/postgres/postgres
1169 lines
42 KiB
Plaintext
1169 lines
42 KiB
Plaintext
|
|
PostgreSQL TODO List
|
|
====================
|
|
Current maintainer: Bruce Momjian (pgman@candle.pha.pa.us)
|
|
Last updated: Mon Aug 1 10:13:28 EDT 2005
|
|
|
|
The most recent version of this document can be viewed at
|
|
http://www.postgresql.org/docs/faqs.TODO.html.
|
|
|
|
#A hyphen, "-", marks changes that will appear in the upcoming 8.1 release.#
|
|
|
|
Bracketed items, "[]", have more detail.
|
|
|
|
This list contains all known PostgreSQL bugs and feature requests. If
|
|
you would like to work on an item, please read the Developer's FAQ
|
|
first.
|
|
|
|
|
|
Administration
|
|
==============
|
|
|
|
* Remove behavior of postmaster -o after making postmaster/postgres
|
|
flags unique
|
|
* -Allow limits on per-db/role connections
|
|
* Allow pooled connections to list all prepared queries
|
|
|
|
This would allow an application inheriting a pooled connection to know
|
|
the queries prepared in the current session.
|
|
|
|
* Allow major upgrades without dump/reload, perhaps using pg_upgrade
|
|
[pg_upgrade]
|
|
* Check for unreferenced table files created by transactions that were
|
|
in-progress when the server terminated abruptly
|
|
* Allow administrators to safely terminate individual sessions either
|
|
via an SQL function or SIGTERM
|
|
|
|
Currently SIGTERM of a backend can lead to lock table corruption.
|
|
|
|
* -Prevent dropping user that still owns objects, or auto-drop the objects
|
|
* Set proper permissions on non-system schemas during db creation
|
|
|
|
Currently all schemas are owned by the super-user because they are
|
|
copied from the template1 database.
|
|
|
|
* -Add the client IP address and port to pg_stat_activity
|
|
* Support table partitioning that allows a single table to be stored
|
|
in subtables that are partitioned based on the primary key or a WHERE
|
|
clause
|
|
|
|
|
|
* Improve replication solutions
|
|
|
|
o Load balancing
|
|
|
|
You can use any of the master/slave replication servers to use a
|
|
standby server for data warehousing. To allow read/write queries to
|
|
multiple servers, you need multi-master replication like pgcluster.
|
|
|
|
o Allow replication over unreliable or non-persistent links
|
|
|
|
|
|
* Configuration files
|
|
|
|
o Add "include file" functionality in postgresql.conf
|
|
o Allow postgresql.conf values to be set so they can not be changed
|
|
by the user
|
|
o Allow commenting of variables in postgresql.conf to restore them
|
|
to defaults
|
|
o Allow pg_hba.conf settings to be controlled via SQL
|
|
|
|
This would add a function to load the SQL table from
|
|
pg_hba.conf, and one to writes its contents to the flat file.
|
|
The table should have a line number that is a float so rows
|
|
can be inserted between existing rows, e.g. row 2.5 goes
|
|
between row 2 and row 3.
|
|
|
|
o Allow postgresql.conf file values to be changed via an SQL
|
|
API, perhaps using SET GLOBAL
|
|
o Allow the server to be stopped/restarted via an SQL API
|
|
|
|
|
|
* Tablespaces
|
|
|
|
* Allow a database in tablespace t1 with tables created in
|
|
tablespace t2 to be used as a template for a new database created
|
|
with default tablespace t2
|
|
|
|
All objects in the default database tablespace must have default
|
|
tablespace specifications. This is because new databases are
|
|
created by copying directories. If you mix default tablespace
|
|
tables and tablespace-specified tables in the same directory,
|
|
creating a new database from such a mixed directory would create a
|
|
new database with tables that had incorrect explicit tablespaces.
|
|
To fix this would require modifying pg_class in the newly copied
|
|
database, which we don't currently do.
|
|
|
|
* Allow reporting of which objects are in which tablespaces
|
|
|
|
This item is difficult because a tablespace can contain objects
|
|
from multiple databases. There is a server-side function that
|
|
returns the databases which use a specific tablespace, so this
|
|
requires a tool that will call that function and connect to each
|
|
database to find the objects in each database for that tablespace.
|
|
|
|
o Add a GUC variable to control the tablespace for temporary objects
|
|
and sort files
|
|
|
|
It could start with a random tablespace from a supplied list and
|
|
cycle through the list.
|
|
|
|
o Allow WAL replay of CREATE TABLESPACE to work when the directory
|
|
structure on the recovery computer is different from the original
|
|
|
|
o Allow per-tablespace quotas
|
|
|
|
|
|
* Point-In-Time Recovery (PITR)
|
|
|
|
o Allow point-in-time recovery to archive partially filled
|
|
write-ahead logs [pitr]
|
|
|
|
Currently only full WAL files are archived. This means that the
|
|
most recent transactions aren't available for recovery in case
|
|
of a disk failure. This could be triggered by a user command or
|
|
a timer.
|
|
|
|
o Automatically force archiving of partially-filled WAL files when
|
|
pg_stop_backup() is called or the server is stopped
|
|
|
|
Doing this will allow administrators to know more easily when
|
|
the archive contins all the files needed for point-in-time
|
|
recovery.
|
|
|
|
o Create dump tool for write-ahead logs for use in determining
|
|
transaction id for point-in-time recovery
|
|
o Allow a warm standby system to also allow read-only queries
|
|
[pitr]
|
|
|
|
This is useful for checking PITR recovery.
|
|
|
|
o Allow the PITR process to be debugged and data examined
|
|
|
|
|
|
Monitoring
|
|
==========
|
|
|
|
* Allow server log information to be output as INSERT statements
|
|
|
|
This would allow server log information to be easily loaded into
|
|
a database for analysis.
|
|
|
|
* Add ability to monitor the use of temporary sort files
|
|
* -Add session start time and last statement time to pg_stat_activity
|
|
* -Add a function that returns the start time of the postmaster
|
|
* Allow server logs to be remotely read and removed using SQL commands
|
|
|
|
|
|
Data Types
|
|
==========
|
|
|
|
* Remove Money type, add money formatting for decimal type
|
|
* Change NUMERIC to enforce the maximum precision, and increase it
|
|
* Add NUMERIC division operator that doesn't round?
|
|
|
|
Currently NUMERIC _rounds_ the result to the specified precision.
|
|
This means division can return a result that multiplied by the
|
|
divisor is greater than the dividend, e.g. this returns a value > 10:
|
|
|
|
SELECT (10::numeric(2,0) / 6::numeric(2,0))::numeric(2,0) * 6;
|
|
|
|
The positive modulus result returned by NUMERICs might be considered
|
|
inaccurate, in one sense.
|
|
|
|
* Have sequence dependency track use of DEFAULT sequences,
|
|
seqname.nextval?
|
|
* Disallow changing default expression of a SERIAL column?
|
|
* Fix data types where equality comparison isn't intuitive, e.g. box
|
|
* Prevent INET cast to CIDR if the unmasked bits are not zero, or
|
|
zero the bits
|
|
* Prevent INET cast to CIDR from droping netmask, SELECT '1.1.1.1'::inet::cidr
|
|
* Allow INET + INT4 to increment the host part of the address, or
|
|
throw an error on overflow
|
|
* Add 'tid != tid ' operator for use in corruption recovery
|
|
|
|
|
|
* Dates and Times
|
|
|
|
o Allow infinite dates just like infinite timestamps
|
|
o Add a GUC variable to allow output of interval values in ISO8601
|
|
format
|
|
o Merge hardwired timezone names with the TZ database; allow either
|
|
kind everywhere a TZ name is currently taken
|
|
o Allow customization of the known set of TZ names (generalize the
|
|
present australian_timezones hack)
|
|
o Allow TIMESTAMP WITH TIME ZONE to store the original timezone
|
|
information, either zone name or offset from UTC [timezone]
|
|
|
|
If the TIMESTAMP value is stored with a time zone name, interval
|
|
computations should adjust based on the time zone rules.
|
|
|
|
o Add ISO INTERVAL handling
|
|
o Add support for day-time syntax, INTERVAL '1 2:03:04' DAY TO
|
|
SECOND
|
|
o Add support for year-month syntax, INTERVAL '50-6' YEAR TO MONTH
|
|
o For syntax that isn't uniquely ISO or PG syntax, like '1:30' or
|
|
'1', treat as ISO if there is a range specification clause,
|
|
and as PG if there no clause is present, e.g. interpret
|
|
'1:30' MINUTE TO SECOND as '1 minute 30 seconds', and
|
|
interpret '1:30' as '1 hour, 30 minutes'
|
|
o Interpret INTERVAL '1 year' MONTH as CAST (INTERVAL '1 year' AS
|
|
INTERVAL MONTH), and this should return '12 months'
|
|
o Round or truncate values to the requested precision, e.g.
|
|
INTERVAL '11 months' AS YEAR should return one or zero
|
|
o Support precision, CREATE TABLE foo (a INTERVAL MONTH(3))
|
|
|
|
|
|
* Arrays
|
|
|
|
o Allow NULLs in arrays
|
|
o Allow MIN()/MAX() on arrays
|
|
o Delay resolution of array expression's data type so assignment
|
|
coercion can be performed on empty array expressions
|
|
o Modify array literal representation to handle array index lower bound
|
|
of other than one
|
|
|
|
|
|
* Binary Data
|
|
|
|
o Improve vacuum of large objects, like /contrib/vacuumlo?
|
|
o Add security checking for large objects
|
|
|
|
Currently large objects entries do not have owners. Permissions can
|
|
only be set at the pg_largeobject table level.
|
|
|
|
o Auto-delete large objects when referencing row is deleted
|
|
o Allow read/write into TOAST values like large objects
|
|
|
|
This requires the TOAST column to be stored EXTERNAL.
|
|
|
|
|
|
Functions
|
|
=========
|
|
|
|
* -Add function to return compressed length of TOAST data values
|
|
* Allow INET subnet tests using non-constants to be indexed
|
|
* Add transaction_timestamp(), statement_timestamp(), clock_timestamp()
|
|
functionality
|
|
|
|
Current CURRENT_TIMESTAMP returns the start time of the current
|
|
transaction, and gettimeofday() returns the wallclock time. This will
|
|
make time reporting more consistent and will allow reporting of
|
|
the statement start time.
|
|
|
|
* Add pg_get_acldef(), pg_get_typedefault(), and pg_get_attrdef()
|
|
* Allow to_char() to print localized month names
|
|
* Allow functions to have a schema search path specified at creation time
|
|
* Allow substring/replace() to get/set bit values
|
|
* Allow to_char() on interval values to accumulate the highest unit
|
|
requested
|
|
|
|
Some special format flag would be required to request such
|
|
accumulation. Such functionality could also be added to EXTRACT.
|
|
Prevent accumulation that crosses the month/day boundary because of
|
|
the uneven number of days in a month.
|
|
|
|
o to_char(INTERVAL '1 hour 5 minutes', 'MI') => 65
|
|
o to_char(INTERVAL '43 hours 20 minutes', 'MI' ) => 2600
|
|
o to_char(INTERVAL '43 hours 20 minutes', 'WK:DD:HR:MI') => 0:1:19:20
|
|
o to_char(INTERVAL '3 years 5 months','MM') => 41
|
|
|
|
* Prevent to_char() on interval from returning meaningless values
|
|
|
|
For example, to_char('1 month', 'mon') is meaningless. Basically,
|
|
most date-related parameters to to_char() are meaningless for
|
|
intervals because interval is not anchored to a date.
|
|
|
|
|
|
Multi-Language Support
|
|
======================
|
|
|
|
* Add NCHAR (as distinguished from ordinary varchar),
|
|
* Allow locale to be set at database creation
|
|
|
|
Currently locale can only be set during initdb. No global tables have
|
|
locale-aware columns. However, the database template used during
|
|
database creation might have locale-aware indexes. The indexes would
|
|
need to be reindexed to match the new locale.
|
|
|
|
* Allow encoding on a per-column basis
|
|
|
|
Right now only one encoding is allowed per database.
|
|
|
|
* Support multiple simultaneous character sets, per SQL92
|
|
* Improve UTF8 combined character handling?
|
|
* Add octet_length_server() and octet_length_client()
|
|
* Make octet_length_client() the same as octet_length()?
|
|
|
|
|
|
Views / Rules
|
|
=============
|
|
|
|
* Automatically create rules on views so they are updateable, per SQL99
|
|
|
|
We can only auto-create rules for simple views. For more complex
|
|
cases users will still have to write rules.
|
|
|
|
* Add the functionality for WITH CHECK OPTION clause of CREATE VIEW
|
|
* Allow NOTIFY in rules involving conditionals
|
|
* Have views on temporary tables exist in the temporary namespace
|
|
* Allow temporary views on non-temporary tables
|
|
* Allow RULE recompilation
|
|
|
|
|
|
SQL Commands
|
|
============
|
|
|
|
* -Add BETWEEN SYMMETRIC/ASYMMETRIC
|
|
* Change LIMIT/OFFSET and FETCH/MOVE to use int8
|
|
* -Add E'' escape string marker so eventually ordinary strings can treat
|
|
backslashes literally, for portability
|
|
|
|
* -Allow additional tables to be specified in DELETE for joins
|
|
|
|
UPDATE already allows this (UPDATE...FROM) but we need similar
|
|
functionality in DELETE. It's been agreed that the keyword should
|
|
be USING, to avoid anything as confusing as DELETE FROM a FROM b.
|
|
|
|
* Add CORRESPONDING BY to UNION/INTERSECT/EXCEPT
|
|
* -Allow REINDEX to rebuild all database indexes
|
|
* Add ROLLUP, CUBE, GROUPING SETS options to GROUP BY
|
|
* Allow SET CONSTRAINTS to be qualified by schema/table name
|
|
* Allow TRUNCATE ... CASCADE/RESTRICT
|
|
* Add a separate TRUNCATE permission
|
|
|
|
Currently only the owner can TRUNCATE a table because triggers are not
|
|
called, and the table is locked in exclusive mode.
|
|
|
|
* Allow PREPARE of cursors
|
|
* Allow PREPARE to automatically determine parameter types based on the SQL
|
|
statement
|
|
* Allow finer control over the caching of prepared query plans
|
|
|
|
Currently, queries prepared via the libpq API are planned on first
|
|
execute using the supplied parameters --- allow SQL PREPARE to do the
|
|
same. Also, allow control over replanning prepared queries either
|
|
manually or automatically when statistics for execute parameters
|
|
differ dramatically from those used during planning.
|
|
|
|
* Allow LISTEN/NOTIFY to store info in memory rather than tables?
|
|
|
|
Currently LISTEN/NOTIFY information is stored in pg_listener. Storing
|
|
such information in memory would improve performance.
|
|
|
|
* Add optional textual message to NOTIFY
|
|
|
|
This would allow an informational message to be added to the notify
|
|
message, perhaps indicating the row modified or other custom
|
|
information.
|
|
|
|
* Add a GUC variable to warn about non-standard SQL usage in queries
|
|
* Add MERGE command that does UPDATE/DELETE, or on failure, INSERT (rules,
|
|
triggers?)
|
|
* Add NOVICE output level for helpful messages like automatic sequence/index
|
|
creation
|
|
* Add COMMENT ON for all cluster global objects (roles, databases
|
|
and tablespaces)
|
|
* -Add an option to automatically use savepoints for each statement in a
|
|
multi-statement transaction.
|
|
|
|
When enabled, this would allow errors in multi-statement transactions
|
|
to be automatically ignored.
|
|
|
|
* Make row-wise comparisons work per SQL spec
|
|
* Add RESET CONNECTION command to reset all session state
|
|
|
|
This would include resetting of all variables (RESET ALL), dropping of
|
|
temporary tables, removing any NOTIFYs, cursors, open transactions,
|
|
prepared queries, currval()s, etc. This could be used for connection
|
|
pooling. We could also change RESET ALL to have this functionality.
|
|
The difficult of this features is allowing RESET ALL to not affect
|
|
changes made by the interface driver for its internal use. One idea
|
|
is for this to be a protocol-only feature. Another approach is to
|
|
notify the protocol when a RESET CONNECTION command is used.
|
|
|
|
* Add GUC to issue notice about queries that use unjoined tables
|
|
* Allow EXPLAIN to identify tables that were skipped because of
|
|
enable_constraint_exclusion
|
|
* Allow EXPLAIN output to be more easily processed by scripts
|
|
|
|
|
|
* CREATE
|
|
|
|
o Allow CREATE TABLE AS to determine column lengths for complex
|
|
expressions like SELECT col1 || col2
|
|
|
|
o Use more reliable method for CREATE DATABASE to get a consistent
|
|
copy of db?
|
|
|
|
o Currently the system uses the operating system COPY command to
|
|
create a new database. Add ON COMMIT capability to CREATE TABLE AS
|
|
SELECT
|
|
|
|
|
|
* UPDATE
|
|
o Allow UPDATE to handle complex aggregates [update]?
|
|
o Allow an alias to be provided for the target table in
|
|
UPDATE/DELETE
|
|
|
|
This is not SQL-spec but many DBMSs allow it.
|
|
|
|
o Allow UPDATE tab SET ROW (col, ...) = (...) for updating multiple
|
|
columns
|
|
o Allow FOR UPDATE queries to do NOWAIT locks
|
|
|
|
|
|
* ALTER
|
|
|
|
o Have ALTER TABLE RENAME rename SERIAL sequence names
|
|
o Add ALTER DOMAIN TYPE
|
|
o Allow ALTER TABLE ... ALTER CONSTRAINT ... RENAME
|
|
o Allow ALTER TABLE to change constraint deferrability and actions
|
|
o Disallow dropping of an inherited constraint
|
|
o -Allow objects to be moved to different schemas
|
|
o Allow ALTER TABLESPACE to move to different directories
|
|
o Allow databases to be moved to different tablespaces
|
|
o Allow moving system tables to other tablespaces, where possible
|
|
|
|
Currently non-global system tables must be in the default database
|
|
tablespace. Global system tables can never be moved.
|
|
|
|
o Prevent child tables from altering constraints like CHECK that were
|
|
inherited from the parent table
|
|
|
|
|
|
* CLUSTER
|
|
|
|
o Automatically maintain clustering on a table
|
|
|
|
This might require some background daemon to maintain clustering
|
|
during periods of low usage. It might also require tables to be only
|
|
paritally filled for easier reorganization. Another idea would
|
|
be to create a merged heap/index data file so an index lookup would
|
|
automatically access the heap data too. A third idea would be to
|
|
store heap rows in hashed groups, perhaps using a user-supplied
|
|
hash function.
|
|
|
|
o Add default clustering to system tables
|
|
|
|
To do this, determine the ideal cluster index for each system
|
|
table and set the cluster setting during initdb.
|
|
|
|
|
|
* COPY
|
|
|
|
o Allow COPY to report error lines and continue
|
|
|
|
This requires the use of a savepoint before each COPY line is
|
|
processed, with ROLLBACK on COPY failure.
|
|
|
|
o -Allow COPY to understand \x as a hex byte
|
|
o Have COPY return the number of rows loaded/unloaded?
|
|
o -Allow COPY to optionally include column headings in the first line
|
|
o -Allow COPY FROM ... CSV to interpret newlines and carriage
|
|
returns in data
|
|
|
|
|
|
* GRANT/REVOKE
|
|
|
|
o Allow column-level privileges
|
|
o Allow GRANT/REVOKE permissions to be applied to all schema objects
|
|
with one command
|
|
|
|
The proposed syntax is:
|
|
GRANT SELECT ON ALL TABLES IN public TO phpuser;
|
|
GRANT SELECT ON NEW TABLES IN public TO phpuser;
|
|
|
|
* Allow GRANT/REVOKE permissions to be inherited by objects based on
|
|
schema permissions
|
|
|
|
|
|
* CURSOR
|
|
|
|
o Allow UPDATE/DELETE WHERE CURRENT OF cursor
|
|
|
|
This requires using the row ctid to map cursor rows back to the
|
|
original heap row. This become more complicated if WITH HOLD cursors
|
|
are to be supported because WITH HOLD cursors have a copy of the row
|
|
and no FOR UPDATE lock.
|
|
|
|
o Prevent DROP TABLE from dropping a row referenced by its own open
|
|
cursor?
|
|
|
|
o Allow pooled connections to list all open WITH HOLD cursors
|
|
|
|
Because WITH HOLD cursors exist outside transactions, this allows
|
|
them to be listed so they can be closed.
|
|
|
|
|
|
* INSERT
|
|
|
|
o Allow INSERT/UPDATE of the system-generated oid value for a row
|
|
o Allow INSERT INTO tab (col1, ..) VALUES (val1, ..), (val2, ..)
|
|
o Allow INSERT/UPDATE ... RETURNING new.col or old.col
|
|
|
|
This is useful for returning the auto-generated key for an INSERT.
|
|
One complication is how to handle rules that run as part of
|
|
the insert.
|
|
|
|
|
|
* SHOW/SET
|
|
|
|
o -Have SHOW ALL show descriptions for server-side variables
|
|
o Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM
|
|
ANALYZE, and CLUSTER
|
|
o Add SET PATH for schemas?
|
|
|
|
This is basically the same as SET search_path.
|
|
|
|
|
|
* Server-Side Languages
|
|
|
|
o -Allow PL/PgSQL's RAISE function to take expressions
|
|
|
|
Currently only constants are supported.
|
|
|
|
o -Change PL/PgSQL to use palloc() instead of malloc()
|
|
o Handle references to temporary tables that are created, destroyed,
|
|
then recreated during a session, and EXECUTE is not used
|
|
|
|
This requires the cached PL/PgSQL byte code to be invalidated when
|
|
an object referenced in the function is changed.
|
|
|
|
o Fix PL/pgSQL RENAME to work on variables other than OLD/NEW
|
|
o Allow function parameters to be passed by name,
|
|
get_employee_salary(emp_id => 12345, tax_year => 2001)
|
|
o Add Oracle-style packages
|
|
o Add table function support to pltcl, plperl, plpython?
|
|
o Allow PL/pgSQL to name columns by ordinal position, e.g. rec.(3)
|
|
o -Allow PL/pgSQL EXECUTE query_var INTO record_var;
|
|
o Add capability to create and call PROCEDURES
|
|
o Allow PL/pgSQL to handle %TYPE arrays, e.g. tab.col%TYPE[]
|
|
o Add MOVE to PL/pgSQL
|
|
o Pass arrays natively instead of as text between plperl and postgres
|
|
o Add support for polymorphic arguments and return types to plperl
|
|
|
|
|
|
Clients
|
|
=======
|
|
|
|
* Add a libpq function to support Parse/DescribeStatement capability
|
|
* Prevent libpq's PQfnumber() from lowercasing the column name?
|
|
* Allow libpq to access SQLSTATE so pg_ctl can test for connection failure
|
|
|
|
This would be used for checking if the server is up.
|
|
|
|
* Add PQescapeIdentifier() to libpq
|
|
* Have initdb set DateStyle based on locale?
|
|
* Have pg_ctl look at PGHOST in case it is a socket directory?
|
|
* Add a schema option to createlang
|
|
* Allow pg_ctl to work properly with configuration files located outside
|
|
the PGDATA directory
|
|
|
|
pg_ctl can not read the pid file because it isn't located in the
|
|
config directory but in the PGDATA directory. The solution is to
|
|
allow pg_ctl to read and understand postgresql.conf to find the
|
|
data_directory value.
|
|
|
|
|
|
* psql
|
|
|
|
o Have psql show current values for a sequence
|
|
o Move psql backslash database information into the backend, use
|
|
mnemonic commands? [psql]
|
|
|
|
This would allow non-psql clients to pull the same information out
|
|
of the database as psql.
|
|
|
|
o Fix psql's display of schema information (Neil)
|
|
o Allow psql \pset boolean variables to set to fixed values, rather
|
|
than toggle
|
|
o Consistently display privilege information for all objects in psql
|
|
o Improve psql's handling of multi-line queries
|
|
|
|
|
|
* pg_dump
|
|
|
|
o Have pg_dump use multi-statement transactions for INSERT dumps
|
|
o Allow pg_dump to use multiple -t and -n switches [pg_dump]
|
|
o Add dumping of comments on composite type columns
|
|
o Add dumping of comments on index columns
|
|
o Replace crude DELETE FROM method of pg_dumpall --clean for
|
|
cleaning of roles with separate DROP commands
|
|
o -Add dumping and restoring of LOB comments
|
|
o Stop dumping CASCADE on DROP TYPE commands in clean mode
|
|
o Add full object name to the tag field. eg. for operators we need
|
|
'=(integer, integer)', instead of just '='.
|
|
o Add pg_dumpall custom format dumps.
|
|
|
|
This is probably best done by combining pg_dump and pg_dumpall
|
|
into a single binary.
|
|
|
|
o Add CSV output format
|
|
o Update pg_dump and psql to use the new COPY libpq API (Christopher)
|
|
|
|
|
|
* ecpg
|
|
|
|
o Docs
|
|
|
|
Document differences between ecpg and the SQL standard and
|
|
information about the Informix-compatibility module.
|
|
|
|
o Solve cardinality > 1 for input descriptors / variables?
|
|
o Add a semantic check level, e.g. check if a table really exists
|
|
o fix handling of DB attributes that are arrays
|
|
o Use backend PREPARE/EXECUTE facility for ecpg where possible
|
|
o Implement SQLDA
|
|
o Fix nested C comments
|
|
o sqlwarn[6] should be 'W' if the PRECISION or SCALE value specified
|
|
o Make SET CONNECTION thread-aware, non-standard?
|
|
o Allow multidimensional arrays
|
|
o Add internationalized message strings
|
|
|
|
|
|
Referential Integrity
|
|
=====================
|
|
|
|
* Add MATCH PARTIAL referential integrity
|
|
* Add deferred trigger queue file
|
|
|
|
Right now all deferred trigger information is stored in backend
|
|
memory. This could exhaust memory for very large trigger queues.
|
|
This item involves dumping large queues into files.
|
|
|
|
* -Implement shared row locks and use them in RI triggers
|
|
* Change foreign key constraint for array -> element to mean element
|
|
in array?
|
|
* Allow DEFERRABLE UNIQUE constraints?
|
|
* Allow triggers to be disabled [trigger]
|
|
|
|
Currently the only way to disable triggers is to modify the system
|
|
tables.
|
|
|
|
* With disabled triggers, allow pg_dump to use ALTER TABLE ADD FOREIGN KEY
|
|
|
|
If the dump is known to be valid, allow foreign keys to be added
|
|
without revalidating the data.
|
|
|
|
* Allow statement-level triggers to access modified rows
|
|
* Support triggers on columns (Greg Sabino Mullane)
|
|
* Remove CREATE CONSTRAINT TRIGGER
|
|
|
|
This was used in older releases to dump referential integrity
|
|
constraints.
|
|
|
|
* Enforce referential integrity for system tables
|
|
* Allow AFTER triggers on system tables
|
|
|
|
System tables are modified in many places in the backend without going
|
|
through the executor and therefore not causing triggers to fire. To
|
|
complete this item, the functions that modify system tables will have
|
|
to fire triggers.
|
|
|
|
|
|
Dependency Checking
|
|
===================
|
|
|
|
* Flush cached query plans when the dependent objects change
|
|
* Track dependencies in function bodies and recompile/invalidate
|
|
|
|
|
|
Exotic Features
|
|
===============
|
|
|
|
* Add SQL99 WITH clause to SELECT
|
|
* Add SQL99 WITH RECURSIVE to SELECT
|
|
* Add pre-parsing phase that converts non-ISO syntax to supported
|
|
syntax
|
|
|
|
This could allow SQL written for other databases to run without
|
|
modification.
|
|
|
|
* Allow plug-in modules to emulate features from other databases
|
|
* SQL*Net listener that makes PostgreSQL appear as an Oracle database
|
|
to clients
|
|
* Allow queries across databases or servers with transaction
|
|
semantics
|
|
|
|
This can be done using dblink and two-phase commit.
|
|
|
|
* -Add two-phase commit
|
|
|
|
|
|
* Add the features of packages
|
|
|
|
o Make private objects accessable only to objects in the same schema
|
|
o Allow current_schema.objname to access current schema objects
|
|
o Add session variables
|
|
o Allow nested schemas
|
|
|
|
|
|
Indexes
|
|
=======
|
|
|
|
* Allow inherited tables to inherit index, UNIQUE constraint, and primary
|
|
key, foreign key
|
|
* UNIQUE INDEX on base column not honored on INSERTs/UPDATEs from
|
|
inherited table: INSERT INTO inherit_table (unique_index_col) VALUES
|
|
(dup) should fail
|
|
|
|
The main difficulty with this item is the problem of creating an index
|
|
that can span more than one table.
|
|
|
|
* Allow SELECT ... FOR UPDATE on inherited tables
|
|
* Prevent inherited tables from expanding temporary subtables of other
|
|
sessions
|
|
* Add UNIQUE capability to non-btree indexes
|
|
* -Use indexes for MIN() and MAX()
|
|
|
|
MIN/MAX queries can already be rewritten as SELECT col FROM tab ORDER
|
|
BY col {DESC} LIMIT 1. Completing this item involves doing this
|
|
transformation automatically.
|
|
|
|
* -Use index to restrict rows returned by multi-key index when used with
|
|
non-consecutive keys to reduce heap accesses
|
|
|
|
For an index on col1,col2,col3, and a WHERE clause of col1 = 5 and
|
|
col3 = 9, spin though the index checking for col1 and col3 matches,
|
|
rather than just col1; also called skip-scanning.
|
|
|
|
* Prevent index uniqueness checks when UPDATE does not modify the column
|
|
|
|
Uniqueness (index) checks are done when updating a column even if the
|
|
column is not modified by the UPDATE.
|
|
|
|
* Fetch heap pages matching index entries in sequential order
|
|
|
|
Rather than randomly accessing heap pages based on index entries, mark
|
|
heap pages needing access in a bitmap and do the lookups in sequential
|
|
order. Another method would be to sort heap ctids matching the index
|
|
before accessing the heap rows.
|
|
|
|
* -Allow non-bitmap indexes to be combined by creating bitmaps in memory
|
|
|
|
This feature allows separate indexes to be ANDed or ORed together. This
|
|
is particularly useful for data warehousing applications that need to
|
|
query the database in an many permutations. This feature scans an index
|
|
and creates an in-memory bitmap, and allows that bitmap to be combined
|
|
with other bitmap created in a similar way. The bitmap can either index
|
|
all TIDs, or be lossy, meaning it records just page numbers and each
|
|
page tuple has to be checked for validity in a separate pass.
|
|
|
|
* Allow the creation of on-disk bitmap indexes which can be quickly
|
|
combined with other bitmap indexes
|
|
|
|
Such indexes could be more compact if there are only a few distinct values.
|
|
Such indexes can also be compressed. Keeping such indexes updated can be
|
|
costly.
|
|
|
|
* Allow use of indexes to search for NULLs
|
|
|
|
One solution is to create a partial index on an IS NULL expression.
|
|
|
|
* Allow accurate statistics to be collected on indexes with more than
|
|
one column or expression indexes, perhaps using per-index statistics
|
|
* Add fillfactor to control reserved free space during index creation
|
|
* Allow the creation of indexes with mixed ascending/descending specifiers
|
|
* -Fix incorrect rtree results due to wrong assumptions about "over"
|
|
operator semantics
|
|
* Allow enable_constraint_exclusion to work for UNIONs like it does for
|
|
inheritance
|
|
* Allow enable_constraint_exclusion to work for UPDATE and DELETE queries
|
|
|
|
|
|
* GIST
|
|
|
|
o Add more GIST index support for geometric data types
|
|
o -Add concurrency to GIST
|
|
o Allow GIST indexes to create certain complex index types, like
|
|
digital trees (see Aoki)
|
|
|
|
* Hash
|
|
|
|
o Pack hash index buckets onto disk pages more efficiently
|
|
|
|
Currently no only one hash bucket can be stored on a page. Ideally
|
|
several hash buckets could be stored on a single page and greater
|
|
granularity used for the hash algorithm.
|
|
|
|
o Consider sorting hash buckets so entries can be found using a
|
|
binary search, rather than a linear scan
|
|
|
|
o In hash indexes, consider storing the hash value with or instead
|
|
of the key itself
|
|
|
|
|
|
Fsync
|
|
=====
|
|
|
|
* Improve commit_delay handling to reduce fsync()
|
|
* Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options
|
|
* Allow multiple blocks to be written to WAL with one write()
|
|
* Add an option to sync() before fsync()'ing checkpoint files
|
|
* Add program to test if fsync has a delay compared to non-fsync
|
|
|
|
|
|
Cache Usage
|
|
===========
|
|
|
|
* Allow free-behind capability for large sequential scans, perhaps using
|
|
posix_fadvise()
|
|
|
|
Posix_fadvise() can control both sequential/random file caching and
|
|
free-behind behavior, but it is unclear how the setting affects other
|
|
backends that also have the file open, and the feature is not supported
|
|
on all operating systems.
|
|
|
|
* -Consider use of open/fcntl(O_DIRECT) to minimize OS caching,
|
|
for WAL writes
|
|
|
|
O_DIRECT doesn't have the same media write guarantees as fsync, so it
|
|
is in addition to the fsync method, not in place of it.
|
|
|
|
* -Cache last known per-tuple offsets to speed long tuple access
|
|
* Speed up COUNT(*)
|
|
|
|
We could use a fixed row count and a +/- count to follow MVCC
|
|
visibility rules, or a single cached value could be used and
|
|
invalidated if anyone modifies the table. Another idea is to
|
|
get a count directly from a unique index, but for this to be
|
|
faster than a sequential scan it must avoid access to the heap
|
|
to obtain tuple visibility information.
|
|
|
|
* Allow data to be pulled directly from indexes
|
|
|
|
Currently indexes do not have enough tuple visibility information
|
|
to allow data to be pulled from the index without also accessing
|
|
the heap. One way to allow this is to set a bit to index tuples
|
|
to indicate if a tuple is currently visible to all transactions
|
|
when the first valid heap lookup happens. This bit would have to
|
|
be cleared when a heap tuple is expired.
|
|
|
|
|
|
* Consider automatic caching of queries at various levels:
|
|
|
|
o Parsed query tree
|
|
o Query execute plan
|
|
o Query results
|
|
|
|
* -Allow the size of the buffer cache used by temporary objects to be
|
|
specified as a GUC variable
|
|
|
|
Larger local buffer cache sizes requires more efficient handling of
|
|
local cache lookups.
|
|
|
|
* Improve the background writer
|
|
|
|
Allow the background writer to more efficiently write dirty buffers
|
|
from the end of the LRU cache and use a clock sweep algorithm to
|
|
write other dirty buffers to reduced checkpoint I/O
|
|
|
|
* Allow sequential scans to take advantage of other concurrent
|
|
sequentiqal scans, also called "Synchronised Scanning"
|
|
|
|
One possible implementation is to start sequential scans from the lowest
|
|
numbered buffer in the shared cache, and when reaching the end wrap
|
|
around to the beginning, rather than always starting sequential scans
|
|
at the start of the table.
|
|
|
|
|
|
Vacuum
|
|
======
|
|
|
|
* Improve speed with indexes
|
|
|
|
For large table adjustements during vacuum, it is faster to reindex
|
|
rather than update the index.
|
|
|
|
* Reduce lock time by moving tuples with read lock, then write
|
|
lock and truncate table
|
|
|
|
Moved tuples are invisible to other backends so they don't require a
|
|
write lock. However, the read lock promotion to write lock could lead
|
|
to deadlock situations.
|
|
|
|
* -Add a warning when the free space map is too small
|
|
* Maintain a map of recently-expired rows
|
|
|
|
This allows vacuum to target specific pages for possible free space
|
|
without requiring a sequential scan.
|
|
|
|
* Auto-fill the free space map by scanning the buffer cache or by
|
|
checking pages written by the background writer
|
|
* Create a bitmap of pages that need vacuuming
|
|
|
|
Instead of sequentially scanning the entire table, have the background
|
|
writer or some other process record pages that have expired rows, then
|
|
VACUUM can look at just those pages rather than the entire table. In
|
|
the event of a system crash, the bitmap would probably be invalidated.
|
|
|
|
* Add system view to show free space map contents
|
|
|
|
|
|
* Auto-vacuum
|
|
|
|
o -Move into the backend code
|
|
o Use free-space map information to guide refilling
|
|
o Do VACUUM FULL if table is nearly empty?
|
|
o Improve xid wraparound detection by recording per-table rather
|
|
than per-database
|
|
|
|
|
|
Locking
|
|
=======
|
|
|
|
* -Make locking of shared data structures more fine-grained
|
|
|
|
This requires that more locks be acquired but this would reduce lock
|
|
contention, improving concurrency.
|
|
|
|
* Add code to detect an SMP machine and handle spinlocks accordingly
|
|
from distributted.net, http://www1.distributed.net/source,
|
|
in client/common/cpucheck.cpp
|
|
|
|
On SMP machines, it is possible that locks might be released shortly,
|
|
while on non-SMP machines, the backend should sleep so the process
|
|
holding the lock can complete and release it.
|
|
|
|
* -Improve SMP performance on i386 machines
|
|
|
|
i386-based SMP machines can generate excessive context switching
|
|
caused by lock failure in high concurrency situations. This may be
|
|
caused by CPU cache line invalidation inefficiencies.
|
|
|
|
* Research use of sched_yield() for spinlock acquisition failure
|
|
* Fix priority ordering of read and write light-weight locks (Neil)
|
|
|
|
|
|
Startup Time Improvements
|
|
=========================
|
|
|
|
* Experiment with multi-threaded backend [thread]
|
|
|
|
This would prevent the overhead associated with process creation. Most
|
|
operating systems have trivial process creation time compared to
|
|
database startup overhead, but a few operating systems (WIn32,
|
|
Solaris) might benefit from threading. Also explore the idea of
|
|
a single session using multiple threads to execute a query faster.
|
|
|
|
* Add connection pooling
|
|
|
|
It is unclear if this should be done inside the backend code or done
|
|
by something external like pgpool. The passing of file descriptors to
|
|
existing backends is one of the difficulties with a backend approach.
|
|
|
|
|
|
Write-Ahead Log
|
|
===============
|
|
|
|
* Eliminate need to write full pages to WAL before page modification [wal]
|
|
|
|
Currently, to protect against partial disk page writes, we write
|
|
full page images to WAL before they are modified so we can correct any
|
|
partial page writes during recovery. These pages can also be
|
|
eliminated from point-in-time archive files.
|
|
|
|
o -Add ability to turn off full page writes
|
|
o When off, write CRC to WAL and check file system blocks
|
|
on recovery
|
|
|
|
If CRC check fails during recovery, remember the page in case
|
|
a later CRC for that page properly matches.
|
|
|
|
o Write full pages during file system write and not when
|
|
the page is modified in the buffer cache
|
|
|
|
This allows most full page writes to happen in the background
|
|
writer. It might cause problems for applying WAL on recovery
|
|
into a partially-written page, but later the full page will be
|
|
replaced from WAL.
|
|
|
|
* Reduce WAL traffic so only modified values are written rather than
|
|
entire rows?
|
|
* Add WAL index reliability improvement to non-btree indexes
|
|
* Allow the pg_xlog directory location to be specified during initdb
|
|
with a symlink back to the /data location
|
|
* Allow WAL information to recover corrupted pg_controldata
|
|
* Find a way to reduce rotational delay when repeatedly writing
|
|
last WAL page
|
|
|
|
Currently fsync of WAL requires the disk platter to perform a full
|
|
rotation to fsync again. One idea is to write the WAL to different
|
|
offsets that might reduce the rotational delay.
|
|
|
|
* Allow buffered WAL writes and fsync
|
|
|
|
Instead of guaranteeing recovery of all committed transactions, this
|
|
would provide improved performance by delaying WAL writes and fsync
|
|
so an abrupt operating system restart might lose a few seconds of
|
|
committed transactions but still be consistent. We could perhaps
|
|
remove the 'fsync' parameter (which results in an an inconsistent
|
|
database) in favor of this capability.
|
|
|
|
* -Eliminate WAL logging for CREATE TABLE AS when not doing WAL archiving
|
|
* -Change WAL to use 32-bit CRC, for performance reasons
|
|
|
|
|
|
Optimizer / Executor
|
|
====================
|
|
|
|
* Add missing optimizer selectivities for date, r-tree, etc
|
|
* Allow ORDER BY ... LIMIT # to select high/low value without sort or
|
|
index using a sequential scan for highest/lowest values
|
|
|
|
Right now, if no index exists, ORDER BY ... LIMIT # requires we sort
|
|
all values to return the high/low value. Instead The idea is to do a
|
|
sequential scan to find the high/low value, thus avoiding the sort.
|
|
MIN/MAX already does this, but not for LIMIT > 1.
|
|
|
|
* Precompile SQL functions to avoid overhead
|
|
* Create utility to compute accurate random_page_cost value
|
|
* Improve ability to display optimizer analysis using OPTIMIZER_DEBUG
|
|
* Have EXPLAIN ANALYZE highlight poor optimizer estimates
|
|
* -Use CHECK constraints to influence optimizer decisions
|
|
|
|
CHECK constraints contain information about the distribution of values
|
|
within the table. This is also useful for implementing subtables where
|
|
a tables content is distributed across several subtables.
|
|
|
|
* Consider using hash buckets to do DISTINCT, rather than sorting
|
|
|
|
This would be beneficial when there are few distinct values.
|
|
|
|
* ANALYZE should record a pg_statistic entry for an all-NULL column
|
|
* Log queries where the optimizer row estimates were dramatically
|
|
different from the number of rows actually found?
|
|
|
|
|
|
Miscellaneous Performance
|
|
=========================
|
|
|
|
* Do async I/O for faster random read-ahead of data
|
|
|
|
Async I/O allows multiple I/O requests to be sent to the disk with
|
|
results coming back asynchronously.
|
|
|
|
* Use mmap() rather than SYSV shared memory or to write WAL files?
|
|
|
|
This would remove the requirement for SYSV SHM but would introduce
|
|
portability issues. Anonymous mmap (or mmap to /dev/zero) is required
|
|
to prevent I/O overhead.
|
|
|
|
* Consider mmap()'ing files into a backend?
|
|
|
|
Doing I/O to large tables would consume a lot of address space or
|
|
require frequent mapping/unmapping. Extending the file also causes
|
|
mapping problems that might require mapping only individual pages,
|
|
leading to thousands of mappings. Another problem is that there is no
|
|
way to _prevent_ I/O to disk from the dirty shared buffers so changes
|
|
could hit disk before WAL is written.
|
|
|
|
* Add a script to ask system configuration questions and tune postgresql.conf
|
|
* Use a phantom command counter for nested subtransactions to reduce
|
|
per-tuple overhead
|
|
* Research storing disk pages with no alignment/padding
|
|
|
|
Source Code
|
|
===========
|
|
|
|
* Add use of 'const' for variables in source tree
|
|
* Rename some /contrib modules from pg* to pg_*
|
|
* Move some things from /contrib into main tree
|
|
* Move some /contrib modules out to their own project sites
|
|
* Remove warnings created by -Wcast-align
|
|
* Move platform-specific ps status display info from ps_status.c to ports
|
|
* Add optional CRC checksum to heap and index pages
|
|
* Improve documentation to build only interfaces (Marc)
|
|
* Remove or relicense modules that are not under the BSD license, if possible
|
|
* Remove memory/file descriptor freeing before ereport(ERROR)
|
|
* Acquire lock on a relation before building a relcache entry for it
|
|
* Promote debug_query_string into a server-side function current_query()
|
|
* Allow the identifier length to be increased via a configure option
|
|
* Remove Win32 rename/unlink looping if unnecessary
|
|
* -Remove kerberos4 from source tree
|
|
* Allow cross-compiling by generating the zic database on the target system
|
|
* Improve NLS maintenace of libpgport messages linked onto applications
|
|
* Allow ecpg to work with MSVC and BCC
|
|
* -Make src/port/snprintf.c thread-safe
|
|
* Add xpath_array() to /contrib/xml2 to return results as an array
|
|
* Allow building in directories containing spaces
|
|
|
|
This is probably not possible because 'gmake' and other compiler tools
|
|
do not fully support quoting of paths with spaces.
|
|
|
|
* Allow installing to directories containing spaces
|
|
|
|
This is possible if proper quoting is added to the makefiles for the
|
|
install targets. Because PostgreSQL supports relocatable installs, it
|
|
is already possible to install into a directory that doesn't contain
|
|
spaces and then copy the install to a directory with spaces.
|
|
|
|
* Fix cross-compiling of time zone database via 'zic'
|
|
* Fix sgmltools so PDFs can be generated with bookmarks
|
|
* Add C code on Unix to copy directories for use in creating new databases
|
|
|
|
|
|
* Win32
|
|
|
|
o Remove configure.in check for link failure when cause is found
|
|
o Remove readdir() errno patch when runtime/mingwex/dirent.c rev
|
|
1.4 is released
|
|
o Remove psql newline patch when we find out why mingw outputs an
|
|
extra newline
|
|
o Allow psql to use readline once non-US code pages work with
|
|
backslashes
|
|
o Re-enable timezone output on log_line_prefix '%t' when a
|
|
shorter timezone string is available
|
|
o Improve dlerror() reporting string
|
|
o Fix problem with shared memory on the Win32 Terminal Server
|
|
o Add support for Unicode
|
|
|
|
To fix this, the data needs to be converted to/from UTF16/UTF8
|
|
so the Win32 wcscoll() can be used, and perhaps other functions
|
|
like towupper(). However, UTF8 already works with normal
|
|
locales but provides no ordering or character set classes.
|
|
|
|
|
|
* Wire Protocol Changes
|
|
|
|
o Allow dynamic character set handling
|
|
o Add decoded type, length, precision
|
|
o Use compression?
|
|
o Update clients to use data types, typmod, schema.table.column names
|
|
of result sets using new query protocol
|
|
|
|
|
|
---------------------------------------------------------------------------
|
|
|
|
|
|
Developers who have claimed items are:
|
|
--------------------------------------
|
|
* Alvaro is Alvaro Herrera <alvherre@dcc.uchile.cl>
|
|
* Andrew is Andrew Dunstan <andrew@dunslane.net>
|
|
* Bruce is Bruce Momjian <pgman@candle.pha.pa.us> of Software Research Assoc.
|
|
* Christopher is Christopher Kings-Lynne <chriskl@familyhealth.com.au> of
|
|
Family Health Network
|
|
* Claudio is Claudio Natoli <claudio.natoli@memetrics.com>
|
|
* D'Arcy is D'Arcy J.M. Cain <darcy@druid.net> of The Cain Gang Ltd.
|
|
* Fabien is Fabien Coelho <coelho@cri.ensmp.fr>
|
|
* Gavin is Gavin Sherry <swm@linuxworld.com.au> of Alcove Systems Engineering
|
|
* Greg is Greg Sabino Mullane <greg@turnstep.com>
|
|
* Hiroshi is Hiroshi Inoue <Inoue@tpf.co.jp>
|
|
* Jan is Jan Wieck <JanWieck@Yahoo.com> of Afilias, Inc.
|
|
* Joe is Joe Conway <mail@joeconway.com>
|
|
* Karel is Karel Zak <zakkr@zf.jcu.cz>
|
|
* Magnus is Magnus Hagander <mha@sollentuna.net>
|
|
* Marc is Marc Fournier <scrappy@hub.org> of PostgreSQL, Inc.
|
|
* Matthew T. O'Connor <matthew@zeut.net>
|
|
* Michael is Michael Meskes <meskes@postgresql.org> of Credativ
|
|
* Neil is Neil Conway <neilc@samurai.com>
|
|
* Oleg is Oleg Bartunov <oleg@sai.msu.su>
|
|
* Peter is Peter Eisentraut <peter_e@gmx.net>
|
|
* Philip is Philip Warner <pjw@rhyme.com.au> of Albatross Consulting Pty. Ltd.
|
|
* Rod is Rod Taylor <pg@rbt.ca>
|
|
* Simon is Simon Riggs <simon@2ndquadrant.com>
|
|
* Stephan is Stephan Szabo <sszabo@megazone23.bigpanda.com>
|
|
* Tatsuo is Tatsuo Ishii <t-ishii@sra.co.jp> of Software Research Assoc.
|
|
* Tom is Tom Lane <tgl@sss.pgh.pa.us> of Red Hat
|