TODO list for PostgreSQL ======================== Last updated: Tue May 28 20:36:36 EDT 2002 Current maintainer: Bruce Momjian (pgman@candle.pha.pa.us) The most recent version of this document can be viewed at the PostgreSQL web site, http://www.PostgreSQL.org. A dash (-) marks changes that will appear in the upcoming 7.3 release. Bracketed items "[]" have more detailed. Urgent ====== * Add replication of distributed databases [replication] o automatic failover o load balancing o master/slave replication o multi-master replication o partition data across servers o sample implementation in contrib/rserv o queries across databases or servers (two-phase commit) o allow replication over unreliable or non-persistent links o http://gborg.postgresql.org/project/pgreplication/projdisplay.php * Point-in-time data recovery using backup and write-ahead log Reporting ========= * Allow elog() to return error codes, module name, file name, line number, not just messages (Peter E) * Add error codes (Peter E) * -Change DEBUG startup tag to LOG (Bruce) * Show location of syntax error in query [yacc] * Add getpid() function to backend Permissions =========== * Improve control over user privileges, including table creation and lock use [privileges] (Karel, others) * -Allow user/group names to be specified directly in pg_hba.conf (Bruce) * Add PGPASSWORDFILE environment variable or ~/.pgpass to store user/host/password combinations * Remove PGPASSWORD because it is insecure on some OS's * Make single-user local access permissions the default by limiting permissions on the socket file (Peter E) * -Allow permissions for functions (Peter E) Administration ============== * Incremental backups * -Make it easier to create a database owned by someone who can't createdb, perhaps CREATE DATABASE dbname WITH OWNER = "user" (Gavin) * -Make equals sign optional in CREATE DATABASE WITH param = 'val' * Remove unreferenced table files and temp tables during database vacuum or postmaster startup (Bruce) * Add table name mapping for numeric file names (Bruce) * Remove behavior of postmaster -o after making postmaster/postgres flags unique * Allow logging of query durations * -Prevent SIGHUP and 'pg_ctl reload' from changing command line specified parameters to postgresql.conf defaults (Peter E) * Allow easy display of usernames in a group Data Types ========== * -Add domain capability (Rod Taylor) * Add IPv6 capability to INET/CIDR types * Remove Money type, add money formatting for decimal type * SELECT cash_out(2) crashes because of opaque * Declare typein/out functions in pg_proc with a special "C string" data type * Functions returning sets do not totally work * Change factorial to return a numeric * Change NUMERIC data type to use base 10,000 internally * Change NUMERIC to enforce the maximum precision, and increase it * Add function to return compressed length of TOAST data values (Tom) * Add GUC parameter for DATESTYLE * CONVERSION o Store binary-compatible type information in the system o Allow better handling of numeric constants, type conversion [typeconv] o SELECT col FROM tab WHERE numeric_col = 10.1 fails, requires quotes * ARRAYS o Allow nulls in arrays o Allow arrays to be ORDER'ed o Ensure we have array-eq operators for every built-in array type o Support construction of array result values in expressions * BINARY DATA o Improve vacuum of large objects, like /contrib/vacuumlo o Add security checking for large objects o Make file in/out interface for TOAST columns, similar to large object interface (force out-of-line storage and no compression) o Auto-delete large objects when referencing row is deleted Multi-Language Support ====================== * Add NCHAR (as distinguished from ordinary varchar), * Allow LOCALE on a per-column basis, default to ASCII * Support multiple simultaneous character sets, per SQL92 * Allow setting database character set without multibyte enabled * Improve Unicode combined character handling * Optimize locale to have minimal performance impact when not used (Peter E) * Add octet_length_server() and octet_length_client() (Thomas, Tatsuo) * Make octet_length_client the same as octet_length() (?) Views / Rules ============= * Automatically create rules on views so they are updateable, per SQL92 [view] * Add the functionality for WITH CHECK OPTION clause of CREATE VIEW * Allow NOTIFY in rules involving conditionals * Allow temporary views * Require view using temporary tables to be temporary views * Move psql backslash information into views * Allow RULE recompilation * -Remove brackets as multi-statement rule grouping, must use parens (Bruce) * Prevent aggregates from being used in rule WHERE clauses Indexes ======= * Allow CREATE INDEX zman_index ON test (date_trunc( 'day', zman ) datetime_ops) fails index can't store constant parameters * Order duplicate index entries by tid for faster heap lookups * Allow inherited tables to inherit index, UNIQUE constraint, and primary key, foreign key [inheritance] * UNIQUE INDEX on base column not honored on inserts from inherited table INSERT INTO inherit_table (unique_index_col) VALUES (dup) should fail [inheritance] * Allow UPDATE/DELETE on inherited table * Have UPDATE/DELETE clean out indexes * Add UNIQUE capability to non-btree indexes * Add btree index support for reltime, tinterval, regproc * Add rtree index support for line, lseg, path, point * Certain indexes will not shrink, e.g. oid indexes with many inserts * Use indexes for min() and max() or convert to SELECT col FROM tab ORDER BY col DESC LIMIT 1 if appropriate index exists and WHERE clause acceptible * Allow LIKE indexing optimization for non-ASCII locales * Use index to restrict rows returned by multi-key index when used with non-consecutive keys or OR clauses, so fewer heap accesses * Be smarter about insertion of already-ordered data into btree index * Add deleted bit to index tuples to reduce heap access * Prevent index uniqueness checks when UPDATE does not modifying column * Add bitmap indexes [performance] * Improve handling of index scans for NULL * Allow SELECT * FROM tab WHERE int2col = 4 to use int2col index, int8, float4, numeric/decimal too [optimizer] * Improve concurrency in GIST * Add FILLFACTOR to index creation * Improve concurrency of hash indexes (Neil Conway) * Test hash index performance and recommend or discourage usage Commands ======== * Add SIMILAR TO to allow character classes, 'pg_[a-c]%' * Add BETWEEN ASYMMETRIC/SYMMETRIC (Christopher) * -Remove LIMIT #,# and force use LIMIT and OFFSET clauses in 7.3 (Bruce) * Allow LIMIT/OFFSET to use expressions * Disallow TRUNCATE on tables that are involved in referential constraints * Add OR REPLACE clauses to non-FUNCTION object creation * CREATE TABLE AS can not determine column lengths from expressions [atttypmod] * Allow UPDATE to handle complex aggregates [update] * -Prevent create/drop scripts from allowing extra args (Bruce) * Allow command blocks to ignore certain types of errors * Add checks for missing parameters to shell script, to prevent over-shifting * Allow backslash handling in quoted strings to be disabled for portability * ALTER o ALTER TABLE ADD COLUMN does not honor DEFAULT and non-CHECK CONSTRAINT o ALTER TABLE ADD COLUMN to inherited table put column in wrong place [inheritance] o Add ALTER TABLE DROP COLUMN feature [drop] o Add ALTER FUNCTION o Add ALTER TABLE DROP non-CHECK CONSTRAINT o -ALTER TABLE ADD PRIMARY KEY (Tom) o -ALTER TABLE ADD UNIQUE (Tom) o -ALTER TABLE ALTER COLUMN SET/DROP NOT NULL (Christopher Kings-Lynne) o ALTER TABLE ADD COLUMN column SERIAL doesn't create sequence o ALTER TABLE ADD COLUMN column SET DEFAULT should fill existing rows with DEFAULT value o -Have ALTER TABLE OWNER change all dependant objects like indexes * CLUSTER o Cluster all tables at once o Prevent loss of indexes, permissions, inheritance o Automatically maintain clustering on a table * COPY o Allow specification of column names o Allow dump/load of CSV format o Change syntax to WITH DELIMITER, (keep old syntax around?) o Allow COPY to report error lines and continue; optionally allow error codes to be specified; requires savepoints or can not be run in a multi-statement transaction o Generate failure on short COPY lines rather than pad NULLs * CURSOR o Allow BINARY option to SELECT, just like DECLARE o MOVE 0 should not move to end of cursor o Allow UPDATE/DELETE WHERE CURRENT OF cursor using per-cursor tid stored in the backend o Prevent DROP of table being referenced by our own open cursor o Allow cursors outside transactions [cursor] * INSERT o Allow INSERT/UPDATE of system-generated oid value for a row o Allow INSERT INTO tab (col1, ..) VALUES (val1, ..), (val2, ..) o -Allow INSERT INTO my_table VALUES (a, b, c, DEFAULT, x, y, z, ...) o -Disallow missing columns in INSERT ... VALUES, per ANSI o Allow INSERT/UPDATE ... RETURNING new.col or old.col; handle RULE cases (Philip) * SHOW/SET o Add SHOW command to display locks o Add SET or BEGIN timeout parameter to cancel query o Add SET REAL_FORMAT and SET DOUBLE_PRECISION_FORMAT using printf args o Remove SET KSQO option now that OR processing is improved (Tom) o Add SET PERFORMANCE_TIPS option to suggest INDEX, VACUUM, VACUUM ANALYZE, and CLUSTER o Add SHOW command to see locale o Allow SHOW to output as a query result, like EXPLAIN o Abort all SET changes made in an aborted transaction * SERVER-SIDE LANGUAGES o Allow PL/PgSQL's RAISE function to take expressions o Fix PL/PgSQL to handle quoted mixed-case identifiers o Change PL/PgSQL to use palloc() instead of malloc() o Add untrusted version of plpython o Add plsh server-side shell language (Peter E) o Allow Java server-side programming, http://pljava.sourceforge.net [java] o Fix problems with complex temporary table creation/destruction without using PL/PgSQL EXECUTE, needs cache prevention/invalidation o Fix PL/pgSQL RENAME to work on variables other than OLD/NEW Clients ======= * Have pg_dump use LEFT OUTER JOIN in multi-table SELECTs or multiple SELECTS to avoid bad system catalog entries * -Have pg_dump -C dump database location and encoding information * Allow psql \d to show foreign keys * Allow psql \d to show temporary table structure * Allow psql to show transaction status if backend protocol changes made * Add XML interface: psql, pg_dump, COPY, separate server (?) * Add config file check for $ODBCINI, $HOME/.odbc.ini, installpath/etc/odbc.ini * -Have pg_dump use ADD PRIMARY KEY after COPY, for performance (Neil Conway) * JDBC o Comprehensive test suite. This may be available already. o Updateable resultSet (must be done in backend code) o JDBC-standard BLOB support o Error Codes (pending backend implementation) o Support both 'make' and 'ant' o Fix LargeObject API to handle OIDs as unsigned ints o -Implement cancel() method on Statement o Use cursors implicitly to avoid large results (see setCursorName()) o Add support for CallableStatements o Add LISTEN/NOTIFY support to the JDBC driver (Barry) * ECPG o Implement set descriptor, using descriptor o Make casts work in variable initializations o Implement SQLDA o Solve cardinality > 1 for input descriptors / variables o Understand structure definitions outside a declare section o sqlwarn[6] should be 'W' if the PRECISION or SCALE value specified o Improve error handling o Allow :var[:index] or :var[] as cvariable for an array var o Add a semantic check level, e.g. check if a table really exists o Fix nested C comments o Add SQLSTATE o fix handling of DB attributes that are arrays * ODBC o ODBC 3.0 support o Unicode(UCS-2) support o Updatable cursors support Referential Integrity ===================== * Add MATCH PARTIAL referential integrity [foreign] * Add deferred trigger queue file (Jan) * -Allow oid to act as a foreign key * Implement dirty reads and use them in RI triggers * Make triggers refer to columns by number, not name * Enforce referential integrity for system tables * -Allow user to control trigger firing order (Tom) * Add ALTER TRIGGER ... RENAME * Change foreign key constraint for array -> element to mean element in array * Fix foreign key constraints to not error on intermediate db states (Stephan) Dependency Checking =================== * Add pg_depend table for dependency recording; use sysrelid, oid, depend_sysrelid, depend_oid, name * Auto-destroy sequence on DROP of table with SERIAL; perhaps a separate SERIAL type * Have SERIAL generate non-colliding sequence names when we have auto-destruction * Prevent column dropping if column is used by foreign key * Propagate column or table renaming to foreign key constraints * Automatically drop constraints/functions when object is dropped * Make constraints clearer in dump file * Make foreign keys easier to identify * Flush cached query plans when their underlying catalog data changes Transactions ============ * Allow autocommit so always in a transaction block * Overhaul bufmgr/lockmgr/transaction manager * Allow savepoints / nested transactions [transactions] Exotic Features =============== * Add sql3 recursive unions * Add the concept of dataspaces/tablespaces [tablespaces] * Allow SQL92 schemas (Tom) [schema] * Allow queries across multiple databases [crossdb] * Add pre-parsing phase that converts non-ANSI features to supported features * Allow plug-in modules to emulate features from other databases * SQL*Net listener that makes PostgreSQL appear as an Oracle database to clients PERFORMANCE =========== Fsync ===== * Delay fsync() when other backends are about to commit too [fsync] o Determine optimal commit_delay value * Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options o Allow multiple blocks to be written to WAL with one write() Cache ===== * Cache most recent query plan(s) (Neil) [prepare] * Shared catalog cache, reduce lseek()'s by caching table size in shared area * Add free-behind capability for large sequential scans (Bruce) * Allow binding query args over FE/BE protocol * Consider use of open/fcntl(O_DIRECT) to minimize OS caching * Make blind writes go through the file descriptor cache Vacuum ====== * Improve speed with indexes (perhaps recreate index instead) [vacuum] * Reduce lock time by moving tuples with read lock, then write lock and truncate table [vacuum] * Provide automatic running of vacuum in the background (Tom) Locking ======= * Make locking of shared data structures more fine-grained * Add code to detect an SMP machine and handle spinlocks accordingly from distributted.net, http://www1.distributed.net/source, in client/common/cpucheck.cpp * Research use of sched_yield() for spinlock acquisition failure Startup Time ============ * Experiment with multi-threaded backend [thread] * Add connection pooling [pool] * Allow persistent backends [persistent] * Create a transaction processor to aid in persistent connections and connection pooling * Do listen() in postmaster and accept() in pre-forked backend * Have pre-forked backend pre-connect to last requested database or pass file descriptor to backend pre-forked for matching database * -Cache system catalog information in per-database files (Tom) Write-Ahead Log =============== * Have after-change WAL write()'s write only modified data to kernel * Reduce number of after-change WAL writes; they exist only to gaurd against partial page writes * Turn off after-change writes if fsync is disabled (?) * Add WAL index reliability improvement to non-btree indexes * -Reorder postgresql.conf WAL items in order of importance (Bruce) * Remove wal_files postgresql.conf option because WAL files are now recycled * Find proper defaults for postgresql.conf WAL entries * Add checkpoint_min_warning postgresql.conf option to warn about checkpoints that are too frequent Optimizer / Executor ==================== * Improve Subplan list handling * Allow Subplans to use efficient joins(hash, merge) with upper variable * Improve dynamic memory allocation by introducing tuple-context memory allocation (Tom) * Add hash for evaluating GROUP BY aggregates * Nested FULL OUTER JOINs don't work (Tom) * Allow merge and hash joins on expressions not just simple variables (Tom) * -Add new pg_proc cachable settings to specify whether function can be evaluated only once or once per query * -Change FIXED_CHAR_SEL to 0.20 from 0.04 to give better selectivity (Bruce) * Make IN/NOT IN have similar performance to EXISTS/NOT EXISTS [exists] * Missing optimizer selectivities for date, r-tree, etc. [optimizer] * Allow ORDER BY ... LIMIT to select top values without sort or index using a sequential scan for highest/lowest values (Oleg) * Inline simple SQL functions to avoid overhead (Tom) * Precompile SQL functions to avoid overhead (Neil Conway) * Add utility to compute accurate random_page_cost value * Improve ability to display optimizer analysis using OPTIMIZER_DEBUG Miscellaneous ============= * Do async I/O for faster random read-ahead of data * Get faster regex() code from Henry Spencer when it is available * Use mmap() rather than SYSV shared memory or to write WAL files (?) [mmap] * Improve caching of attribute offsets when NULLs exist in the row * Add Intimate Shared Memory(ISM) for Solaris * Add documentation to lock shared memory into RAM for each OS, if possible * -Use faster flex flags for performance improvement (Peter E) * Add BSD-licensed qsort() for Solaris Source Code =========== * Add use of 'const' for variables in source tree * -Fix problems with libpq non-blocking/async code * Make sure all block numbers are unsigned to increase maximum table size * Use BlockNumber rather than int where appropriate * Merge LockMethodCtl and LockMethodTable into one shared structure (Bruce) * HOLDER/HOLDERTAB rename to PROCLOCK/PROCLOCKTAG (Bruce) * Remove LockMethodTable.prio field, not used (Bruce) * Add version file format stamp to heap and other table types * Rename some /contrib modules from pg* to pg_* * Move some things from /contrib into main tree * Remove warnings created by -Wcast-align * Move platform-specific ps status display info from ps_status.c to ports * Make one version of simple_prompt() in code (Bruce, Tom) * -Compile in syslog functionaility by default (Tatsuo) * Modify regression tests to prevent failures do to minor numeric rounding * Use our own getopt() for FreeBSD/OpenBSD to allow --xxx flags (Bruce) * Add OpenBSD's getpeereid() call for local socket authentication (Bruce) * Improve access-permissions check on data directory in Cygwin (Tom) * Report failure to find readline or zlib at end of configure run * Add --port flag to regression tests * Increase identifier length (NAMEDATALEN) if small performance hit, perhaps to standard length of 128; change struct pgNotify to use pid first, breaks notify API; [namedatalen] * Increase maximum number of function parameters if little wasted space * Add documentation for perl, including mention of DBI/DBD perl location * Add optional CRC checksum to heap and index pages * Change representation of whole-tuple parameters to functions * Evaluate AIX cs() spinlock macro for performance optimizations (Tatsuo) * Clarify use of 'application' and 'command' tags in SGML docs * Better document ability to build only certain interfaces (Marc) * Remove or relicense modules that are not under the BSD license, if possible * Remove memory/file descriptor freeing befor elog(ERROR) (Bruce) --------------------------------------------------------------------------- Developers who have claimed items are: -------------------------------------- * Barry is Barry Lind * Billy is Billy G. Allie * Bruce is Bruce Momjian * Christopher is Christopher Kings-Lynne * D'Arcy is D'Arcy J.M. Cain * Dave is Dave Cramer * Edmund is Edmund Mergl * Hiroshi is Hiroshi Inoue * Karel is Karel Zak * Jan is Jan Wieck * Liam is Liam Stewart * Marc is Marc Fournier * Mark is Mark Hollomon * Marko is Marko Kreen * Michael is Michael Meskes * Neil is Neil Conway * Oleg is Oleg Bartunov * Peter M is Peter T Mount * Peter E is Peter Eisentraut * Philip is Philip Warner * Rod is Rod Taylor * Ross is Ross J. Reedstrom * Ryan is Ryan Bradetich * Stephan is Stephan Szabo * Tatsuo is Tatsuo Ishii * Thomas is Thomas Lockhart * Tom is Tom Lane * TomH is Tom I Helbekkmo * Vadim is Vadim B. Mikheev