* Experiment with multi-threaded backend better I/O utilization
This would allow a single query to make use of multiple I/O channels
simultaneously. One idea is to create a background reader that can
pre-fetch sequential and index scan pages needed by other backends.
This could be expanded to allow concurrent reads from multiple devices
in a partitioned table.
* Experiment with multi-threaded backend better CPU utilization
This would allow several CPUs to be used for a single query, such as
for sorting or query execution.
* Speed WAL recovery by allowing more than one page to be prefetched
This should be done utilizing the same infrastructure used for
prefetching in general to avoid introducing complex error-prone code
in WAL replay.
o Allow COPY in CSV mode to control whether a quoted zero-length
string is treated as NULL
Currently this is always treated as a zero-length string,
which generates an error when loading into an integer column
>
> * Change memory allocation for multi-byte functions so memory is
> allocated inside conversion functions
>
> Currently we preallocate memory based on worst-case usage.
* Consider increasing the number of default statistics target, and
reduce statistics target overhead
Also consider having a larger statistics target for indexed columns
and expression indexes
<
> http://archives.postgresql.org/pgsql-general/2007-06/msg00542.php
* Consider increasing the number of default statistics target, and
reduce statistics target overhead
Also consider having a larger statistics target for indexed columns
and expression indexes
> http://archives.postgresql.org/pgsql-general/2007-05/msg01228.php
>
>
> * Consider increasing the number of default statistics target, and
> reduce statistics target overhead
>
> Also consider having a larger statistics target for indexed columns
> and expression indexes
>
> * Add comments on system tables/columns using the information in
> catalogs.sgml
>
> Ideally the information would be pulled from the SGML file
> automatically.
>
>
> * Allow client certificate names to be checked against the client
> hostname
>
> This is already implemented in
> libpq/fe-secure.c::verify_peer_name_matches_certificate() but the code
> is commented out.
> * Prevent malicious functions from being executed with the permissions
> of unsuspecting users
>
> Index functions are safe, so VACUUM and ANALYZE are safe too.
> Triggers, CHECK and DEFAULT expressions, and rules are still vulnerable.
> http://archives.postgresql.org/pgsql-hackers/2008-01/msg00268.php
>
> o Have CONSTRAINT cname NOT NULL preserve the contraint name
>
> Right now pg_attribute.attnotnull records the NOT NULL status
> of the column, but does not record the contraint name
>
<
< o To better utilize resources, restore data, primary keys, and
< indexes for a single table before restoring the next table
<
< Hopefully this will allow the CPU-I/O load to be more uniform
< for simultaneous restores. The idea is to start data restores
< for several objects, and once the first object is done, to move
< on to its primary keys and indexes. Over time, simultaneous
< data loads and index builds will be running.
< * pg_dump
> * pg_dump / pg_restore
> o Allow pg_dump to utilize multiple CPUs and I/O channels by dumping
> multiple objects simultaneously
>
> The difficulty with this is getting multiple dump processes to
> produce a single dump output file.
> http://archives.postgresql.org/pgsql-hackers/2008-02/msg00205.php
>
> o Allow pg_restore to utilize multiple CPUs and I/O channels by
> restoring multiple objects simultaneously
>
> This might require a pg_restore flag to indicate how many
> simultaneous operations should be performed. Only pg_dump's
> -Fc format has the necessary dependency information.
>
> o To better utilize resources, restore data, primary keys, and
> indexes for a single table before restoring the next table
>
> Hopefully this will allow the CPU-I/O load to be more uniform
> for simultaneous restores. The idea is to start data restores
> for several objects, and once the first object is done, to move
> on to its primary keys and indexes. Over time, simultaneous
> data loads and index builds will be running.
>
> o To better utilize resources, allow pg_restore to check foreign
> keys simultaneously, where possible
> o Allow pg_restore to create all indexes of a table
> concurrently, via a single heap scan
>
> This requires a pg_dump -Fc file because that format contains
> the required dependency information.
> http://archives.postgresql.org/pgsql-general/2007-05/msg01274.php
>
> o Allow pg_restore to load different parts of the COPY data
> simultaneously
< single heap scan, and have a restore of a pg_dump somehow use it
> single heap scan, and have pg_restore use it
< http://archives.postgresql.org/pgsql-general/2007-05/msg01274.php
> * Speed WAL recovery by allowing more than one page to be prefetched
>
> This involves having a separate process that can be told which pages
> the recovery process will need in the near future.
> http://archives.postgresql.org/pgsql-hackers/2008-02/msg01279.php
>
< * Improve deadlock detection when deleting items from shared buffers
> * Improve deadlock detection when a page cleaning lock conflicts
> with a shared buffer that is pinned
>
> * Add the ability to automatically create materialized views
>
> Right now materialized views require the user to create triggers on the
> main table to keep the summary table current. SQL syntax should be able
> to manager the triggers and summary table automatically. A more
> sophisticated implementation would automatically retrieve from the
> summary table when the main table is referenced, if possible.
>
< * Allow major upgrades without dump/reload, perhaps using pg_upgrade
< [pg_upgrade]
< * Check for unreferenced table files created by transactions that were
< in-progress when the server terminated abruptly
<
< http://archives.postgresql.org/pgsql-patches/2006-06/msg00096.php
<
> * Check for unreferenced table files created by transactions that were
> in-progress when the server terminated abruptly
>
> http://archives.postgresql.org/pgsql-patches/2006-06/msg00096.php
>
< * Support table partitioning that allows a single table to be stored
< in subtables that are partitioned based on the primary key or a WHERE
< clause
< creation of rules for INSERT/UPDATE/DELETE, and constraints for
< rapid partition selection. Options could include range and hash
> creation of triggers or rules for INSERT/UPDATE/DELETE, and constraints
> for rapid partition selection. Options could include range and hash
<
< * Improve replication solutions
<
< o Load balancing
<
< You can use any of the master/slave replication servers to use a
< standby server for data warehousing. To allow read/write queries to
< multiple servers, you need multi-master replication like pgcluster.
<
< o Allow replication over unreliable or non-persistent links
<
<
< o Mark change-on-restart-only values in postgresql.conf
< All objects in the default database tablespace must have default
< tablespace specifications. This is because new databases are
< created by copying directories. If you mix default tablespace
< tables and tablespace-specified tables in the same directory,
< creating a new database from such a mixed directory would create a
< new database with tables that had incorrect explicit tablespaces.
< To fix this would require modifying pg_class in the newly copied
< database, which we don't currently do.
> Currently all objects in the default database tablespace must
> have default tablespace specifications. This is because new
> databases are created by copying directories. If you mix default
> tablespace tables and tablespace-specified tables in the same
> directory, creating a new database from such a mixed directory
> would create a new database with tables that had incorrect
> explicit tablespaces. To fix this would require modifying
> pg_class in the newly copied database, which we don't currently
> do.
<
< o Allow recovery.conf to allow the same syntax as
> o Allow recovery.conf to support the same syntax as
< * Allow user-defined types to specify a type modifier at table creation
< time
< * Allow all data types to cast to and from TEXT
<
< http://archives.postgresql.org/pgsql-hackers/2007-04/msg00017.php
<
<
< o Add support for year-month syntax, INTERVAL '50-6' YEAR TO MONTH
< o Interpret INTERVAL '1 year' MONTH as CAST (INTERVAL '1 year' AS
< INTERVAL MONTH), and this should return '12 months'
> o Add support for year-month syntax, INTERVAL '50-6' YEAR
> TO MONTH
> o Interpret INTERVAL '1 year' MONTH as CAST (INTERVAL '1
> year' AS INTERVAL MONTH), and this should return '12 months'
< * Allow MONEY to be cast to/from other numeric data types
> * Allow MONEY to be easily cast to/from other numeric data types
>
< * Allow functions to have a schema search path specified at creation time
< * Fix cases where invalid byte encodings are accepted by the database,
< but throw an error on SELECT
<
< http://archives.postgresql.org/pgsql-hackers/2007-03/msg00767.php
< * Improve logging of prepared statements recovered during startup
> * Improve logging of prepared transactions recovered during startup
< * Make standard_conforming_strings the default in 8.4?
> * Make standard_conforming_strings the default in 8.5?
< * Allow the count returned by SELECT, etc to be to represent as an int64
> * Allow the count returned by SELECT, etc to be represented as an int64
< o Use more reliable method for CREATE DATABASE to get a consistent
< copy of db?
< o Fix transaction restriction checks for CREATE DATABASE and
< other commands
<
< http://archives.postgresql.org/pgsql-hackers/2007-01/msg00133.php
< currently allowed.
> currently allowed. This currently is done if the table is
> created inside the same transaction block as the COPY because
> no other backends can see the table.
< o Add SET PATH for schemas?
<
< This is basically the same as SET search_path.
< o Enforce referential integrity for system tables
< o Add Oracle-style packages (Pavel)
<
< A package would be a schema with session-local variables,
< public/private functions, and initialization functions. It
< is also possible to implement these capabilities
< in all schemas and not use a separate "packages"
< syntax at all.
<
< http://archives.postgresql.org/pgsql-hackers/2006-08/msg00384.php
<
< o Add single-step debugging of functions
< o Allow RETURN to return row or record functions
<
< http://archives.postgresql.org/pgsql-patches/2005-11/msg00045.php
< http://archives.postgresql.org/pgsql-patches/2006-08/msg00397.php
< http://archives.postgresql.org/pgsql-hackers/2006-09/msg00388.php
<
< o Fix problems with RETURN NEXT on tables with
< dropped/added columns after function creation
<
< http://archives.postgresql.org/pgsql-patches/2006-02/msg00165.php
<
< * Make consistent use of long/short command options --- pg_ctl needs
< long ones, pg_config doesn't have short ones, postgres doesn't have
< enough long ones, etc.
<
<
<
< o Consider parsing the -c string into individual queries so each
< is run in its own transaction
<
< http://archives.postgresql.org/pgsql-hackers/2007-01/msg00291.php
<
<
< o Remove unnecessary function pointer abstractions in pg_dump source
< code
> o Remove unnecessary function pointer abstractions in pg_dump source
> code
<
<
< o Fix SSL retry to avoid useless repeated connection attempts and
< ensuing misleading error messages
>
<
< This is difficult because it requires datatype-specific knowledge.
<
< * Improve commit_delay handling to reduce fsync()
< * %Add an option to sync() before fsync()'ing checkpoint files
>
< * Reduce lock time during VACUUM FULL by moving tuples with read lock,
< then write lock and truncate table
<
< Moved tuples are invisible to other backends so they don't require a
< write lock. However, the read lock promotion to write lock could lead
< to deadlock situations.
<
< * Prevent long-lived temporary tables from causing frozen-xid advancement
< starvation
<
< The problem is that autovacuum cannot vacuum them to set frozen xids;
< only the session that created them can do that.
<
<
<
< o Use free-space map information to guide refilling
< o Consider logging activity either to the logs or a system view
> The problem is that autovacuum cannot vacuum them to set frozen xids;
> only the session that created them can do that.
< * Add connection pooling
<
< It is unclear if this should be done inside the backend code or done
< by something external like pgpool. The passing of file descriptors to
< existing backends is one of the difficulties with a backend approach.
<
< * Consider reducing memory used for shared buffer reference count
<
< http://archives.postgresql.org/pgsql-hackers/2007-01/msg00752.php
<
< * %Remove memory/file descriptor freeing before ereport(ERROR)
< * %Promote debug_query_string into a server-side function current_query()
< * Allow ecpg to work with MSVC and BCC
< * Add xpath_array() to /contrib/xml2 to return results as an array
< * Allow building in directories containing spaces
<
< This is probably not possible because 'gmake' and other compiler tools
< do not fully support quoting of paths with spaces.
<
< * Fix sgmltools so PDFs can be generated with bookmarks
< * Split out libpq pgpass and environment documentation sections to make
< it easier for non-developers to find
< * Use strlcpy() rather than our StrNCpy() macro
<
< http://archives.postgresql.org/pgsql-hackers/2006-09/msg02108.php
<
< o Re-enable timezone output on log_line_prefix '%t' when a
< shorter timezone string is available
< * Allow statements across databases or servers with transaction
< semantics
<
< This can be done using dblink and two-phase commit.
> * Add Oracle-style packages (Pavel)
< * Add the features of packages
> A package would be a schema with session-local variables,
> public/private functions, and initialization functions. It
> is also possible to implement these capabilities
> in any schema and not use a separate "packages"
> syntax at all.
< o Make private objects accessible only to objects in the same schema
< o Allow current_schema.objname to access current schema objects
< o Add session variables
< o Allow nested schemas
> http://archives.postgresql.org/pgsql-hackers/2006-08/msg00384.php
< * Experiment with multi-threaded backend better resource utilization
<
< This would allow a single query to make use of multiple CPU's or
< multiple I/O channels simultaneously. One idea is to create a
< background reader that can pre-fetch sequential and index scan
< pages needed by other backends. This could be expanded to allow
< concurrent reads from multiple devices in a partitioned table.
<
> * Experiment with multi-threaded backend better resource utilization
>
> This would allow a single query to make use of multiple CPU's or
> multiple I/O channels simultaneously. One idea is to create a
> background reader that can pre-fetch sequential and index scan
> pages needed by other backends. This could be expanded to allow
> concurrent reads from multiple devices in a partitioned table.
* Consider having the background writer update the transaction status
hint bits before writing out the page
Implementing this requires the background writer to have access to system
catalogs and the transaction status log.
<
< * Allow free-behind capability for large sequential scans to avoid
< kernel cache spoiling
<
< Posix_fadvise() can control both sequential/random file caching and
< free-behind behavior, but it is unclear how the setting affects other
< backends that also have the file open, and the feature is not supported
< on all operating systems.
< o -Allow commenting of variables in postgresql.conf to restore them
< to defaults
< o -Add a GUC variable to control the tablespace for temporary objects
< and sort files
< Monitoring
< ==========
<
< * -Allow server log information to be output as CSV format
< * -Add ability to monitor the use of temporary sort files
< * -Allow user-defined types to accept 'typmod' parameters
<
< http://archives.postgresql.org/pgsql-hackers/2005-08/msg01142.php
< http://archives.postgresql.org/pgsql-hackers/2005-09/msg00012.php
< http://archives.postgresql.org/pgsql-hackers/2006-08/msg00149.php
<
< * -Add Globally/Universally Unique Identifier (GUID/UUID)
<
< http://archives.postgresql.org/pgsql-patches/2006-09/msg00209.php
< http://archives.postgresql.org/pgsql-general/2007-01/msg00853.php
<
< * -Support a data type with specific enumerated values (ENUM)
< o -Add support for arrays of complex types
< o -Make 64-bit version of the MONEY data type
< * -Add ISO day of week format 'ID' to to_char() where Monday = 1
< * -Add a field 'isoyear' to extract(), based on the ISO week
< * -Add RESET SESSION command to reset all session state
< o -Make CLUSTER preserve recently-dead tuples per MVCC requirements
< o -Add more logical syntax CLUSTER table USING index;
< support current syntax for backward compatibility
< o -Allow UPDATE/DELETE WHERE CURRENT OF cursor
< o -Add support for MOVE cursors
< o -Allow PL/PythonU to return boolean rather than 1/0
< o -Allow psql \pset boolean variables to set to fixed values, rather
< than toggle
< o -Add -f to pg_dumpall
< Dependency Checking
< ===================
<
< * -Flush cached query plans when the dependent objects change or
< when new ANALYZE statistics are available
< * -Track dependencies in function bodies and recompile/invalidate
< * -Invalidate prepared queries, like INSERT, when the table definition
< is altered
<
< * -Allow use of indexes to search for NULLs
< * -Allow the creation of indexes with mixed ascending/descending
< specifiers
< * -Reduce checkpoint performance degredation by forcing data to disk
< more evenly
< * -Allow sequential scans to take advantage of other concurrent
< sequential scans, also called "Synchronised Scanning"
< * -Consider shrinking expired tuples to just their headers
< * -Allow heap reuse of UPDATEd rows if no indexed columns are changed,
< and old and new versions are on the same heap page
< * -Reduce XID consumption of read-only queries
< o -Turn on by default
< o -Allow multiple vacuums so large tables do not starve small
< tables
< * -Allow the pg_xlog directory location to be specified during initdb
< with a symlink back to the /data location
< * -Allow buffered WAL writes and fsync
< * -Allow ORDER BY ... LIMIT # to select high/low value without sort or
< index using a sequential scan for highest/lowest values
< * -Merge xmin/xmax/cmin/cmax back into three header fields
< o -Support a smaller header for short variable-length fields
< * -Move NAMEDATALEN from postgres_ext.h to pg_config_manual.h
< * -Fix problem with excessive logging during SSL disconnection
<
< http://archives.postgresql.org/pgsql-bugs/2006-12/msg00122.php
< http://archives.postgresql.org/pgsql-bugs/2007-05/msg00065.php
<
< o -Add long file support for binary pg_dump output
< * Prevent long-lived temporary tables from causing frozen-Xid advancement
> * Prevent long-lived temporary tables from causing frozen-xid advancement
>
> The problem is that autovacuum cannot vacuum them to set frozen xids;
> only the session that created them can do that.
>
>
>
< o Prevent COMMENT ON dbname from issuing a warning when loading
< into a database with a different name, perhaps using COMMENT ON
< CURRENT DATABASE
> o Change pg_dump so that a comment on the dumped database is
> applied to the loaded database, even if the database has a
> different name. This will require new backend syntax, perhaps
> COMMENT ON CURRENT DATABASE.
< o Allow COMMENT ON dbname to work when loading into a database
< with a different name, perhaps using COMMENT ON CURRENT
< DATABASE
> o Prevent COMMENT ON dbname from issuing a warning when loading
> into a database with a different name, perhaps using COMMENT ON
> CURRENT DATABASE
> * -Consider shrinking expired tuples to just their headers
> * -Allow heap reuse of UPDATEd rows if no indexed columns are changed,
> and old and new versions are on the same heap page
Not needed anymore:
< * Reuse index tuples that point to heap tuples that are not visible to
< anyone?
< o Allow UPDATE/DELETE WHERE CURRENT OF cursor
<
< This requires using the row ctid to map cursor rows back to the
< original heap row. This become more complicated if WITH HOLD cursors
< are to be supported because WITH HOLD cursors have a copy of the row
< and no FOR UPDATE lock.
< http://archives.postgresql.org/pgsql-hackers/2007-01/msg01014.php
<
> o -Allow UPDATE/DELETE WHERE CURRENT OF cursor
o -Add a GUC variable to control the tablespace for temporary objects
and sort files
<
< It could start with a random tablespace from a supplied list and
< cycle through the list.
<
< * Allow free-behind capability for large sequential scans, perhaps using
< posix_fadvise()
> * Allow free-behind capability for large sequential scans to avoid
> kernel cache spoiling
scan-resistant:
<
< * Allow free-behind capability for large sequential scans, perhaps using
< posix_fadvise()
<
< Posix_fadvise() can control both sequential/random file caching and
< free-behind behavior, but it is unclear how the setting affects other
< backends that also have the file open, and the feature is not supported
< on all operating systems.
< * Consider allowing 64-bit integers to be passed by value on 64-bit
< platforms
> * Consider allowing 64-bit integers and floats to be passed by value on
> 64-bit platforms
>
> Also change 32-bit floats (float4) to be passed by value at the same
> time.
>
* Improve speed with indexes
For large table adjustments during VACUUM FULL, it is faster to cluster
or reindex rather than update the index. Also, index updates can bloat
the index.
>
> * Implement the SQL standard mechanism whereby REVOKE ROLE revokes only
> the privilege granted by the invoking role, and not those granted
> by other roles
< Last updated: Sat May 5 10:47:39 EDT 2007
> Last updated: Sat May 5 11:39:57 EDT 2007
< * Flush cached query plans when the dependent objects change,
< when the cardinality of parameters changes dramatically, or
> * -Flush cached query plans when the dependent objects change or
<
< A more complex solution would be to save multiple plans for different
< cardinality and use the appropriate plan based on the EXECUTE values.
<
< * Track dependencies in function bodies and recompile/invalidate
<
< This is particularly important for references to temporary tables
< in PL/PgSQL because PL/PgSQL caches query plans. The only workaround
< in PL/PgSQL is to use EXECUTE. One complexity is that a function
< might itself drop and recreate dependent tables, causing it to
< invalidate its own query plan.
<
< * Invalidate prepared queries, like INSERT, when the table definition
> * -Track dependencies in function bodies and recompile/invalidate
> * -Invalidate prepared queries, like INSERT, when the table definition
< * Invalidate prepared queries, like INSERT, when the table definition
< is altered
>
> * Invalidate prepared queries, like INSERT, when the table definition
> is altered
> * -Allow ORDER BY ... LIMIT # to select high/low value without sort or
<
< Right now, if no index exists, ORDER BY ... LIMIT # requires we sort
< all values to return the high/low value. Instead The idea is to do a
< sequential scan to find the high/low value, thus avoiding the sort.
< MIN/MAX already does this, but not for LIMIT > 1.
<
< o Add support for MOVE and SCROLL cursors
<
< PL/pgSQL cursors should support the same syntax as
< backend cursors.
<
> o -Add support for MOVE cursors
> o Add support for SCROLL cursors
< Currently all schemas are owned by the super-user because they are
< copied from the template1 database.
> Currently all schemas are owned by the super-user because they are copied
> from the template1 database. However, since all objects are inherited
> from the template database, it is not clear that setting schemas to the db
> owner is correct.
< o Consider reducing on-disk varlena length from four to two
< because a heap row cannot be more than 64k in length
> o Consider reducing on-disk varlena length from four bytes to
> two because a heap row cannot be more than 64k in length