1045 lines
45 KiB
Plaintext
1045 lines
45 KiB
Plaintext
|
|
Frequently Asked Questions (FAQ) for PostgreSQL
|
|
|
|
Last updated: Mon Mar 18 14:34:57 EST 2002
|
|
|
|
Current maintainer: Bruce Momjian (pgman@candle.pha.pa.us)
|
|
|
|
The most recent version of this document can be viewed at
|
|
http://www.PostgreSQL.org/docs/faq-english.html.
|
|
|
|
Platform-specific questions are answered at
|
|
http://www.PostgreSQL.org/users-lounge/docs/faq.html.
|
|
_________________________________________________________________
|
|
|
|
General Questions
|
|
|
|
1.1) What is PostgreSQL? How is it pronounced?
|
|
1.2) What is the copyright on PostgreSQL?
|
|
1.3) What Unix platforms does PostgreSQL run on?
|
|
1.4) What non-Unix ports are available?
|
|
1.5) Where can I get PostgreSQL?
|
|
1.6) Where can I get support?
|
|
1.7) What is the latest release?
|
|
1.8) What documentation is available?
|
|
1.9) How do I find out about known bugs or missing features?
|
|
1.10) How can I learn SQL?
|
|
1.11) Is PostgreSQL Y2K compliant?
|
|
1.12) How do I join the development team?
|
|
1.13) How do I submit a bug report?
|
|
1.14) How does PostgreSQL compare to other DBMSs?
|
|
1.15) How can I financially assist PostgreSQL?
|
|
|
|
User Client Questions
|
|
|
|
2.1) Are there ODBC drivers for PostgreSQL?
|
|
2.2) What tools are available for using PostgreSQL with Web pages?
|
|
2.3) Does PostgreSQL have a graphical user interface? A report
|
|
generator? An embedded query language interface?
|
|
2.4) What languages are available to communicate with PostgreSQL?
|
|
|
|
Administrative Questions
|
|
|
|
3.1) How do I install PostgreSQL somewhere other than
|
|
/usr/local/pgsql?
|
|
3.2) When I start postmaster, I get a Bad System Call or core dumped
|
|
message. Why?
|
|
3.3) When I try to start postmaster, I get IpcMemoryCreate errors.
|
|
Why?
|
|
3.4) When I try to start postmaster, I get IpcSemaphoreCreate errors.
|
|
Why?
|
|
3.5) How do I control connections from other hosts?
|
|
3.6) How do I tune the database engine for better performance?
|
|
3.7) What debugging features are available?
|
|
3.8) Why do I get "Sorry, too many clients" when trying to connect?
|
|
3.9) What are the pg_sorttempNNN.NN files in my database directory?
|
|
|
|
Operational Questions
|
|
|
|
4.1) What is the difference between binary cursors and normal cursors?
|
|
4.2) How do I SELECT only the first few rows of a query?
|
|
4.3) How do I get a list of tables or other things I can see in psql?
|
|
4.4) How do you remove a column from a table?
|
|
4.5) What is the maximum size for a row, a table, and a database?
|
|
4.6) How much database disk space is required to store data from a
|
|
typical text file?
|
|
4.7) How do I find out what tables or indexes are defined in the
|
|
database?
|
|
4.8) My queries are slow or don't make use of the indexes. Why?
|
|
4.9) How do I see how the query optimizer is evaluating my query?
|
|
4.10) What is an R-tree index?
|
|
4.11) What is the Genetic Query Optimizer?
|
|
4.12) How do I perform regular expression searches and
|
|
case-insensitive regular expression searches? How do I use an index
|
|
for case-insensitive searches?
|
|
4.13) In a query, how do I detect if a field is NULL?
|
|
4.14) What is the difference between the various character types?
|
|
4.15.1) How do I create a serial/auto-incrementing field?
|
|
4.15.2) How do I get the value of a SERIAL insert?
|
|
4.15.3) Don't currval() and nextval() lead to a race condition with
|
|
other users?
|
|
4.15.4) Why aren't my sequence numbers reused on transaction abort?
|
|
Why are there gaps in the numbering of my sequence/SERIAL column?
|
|
4.16) What is an OID? What is a TID?
|
|
4.17) What is the meaning of some of the terms used in PostgreSQL?
|
|
4.18) Why do I get the error "ERROR: Memory exhausted in
|
|
AllocSetAlloc()"?
|
|
4.19) How do I tell what PostgreSQL version I am running?
|
|
4.20) Why does my large-object operations get "invalid large obj
|
|
descriptor"?
|
|
4.21) How do I create a column that will default to the current time?
|
|
4.22) Why are my subqueries using IN so slow?
|
|
4.23) How do I perform an outer join?
|
|
4.24) How do I perform queries using multiple databases?
|
|
|
|
Extending PostgreSQL
|
|
|
|
5.1) I wrote a user-defined function. When I run it in psql, why does
|
|
it dump core?
|
|
5.2) How can I contribute some nifty new types and functions to
|
|
PostgreSQL?
|
|
5.3) How do I write a C function to return a tuple?
|
|
5.4) I have changed a source file. Why does the recompile not see the
|
|
change?
|
|
_________________________________________________________________
|
|
|
|
General Questions
|
|
|
|
1.1) What is PostgreSQL?
|
|
|
|
PostgreSQL is pronounced Post-Gres-Q-L.
|
|
|
|
PostgreSQL is an enhancement of the POSTGRES database management
|
|
system, a next-generation DBMS research prototype. While PostgreSQL
|
|
retains the powerful data model and rich data types of POSTGRES, it
|
|
replaces the PostQuel query language with an extended subset of SQL.
|
|
PostgreSQL is free and the complete source is available.
|
|
|
|
PostgreSQL development is performed by a team of Internet developers
|
|
who all subscribe to the PostgreSQL development mailing list. The
|
|
current coordinator is Marc G. Fournier (scrappy@PostgreSQL.org). (See
|
|
below on how to join). This team is now responsible for all
|
|
development of PostgreSQL.
|
|
|
|
The authors of PostgreSQL 1.01 were Andrew Yu and Jolly Chen. Many
|
|
others have contributed to the porting, testing, debugging, and
|
|
enhancement of the code. The original Postgres code, from which
|
|
PostgreSQL is derived, was the effort of many graduate students,
|
|
undergraduate students, and staff programmers working under the
|
|
direction of Professor Michael Stonebraker at the University of
|
|
California, Berkeley.
|
|
|
|
The original name of the software at Berkeley was Postgres. When SQL
|
|
functionality was added in 1995, its name was changed to Postgres95.
|
|
The name was changed at the end of 1996 to PostgreSQL.
|
|
|
|
1.2) What is the copyright on PostgreSQL?
|
|
|
|
PostgreSQL is subject to the following COPYRIGHT:
|
|
|
|
PostgreSQL Data Base Management System
|
|
|
|
Portions copyright (c) 1996-2002, PostgreSQL Global Development Group
|
|
Portions Copyright (c) 1994-6 Regents of the University of California
|
|
|
|
Permission to use, copy, modify, and distribute this software and its
|
|
documentation for any purpose, without fee, and without a written
|
|
agreement is hereby granted, provided that the above copyright notice
|
|
and this paragraph and the following two paragraphs appear in all
|
|
copies.
|
|
|
|
IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO ANY PARTY
|
|
FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES,
|
|
INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND
|
|
ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA HAS BEEN
|
|
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
|
|
|
THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY WARRANTIES,
|
|
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
|
|
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE
|
|
PROVIDED HEREUNDER IS ON AN "AS IS" BASIS, AND THE UNIVERSITY OF
|
|
CALIFORNIA HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, SUPPORT,
|
|
UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
|
|
|
|
The above is the BSD license, the classic open-source license. It has
|
|
no restrictions on how the source code may be used. We like it and
|
|
have no intention of changing it.
|
|
|
|
1.3) What Unix platforms does PostgreSQL run on?
|
|
|
|
In general, a modern Unix-compatible platform should be able to run
|
|
PostgreSQL. The platforms that had received explicit testing at the
|
|
time of release are listed in the installation instructions.
|
|
|
|
1.4) What non-Unix ports are available?
|
|
|
|
Client
|
|
|
|
It is possible to compile the libpq C library, psql, and other
|
|
interfaces and binaries to run on MS Windows platforms. In this case,
|
|
the client is running on MS Windows, and communicates via TCP/IP to a
|
|
server running on one of our supported Unix platforms. A file
|
|
win31.mak is included in the distribution for making a Win32 libpq
|
|
library and psql. PostgreSQL also communicates with ODBC clients.
|
|
|
|
Server
|
|
|
|
The database server can run on Windows NT and Win2k using Cygwin, the
|
|
Cygnus Unix/NT porting library. See pgsql/doc/FAQ_MSWIN in the
|
|
distribution or the MS Windows FAQ on our web site. We have no plan to
|
|
do a native port to any Microsoft platform.
|
|
|
|
1.5) Where can I get PostgreSQL?
|
|
|
|
The primary anonymous ftp site for PostgreSQL is
|
|
ftp://ftp.PostgreSQL.org/pub. For mirror sites, see our main web site.
|
|
|
|
1.6) Where can I get support?
|
|
|
|
The main mailing list is: pgsql-general@PostgreSQL.org. It is
|
|
available for discussion of matters pertaining to PostgreSQL. To
|
|
subscribe, send mail with the following lines in the body (not the
|
|
subject line):
|
|
subscribe
|
|
end
|
|
|
|
to pgsql-general-request@PostgreSQL.org.
|
|
|
|
There is also a digest list available. To subscribe to this list, send
|
|
email to: pgsql-general-digest-request@PostgreSQL.org with a body of:
|
|
subscribe
|
|
end
|
|
|
|
Digests are sent out to members of this list whenever the main list
|
|
has received around 30k of messages.
|
|
|
|
The bugs mailing list is available. To subscribe to this list, send
|
|
email to pgsql-bugs-request@PostgreSQL.org with a body of:
|
|
subscribe
|
|
end
|
|
|
|
There is also a developers discussion mailing list available. To
|
|
subscribe to this list, send email to
|
|
pgsql-hackers-request@PostgreSQL.org with a body of:
|
|
subscribe
|
|
end
|
|
|
|
Additional mailing lists and information about PostgreSQL can be found
|
|
via the PostgreSQL WWW home page at:
|
|
|
|
http://www.PostgreSQL.org
|
|
|
|
There is also an IRC channel on EFNet, channel #PostgreSQL. I use the
|
|
Unix command irc -c '#PostgreSQL' "$USER" irc.phoenix.net.
|
|
|
|
A list of commercial support companies is available at
|
|
http://www.postgresql.org/users-lounge/commercial-support.html.
|
|
|
|
1.7) What is the latest release?
|
|
|
|
The latest release of PostgreSQL is version 7.2.1.
|
|
|
|
We plan to have major releases every four months.
|
|
|
|
1.8) What documentation is available?
|
|
|
|
Several manuals, manual pages, and some small test examples are
|
|
included in the distribution. See the /doc directory. You can also
|
|
browse the manual online at
|
|
http://www.PostgreSQL.org/users-lounge/docs/.
|
|
|
|
There are two PostgreSQL books available online at
|
|
http://www.PostgreSQL.org/docs/awbook.html and
|
|
http://www.commandprompt.com/ppbook/. There is a list of PostgreSQL
|
|
books available for purchase at http://www.postgresql.org/books/.
|
|
There is also a collection of PostgreSQL technical articles at
|
|
http://techdocs.postgresql.org/.
|
|
|
|
psql has some nice \d commands to show information about types,
|
|
operators, functions, aggregates, etc.
|
|
|
|
Our web site contains even more documentation.
|
|
|
|
1.9) How do I find out about known bugs or missing features?
|
|
|
|
PostgreSQL supports an extended subset of SQL-92. See our TODO list
|
|
for known bugs, missing features, and future plans.
|
|
|
|
1.10) How can I learn SQL?
|
|
|
|
The PostgreSQL book at http://www.PostgreSQL.org/docs/awbook.html
|
|
teaches SQL. There is another PostgreSQL book at
|
|
http://www.commandprompt.com/ppbook. There is a nice tutorial at
|
|
http://www.intermedia.net/support/sql/sqltut.shtm, at
|
|
http://ourworld.compuserve.com/homepages/graeme_birchall/HTM_COOK.HTM,
|
|
and at http://sqlcourse.com.
|
|
|
|
Another one is "Teach Yourself SQL in 21 Days, Second Edition" at
|
|
http://members.tripod.com/er4ebus/sql/index.htm
|
|
|
|
Many of our users like The Practical SQL Handbook, Bowman, Judith S.,
|
|
et al., Addison-Wesley. Others like The Complete Reference SQL, Groff
|
|
et al., McGraw-Hill.
|
|
|
|
1.11) Is PostgreSQL Y2K compliant?
|
|
|
|
Yes, we easily handle dates past the year 2000 AD, and before 2000 BC.
|
|
|
|
1.12) How do I join the development team?
|
|
|
|
First, download the latest source and read the PostgreSQL Developers
|
|
documentation on our web site, or in the distribution. Second,
|
|
subscribe to the pgsql-hackers and pgsql-patches mailing lists. Third,
|
|
submit high quality patches to pgsql-patches.
|
|
|
|
There are about a dozen people who have commit privileges to the
|
|
PostgreSQL CVS archive. They each have submitted so many high-quality
|
|
patches that it was impossible for the existing committers to keep up,
|
|
and we had confidence that patches they committed were of high
|
|
quality.
|
|
|
|
1.13) How do I submit a bug report?
|
|
|
|
Please visit the PostgreSQL BugTool page, which gives guidelines and
|
|
directions on how to submit a bug.
|
|
|
|
Also check out our ftp site ftp://ftp.PostgreSQL.org/pub to see if
|
|
there is a more recent PostgreSQL version or patches.
|
|
|
|
1.14) How does PostgreSQL compare to other DBMSs?
|
|
|
|
There are several ways of measuring software: features, performance,
|
|
reliability, support, and price.
|
|
|
|
Features
|
|
PostgreSQL has most features present in large commercial DBMSs,
|
|
like transactions, subselects, triggers, views, foreign key
|
|
referential integrity, and sophisticated locking. We have some
|
|
features they do not have, like user-defined types,
|
|
inheritance, rules, and multi-version concurrency control to
|
|
reduce lock contention.
|
|
|
|
Performance
|
|
PostgreSQL runs in two modes. Normal fsync mode flushes every
|
|
completed transaction to disk, guaranteeing that if the OS
|
|
crashes or loses power in the next few seconds, all your data
|
|
is safely stored on disk. In this mode, we are slower than most
|
|
commercial databases, partly because few of them do such
|
|
conservative flushing to disk in their default modes. In
|
|
no-fsync mode, we are usually faster than commercial databases,
|
|
though in this mode, an OS crash could cause data corruption.
|
|
We are working to provide an intermediate mode that suffers
|
|
less performance overhead than full fsync mode, and will allow
|
|
data integrity within 30 seconds of an OS crash.
|
|
In comparison to MySQL or leaner database systems, we are
|
|
slower on inserts/updates because we have transaction overhead.
|
|
Of course, MySQL does not have any of the features mentioned in
|
|
the Features section above. We are built for flexibility and
|
|
features, though we continue to improve performance through
|
|
profiling and source code analysis. There is an interesting Web
|
|
page comparing PostgreSQL to MySQL at
|
|
http://openacs.org/why-not-mysql.html
|
|
We handle each user connection by creating a Unix process.
|
|
Backend processes share data buffers and locking information.
|
|
With multiple CPUs, multiple backends can easily run on
|
|
different CPUs.
|
|
|
|
Reliability
|
|
We realize that a DBMS must be reliable, or it is worthless. We
|
|
strive to release well-tested, stable code that has a minimum
|
|
of bugs. Each release has at least one month of beta testing,
|
|
and our release history shows that we can provide stable, solid
|
|
releases that are ready for production use. We believe we
|
|
compare favorably to other database software in this area.
|
|
|
|
Support
|
|
Our mailing list provides a large group of developers and users
|
|
to help resolve any problems encountered. While we can not
|
|
guarantee a fix, commercial DBMSs do not always supply a fix
|
|
either. Direct access to developers, the user community,
|
|
manuals, and the source code often make PostgreSQL support
|
|
superior to other DBMSs. There is commercial per-incident
|
|
support available for those who need it. (See support FAQ
|
|
item.)
|
|
|
|
Price
|
|
We are free for all use, both commercial and non-commercial.
|
|
You can add our code to your product with no limitations,
|
|
except those outlined in our BSD-style license stated above.
|
|
|
|
1.15) How can I financially assist PostgreSQL?
|
|
|
|
PostgreSQL has had a first-class infrastructure since we started six
|
|
years ago. This is all thanks to Marc Fournier, who has created and
|
|
managed this infrastructure over the years.
|
|
|
|
Quality infrastructure is very important to an open-source project. It
|
|
prevents disruptions that can greatly delay forward movement of the
|
|
project.
|
|
|
|
Of course, this infrastructure is not cheap. There are a variety of
|
|
monthly and one-time expenses that are required to keep it going. If
|
|
you or your company has money it can donate to help fund this effort,
|
|
please go to http://www.pgsql.com/pg_goodies and make a donation.
|
|
|
|
Although the web page mentions PostgreSQL, Inc, the "contributions"
|
|
item is solely to support the PostgreSQL project and does not fund any
|
|
specific company. If you prefer, you can also send a check to the
|
|
contact address.
|
|
_________________________________________________________________
|
|
|
|
User Client Questions
|
|
|
|
2.1) Are there ODBC drivers for PostgreSQL?
|
|
|
|
There are two ODBC drivers available, PsqlODBC and OpenLink ODBC.
|
|
|
|
PsqlODBC is included in the distribution. More information about it
|
|
can be gotten from ftp://ftp.PostgreSQL.org/pub/odbc/.
|
|
|
|
OpenLink ODBC can be gotten from http://www.openlinksw.com. It works
|
|
with their standard ODBC client software so you'll have PostgreSQL
|
|
ODBC available on every client platform they support (Win, Mac, Unix,
|
|
VMS).
|
|
|
|
They will probably be selling this product to people who need
|
|
commercial-quality support, but a freeware version will always be
|
|
available. Please send questions to postgres95@openlink.co.uk.
|
|
|
|
See also the ODBC chapter of the Programmer's Guide.
|
|
|
|
2.2) What tools are available for using PostgreSQL with Web pages?
|
|
|
|
A nice introduction to Database-backed Web pages can be seen at:
|
|
http://www.webreview.com
|
|
|
|
There is also one at http://www.phone.net/home/mwm/hotlist/.
|
|
|
|
For Web integration, PHP is an excellent interface. It is at
|
|
http://www.php.net.
|
|
|
|
For complex cases, many use the Perl interface and CGI.pm.
|
|
|
|
2.3) Does PostgreSQL have a graphical user interface? A report generator?
|
|
An embedded query language interface?
|
|
|
|
We have a nice graphical user interface called pgaccess, which is
|
|
shipped as part of the distribution. pgaccess also has a report
|
|
generator. The Web page is http://www.flex.ro/pgaccess
|
|
|
|
We also include ecpg, which is an embedded SQL query language
|
|
interface for C.
|
|
|
|
2.4) What languages are available to communicate with PostgreSQL?
|
|
|
|
We have:
|
|
* C (libpq)
|
|
* C++ (libpq++)
|
|
* Embedded C (ecpg)
|
|
* Java (jdbc)
|
|
* Perl (perl5)
|
|
* ODBC (odbc)
|
|
* Python (PyGreSQL)
|
|
* TCL (libpgtcl)
|
|
* C Easy API (libpgeasy)
|
|
* Embedded HTML (PHP from http://www.php.net)
|
|
_________________________________________________________________
|
|
|
|
Administrative Questions
|
|
|
|
3.1) How do I install PostgreSQL somewhere other than /usr/local/pgsql?
|
|
|
|
Specify the --prefix option when running configure.
|
|
|
|
3.2) When I start postmaster, I get a Bad System Call or core dumped
|
|
message. Why?
|
|
|
|
It could be a variety of problems, but first check to see that you
|
|
have System V extensions installed in your kernel. PostgreSQL requires
|
|
kernel support for shared memory and semaphores.
|
|
|
|
3.3) When I try to start postmaster, I get IpcMemoryCreate errors. Why?
|
|
|
|
You either do not have shared memory configured properly in your
|
|
kernel or you need to enlarge the shared memory available in the
|
|
kernel. The exact amount you need depends on your architecture and how
|
|
many buffers and backend processes you configure for postmaster. For
|
|
most systems, with default numbers of buffers and processes, you need
|
|
a minimum of ~1 MB. See the PostgreSQL Administrator's Guide for more
|
|
detailed information about shared memory and semaphores.
|
|
|
|
3.4) When I try to start postmaster, I get IpcSemaphoreCreate errors. Why?
|
|
|
|
If the error message is IpcSemaphoreCreate: semget failed (No space
|
|
left on device) then your kernel is not configured with enough
|
|
semaphores. Postgres needs one semaphore per potential backend
|
|
process. A temporary solution is to start postmaster with a smaller
|
|
limit on the number of backend processes. Use -N with a parameter less
|
|
than the default of 32. A more permanent solution is to increase your
|
|
kernel's SEMMNS and SEMMNI parameters.
|
|
|
|
Inoperative semaphores can also cause crashes during heavy database
|
|
access.
|
|
|
|
If the error message is something else, you might not have semaphore
|
|
support configured in your kernel at all. See the PostgreSQL
|
|
Administrator's Guide for more detailed information about shared
|
|
memory and semaphores.
|
|
|
|
3.5) How do I control connections from other hosts?
|
|
|
|
By default, PostgreSQL only allows connections from the local machine
|
|
using Unix domain sockets. Other machines will not be able to connect
|
|
unless you add the -i flag to postmaster, and enable host-based
|
|
authentication by modifying the file $PGDATA/pg_hba.conf accordingly.
|
|
This will allow TCP/IP connections.
|
|
|
|
3.6) How do I tune the database engine for better performance?
|
|
|
|
Certainly, indexes can speed up queries. The EXPLAIN command allows
|
|
you to see how PostgreSQL is interpreting your query, and which
|
|
indexes are being used.
|
|
|
|
If you are doing many INSERTs, consider doing them in a large batch
|
|
using the COPY command. This is much faster than individual INSERTS.
|
|
Second, statements not in a BEGIN WORK/COMMIT transaction block are
|
|
considered to be in their own transaction. Consider performing several
|
|
statements in a single transaction block. This reduces the transaction
|
|
overhead. Also, consider dropping and recreating indexes when making
|
|
large data changes.
|
|
|
|
There are several tuning options. You can disable fsync() by starting
|
|
postmaster with a -o -F option. This will prevent fsync()s from
|
|
flushing to disk after every transaction.
|
|
|
|
You can also use the postmaster -B option to increase the number of
|
|
shared memory buffers used by the backend processes. If you make this
|
|
parameter too high, the postmaster may not start because you have
|
|
exceeded your kernel's limit on shared memory space. Each buffer is 8K
|
|
and the default is 64 buffers.
|
|
|
|
You can also use the backend -S option to increase the maximum amount
|
|
of memory used by the backend process for temporary sorts. The -S
|
|
value is measured in kilobytes, and the default is 512 (i.e. 512K).
|
|
|
|
You can also use the CLUSTER command to group data in tables to match
|
|
an index. See the CLUSTER manual page for more details.
|
|
|
|
3.7) What debugging features are available?
|
|
|
|
PostgreSQL has several features that report status information that
|
|
can be valuable for debugging purposes.
|
|
|
|
First, by running configure with the --enable-cassert option, many
|
|
assert()s monitor the progress of the backend and halt the program
|
|
when something unexpected occurs.
|
|
|
|
Both postmaster and postgres have several debug options available.
|
|
First, whenever you start postmaster, make sure you send the standard
|
|
output and error to a log file, like:
|
|
cd /usr/local/pgsql
|
|
./bin/postmaster >server.log 2>&1 &
|
|
|
|
This will put a server.log file in the top-level PostgreSQL directory.
|
|
This file contains useful information about problems or errors
|
|
encountered by the server. Postmaster has a -d option that allows even
|
|
more detailed information to be reported. The -d option takes a number
|
|
that specifies the debug level. Be warned that high debug level values
|
|
generate large log files.
|
|
|
|
If postmaster is not running, you can actually run the postgres
|
|
backend from the command line, and type your SQL statement directly.
|
|
This is recommended only for debugging purposes. Note that a newline
|
|
terminates the query, not a semicolon. If you have compiled with
|
|
debugging symbols, you can use a debugger to see what is happening.
|
|
Because the backend was not started from postmaster, it is not running
|
|
in an identical environment and locking/backend interaction problems
|
|
may not be duplicated.
|
|
|
|
If postmaster is running, start psql in one window, then find the PID
|
|
of the postgres process used by psql. Use a debugger to attach to the
|
|
postgres PID. You can set breakpoints in the debugger and issue
|
|
queries from psql. If you are debugging postgres startup, you can set
|
|
PGOPTIONS="-W n", then start psql. This will cause startup to delay
|
|
for n seconds so you can attach to the process with the debugger, set
|
|
any breakpoints, and continue through the startup sequence.
|
|
|
|
The postgres program has -s, -A, and -t options that can be very
|
|
useful for debugging and performance measurements.
|
|
|
|
You can also compile with profiling to see what functions are taking
|
|
execution time. The backend profile files will be deposited in the
|
|
pgsql/data/base/dbname directory. The client profile file will be put
|
|
in the client's current directory.
|
|
|
|
3.8) Why do I get "Sorry, too many clients" when trying to connect?
|
|
|
|
You need to increase postmaster's limit on how many concurrent backend
|
|
processes it can start.
|
|
|
|
The default limit is 32 processes. You can increase it by restarting
|
|
postmaster with a suitable -N value or modifying postgresql.conf.
|
|
|
|
Note that if you make -N larger than 32, you must also increase -B
|
|
beyond its default of 64; -B must be at least twice -N, and probably
|
|
should be more than that for best performance. For large numbers of
|
|
backend processes, you are also likely to find that you need to
|
|
increase various Unix kernel configuration parameters. Things to check
|
|
include the maximum size of shared memory blocks, SHMMAX; the maximum
|
|
number of semaphores, SEMMNS and SEMMNI; the maximum number of
|
|
processes, NPROC; the maximum number of processes per user, MAXUPRC;
|
|
and the maximum number of open files, NFILE and NINODE. The reason
|
|
that PostgreSQL has a limit on the number of allowed backend processes
|
|
is so your system won't run out of resources.
|
|
|
|
In PostgreSQL versions prior to 6.5, the maximum number of backends
|
|
was 64, and changing it required a rebuild after altering the
|
|
MaxBackendId constant in include/storage/sinvaladt.h.
|
|
|
|
3.9) What are the pg_sorttempNNN.NN files in my database directory?
|
|
|
|
They are temporary files generated by the query executor. For example,
|
|
if a sort needs to be done to satisfy an ORDER BY, and the sort
|
|
requires more space than the backend's -S parameter allows, then
|
|
temporary files are created to hold the extra data.
|
|
|
|
The temporary files should be deleted automatically, but might not if
|
|
a backend crashes during a sort. If you have no backends running at
|
|
the time, it is safe to delete the pg_tempNNN.NN files.
|
|
_________________________________________________________________
|
|
|
|
Operational Questions
|
|
|
|
4.1) What is the difference between binary cursors and normal cursors?
|
|
|
|
See the DECLARE manual page for a description.
|
|
|
|
4.2) How do I SELECT only the first few rows of a query?
|
|
|
|
See the FETCH manual page, or use SELECT ... LIMIT....
|
|
|
|
The entire query may have to be evaluated, even if you only want the
|
|
first few rows. Consider a query that has an ORDER BY. If there is an
|
|
index that matches the ORDER BY, PostgreSQL may be able to evaluate
|
|
only the first few records requested, or the entire query may have to
|
|
be evaluated until the desired rows have been generated.
|
|
|
|
4.3) How do I get a list of tables or other things I can see in psql?
|
|
|
|
You can read the source code for psql in file
|
|
pgsql/src/bin/psql/describe.c. It contains SQL commands that generate
|
|
the output for psql's backslash commands. You can also start psql with
|
|
the -E option so it will print out the queries it uses to execute the
|
|
commands you give.
|
|
|
|
4.4) How do you remove a column from a table?
|
|
|
|
We do not support ALTER TABLE DROP COLUMN, but do this:
|
|
SELECT ... -- select all columns but the one you want to remove
|
|
INTO TABLE new_table
|
|
FROM old_table;
|
|
DROP TABLE old_table;
|
|
ALTER TABLE new_table RENAME TO old_table;
|
|
|
|
4.5) What is the maximum size for a row, a table, and a database?
|
|
|
|
These are the limits:
|
|
Maximum size for a database? unlimited (500 GB databases exist)
|
|
Maximum size for a table? 16 TB
|
|
Maximum size for a row? unlimited in 7.1 and later
|
|
Maximum size for a field? 1 GB in 7.1 and later
|
|
Maximum number of rows in a table? unlimited
|
|
Maximum number of columns in a table? 250-1600 depending on column types
|
|
Maximum number of indexes on a table? unlimited
|
|
|
|
Of course, these are not actually unlimited, but limited to available
|
|
disk space and memory/swap space. Performance may suffer when these
|
|
values get unusually large.
|
|
|
|
The maximum table size of 16 TB does not require large file support
|
|
from the operating system. Large tables are stored as multiple 1 GB
|
|
files so file system size limits are not important.
|
|
|
|
The maximum table size and maximum number of columns can be increased
|
|
if the default block size is increased to 32k.
|
|
|
|
4.6) How much database disk space is required to store data from a typical
|
|
text file?
|
|
|
|
A PostgreSQL database may require up to five times the disk space to
|
|
store data from a text file.
|
|
|
|
As an example, consider a file of 100,000 lines with an integer and
|
|
text description on each line. Suppose the text string avergages
|
|
twenty bytes in length. The flat file would be 2.8 MB. The size of the
|
|
PostgreSQL database file containing this data can be estimated as 6.4
|
|
MB:
|
|
36 bytes: each row header (approximate)
|
|
24 bytes: one int field and one text filed
|
|
+ 4 bytes: pointer on page to tuple
|
|
----------------------------------------
|
|
64 bytes per row
|
|
|
|
The data page size in PostgreSQL is 8192 bytes (8 KB), so:
|
|
|
|
8192 bytes per page
|
|
------------------- = 128 rows per database page (rounded down)
|
|
64 bytes per row
|
|
|
|
100000 data rows
|
|
-------------------- = 782 database pages (rounded up)
|
|
128 rows per page
|
|
|
|
782 database pages * 8192 bytes per page = 6,406,144 bytes (6.4 MB)
|
|
|
|
Indexes do not require as much overhead, but do contain the data that
|
|
is being indexed, so they can be large also.
|
|
|
|
4.7) How do I find out what tables or indexes are defined in the database?
|
|
|
|
psql has a variety of backslash commands to show such information. Use
|
|
\? to see them.
|
|
|
|
Also try the file pgsql/src/tutorial/syscat.source. It illustrates
|
|
many of the SELECTs needed to get information from the database system
|
|
tables.
|
|
|
|
4.8) My queries are slow or don't make use of the indexes. Why?
|
|
|
|
Indexes are not automatically used by every query. Indexes are only
|
|
used if the table is larger than a minimum size, and the query selects
|
|
only a small percentage of the rows in the table. This is because the
|
|
random disk access caused by an index scan is sometimes slower than a
|
|
straight read through the table, or sequential scan.
|
|
|
|
To determine if an index should be used, PostgreSQL must have
|
|
statistics about the table. These statistics are collected using
|
|
VACUUM ANALYZE, or simply ANALYZE. Using statistics, the optimizer
|
|
knows how many rows are in the table, and can better determine if
|
|
indexes should be used. Statistics are also valuable in determining
|
|
optimal join order and join methods. Statistics collection should be
|
|
performed periodically as the contents of the table change.
|
|
|
|
Indexes are normally not used for ORDER BY or to perform joins. A
|
|
sequential scan followed by an explicit sort is usually faster than an
|
|
index scan of a large table.
|
|
However, LIMIT combined with ORDER BY often will use an index because
|
|
only a small portion of the table is returned.
|
|
|
|
When using wild-card operators such as LIKE or ~, indexes can only be
|
|
used if the beginning of the search is anchored to the start of the
|
|
string. Therefore, to use indexes, LIKE patterns must not start with
|
|
%, and ~(regular expression) patterns must start with ^.
|
|
|
|
4.9) How do I see how the query optimizer is evaluating my query?
|
|
|
|
See the EXPLAIN manual page.
|
|
|
|
4.10) What is an R-tree index?
|
|
|
|
An R-tree index is used for indexing spatial data. A hash index can't
|
|
handle range searches. A B-tree index only handles range searches in a
|
|
single dimension. R-trees can handle multi-dimensional data. For
|
|
example, if an R-tree index can be built on an attribute of type
|
|
point, the system can more efficiently answer queries such as "select
|
|
all points within a bounding rectangle."
|
|
|
|
The canonical paper that describes the original R-tree design is:
|
|
|
|
Guttman, A. "R-trees: A Dynamic Index Structure for Spatial
|
|
Searching." Proceedings of the 1984 ACM SIGMOD Int'l Conf on Mgmt of
|
|
Data, 45-57.
|
|
|
|
You can also find this paper in Stonebraker's "Readings in Database
|
|
Systems".
|
|
|
|
Built-in R-trees can handle polygons and boxes. In theory, R-trees can
|
|
be extended to handle higher number of dimensions. In practice,
|
|
extending R-trees requires a bit of work and we don't currently have
|
|
any documentation on how to do it.
|
|
|
|
4.11) What is the Genetic Query Optimizer?
|
|
|
|
The GEQO module speeds query optimization when joining many tables by
|
|
means of a Genetic Algorithm (GA). It allows the handling of large
|
|
join queries through nonexhaustive search.
|
|
|
|
4.12) How do I perform regular expression searches and case-insensitive
|
|
regular expression searches? How do I use an index for case-insensitive
|
|
searches?
|
|
|
|
The ~ operator does regular expression matching, and ~* does
|
|
case-insensitive regular expression matching. The case-insensitive
|
|
variant of LIKE is called ILIKE in PostgreSQL 7.1 and later.
|
|
|
|
Case-insensitive equality comparisons are normally expressed as:
|
|
SELECT *
|
|
FROM tab
|
|
WHERE lower(col) = 'abc'
|
|
|
|
This will not use an standard index. However, if you create a
|
|
functional index, it will be used:
|
|
CREATE INDEX tabindex on tab (lower(col));
|
|
|
|
4.13) In a query, how do I detect if a field is NULL?
|
|
|
|
You test the column with IS NULL and IS NOT NULL.
|
|
|
|
4.14) What is the difference between the various character types?
|
|
|
|
Type Internal Name Notes
|
|
--------------------------------------------------
|
|
"char" char 1 character
|
|
CHAR(#) bpchar blank padded to the specified fixed length
|
|
VARCHAR(#) varchar size specifies maximum length, no padding
|
|
TEXT text no specific upper limit on length
|
|
BYTEA bytea variable-length byte array (null-byte safe)
|
|
|
|
You will see the internal name when examining system catalogs and in
|
|
some error messages.
|
|
|
|
The last four types above are "varlena" types (i.e., the first four
|
|
bytes on disk are the length, followed by the data). Thus the actual
|
|
space used is slightly greater than the declared size. However, these
|
|
data types are also subject to compression or being stored out-of-line
|
|
by TOAST, so the space on disk might also be less than expected.
|
|
|
|
CHAR() is best when storing strings that are usually the same length.
|
|
VARCHAR() is best when storing variable-length strings but it limits
|
|
how long a string can be. TEXT is for strings of unlimited length,
|
|
maximum 1 gigabyte. BYTEA is for storing binary data, particularly
|
|
values that include NULL bytes.
|
|
|
|
4.15.1) How do I create a serial/auto-incrementing field?
|
|
|
|
PostgreSQL supports a SERIAL data type. It auto-creates a sequence and
|
|
index on the column. For example, this:
|
|
CREATE TABLE person (
|
|
id SERIAL,
|
|
name TEXT
|
|
);
|
|
|
|
is automatically translated into this:
|
|
CREATE SEQUENCE person_id_seq;
|
|
CREATE TABLE person (
|
|
id INT4 NOT NULL DEFAULT nextval('person_id_seq'),
|
|
name TEXT
|
|
);
|
|
CREATE UNIQUE INDEX person_id_key ON person ( id );
|
|
|
|
See the create_sequence manual page for more information about
|
|
sequences. You can also use each row's OID field as a unique value.
|
|
However, if you need to dump and reload the database, you need to use
|
|
pg_dump's -o option or COPY WITH OIDS option to preserve the OIDs.
|
|
|
|
4.15.2) How do I get the value of a SERIAL insert?
|
|
|
|
One approach is to retrieve the next SERIAL value from the sequence
|
|
object with the nextval() function before inserting and then insert it
|
|
explicitly. Using the example table in 4.15.1, that might look like
|
|
this in Perl:
|
|
new_id = output of "SELECT nextval('person_id_seq')"
|
|
INSERT INTO person (id, name) VALUES (new_id, 'Blaise Pascal');
|
|
|
|
You would then also have the new value stored in new_id for use in
|
|
other queries (e.g., as a foreign key to the person table). Note that
|
|
the name of the automatically created SEQUENCE object will be named
|
|
<table>_<serialcolumn>_seq, where table and serialcolumn are the names
|
|
of your table and your SERIAL column, respectively.
|
|
|
|
Alternatively, you could retrieve the assigned SERIAL value with the
|
|
currval() function after it was inserted by default, e.g.,
|
|
INSERT INTO person (name) VALUES ('Blaise Pascal');
|
|
new_id = output of "SELECT currval('person_id_seq')";
|
|
|
|
Finally, you could use the OID returned from the INSERT statement to
|
|
look up the default value, though this is probably the least portable
|
|
approach. In Perl, using DBI with Edmund Mergl's DBD::Pg module, the
|
|
oid value is made available via $sth->{pg_oid_status} after
|
|
$sth->execute().
|
|
|
|
4.15.3) Don't currval() and nextval() lead to a race condition with other
|
|
users?
|
|
|
|
No. Currval() returns the current value assigned by your backend, not
|
|
by all users.
|
|
|
|
4.15.4) Why aren't my sequence numbers reused on transaction abort? Why are
|
|
there gaps in the numbering of my sequence/SERIAL column?
|
|
|
|
To improve concurrency, sequence values are given out to running
|
|
transactions as needed and are not locked until the transaction
|
|
completes. This causes gaps in numbering from aborted transactions.
|
|
|
|
4.16) What is an OID? What is a TID?
|
|
|
|
OIDs are PostgreSQL's answer to unique row ids. Every row that is
|
|
created in PostgreSQL gets a unique OID. All OIDs generated during
|
|
initdb are less than 16384 (from backend/access/transam.h). All
|
|
user-created OIDs are equal to or greater than this. By default, all
|
|
these OIDs are unique not only within a table or database, but unique
|
|
within the entire PostgreSQL installation.
|
|
|
|
PostgreSQL uses OIDs in its internal system tables to link rows
|
|
between tables. These OIDs can be used to identify specific user rows
|
|
and used in joins. It is recommended you use column type OID to store
|
|
OID values. You can create an index on the OID field for faster
|
|
access.
|
|
|
|
OIDs are assigned to all new rows from a central area that is used by
|
|
all databases. If you want to change the OID to something else, or if
|
|
you want to make a copy of the table, with the original OIDs, there is
|
|
no reason you can't do it:
|
|
CREATE TABLE new_table(old_oid oid, mycol int);
|
|
SELECT old_oid, mycol INTO new FROM old;
|
|
COPY new TO '/tmp/pgtable';
|
|
DELETE FROM new;
|
|
COPY new WITH OIDS FROM '/tmp/pgtable';
|
|
|
|
OIDs are stored as 4-byte integers, and will overflow at 4 billion. No
|
|
one has reported this ever happening, and we plan to have the limit
|
|
removed before anyone does.
|
|
|
|
TIDs are used to identify specific physical rows with block and offset
|
|
values. TIDs change after rows are modified or reloaded. They are used
|
|
by index entries to point to physical rows.
|
|
|
|
4.17) What is the meaning of some of the terms used in PostgreSQL?
|
|
|
|
Some of the source code and older documentation use terms that have
|
|
more common usage. Here are some:
|
|
* table, relation, class
|
|
* row, record, tuple
|
|
* column, field, attribute
|
|
* retrieve, select
|
|
* replace, update
|
|
* append, insert
|
|
* OID, serial value
|
|
* portal, cursor
|
|
* range variable, table name, table alias
|
|
|
|
A list of general database terms can be found at:
|
|
http://www.comptechnews.com/~reaster/dbdesign.html
|
|
|
|
4.18) Why do I get the error "ERROR: Memory exhausted in AllocSetAlloc()"?
|
|
|
|
If you are running a version older than 7.1, an upgrade may fix the
|
|
problem. Also it is possible you have run out of virtual memory on
|
|
your system, or your kernel has a low limit for certain resources. Try
|
|
this before starting postmaster:
|
|
ulimit -d 262144
|
|
limit datasize 256m
|
|
|
|
Depending on your shell, only one of these may succeed, but it will
|
|
set your process data segment limit much higher and perhaps allow the
|
|
query to complete. This command applies to the current process, and
|
|
all subprocesses created after the command is run. If you are having a
|
|
problem with the SQL client because the backend is returning too much
|
|
data, try it before starting the client.
|
|
|
|
4.19) How do I tell what PostgreSQL version I am running?
|
|
|
|
From psql, type select version();
|
|
|
|
4.20) Why does my large-object operations get "invalid large obj
|
|
descriptor"?
|
|
|
|
You need to put BEGIN WORK and COMMIT around any use of a large object
|
|
handle, that is, surrounding lo_open ... lo_close.
|
|
|
|
Currently PostgreSQL enforces the rule by closing large object handles
|
|
at transaction commit. So the first attempt to do anything with the
|
|
handle will draw invalid large obj descriptor. So code that used to
|
|
work (at least most of the time) will now generate that error message
|
|
if you fail to use a transaction.
|
|
|
|
If you are using a client interface like ODBC you may need to set
|
|
auto-commit off.
|
|
|
|
4.21) How do I create a column that will default to the current time?
|
|
|
|
Use CURRENT_TIMESTAMP:
|
|
CREATE TABLE test (x int, modtime timestamp DEFAULT CURRENT_TIMESTAMP );
|
|
|
|
4.22) Why are my subqueries using IN so slow?
|
|
|
|
Currently, we join subqueries to outer queries by sequentially
|
|
scanning the result of the subquery for each row of the outer query. A
|
|
workaround is to replace IN with EXISTS:
|
|
SELECT *
|
|
FROM tab
|
|
WHERE col1 IN (SELECT col2 FROM TAB2)
|
|
|
|
to:
|
|
SELECT *
|
|
FROM tab
|
|
WHERE EXISTS (SELECT col2 FROM TAB2 WHERE col1 = col2)
|
|
|
|
We hope to fix this limitation in a future release.
|
|
|
|
4.23) How do I perform an outer join?
|
|
|
|
PostgreSQL 7.1 and later supports outer joins using the SQL standard
|
|
syntax. Here are two examples:
|
|
SELECT *
|
|
FROM t1 LEFT OUTER JOIN t2 ON (t1.col = t2.col);
|
|
|
|
or
|
|
SELECT *
|
|
FROM t1 LEFT OUTER JOIN t2 USING (col);
|
|
|
|
These identical queries join t1.col to t2.col, and also return any
|
|
unjoined rows in t1 (those with no match in t2). A RIGHT join would
|
|
add unjoined rows of t2. A FULL join would return the matched rows
|
|
plus all unjoined rows from t1 and t2. The word OUTER is optional and
|
|
is assumed in LEFT, RIGHT, and FULL joins. Ordinary joins are called
|
|
INNER joins.
|
|
|
|
In previous releases, outer joins can be simulated using UNION and NOT
|
|
IN. For example, when joining tab1 and tab2, the following query does
|
|
an outer join of the two tables:
|
|
SELECT tab1.col1, tab2.col2
|
|
FROM tab1, tab2
|
|
WHERE tab1.col1 = tab2.col1
|
|
UNION ALL
|
|
SELECT tab1.col1, NULL
|
|
FROM tab1
|
|
WHERE tab1.col1 NOT IN (SELECT tab2.col1 FROM tab2)
|
|
ORDER BY col1
|
|
|
|
4.24) How do I perform queries using multiple databases?
|
|
|
|
There is no way to query any database except the current one. Because
|
|
PostgreSQL loads database-specific system catalogs, it is uncertain
|
|
how a cross-database query should even behave.
|
|
|
|
Of course, a client can make simultaneous connections to different
|
|
databases and merge the information that way.
|
|
_________________________________________________________________
|
|
|
|
Extending PostgreSQL
|
|
|
|
5.1) I wrote a user-defined function. When I run it in psql, why does it
|
|
dump core?
|
|
|
|
The problem could be a number of things. Try testing your user-defined
|
|
function in a stand-alone test program first.
|
|
|
|
5.2) How can I contribute some nifty new types and functions to PostgreSQL?
|
|
|
|
Send your extensions to the pgsql-hackers mailing list, and they will
|
|
eventually end up in the contrib/ subdirectory.
|
|
|
|
5.3) How do I write a C function to return a tuple?
|
|
|
|
This requires wizardry so extreme that the authors have never tried
|
|
it, though in principle it can be done.
|
|
|
|
5.4) I have changed a source file. Why does the recompile not see the
|
|
change?
|
|
|
|
The Makefiles do not have the proper dependencies for include files.
|
|
You have to do a make clean and then another make. If you are using
|
|
GCC you can use the --enable-depend option of configure to have the
|
|
compiler compute the dependencies automatically.
|