postgres

Go to file

Dean Rasheed b5db1d93d2 Improve ANALYZE's strategy for finding MCVs.

Previously, a value was included in the MCV list if its frequency was
25% larger than the estimated average frequency of all nonnull values
in the table. For uniform distributions, that can lead to values
being included in the MCV list and significantly overestimated on the
basis of relatively few (sometimes just 2) instances being seen in the
sample. For non-uniform distributions, it can lead to too few values
being included in the MCV list, since the overall average frequency
may be dominated by a small number of very common values, while the
remaining values may still have a large spread of frequencies, causing
both substantial overestimation and underestimation of the remaining
values. Furthermore, increasing the statistics target may have little
effect because the overall average frequency will remain relatively
unchanged.

Instead, populate the MCV list with the largest set of common values
that are statistically significantly more common than the average
frequency of the remaining values. This takes into account the
variance of the sample counts, which depends on the counts themselves
and on the proportion of the table that was sampled. As a result, it
constrains the relative standard error of estimates based on the
frequencies of values in the list, reducing the chances of too many
values being included. At the same time, it allows more values to be
included, since the MCVs need only be more common than the remaining
non-MCVs, rather than the overall average. Thus it tends to produce
fewer MCVs than the previous code for uniform distributions, and more
for non-uniform distributions, reducing estimation errors in both
cases. In addition, the algorithm responds better to increasing the
statistics target, allowing more values to be included in the MCV list
when more of the table is sampled.

Jeff Janes, substantially modified by me. Reviewed by John Naylor and
Tomas Vondra.

Discussion: https://postgr.es/m/CAMkU=1yvdGvW9TmiLAhz2erFnvnPFYHbOZuO+a=4DVkzpuQ2tw@mail.gmail.com

2018-03-22 09:37:36 +00:00

config

Add configure infrastructure (--with-llvm) to enable LLVM support.

2018-03-20 17:26:25 -07:00

contrib

Handle heap rewrites even better in logical decoding

2018-03-21 09:15:04 -04:00

doc

Add general purpose hasing functions to pgbench.

2018-03-21 18:01:23 +03:00

src

Improve ANALYZE's strategy for finding MCVs.

2018-03-22 09:37:36 +00:00

.dir-locals.el

emacs: Set indent-tabs-mode in perl-mode

2015-04-12 23:53:23 -04:00

.gitattributes

Remove contrib/tsearch2.

2017-02-13 11:06:11 -05:00

.gitignore

Add lcov --initial

2017-09-29 08:54:34 -04:00

aclocal.m4

Add configure infrastructure (--with-llvm) to enable LLVM support.

2018-03-20 17:26:25 -07:00

configure

Fix typo in BITCODE_CXXFLAGS assignment.

2018-03-21 18:41:08 -07:00

configure.in

Fix typo in BITCODE_CXXFLAGS assignment.

2018-03-21 18:41:08 -07:00

Update copyright for 2018

2018-01-02 23:30:12 -05:00

GNUmakefile.in

Have "make coverage" recurse into contrib as well

2016-09-05 18:44:36 -03:00

HISTORY

Change documentation references to PG website to use https: not http:

2017-05-20 21:50:47 -04:00

Makefile

Fix non-GNU makefiles for AIX make.

2017-11-30 00:57:22 -08:00

README

Change documentation references to PG website to use https: not http:

2017-05-20 21:50:47 -04:00

README.git

Change documentation references to PG website to use https: not http:

2017-05-20 21:50:47 -04:00

README

PostgreSQL Database Management System
=====================================

This directory contains the source code distribution of the PostgreSQL
database management system.

PostgreSQL is an advanced object-relational database management system
that supports an extended subset of the SQL standard, including
transactions, foreign keys, subqueries, triggers, user-defined types
and functions.  This distribution also contains C language bindings.

PostgreSQL has many language interfaces, many of which are listed here:

	https://www.postgresql.org/download

See the file INSTALL for instructions on how to build and install
PostgreSQL.  That file also lists supported operating systems and
hardware platforms and contains information regarding any other
software packages that are required to build or run the PostgreSQL
system.  Copyright and license information can be found in the
file COPYRIGHT.  A comprehensive documentation set is included in this
distribution; it can be read as described in the installation
instructions.

The latest version of this software may be obtained at
https://www.postgresql.org/download/.  For more information look at our
web site located at https://www.postgresql.org/.

Languages

C 85.7%

PLpgSQL 5.8%

Perl 4.1%

Yacc 1.3%

Makefile 0.7%

Other 2.3%