From 1b5617eb844cd2470a334c1d2eec66cf9b39c41a Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Fri, 14 May 2021 13:10:52 -0400 Subject: [PATCH] Describe (auto-)analyze behavior for partitioned tables MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This explains the new behavior introduced by 0827e8af70f4 as well as preexisting. Author: Justin Pryzby Author: Álvaro Herrera Discussion: https://postgr.es/m/20210423180152.GA17270@telsasoft.com --- doc/src/sgml/maintenance.sgml | 6 +++++ doc/src/sgml/perform.sgml | 3 ++- doc/src/sgml/ref/analyze.sgml | 40 +++++++++++++++++++++++--------- doc/src/sgml/ref/pg_restore.sgml | 6 +++-- 4 files changed, 41 insertions(+), 14 deletions(-) diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml index de7fd75e1c..4b535809b6 100644 --- a/doc/src/sgml/maintenance.sgml +++ b/doc/src/sgml/maintenance.sgml @@ -817,6 +817,12 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu is compared to the total number of tuples inserted, updated, or deleted since the last ANALYZE. + For partitioned tables, inserts, updates and deletes on partitions + are counted towards this threshold; however, DDL + operations such as ATTACH, DETACH + and DROP are not, so running a manual + ANALYZE is recommended if the partition added or + removed contains a statistically significant volume of data. diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml index 89ff58338e..ddd6c3ff3e 100644 --- a/doc/src/sgml/perform.sgml +++ b/doc/src/sgml/perform.sgml @@ -1767,7 +1767,8 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse; Whenever you have significantly altered the distribution of data within a table, running ANALYZE is strongly recommended. This - includes bulk loading large amounts of data into the table. Running + includes bulk loading large amounts of data into the table as well as + attaching, detaching or dropping partitions. Running ANALYZE (or VACUUM ANALYZE) ensures that the planner has up-to-date statistics about the table. With no statistics or obsolete statistics, the planner might diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml index c8fcebc161..0879004b84 100644 --- a/doc/src/sgml/ref/analyze.sgml +++ b/doc/src/sgml/ref/analyze.sgml @@ -250,20 +250,38 @@ ANALYZE [ VERBOSE ] [ table_and_columns - If the table being analyzed has one or more children, - ANALYZE will gather statistics twice: once on the - rows of the parent table only, and a second time on the rows of the - parent table with all of its children. This second set of statistics - is needed when planning queries that traverse the entire inheritance - tree. The autovacuum daemon, however, will only consider inserts or - updates on the parent table itself when deciding whether to trigger an - automatic analyze for that table. If that table is rarely inserted into - or updated, the inheritance statistics will not be up to date unless you - run ANALYZE manually. + If the table being analyzed is partitioned, ANALYZE + will gather statistics by sampling blocks randomly from its partitions; + in addition, it will recurse into each partition and update its statistics. + (However, in multi-level partitioning scenarios, each leaf partition + will only be analyzed once.) + By constrast, if the table being analyzed has inheritance children, + ANALYZE will gather statistics for it twice: + once on the rows of the parent table only, and a second time on the + rows of the parent table with all of its children. This second set of + statistics is needed when planning queries that traverse the entire + inheritance tree. The child tables themselves are not individually + analyzed in this case. - If any of the child tables are foreign tables whose foreign data wrappers + The autovacuum daemon counts inserts, updates and deletes in the + partitions to determine if auto-analyze is needed. However, adding + or removing partitions does not affect autovacuum daemon decisions, + so triggering a manual ANALYZE is recommended + when this occurs. + + + + Tuples changed in inheritance children do not count towards analyze + on the parent table. If the parent table is empty or rarely modified, + it may never be processed by autovacuum. It's necessary to + periodically run a manual ANALYZE to keep the + statistics of the table hierarchy up to date. + + + + If any of the child tables or partitions are foreign tables whose foreign data wrappers do not support ANALYZE, those child tables are ignored while gathering inheritance statistics. diff --git a/doc/src/sgml/ref/pg_restore.sgml b/doc/src/sgml/ref/pg_restore.sgml index 93ea937ac8..35cd56297c 100644 --- a/doc/src/sgml/ref/pg_restore.sgml +++ b/doc/src/sgml/ref/pg_restore.sgml @@ -922,8 +922,10 @@ CREATE DATABASE foo WITH TEMPLATE template0; Once restored, it is wise to run ANALYZE on each - restored table so the optimizer has useful statistics; see - and + restored table so the optimizer has useful statistics. + If the table is a partition or an inheritance child, it may also be useful + to analyze the parent to update statistics for the table hierarchy. + See and for more information.