Doc updates for index-only scans.

Document that routine vacuuming is now also important for the purpose
of index-only scans; and mention in the section that describes the
visibility map that it is used to implement index-only scans.

Marti Raudsepp, with some changes by me.
This commit is contained in:
Robert Haas 2012-03-21 14:51:11 -04:00
parent f70f095c90
commit 5b9c1e6d52
2 changed files with 42 additions and 5 deletions

View File

@ -101,6 +101,11 @@
<productname>PostgreSQL</productname> query planner.</simpara>
</listitem>
<listitem>
<simpara>To update the visibility map, which speeds up index-only
scans.</simpara>
</listitem>
<listitem>
<simpara>To protect against loss of very old data due to
<firstterm>transaction ID wraparound</>.</simpara>
@ -329,6 +334,33 @@
</tip>
</sect2>
<sect2 id="vacuum-for-visibility-map">
<title>Updating The Visibility Map</title>
<para>
Vacuum maintains a <link linkend="storage-vm">visibility map</> for each
table to keep track of which pages contain only tuples that are known to be
visible to all active transactions (and all future transactions, until the
page is again modified). This has two purposes. First, vacuum
itself can skip such pages on the next run, since there is nothing to
clean up.
</para>
<para>
Second, it allows <productname>PostgreSQL</productname> to answer some
queries using only the index, without reference to the underlying table.
Since <productname>PostgreSQL</productname> indexes don't contain tuple
visibility information, a normal index scan fetches the heap tuple for each
matching index entry, to check whether it should be seen by the current
transaction. An <firstterm>index-only scan</>, on the other hand, checks
the visibility map first. If it's known that all tuples on the page are
visible, the heap fetch can be skipped. This is most noticeable on
large data sets where the visibility map can prevent disk accesses.
The visibility map is vastly smaller than the heap, so it can easily be
cached even when the heap is very large.
</para>
</sect2>
<sect2 id="vacuum-for-wraparound">
<title>Preventing Transaction ID Wraparound Failures</title>

View File

@ -494,11 +494,16 @@ Note that indexes do not have VMs.
<para>
The visibility map simply stores one bit per heap page. A set bit means
that all tuples on the page are known to be visible to all transactions.
This means that the page does not contain any tuples that need to be vacuumed;
in future it might also be used to avoid visiting the page for visibility
checks. The map is conservative in the sense that we
make sure that whenever a bit is set, we know the condition is true, but if
a bit is not set, it might or might not be true.
This means that the page does not contain any tuples that need to be vacuumed.
This information can also be used by <firstterm>index-only scans</> to answer
queries using only the index tuple.
</para>
<para>
The map is conservative in the sense that we make sure that whenever a bit is
set, we know the condition is true, but if a bit is not set, it might or
might not be true. Visibility map bits are only set by vacuum, but are
cleared by any data-modifying operations on a page.
</para>
</sect1>