From 3a06a79cd137d294bc82d931769d778c8c9aec91 Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Thu, 15 Sep 2022 18:04:00 +0200 Subject: [PATCH] Copy-edit docs for logical replication column lists There was a excessive structure, leading to somewhat disorganized presentation of the information. Remove a few tags and reorder paragraphs to make the text flow more easily. Also, reword some of it to be more concise. The bit about column list combination is not modified, other than to remove an uninteresting (and IMO confusing and wrong) paragraph; I intend to deal with it differently afterwards. Backpatch to 15. Discussion: https://postgr.es/m/20220913121138.yn7ekkfysxzhkm2u@alvherre.pgsql --- doc/src/sgml/logical-replication.sgml | 127 ++++++++--------------- doc/src/sgml/ref/create_publication.sgml | 2 +- 2 files changed, 47 insertions(+), 82 deletions(-) diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml index 0ab191e402..1ae3287f22 100644 --- a/doc/src/sgml/logical-replication.sgml +++ b/doc/src/sgml/logical-replication.sgml @@ -1093,89 +1093,60 @@ test_sub=# SELECT * FROM child ORDER BY a; Column Lists - By default, all columns of a published table will be replicated to the - appropriate subscribers. The subscriber table must have at least all the - columns of the published table. However, if a - column list is specified then only the columns named - in the list will be replicated. This means the subscriber-side table only - needs to have those columns named by the column list. A user might choose to - use column lists for behavioral, security or performance reasons. + Each publication can optionally specify which columns of each table are + replicated to subscribers. The table on the subscriber side must have at + least all the columns that are published. If no column list is specified, + then all columns in the publisher are replicated. + See for details on the syntax. - - Column List Rules + + The choice of columns can be based on behavioral or performance reasons. + However, do not rely on this feature for security: a malicious subscriber + is able to obtain data from columns that are not specifically + published. If security is a consideration, protections can be applied + at the publisher side. + - - A column list is specified per table following the table name, and enclosed - by parentheses. See for details. - + + If no column list is specified, any columns added later are automatically + replicated. This means that having a column list which names all columns + is not the same as having no column list at all. + - - When specifying a column list, the order of columns is not important. If no - column list is specified, all columns of the table are replicated through - this publication, including any columns added later. This means a column - list which names all columns is not quite the same as having no column list - at all. For example, if additional columns are added to the table then only - those named columns mentioned in the column list will continue to be - replicated. - + + A column list can contain only simple column references. The order + of columns in the list is not preserved. + - - Column lists have no effect for TRUNCATE command. - + + For partitioned tables, the publication parameter + publish_via_partition_root determines which column list + is used. If publish_via_partition_root is + true, the root partitioned table's column list is used. + Otherwise, if publish_via_partition_root is + false (the default), each partition's column list is used. + - + + If a publication publishes UPDATE or + DELETE operations, any column list must include the + table's replica identity columns (see + ). + If a publication publishes only INSERT operations, then + the column list may omit replica identity columns. + - - Column List Restrictions + + Column lists have no effect for the TRUNCATE command. + - - A column list can contain only simple column references. - - - - If a publication publishes UPDATE or - DELETE operations, any column list must include the - table's replica identity columns (see - ). - If a publication publishes only INSERT operations, then - the column list is arbitrary and may omit some replica identity columns. - - - - - - Partitioned Tables - - - For partitioned tables, the publication parameter - publish_via_partition_root determines which column list - is used. If publish_via_partition_root is - true, the root partitioned table's column list is used. - Otherwise, if publish_via_partition_root is - false (default), each partition's column list is used. - - - - - - Initial Data Synchronization - - - If the subscription requires copying pre-existing table data and a - publication specifies a column list, only data from those columns will be - copied. - - - - - If the subscriber is in a release prior to 15, copy pre-existing data - doesn't use column lists even if they are defined in the publication. - This is because old releases can only copy the entire table data. - - - - + + During initial data synchronization, only the published columns are + copied. However, if the subscriber is from a release prior to 15, then + all the columns in the table are copied during initial data synchronization, + ignoring any column lists. + Combining Multiple Column Lists @@ -1193,12 +1164,6 @@ test_sub=# SELECT * FROM child ORDER BY a; ALTER SUBSCRIPTION ... DROP PUBLICATION and then add it back after adjusting the column list. - - Background: The main purpose of the column list feature is to allow - statically different table shapes on publisher and subscriber, or hide - sensitive column data. In both cases, it doesn't seem to make sense to - combine column lists. - diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml index f61641896a..0a68c4bf73 100644 --- a/doc/src/sgml/ref/create_publication.sgml +++ b/doc/src/sgml/ref/create_publication.sgml @@ -94,7 +94,7 @@ CREATE PUBLICATION name effect on TRUNCATE commands. See for details about column lists. - + Only persistent base tables and partitioned tables can be part of a