Documentation update for Standard Collations.

Correct out-of-date text that said the "default" collation is always based on LC_COLLATE and LC_CTYPE. Also reformat into a list to make it easier to understand and compare the available collations, and briefly document the stability characteristics of each one. Discussion: https://postgr.es/m/4a69d067374d2f6bfb66f5bfb2ab9a020493d49f.camel@j-davis.com
2024-03-02 13:37:43 -08:00 · 2024-03-02 13:37:43 -08:00 · 875e46a0a2
commit 875e46a0a2
parent 1e01374654
1 changed files with 45 additions and 27 deletions
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@ -788,37 +788,19 @@ SELECT * FROM test1 ORDER BY a || b COLLATE "fr_FR";
    <title>Standard Collations</title>
   <para>
-    On all platforms, the collations named <literal>default</literal>,
+    On all platforms, the following collations are supported:
    <literal>C</literal>, and <literal>POSIX</literal> are available.  Additional
    collations may be available depending on operating system support.
    The <literal>default</literal> collation selects the <symbol>LC_COLLATE</symbol>
    and <symbol>LC_CTYPE</symbol> values specified at database creation time.
    The <literal>C</literal> and <literal>POSIX</literal> collations both specify
    <quote>traditional C</quote> behavior, in which only the ASCII letters
    <quote><literal>A</literal></quote> through <quote><literal>Z</literal></quote>
    are treated as letters, and sorting is done strictly by character
    code byte values.
   </para>
   <note>
    <para>
     The <literal>C</literal> and <literal>POSIX</literal> locales may behave
     differently depending on the database encoding.
    </para>
   </note>
   <para>
    Additionally, two SQL standard collation names are available:
    <variablelist>
     <varlistentry>
      <term><literal>unicode</literal></term>
      <listitem>
       <para>
-        This collation sorts using the Unicode Collation Algorithm with the
+        This SQL standard collation sorts using the Unicode Collation
-        Default Unicode Collation Element Table.  It is available in all
+        Algorithm with the Default Unicode Collation Element Table.  It is
-        encodings.  ICU support is required to use this collation.  (This
+        available in all encodings.  ICU support is required to use this
-        collation has the same behavior as the ICU root locale; see <xref
+        collation, and behavior may change if Postgres is built with a
        different version of ICU.  (This collation has the same behavior as
        the ICU root locale; see <xref
        linkend="collation-managing-predefined-icu-und-x-icu"/>.)
       </para>
      </listitem>
@ -828,15 +810,51 @@ SELECT * FROM test1 ORDER BY a || b COLLATE "fr_FR";
      <term><literal>ucs_basic</literal></term>
      <listitem>
       <para>
-        This collation sorts by Unicode code point.  It is only available for
+        This SQL standard collation sorts using the Unicode code point values
-        encoding <literal>UTF8</literal>.  (This collation has the same
+        rather than natural language order, and only the ASCII letters
        <quote><literal>A</literal></quote> through
        <quote><literal>Z</literal></quote> are treated as letters.  The
        behavior is efficient and stable across all versions.  Only available
        for encoding <literal>UTF8</literal>.  (This collation has the same
        behavior as the libc locale specification <literal>C</literal> in
        <literal>UTF8</literal> encoding.)
       </para>
      </listitem>
     </varlistentry>
     <varlistentry>
      <term><literal>C</literal> (equivalent to <literal>POSIX</literal>)</term>
      <listitem>
       <para>
        The <literal>C</literal> and <literal>POSIX</literal> collations are
        based on <quote>traditional C</quote> behavior.  They sort by byte
        values rather than natural language order, and only the ASCII letters
        <quote><literal>A</literal></quote> through
        <quote><literal>Z</literal></quote> are treated as letters.  The
        behavior is efficient and stable across all versions for a given
        database encoding, but behavior may vary between different database
        encodings.
       </para>
      </listitem>
     </varlistentry>
     <varlistentry>
      <term><literal>default</literal></term>
      <listitem>
       <para>
        The <literal>default</literal> collation selects the locale specified
        at database creation time.
       </para>
      </listitem>
     </varlistentry>
    </variablelist>
   </para>
   <para>
    Additional collations may be available depending on operating system
    support.  The efficiency and stability of these additional collations
    depend on the collation provider, the provider version, and the locale.
   </para>
  </sect3>
  <sect3 id="collation-managing-predefined">