Document collation handling in SQL and plpgsql functions.

This is pretty minimal but covers the bare facts.
2011-03-25 18:21:25 -04:00 · 2011-03-25 18:21:25 -04:00 · 9b19c12e1d
commit 9b19c12e1d
parent a4425e3200
3 changed files with 132 additions and 0 deletions
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@ -316,6 +316,8 @@ initdb --locale=sv_SE
 <sect1 id="collation">
  <title>Collation Support</title>
  <indexterm zone="collation"><primary>collation</></>
  <para>
   The collation feature allows specifying the sort order and certain
   other locale aspects of data per-column, or even per-operation.
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@ -706,6 +706,81 @@ SELECT merge_fields(t.*) FROM table1 t WHERE ... ;
    structure on-the-fly.
   </para>
  </sect2>
  <sect2 id="plpgsql-declaration-collation">
   <title>Collation of <application>PL/pgSQL</application> Variables</title>
   <indexterm>
    <primary>collation</>
    <secondary>in PL/pgSQL</>
   </indexterm>
   <para>
    When a <application>PL/pgSQL</application> function has one or more
    parameters of collatable data types, a collation is identified for each
    function call depending on the collations assigned to the actual
    arguments, as described in <xref linkend="collation">.  If a collation is
    successfully identified (i.e., there are no conflicts of implicit
    collations among the arguments) then all the collatable parameters are
    treated as having that collation implicitly.  This will affect the
    behavior of collation-sensitive operations within the function.
    For example, consider
 <programlisting>
 CREATE FUNCTION less_than(a text, b text) RETURNS boolean AS $$
 BEGIN
    RETURN a &lt; b;
 END;
 $$ LANGUAGE plpgsql;
 SELECT less_than(text_field_1, text_field_2) FROM table1;
 SELECT less_than(text_field_1, text_field_2 COLLATE "C") FROM table1;
 </programlisting>
    The first use of <function>less_than</> will use the common collation
    of <structfield>text_field_1</> and <structfield>text_field_2</> for
    the comparison, while the second use will use <literal>C</> collation.
   </para>
   <para>
    Furthermore, the identified collation is also assumed as the collation of
    any local variables that are of collatable types.  Thus this function
    would not work any differently if it were written as
 <programlisting>
 CREATE FUNCTION less_than(a text, b text) RETURNS boolean AS $$
 DECLARE
    local_a text := a;
    local_b text := b;
 BEGIN
    RETURN local_a &lt; local_b;
 END;
 $$ LANGUAGE plpgsql;
 </programlisting>
   </para>
   <para>
    If there are no parameters of collatable data types, or no common
    collation can be identified for them, then parameters and local variables
    use the default collation of their data type (which is usually the
    database's default collation, but could be different for variables of
    domain types).
   </para>
   <para>
    Explicit <literal>COLLATE</> clauses can be written inside a function
    if it is desired to force a particular collation to be used regardless
    of what the function is called with.  For example,
 <programlisting>
 CREATE FUNCTION less_than_c(a text, b text) RETURNS boolean AS $$
 BEGIN
    RETURN a &lt; b COLLATE "C";
 END;
 $$ LANGUAGE plpgsql;
 </programlisting>
   </para>
  </sect2>
  </sect1>
  <sect1 id="plpgsql-expressions">
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@ -1100,6 +1100,61 @@ SELECT concat_values('|', 1, 4, 2);
 </screen>
    </para>
   </sect2>
   <sect2>
    <title><acronym>SQL</acronym> Functions with Collations</title>
    <indexterm>
     <primary>collation</>
     <secondary>in SQL functions</>
    </indexterm>
    <para>
     When a SQL function has one or more parameters of collatable data types,
     a collation is identified for each function call depending on the
     collations assigned to the actual arguments, as described in <xref
     linkend="collation">.  If a collation is successfully identified
     (i.e., there are no conflicts of implicit collations among the arguments)
     then all the collatable parameters are treated as having that collation
     implicitly.  This will affect the behavior of collation-sensitive
     operations within the function.  For example, using the
     <function>anyleast</> function described above, the result of
 <programlisting>
 SELECT anyleast('abc'::text, 'ABC');
 </programlisting>
     will depend on the database's default collation.  In <literal>C</> locale
     the result will be <literal>ABC</>, but in many other locales it will
     be <literal>abc</>.  The collation to use can be forced by adding
     a <literal>COLLATE</> clause to any of the arguments, for example
 <programlisting>
 SELECT anyleast('abc'::text, 'ABC' COLLATE "C");
 </programlisting>
     Alternatively, if you wish a function to operate with a particular
     collation regardless of what it is called with, insert
     <literal>COLLATE</> clauses as needed in the function definition.
     This version of <function>anyleast</> would always use <literal>en_US</>
     locale to compare strings:
 <programlisting>
 CREATE FUNCTION anyleast (VARIADIC anyarray) RETURNS anyelement AS $$
    SELECT min($1[i] COLLATE "en_US") FROM generate_subscripts($1, 1) g(i);
 $$ LANGUAGE SQL;
 </programlisting>
     But note that this will throw an error if applied to a non-collatable
     data type.
    </para>
    <para>
     If no common collation can be identified among the actual arguments,
     then a SQL function treats its parameters as having their data types'
     default collation (which is usually the database's default collation,
     but could be different for parameters of domain types).
    </para>
    <para>
     The behavior of collatable parameters can be thought of as a limited
     form of polymorphism, applicable only to textual data types.
    </para>
   </sect2>
  </sect1>
  <sect1 id="xfunc-overload">