libpq: URI parsing fixes

Drop special handling of host component with slashes to mean
Unix-domain socket.  Specify it as separate parameter or using
percent-encoding now.

Allow omitting username, password, and port even if the corresponding
designators are present in URI.

Handle percent-encoding in query parameter keywords.

Alex Shulgin

some documentation improvements by myself
This commit is contained in:
Peter Eisentraut 2012-05-28 22:44:34 +03:00
parent 388d251679
commit 2d612abd4d
4 changed files with 272 additions and 259 deletions

View File

@ -711,6 +711,124 @@ PGPing PQping(const char *conninfo);
</variablelist> </variablelist>
</para> </para>
<sect2 id="libpq-connstring">
<title>Connection Strings</title>
<indexterm zone="libpq-connstring">
<primary><literal>conninfo</literal></primary>
</indexterm>
<indexterm zone="libpq-connstring">
<primary><literal>URI</literal></primary>
</indexterm>
<para>
Several <application>libpq</> functions parse a user-specified string to obtain
connection parameters. There are two accepted formats for these strings:
plain <literal>keyword = value</literal> strings
and <ulink url="http://www.ietf.org/rfc/rfc3986.txt">RFC
3986</ulink> URIs.
</para>
<sect3>
<title>Keyword/Value Connection Strings</title>
<para>
In the first format, each parameter setting is in the form
<literal>keyword = value</literal>. Spaces around the equal sign are
optional. To write an empty value, or a value containing spaces, surround it
with single quotes, e.g., <literal>keyword = 'a value'</literal>. Single
quotes and backslashes within
the value must be escaped with a backslash, i.e., <literal>\'</literal> and
<literal>\\</literal>.
</para>
<para>
Example:
<programlisting>
host=localhost port=5432 dbname=mydb connect_timeout=10
</programlisting>
</para>
<para>
The recognized parameter key words are listed in <xref
linkend="libpq-paramkeywords">.
</para>
</sect3>
<sect3>
<title>Connection URIs</title>
<para>
The general form for a connection <acronym>URI</acronym> is:
<synopsis>
postgresql://[user[:password]@][netloc][:port][/dbname][?param1=value1&amp;...]
</synopsis>
</para>
<para>
The <acronym>URI</acronym> scheme designator can be either
<literal>postgresql://</literal> or <literal>postgres://</literal>. Each
of the <acronym>URI</acronym> parts is optional. The following examples
illustrate valid <acronym>URI</acronym> syntax uses:
<programlisting>
postgresql://
postgresql://localhost
postgresql://localhost:5433
postgresql://localhost/mydb
postgresql://user@localhost
postgresql://user:secret@localhost
postgresql://other@localhost/otherdb?connect_timeout=10&amp;application_name=myapp
</programlisting>
Components of the hierarchical part of the <acronym>URI</acronym> can also
be given as parameters. For example:
<programlisting>
postgresql:///mydb?host=localhost&amp;port=5433
</programlisting>
</para>
<para>
Percent-encoding may be used to include symbols with special meaning in any
of the <acronym>URI</acronym> parts.
</para>
<para>
Any connection parameters not corresponding to key words listed in <xref
linkend="libpq-paramkeywords"> are ignored and a warning message about them
is sent to <filename>stderr</filename>.
</para>
<para>
For improved compatibility with JDBC connection <acronym>URI</acronym>s,
instances of parameter <literal>ssl=true</literal> are translated into
<literal>sslmode=require</literal>.
</para>
<para>
The host part may be either hostname or an IP address. To specify an
IPv6 host address, enclose it in square brackets:
<synopsis>
postgresql://[2001:db8::1234]/database
</synopsis>
</para>
<para>
The host component is interpreted as described for the parameter <xref
linkend="libpq-connect-host">. In particular, a Unix-domain socket
connection is chosen if the host part is either empty or starts with a
slash, otherwise a TCP/IP connection is initiated. Note, however, that the
slash is a reserved character in the hierarchical part of the URI. So, to
specify a non-standard Unix-domain socket directory, either omit the host
specification in the URI and specify the host as a parameter, or
percent-encode the path in the host component of the URI:
<programlisting>
postgresql:///dbname?host=/var/lib/postgresql
postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
</programlisting>
</para>
</sect3>
</sect2>
<sect2 id="libpq-paramkeywords"> <sect2 id="libpq-paramkeywords">
<title>Parameter Key Words</title> <title>Parameter Key Words</title>
@ -1220,107 +1338,6 @@ PGPing PQping(const char *conninfo);
</variablelist> </variablelist>
</para> </para>
</sect2> </sect2>
<sect2 id="libpq-connstring">
<title>Connection Strings</title>
<indexterm zone="libpq-connstring">
<primary><literal>conninfo</literal></primary>
</indexterm>
<indexterm zone="libpq-connstring">
<primary><literal>URI</literal></primary>
</indexterm>
<para>
Several <application>libpq</> functions parse a user-specified string to obtain
connection parameters. There are two accepted formats for these strings:
plain <literal>keyword = value</literal> strings, and URIs.
</para>
<para>
In the first format, each parameter setting is in the form
<literal>keyword = value</literal>. Spaces around the equal sign are
optional. To write an empty value, or a value containing spaces, surround it
with single quotes, e.g., <literal>keyword = 'a value'</literal>. Single
quotes and backslashes within
the value must be escaped with a backslash, i.e., <literal>\'</literal> and
<literal>\\</literal>.
</para>
<para>
The currently recognized parameter key words are listed in
<xref linkend="libpq-paramkeywords">.
</para>
<para>
The general form for connection <acronym>URI</acronym> is the
following:
<synopsis>
postgresql://[user[:password]@][unix-socket][:port[/dbname]][?param1=value1&amp;...]
postgresql://[user[:password]@][net-location][:port][/dbname][?param1=value1&amp;...]
</synopsis>
</para>
<para>
The <acronym>URI</acronym> designator can be either
<literal>postgresql://</literal> or <literal>postgres://</literal> and
each of the <acronym>URI</acronym> parts is optional. The following
examples illustrate valid <acronym>URI</acronym> syntax uses:
<synopsis>
postgresql://
postgresql://localhost
postgresql://localhost:5433
postgresql://localhost/mydb
postgresql://user@localhost
postgresql://user:secret@localhost
postgresql://other@localhost/otherdb
</synopsis>
</para>
<para>
Percent-encoding may be used to include a symbol with special meaning in
any of the <acronym>URI</acronym> parts.
</para>
<para>
Additional connection parameters may optionally follow the base <acronym>URI</acronym>.
Any connection parameters not corresponding to key words listed
in <xref linkend="libpq-paramkeywords"> are ignored and a warning message
about them is sent to <filename>stderr</filename>.
</para>
<para>
For improved compatibility with JDBC connection <acronym>URI</acronym>
syntax, instances of parameter <literal>ssl=true</literal> are translated
into <literal>sslmode=require</literal> (see above.)
</para>
<para>
The host part may be either hostname or an IP address. To specify an
IPv6 host address, enclose it in square brackets:
<synopsis>
postgresql://[2001:db8::1234]/database
</synopsis>
As a special case, a host part which starts with <symbol>/</symbol> is
treated as a local Unix socket directory to look for the connection
socket special file:
<synopsis>
postgresql:///path/to/pgsql/socket/dir
</synopsis>
The whole connection string up to the extra parameters designator
(<symbol>?</symbol>) or the port designator (<symbol>:</symbol>) is treated
as the absolute path to the socket directory
(<literal>/path/to/pgsql/socket/dir</literal> in this example.) To specify
a non-default database name in this case you can use either of the following
syntaxes:
<synopsis>
postgresql:///path/to/pgsql/socket/dir?dbname=otherdb
postgresql:///path/to/pgsql/socket/dir:5432/otherdb
</synopsis>
</para>
</sect2>
</sect1> </sect1>
<sect1 id="libpq-status"> <sect1 id="libpq-status">

View File

@ -4544,18 +4544,15 @@ conninfo_uri_parse(const char *uri, PQExpBuffer errorMessage,
* options from the URI. * options from the URI.
* If not successful, returns false and fills errorMessage accordingly. * If not successful, returns false and fills errorMessage accordingly.
* *
* Parses the connection URI string in 'uri' according to the URI syntax: * Parses the connection URI string in 'uri' according to the URI syntax (RFC
* 3986):
* *
* postgresql://[user[:pwd]@][unix-socket][:port[/dbname]][?param1=value1&...] * postgresql://[user[:password]@][netloc][:port][/dbname][?param1=value1&...]
* postgresql://[user[:pwd]@][net-location][:port][/dbname][?param1=value1&...]
* *
* "net-location" is a hostname, an IPv4 address, or an IPv6 address surrounded * where "netloc" is a hostname, an IPv4 address, or an IPv6 address surrounded
* by literal square brackets. To be recognized as a unix-domain socket, the * by literal square brackets.
* value must start with a slash '/'. Note slight inconsistency in that dbname
* can always be specified after net-location, but after unix-socket it can only
* be specified if there is a port specification.
* *
* Any of those elements might be percent-encoded (%xy). * Any of the URI parts might use percent-encoding (%xy).
*/ */
static bool static bool
conninfo_uri_parse_options(PQconninfoOption *options, const char *uri, conninfo_uri_parse_options(PQconninfoOption *options, const char *uri,
@ -4566,6 +4563,8 @@ conninfo_uri_parse_options(PQconninfoOption *options, const char *uri,
char *buf = strdup(uri); /* need a modifiable copy of the input URI */ char *buf = strdup(uri); /* need a modifiable copy of the input URI */
char *start = buf; char *start = buf;
char prevchar = '\0'; char prevchar = '\0';
char *user = NULL;
char *host = NULL;
bool retval = false; bool retval = false;
if (buf == NULL) if (buf == NULL)
@ -4593,8 +4592,6 @@ conninfo_uri_parse_options(PQconninfoOption *options, const char *uri,
++p; ++p;
if (*p == '@') if (*p == '@')
{ {
char *user;
/* /*
* Found username/password designator, so URI should be of the form * Found username/password designator, so URI should be of the form
* "scheme://user[:password]@[netloc]". * "scheme://user[:password]@[netloc]".
@ -4609,14 +4606,8 @@ conninfo_uri_parse_options(PQconninfoOption *options, const char *uri,
prevchar = *p; prevchar = *p;
*p = '\0'; *p = '\0';
if (!*user) if (*user &&
{ !conninfo_storeval(options, "user", user,
printfPQExpBuffer(errorMessage,
libpq_gettext("invalid empty username specifier in URI: %s\n"),
uri);
goto cleanup;
}
if (!conninfo_storeval(options, "user", user,
errorMessage, false, true)) errorMessage, false, true))
goto cleanup; goto cleanup;
@ -4628,15 +4619,8 @@ conninfo_uri_parse_options(PQconninfoOption *options, const char *uri,
++p; ++p;
*p = '\0'; *p = '\0';
if (!*password) if (*password &&
{ !conninfo_storeval(options, "password", password,
printfPQExpBuffer(errorMessage,
libpq_gettext("invalid empty password specifier in URI: %s\n"),
uri);
goto cleanup;
}
if (!conninfo_storeval(options, "password", password,
errorMessage, false, true)) errorMessage, false, true))
goto cleanup; goto cleanup;
} }
@ -4656,30 +4640,7 @@ conninfo_uri_parse_options(PQconninfoOption *options, const char *uri,
* "p" has been incremented past optional URI credential information at * "p" has been incremented past optional URI credential information at
* this point and now points at the "netloc" part of the URI. * this point and now points at the "netloc" part of the URI.
* *
* Check for local unix socket dir. * Look for IPv6 address.
*/
if (*p == '/')
{
const char *socket = p;
/* Look for possible port specifier or query parameters */
while (*p && *p != ':' && *p != '?')
++p;
prevchar = *p;
*p = '\0';
if (!conninfo_storeval(options, "host", socket,
errorMessage, false, true))
goto cleanup;
}
else
{
/* Not a unix socket dir: parse as host name or address */
const char *host;
/*
*
* Look for IPv6 address
*/ */
if (*p == '[') if (*p == '[')
{ {
@ -4733,10 +4694,11 @@ conninfo_uri_parse_options(PQconninfoOption *options, const char *uri,
prevchar = *p; prevchar = *p;
*p = '\0'; *p = '\0';
if (!conninfo_storeval(options, "host", host, if (*host &&
!conninfo_storeval(options, "host", host,
errorMessage, false, true)) errorMessage, false, true))
goto cleanup; goto cleanup;
}
if (prevchar == ':') if (prevchar == ':')
{ {
@ -4748,14 +4710,8 @@ conninfo_uri_parse_options(PQconninfoOption *options, const char *uri,
prevchar = *p; prevchar = *p;
*p = '\0'; *p = '\0';
if (!*port) if (*port &&
{ !conninfo_storeval(options, "port", port,
printfPQExpBuffer(errorMessage,
libpq_gettext("missing port specifier in URI: %s\n"),
uri);
goto cleanup;
}
if (!conninfo_storeval(options, "port", port,
errorMessage, false, true)) errorMessage, false, true))
goto cleanup; goto cleanup;
} }
@ -4813,9 +4769,10 @@ conninfo_uri_parse_params(char *params,
{ {
while (*params) while (*params)
{ {
const char *keyword = params; char *keyword = params;
const char *value = NULL; char *value = NULL;
char *p = params; char *p = params;
bool malloced = false;
/* /*
* Scan the params string for '=' and '&', marking the end of keyword * Scan the params string for '=' and '&', marking the end of keyword
@ -4866,35 +4823,66 @@ conninfo_uri_parse_params(char *params,
++p; ++p;
} }
keyword = conninfo_uri_decode(keyword, errorMessage);
if (keyword == NULL)
{
/* conninfo_uri_decode already set an error message */
return false;
}
value = conninfo_uri_decode(value, errorMessage);
if (value == NULL)
{
/* conninfo_uri_decode already set an error message */
free(keyword);
return false;
}
malloced = true;
/* /*
* Special keyword handling for improved JDBC compatibility. Note * Special keyword handling for improved JDBC compatibility.
* we fail to detect URI-encoded values here, but we don't care.
*/ */
if (strcmp(keyword, "ssl") == 0 && if (strcmp(keyword, "ssl") == 0 &&
strcmp(value, "true") == 0) strcmp(value, "true") == 0)
{ {
free(keyword);
free(value);
malloced = false;
keyword = "sslmode"; keyword = "sslmode";
value = "require"; value = "require";
} }
/* /*
* Store the value if the corresponding option exists; ignore * Store the value if the corresponding option exists; ignore
* otherwise. * otherwise. At this point both keyword and value are not
* URI-encoded.
*/ */
if (!conninfo_storeval(connOptions, keyword, value, if (!conninfo_storeval(connOptions, keyword, value,
errorMessage, true, true)) errorMessage, true, false))
{ {
/* /*
* Check if there was a hard error when decoding or storing the * Check if there was a hard error when decoding or storing the
* option. * option.
*/ */
if (errorMessage->len != 0) if (errorMessage->len != 0)
{
if (malloced)
{
free(keyword);
free(value);
}
return false; return false;
}
fprintf(stderr, fprintf(stderr,
libpq_gettext("WARNING: ignoring unrecognized URI query parameter: %s\n"), libpq_gettext("WARNING: ignoring unrecognized URI query parameter: %s\n"),
keyword); keyword);
} }
if (malloced)
{
free(keyword);
free(value);
}
/* Proceed to next key=value pair */ /* Proceed to next key=value pair */
params = p; params = p;
@ -5017,7 +5005,8 @@ conninfo_getval(PQconninfoOption *connOptions,
* Store a (new) value for an option corresponding to the keyword in * Store a (new) value for an option corresponding to the keyword in
* connOptions array. * connOptions array.
* *
* If uri_decode is true, keyword and value are URI-decoded. * If uri_decode is true, the value is URI-decoded. The keyword is always
* assumed to be non URI-encoded.
* *
* If successful, returns a pointer to the corresponding PQconninfoOption, * If successful, returns a pointer to the corresponding PQconninfoOption,
* which value is replaced with a strdup'd copy of the passed value string. * which value is replaced with a strdup'd copy of the passed value string.
@ -5035,31 +5024,15 @@ conninfo_storeval(PQconninfoOption *connOptions,
{ {
PQconninfoOption *option; PQconninfoOption *option;
char *value_copy; char *value_copy;
char *keyword_copy = NULL;
/* option = conninfo_find(connOptions, keyword);
* Decode the keyword. XXX this is seldom necessary as keywords do not
* normally need URI-escaping. It'd be good to do away with the
* malloc/free overhead and the general ugliness, but I don't see a
* better way to handle it.
*/
if (uri_decode)
{
keyword_copy = conninfo_uri_decode(keyword, errorMessage);
if (keyword_copy == NULL)
/* conninfo_uri_decode already set an error message */
goto failed;
}
option = conninfo_find(connOptions,
keyword_copy != NULL ? keyword_copy : keyword);
if (option == NULL) if (option == NULL)
{ {
if (!ignoreMissing) if (!ignoreMissing)
printfPQExpBuffer(errorMessage, printfPQExpBuffer(errorMessage,
libpq_gettext("invalid connection option \"%s\"\n"), libpq_gettext("invalid connection option \"%s\"\n"),
keyword); keyword);
goto failed; return NULL;
} }
if (uri_decode) if (uri_decode)
@ -5067,7 +5040,7 @@ conninfo_storeval(PQconninfoOption *connOptions,
value_copy = conninfo_uri_decode(value, errorMessage); value_copy = conninfo_uri_decode(value, errorMessage);
if (value_copy == NULL) if (value_copy == NULL)
/* conninfo_uri_decode already set an error message */ /* conninfo_uri_decode already set an error message */
goto failed; return NULL;
} }
else else
{ {
@ -5076,7 +5049,7 @@ conninfo_storeval(PQconninfoOption *connOptions,
if (value_copy == NULL) if (value_copy == NULL)
{ {
printfPQExpBuffer(errorMessage, libpq_gettext("out of memory\n")); printfPQExpBuffer(errorMessage, libpq_gettext("out of memory\n"));
goto failed; return NULL;
} }
} }
@ -5084,14 +5057,7 @@ conninfo_storeval(PQconninfoOption *connOptions,
free(option->val); free(option->val);
option->val = value_copy; option->val = value_copy;
if (keyword_copy != NULL)
free(keyword_copy);
return option; return option;
failed:
if (keyword_copy != NULL)
free(keyword_copy);
return NULL;
} }
/* /*

View File

@ -20,7 +20,7 @@ trying postgresql://uri-user@host/
user='uri-user' host='host' (inet) user='uri-user' host='host' (inet)
trying postgresql://uri-user@ trying postgresql://uri-user@
user='uri-user' host='' (local) user='uri-user' (local)
trying postgresql://host:12345/ trying postgresql://host:12345/
host='host' port='12345' (inet) host='host' port='12345' (inet)
@ -38,10 +38,10 @@ trying postgresql://host
host='host' (inet) host='host' (inet)
trying postgresql:// trying postgresql://
host='' (local) (local)
trying postgresql://?hostaddr=127.0.0.1 trying postgresql://?hostaddr=127.0.0.1
host='' hostaddr='127.0.0.1' (inet) hostaddr='127.0.0.1' (inet)
trying postgresql://example.com?hostaddr=63.1.2.4 trying postgresql://example.com?hostaddr=63.1.2.4
host='example.com' hostaddr='63.1.2.4' (inet) host='example.com' hostaddr='63.1.2.4' (inet)
@ -59,7 +59,7 @@ trying postgresql://host/db?u%73er=someotheruser&port=12345
user='someotheruser' dbname='db' host='host' port='12345' (inet) user='someotheruser' dbname='db' host='host' port='12345' (inet)
trying postgresql://host/db?u%7aer=someotheruser&port=12345 trying postgresql://host/db?u%7aer=someotheruser&port=12345
WARNING: ignoring unrecognized URI query parameter: u%7aer WARNING: ignoring unrecognized URI query parameter: uzer
dbname='db' host='host' port='12345' (inet) dbname='db' host='host' port='12345' (inet)
trying postgresql://host:12345?user=uri-user trying postgresql://host:12345?user=uri-user
@ -87,10 +87,19 @@ trying postgresql://[::1]
host='::1' (inet) host='::1' (inet)
trying postgres:// trying postgres://
host='' (local) (local)
trying postgres:///tmp trying postgres:///
host='/tmp' (local) (local)
trying postgres:///db
dbname='db' (local)
trying postgres://uri-user@/db
user='uri-user' dbname='db' (local)
trying postgres://?host=/path/to/socket/dir
host='/path/to/socket/dir' (local)
trying postgresql://host?uzer= trying postgresql://host?uzer=
WARNING: ignoring unrecognized URI query parameter: uzer WARNING: ignoring unrecognized URI query parameter: uzer
@ -145,19 +154,32 @@ uri-regress: invalid percent-encoded token: %
trying postgres://@host trying postgres://@host
uri-regress: invalid empty username specifier in URI: postgres://@host host='host' (inet)
trying postgres://host:/ trying postgres://host:/
uri-regress: missing port specifier in URI: postgres://host:/ host='host' (inet)
trying postgres://:12345/
port='12345' (local)
trying postgres://otheruser@/no/such/directory trying postgres://otheruser@?host=/no/such/directory
user='otheruser' host='/no/such/directory' (local) user='otheruser' host='/no/such/directory' (local)
trying postgres://otheruser@/no/such/socket/path:12345 trying postgres://otheruser@/?host=/no/such/directory
user='otheruser' host='/no/such/directory' (local)
trying postgres://otheruser@:12345?host=/no/such/socket/path
user='otheruser' host='/no/such/socket/path' port='12345' (local) user='otheruser' host='/no/such/socket/path' port='12345' (local)
trying postgres://otheruser@/path/to/socket:12345/db trying postgres://otheruser@:12345/db?host=/path/to/socket
user='otheruser' dbname='db' host='/path/to/socket' port='12345' (local) user='otheruser' dbname='db' host='/path/to/socket' port='12345' (local)
trying postgres://:12345/db?host=/path/to/socket
dbname='db' host='/path/to/socket' port='12345' (local)
trying postgres://:12345?host=/path/to/socket
host='/path/to/socket' port='12345' (local)
trying postgres://%2Fvar%2Flib%2Fpostgresql/dbname
dbname='dbname' host='/var/lib/postgresql' (local)

View File

@ -28,7 +28,10 @@ postgresql://[2001:db8::1234]/
postgresql://[200z:db8::1234]/ postgresql://[200z:db8::1234]/
postgresql://[::1] postgresql://[::1]
postgres:// postgres://
postgres:///tmp postgres:///
postgres:///db
postgres://uri-user@/db
postgres://?host=/path/to/socket/dir
postgresql://host?uzer= postgresql://host?uzer=
postgre:// postgre://
postgres://[::1 postgres://[::1
@ -44,6 +47,11 @@ postgresql://%1
postgresql://% postgresql://%
postgres://@host postgres://@host
postgres://host:/ postgres://host:/
postgres://otheruser@/no/such/directory postgres://:12345/
postgres://otheruser@/no/such/socket/path:12345 postgres://otheruser@?host=/no/such/directory
postgres://otheruser@/path/to/socket:12345/db postgres://otheruser@/?host=/no/such/directory
postgres://otheruser@:12345?host=/no/such/socket/path
postgres://otheruser@:12345/db?host=/path/to/socket
postgres://:12345/db?host=/path/to/socket
postgres://:12345?host=/path/to/socket
postgres://%2Fvar%2Flib%2Fpostgresql/dbname