diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml index f1f5b1caa3..0d288257cc 100644 --- a/doc/src/sgml/protocol.sgml +++ b/doc/src/sgml/protocol.sgml @@ -1,16 +1,8 @@ - + Frontend/Backend Protocol - - - This is currently a DRAFT description of FE/BE protocol - version 3.0. Details are still subject to change. - In particular, the representation of binary data is still under debate. - - - PostgreSQL uses a message-based protocol for communication between frontends and backends (clients and servers). @@ -139,8 +131,8 @@ lack specific values for parameters. A portal represents a ready-to-execute or already-partially-executed statement, with any missing parameter values filled in. (For SELECT statements, - a portal is equivalent to an open cursor, but we use a different term - since cursors don't handle non-SELECT statements.) + a portal is equivalent to an open cursor, but we choose to use a different + term since cursors don't handle non-SELECT statements.) @@ -167,6 +159,39 @@ of multiple uses. + + + Formats and Format Codes + + + Data of a particular datatype might be transmitted in any of several + different formats. As of PostgreSQL 7.4 + the only supported formats are text and binary, + but the protocol makes provision for future extensions. The desired + format for any value is specified by a format code. + Clients may specify a format code for each transmitted parameter value + and for each column of a query result. Text has format code zero, + binary has format code one, and all other format codes are reserved + for future definition. + + + + The text representation of values is whatever strings are produced + and accepted by the input/output conversion functions for the + particular datatype. In the transmitted representation, there is + no trailing null character; the frontend must add one to received + values if it wants to process them as C strings. + (The text format does not allow embedded nulls, by the way.) + + + + Binary representations for integers use network byte order (most + significant byte first). For other datatypes consult the documentation + or source code to learn about the binary representation. Keep in mind + that binary representations for complex datatypes may change across + server versions; the text format is usually the more portable choice. + + @@ -174,7 +199,9 @@ This section describes the message flow and the semantics of each - message type. There are several different sub-protocols + message type. (Details of the exact representation of each message + appear in .) + There are several different sub-protocols depending on the state of the connection: start-up, query, function call, COPY, and termination. There are also special provisions for asynchronous operations (including @@ -189,7 +216,9 @@ To begin a session, a frontend opens a connection to the server and sends a startup message. This message includes the names of the user and of the database the user wants to connect to; it also identifies the particular - protocol version to be used. The server then uses this information and + protocol version to be used. (Optionally, the startup message can include + additional settings for run-time parameters.) + The server then uses this information and the contents of its configuration files (such as pg_hba.conf) to determine whether the connection is provisionally acceptable, and what additional @@ -333,6 +362,13 @@ some ParameterStatus messages, BackendKeyData, and finally ReadyForQuery. + + During this phase the backend will attempt to apply any additional + run-time parameter settings that were given in the startup message. + If successful, these values become session defaults. An error causes + ErrorResponse and exit. + + The possible messages from the backend in this phase are: @@ -410,7 +446,9 @@ A simple query cycle is initiated by the frontend sending a Query message - to the backend. The backend then sends one or more response + to the backend. The message includes an SQL command (or commands) + expressed as a text string. + The backend then sends one or more response messages depending on the contents of the query command string, and finally a ReadyForQuery response message. ReadyForQuery informs the frontend that it may safely send a new command. @@ -459,14 +497,23 @@ Indicates that rows are about to be returned in response to a SELECT, FETCH, etc query. - The message contents describe the layout of the rows. This - will be followed by a DataRow or BinaryRow message (depending on - whether a binary cursor was specified) for each row being returned + The contents of this message describe the column layout of the rows. + This will be followed by a DataRow message for each row being returned to the frontend. + + DataRow + + + One of the set of rows returned by + a SELECT, FETCH, etc query. + + + + EmptyQueryResponse @@ -517,7 +564,7 @@ The response to a SELECT query (or other queries that return rowsets, such as EXPLAIN or SHOW) normally consists of RowDescription, zero or more - DataRow or BinaryRow messages, and then CommandComplete. + DataRow messages, and then CommandComplete. COPY to or from the frontend invokes special protocol as described in . All other query types normally produce only @@ -545,6 +592,14 @@ individual query. + + In simple Query mode, the format of retrieved values is always text, + except when the given command is a FETCH from a cursor + declared with the BINARY option. In that case, the + retrieved values are in binary format. The format codes given in + the RowDescription message tell which format is being used. + + A frontend must be prepared to accept ErrorResponse and NoticeResponse messages whenever it is expecting any other type of @@ -612,10 +667,22 @@ statement (empty string denotes the unnamed prepared statement), the name of the destination portal (empty string denotes the unnamed portal), and the values to use for any parameter placeholders present in the prepared - statement. The response is either BindComplete or ErrorResponse. The + statement. The supplied parameter set must match those needed by the prepared statement. + Bind also specifies the format to use for any data returned + by the query; the format can be specified overall, or per-column. + The response is either BindComplete or ErrorResponse. + + + The choice between text and binary output is determined by the format + codes given in Bind, regardless of the SQL command involved. The + BINARY attribute in cursor declarations is irrelevant when + using extended query protocol. + + + If successfully created, a named portal object lasts till the end of the current transaction, unless explicitly destroyed. An unnamed portal is @@ -631,17 +698,15 @@ Once a portal exists, it can be executed using an Execute message. The Execute message specifies the portal name (empty string denotes the - unnamed portal), the desired output format (text or binary), and + unnamed portal) and a maximum result-row count (zero meaning fetch all rows). - The output format and result-row count are only meaningful for portals - containing commands that return rowsets; they are ignored for other types - of commands. The possible + The result-row count is only meaningful for portals + containing commands that return rowsets; in other cases the command is + always executed to completion, and the row count is ignored. + The possible responses to Execute are the same as those described above for queries issued via simple query protocol, except that Execute doesn't cause - ReadyForQuery to be issued. Also, the choice between text and binary - output (DataRow or BinaryRow messages) is determined by Execute's - format field, regardless of the command; the BINARY attribute - in cursor declarations is irrelevant when using this protocol. + ReadyForQuery to be issued. @@ -698,10 +763,13 @@ The Describe message (statement variant) specifies the name of an existing prepared statement (or an empty string for the unnamed prepared statement). The response is a ParameterDescription message describing the - parameters needed by the statement (if any), followed by a RowDescription - message describing the rows that will be returned when the statement is - eventually executed (or NoData if the statement will not return rows). - ErrorResponse is issued if there is no such prepared statement. + parameters needed by the statement, followed by a RowDescription message + describing the rows that will be returned when the statement is eventually + executed (or a NoData message if the statement will not return rows). + ErrorResponse is issued if there is no such prepared statement. Note that + since Bind has not yet been issued, the formats to be used for returned + columns are not yet known to the backend; the format code fields in the + RowDescription message will be zeroes in this case. @@ -748,6 +816,13 @@ Function Call + + The Function Call sub-protocol allows the client to request a direct + call of any function that exists in the database's + pg_proc system catalog. The client must have + execute permission for the function. + + The Function Call sub-protocol is a legacy feature that is probably best @@ -780,25 +855,17 @@ - FunctionResultResponse + FunctionCallResponse - The function call was executed and returned a non-null result. + The function call was completed and returned the result given + in the message. (Note that the Function Call protocol can only handle a single scalar result, not a rowtype or set of results.) - - FunctionVoidResponse - - - The function call was executed and returned a NULL value. - - - - ReadyForQuery @@ -882,6 +949,14 @@ indeed any message type other than CopyData or CopyDone) as terminating the copy-out mode. + + + The CopyInResponse and CopyOutResponse messages include fields that + inform the frontend of the number of columns per row and the format + codes being used for each column. (As of the present implementation, + all columns in a given COPY operation will use the same + format, but the message design does not assume this.) + @@ -1114,8 +1189,8 @@ This section describes the base data types used in messages. - An n bit integer in network byte - order. + An n-bit integer in network byte + order (most significant byte first). If i is specified it is the exact value that will appear, otherwise the value is variable. Eg. Int16, Int32(42). @@ -1123,6 +1198,21 @@ This section describes the base data types used in messages. + + + Intn[k] + + + + An array of k + n-bit integers, each in network + byte order. The array length k + is always determined by an earlier field in the message. + Eg. Int16[M]. + + + + String(s) @@ -1154,8 +1244,10 @@ characters that don't fit into your fixed-size buffer. - Exactly n bytes. If - c is specified it is the exact + Exactly n bytes. If the field + width n is not a constant, it is + always determinable from an earlier field in the message. + If c is specified it is the exact value. Eg. Byte2, Byte1('\n'). @@ -1571,86 +1663,6 @@ BackendKeyData (B) - - -BinaryRow (B) - - - - - - - Byte1('B') - - - - Identifies the message as a binary data row. - (Normally, a prior RowDescription message defines the number - of fields in the row and their data types. Note that the - receiver must know the number of fields to be - able to decode the message contents.) - - - - - - Int32 - - - - Length of message contents in bytes, including self. - - - - - - Byten - - - - A bit map with one bit for each field in the row. The 1st - field corresponds to bit 7 (MSB) of the 1st byte, the 2nd - field corresponds to bit 6 of the 1st byte, the 8th field - corresponds to bit 0 (LSB) of the 1st byte, the 9th field - corresponds to bit 7 of the 2nd byte, and so on. Each bit - is set if the value of the corresponding field is not NULL. - If the number of fields is not a multiple of 8, the remainder - of the last byte in the bit map is wasted. - - - - - Then, for each field with a non-NULL value, there is the following: - - - - Int32 - - - - Specifies the size of the value of the field, excluding - this size. - - - - - - Byten - - - - Specifies the value of the field itself in binary - format. n is the above size. - - - - - - - - - - Bind (F) @@ -1703,66 +1715,55 @@ Bind (F) - Int8 + Int16 - 0 if parameter values are specified in textual form. - 1 if parameter values are specified in binary form. + The number of parameter format codes that follow + (denoted C below). + This can be zero to indicate that there are no parameters + or that the parameters all use the default format (text); + or one, in which case the specified format code is applied + to all parameters; or it can equal the actual number of + parameters. - Int32 + Int16[C] - The number of parameter values specified - (may be zero). This must match the number of parameters - needed by the query. + The parameter format codes. Each must presently be + zero (text) or one (binary). - - If parameter values are specified in textual form, the following - appears for each parameter: - - - - Int8 - - - - 1 if the parameter is non-null. 0 if it is null. - - - - - - String - - - - The parameter value in textual form (that is, suitable - input for the parameter's datatype's input converter). - If the preceding byte specified a null parameter, then - the string is omitted. - - - - - If parameter values are specified in binary form, the following - appears for each parameter: - Int16 - Zero if the field is null, otherwise the typlen - for the field datatype. + The number of parameter values that follow (possibly zero). + This must match the number of parameters needed by the query. + + + + + Next, the following pair of fields appear for each parameter: + + + + Int32 + + + + The length of the parameter value, in bytes (this count + does not include itself). Can be zero. + As a special case, -1 indicates a NULL parameter value. + No value bytes follow in the NULL case. @@ -1772,12 +1773,40 @@ Bind (F) - The value of the field itself in binary format. - Omitted if the field is null. - n is the typlen - value if typlen is positive. If - typlen is -1 then the field value begins with - its own length as an Int32 (the length includes itself). + The value of the parameter, in the format indicated by the + associated format code. + n is the above length. + + + + + After the last parameter, the following fields appear: + + + + Int16 + + + + The number of result-column format codes that follow + (denoted R below). + This can be zero to indicate that there are no result columns + or that the result columns should all use the default format + (text); + or one, in which case the specified format code is applied + to all result columns (if any); or it can equal the actual + number of result columns of the query. + + + + + + Int16[R] + + + + The result-column format codes. Each must presently be + zero (text) or one (binary). @@ -2203,7 +2232,7 @@ CopyInResponse (B) - Int32(5) + Int32 @@ -2217,7 +2246,36 @@ CopyInResponse (B) - 0 for textual copy, 1 for binary copy. + 0 indicates the overall copy format is textual (rows + separated by newlines, columns separated by separator + characters, etc). + 1 indicates the overall copy format is binary (similar + to DataRow format). + See + for more information. + + + + + + Int16 + + + + The number of columns in the data to be copied + (denoted N below). + + + + + + Int16[N] + + + + The format codes to be used for each column. + Each must presently be zero (text) or one (binary). + All must be zero if the overall copy format is textual. @@ -2249,7 +2307,7 @@ CopyOutResponse (B) - Int32(5) + Int32 @@ -2263,7 +2321,36 @@ CopyOutResponse (B) - 0 for textual copy, 1 for binary copy. + 0 indicates the overall copy format is textual (rows + separated by newlines, columns separated by separator + characters, etc). + 1 indicates the overall copy format is binary (similar + to DataRow format). + See + for more information. + + + + + + Int16 + + + + The number of columns in the data to be copied + (denoted N below). + + + + + + Int16[N] + + + + The format codes to be used for each column. + Each must presently be zero (text) or one (binary). + All must be zero if the overall copy format is textual. @@ -2287,11 +2374,7 @@ DataRow (B) - Identifies the message as a text-format data row. - (Normally, a prior RowDescription message defines the number - of fields in the row and their data types. Note that the - receiver must know the number of fields to be - able to decode the message contents.) + Identifies the message as a data row. @@ -2307,23 +2390,16 @@ DataRow (B) - Byten + Int16 - A bit map with one bit for each field in the row. The 1st - field corresponds to bit 7 (MSB) of the 1st byte, the 2nd - field corresponds to bit 6 of the 1st byte, the 8th field - corresponds to bit 0 (LSB) of the 1st byte, the 9th field - corresponds to bit 7 of the 2nd byte, and so on. Each bit - is set if the value of the corresponding field is not NULL. - If the number of fields is not a multiple of 8, the remainder - of the last byte in the bit map is wasted. + The number of column values that follow (possibly zero). - Then, for each field with a non-NULL value, there is the following: + Next, the following pair of fields appear for each column: @@ -2331,8 +2407,10 @@ DataRow (B) - Specifies the size of the value of the field, in - bytes; the count includes itself. + The length of the column value, in bytes (this count + does not include itself). Can be zero. + As a special case, -1 indicates a NULL column value. + No value bytes follow in the NULL case. @@ -2342,12 +2420,9 @@ DataRow (B) - Specifies the value of the field itself in textual - form (that is, the result of the output-conversion - routine for the field's datatype). - n is the above size minus 4. - There is no trailing zero-byte in the field data; the - frontend must add one if it wants one. + The value of the column, in the format indicated by the + associated format code. + n is the above length. @@ -2557,17 +2632,6 @@ Execute (F) - - Int8 - - - - 0 to return results in textual form (DataRow messages). - 1 to return results in binary form (BinaryRow messages). - - - - Int32 @@ -2649,16 +2713,6 @@ FunctionCall (F) - - String("") - - - - Unused. - - - - Int32 @@ -2670,7 +2724,34 @@ FunctionCall (F) - Int32 + Int16 + + + + The number of argument format codes that follow + (denoted C below). + This can be zero to indicate that there are no arguments + or that the arguments all use the default format (text); + or one, in which case the specified format code is applied + to all arguments; or it can equal the actual number of + arguments. + + + + + + Int16[C] + + + + The argument format codes. Each must presently be + zero (text) or one (binary). + + + + + + Int16 @@ -2680,7 +2761,7 @@ FunctionCall (F) - Then, for each argument, there is the following: + Next, the following pair of fields appear for each argument: @@ -2688,8 +2769,10 @@ FunctionCall (F) - Specifies the size of the value of the argument, - excluding this size. + The length of the argument value, in bytes (this count + does not include itself). Can be zero. + As a special case, -1 indicates a NULL argument value. + No value bytes follow in the NULL case. @@ -2699,8 +2782,23 @@ FunctionCall (F) - Specifies the value of the field itself in binary - format. n is the above size. + The value of the argument, in the format indicated by the + associated format code. + n is the above length. + + + + + After the last argument, the following field appears: + + + + Int16 + + + + The format code for the function result. Must presently be + zero (text) or one (binary). @@ -2713,7 +2811,7 @@ FunctionCall (F) -FunctionResultResponse (B) +FunctionCallResponse (B) @@ -2740,23 +2838,15 @@ FunctionResultResponse (B) - - Byte1('G') - - - - Specifies that a non-null result was returned. - - - - Int32 - Specifies the size of the value of the result, excluding this - size. + The length of the function result value, in bytes (this count + does not include itself). Can be zero. + As a special case, -1 indicates a NULL function result. + No value bytes follow in the NULL case. @@ -2766,65 +2856,9 @@ FunctionResultResponse (B) - Specifies the value of the result itself in binary format. - n is the above size. - - - - - - Byte1('0') - - - - Unused. (Strictly speaking, FunctionResultResponse and - FunctionVoidResponse are the same thing but with some optional - parts to the message.) - - - - - - - - - - - - -FunctionVoidResponse (B) - - - - - - - - Byte1('V') - - - - Identifies the message as a function call result. - - - - - - Int32(5) - - - - Length of message contents in bytes, including self. - - - - - - Byte1('0') - - - - Specifies that a null result was returned. + The value of the function result, in the format indicated by + the associated format code. + n is the above length. @@ -3033,7 +3067,7 @@ ParameterDescription (B) - Int32 + Int16 @@ -3166,7 +3200,7 @@ Parse (F) - Int32 + Int16 @@ -3293,7 +3327,7 @@ PortalSuspended (B) Identifies the message as a portal-suspended indicator. - Note this only appears if an Execute row-count limit + Note this only appears if an Execute message's row-count limit was reached. @@ -3515,6 +3549,19 @@ RowDescription (B) + + + Int16 + + + + The format code being used for the field. Currently will + be zero (text) or one (binary). In a RowDescription + returned from the statement variant of Describe, the + format code is not yet known and will always be zero. + + + @@ -3580,14 +3627,14 @@ StartupMessage (F) - Int32 + Int32(196608) The protocol version number. The most significant 16 bits are - the major version number (3 or more for the format described - here). - The least significant 16 bits are the minor version number. + the major version number (3 for the protocol described here). + The least significant 16 bits are the minor version number + (0 for the protocol described here). @@ -3931,6 +3978,14 @@ The ReadyForQuery ('Z') message includes a transaction status indicator. + +The distinction between BinaryRow and DataRow message types is gone; the +single DataRow message type serves for returning data in all formats. +Note that the layout of DataRow has changed to make it easier to parse. +Also, the representation of binary values has changed: it is no longer +directly tied to the server's internal representation. + + There is a new extended query sub-protocol, which adds the frontend message types Parse, Bind, Execute, Describe, Close, Flush, and Sync, and the @@ -3947,8 +4002,17 @@ is a well-defined way to recover from errors during COPY. The special during COPY OUT. (It is still recognized as a terminator during COPY IN, but its use is deprecated and will eventually be removed.) Binary COPY is supported. -The CopyInResponse and CopyOutResponse messages include a field indicating -whether the COPY operation is text or binary. +The CopyInResponse and CopyOutResponse messages include fields indicating +the number of columns and the format of each column. + + + +The layout of FunctionCall and FunctionCallResponse messages has changed. +FunctionCall can now support passing NULL arguments to functions. It also +can handle passing parameters and retrieving results in either text or +binary format. There is no longer any reason to consider FunctionCall a +potential security hole, since it does not offer direct access to internal +server data representations. @@ -3960,7 +4024,8 @@ changes for any of these parameters. The RowDescription ('T') message carries new table OID and column -number fields for each column of the described row. +number fields for each column of the described row. It also shows the format +code for each column. @@ -3979,12 +4044,6 @@ The EmptyQueryResponse ('I') message used to include an empty string parameter; this has been removed. - - -Additional changes will be documented as they are implemented. - - -