From eed2e48e6993c9c66987276bb8e674330265874d Mon Sep 17 00:00:00 2001 From: mahlzeit Date: Mon, 22 Dec 2003 16:01:04 +0000 Subject: [PATCH] Release 1.1 - added symbolic constants - added simple expressions - cleaned up docs git-svn-id: file:///srv/svn/repos/haiku/trunk/current@5722 a95241bf-73f2-0310-859d-f6bbb57e9c96 --- src/tools/rc/Doxyfile | 2 +- src/tools/rc/compile.h | 7 + src/tools/rc/docs/beres-deres.html | 67 ++---- src/tools/rc/docs/grammar.html | 341 +++++++++-------------------- src/tools/rc/docs/history.html | 22 +- src/tools/rc/docs/librdef.html | 52 ++--- src/tools/rc/docs/notes.html | 43 ++++ src/tools/rc/docs/rc.html | 294 +++++++++---------------- src/tools/rc/docs/todo.html | 29 +-- src/tools/rc/parser.y | 259 ++++++++++++++++++---- src/tools/rc/rdef.cpp | 2 +- src/tools/rc/tests/all.rdef | 1 + src/tools/rc/tests/builtin.rdef | 18 +- src/tools/rc/tests/expr.rdef | 62 ++++++ 14 files changed, 597 insertions(+), 602 deletions(-) create mode 100644 src/tools/rc/docs/notes.html create mode 100644 src/tools/rc/tests/expr.rdef diff --git a/src/tools/rc/Doxyfile b/src/tools/rc/Doxyfile index 6d8fd36ce3..119ec43c88 100644 --- a/src/tools/rc/Doxyfile +++ b/src/tools/rc/Doxyfile @@ -3,7 +3,7 @@ #--------------------------------------------------------------------------- # General configuration options #--------------------------------------------------------------------------- -PROJECT_NAME = "librdef API version 1" +PROJECT_NAME = "librdef API version 2" PROJECT_NUMBER = OUTPUT_DIRECTORY = OUTPUT_LANGUAGE = English diff --git a/src/tools/rc/compile.h b/src/tools/rc/compile.h index 0e544eb988..e58c18bb8b 100644 --- a/src/tools/rc/compile.h +++ b/src/tools/rc/compile.h @@ -85,6 +85,13 @@ struct id_t char* name; }; +// Describes a symbolic constant. +struct define_t +{ + char* name; + int32 value; +}; + // The output file we add resources to. extern BResources rsrc; extern const char* rsrc_file; diff --git a/src/tools/rc/docs/beres-deres.html b/src/tools/rc/docs/beres-deres.html index 9e6fb4c73c..d06ddb12f9 100644 --- a/src/tools/rc/docs/beres-deres.html +++ b/src/tools/rc/docs/beres-deres.html @@ -6,12 +6,7 @@

Differences with beres/deres

-

Some of your old rdef files may fail to compile with rc. Besides different -command line arguments, rc also has slightly different syntax and semantics. -These changes were made to allow for the future expansion of the rdef grammar -and to make resource scripts easier to write and understand. The compiler -should be backwards compatible with the majority of old scripts, but in some -cases it was necessary to break stuff.

+

Some of your old rdef files may fail to compile with rc. Besides different command line arguments, rc also has slightly different syntax and semantics. These changes were made to allow for the future expansion of the rdef grammar and to make resource scripts easier to write and understand. The compiler should be backwards compatible with the majority of old scripts, but in some cases it was necessary to break stuff.

Differences with beres

@@ -19,62 +14,27 @@ cases it was necessary to break stuff.

  • We allow the \0 escape in string literals.
  • -
  • Our handling of the minus sign is more restrictive; we don't treat it like -a unary minus operator, which means you may only put one minus in front of a -number to negate it (i.e. "- -10" is an error).
  • +
  • Our handling of the minus sign is more restrictive; we don't treat it like a unary minus operator, which means you may only put one minus in front of a number to negate it (i.e. "- -10" is an error).
  • -
  • beres allows #define statements, but apparently ignores them. The compiler -reports no error, but any symbols you define are not recognized later on. We -don't allow #defines (yet).
  • +
  • beres allows #define statements, but apparently ignores them. The compiler reports no error, but any symbols you define are not recognized later on. We don't allow #defines (yet).
  • -
  • beres allows you to put enum symbols in resource data, and replaces the -symbol with its value. For some reason, it lets you assign any kind of data to -enum symbols, such as strings and floats. Since that is not valid C/C++ syntax, -librdef does not allow this. As a result, it does not make much sense for us -to support enum symbols in resource data.
  • +
  • beres allows you to put enum symbols in resource data, and replaces the symbol with its value. For some reason, it lets you assign any kind of data to enum symbols, such as strings and floats. Since that is not valid C/C++ syntax, librdef does not allow this. As a result, it does not make much sense for us to support enum symbols in resource data.
  • -
  • We only allow a few type casts, but enough to be backwards compatible. -beres allows you to cast from bool to all other types, between strings and -numeric types, etc. The usefulness of this is limited, so to keep things simple -librdef doesn't allow that.
  • +
  • We only allow a few type casts, but enough to be backwards compatible. beres allows you to cast from bool to all other types, between strings and numeric types, etc. The usefulness of this is limited, so to keep things simple librdef doesn't allow that.
  • -
  • There is no need to put parentheses around a type code, even for simple -data; beres sometimes requires the parens and sometimes it doesn't. With rc, -they are always optional (preferred notation is without the parens).
  • +
  • There is no need to put parentheses around a type code, even for simple data; beres sometimes requires the parens and sometimes it doesn't. With rc, they are always optional (preferred notation is without the parens).
  • -
  • beres allows multiple type codes and/or type casts per resource -statement/data field, but we allow only one of each. Because we don't associate -type codes with data (but only with the resource itself or with individual -message fields), we don't allow the shortcut notation that lets you put a type -code and cast inside one set of parentheses.
  • +
  • beres allows multiple type codes and/or type casts per resource statement/data field, but we allow only one of each. Because we don't associate type codes with data (but only with the resource itself or with individual message fields), we don't allow the shortcut notation that lets you put a type code and cast inside one set of parentheses.
  • -
  • We do not allow arrays to have named fields, nor do we allow messages (and -archives) to have unnamed fields. beres apparently treats all compound data the -same, but we don't, because not all compound data is the same.
  • +
  • We do not allow arrays to have named fields, nor do we allow messages (and archives) to have unnamed fields. beres apparently treats all compound data the same, but we don't, because not all compound data is the same.
  • -
  • When specifying array data, the "array" keyword is required. beres allows -you to omit the array keyword and just put the data between braces, but that -conflicts with our handling of user-defined types.
  • +
  • When specifying array data, the "array" keyword is required. beres allows you to omit the array keyword and just put the data between braces, but that conflicts with our handling of user-defined types.
  • -
  • Field names in message resources may only be string literals. beres allows -identifiers but it converts them to string literals anyway. Just like normal -data, you may cast the field's data to a different type, but not to a different -type code. Specifying a non-standard type code is fine, but it goes in front of -the type name. Just like beres, rc considers "array", "message", and "archive" -to be valid data type names, but "message" and "archive" cannot be followed by -a "what" code (beres ignores that anyway). When you cast an archive to a -message, we don't strip the "class" field (in fact, we don't perform any -conversion).
  • +
  • Field names in message resources may only be string literals. beres allows identifiers but it converts them to string literals anyway. Just like normal data, you may cast the field's data to a different type, but not to a different type code. Specifying a non-standard type code is fine, but it goes in front of the type name. Just like beres, rc considers "array", "message", and "archive" to be valid data type names, but "message" and "archive" cannot be followed by a "what" code (beres ignores that anyway). When you cast an archive to a message, we don't strip the "class" field (in fact, we don't perform any conversion).
  • -
  • We allow users to define their own types. The built-in types from beres -(point, rect, and rgb_color) are re-implemented as such user-defined types. The -syntax for these three built-ins should still be backwards compatible with old -scripts.
  • +
  • We allow users to define their own types. The built-in types from beres (point, rect, and rgb_color) are re-implemented as such user-defined types. The syntax for these three built-ins should still be backwards compatible with old scripts.
  • -
  • beres automatically adds names for resources that are specified as -"resource(R_Symbol)", unless the "explicit names only" option is set. We do -that the other way around: we don't automatically add names unless the user -sets the "auto names" option.
  • +
  • beres automatically adds names for resources that are specified as "resource(R_Symbol)", unless the "explicit names only" option is set. We do that the other way around: we don't automatically add names unless the user sets the "auto names" option.
  • @@ -86,8 +46,7 @@ sets the "auto names" option.
  • We put opening braces on the next line (heh heh).
  • -
  • Type codes that cannot be represented as #'xxxx' are written as #num where -num is a decimal number, not hexadecimal.
  • +
  • Type codes that cannot be represented as #'xxxx' are written as #num where num is a decimal number, not hexadecimal.
  • Floats and doubles have a bunch of extra zeroes.
  • diff --git a/src/tools/rc/docs/grammar.html b/src/tools/rc/docs/grammar.html index 53794855a0..0d473ea48a 100644 --- a/src/tools/rc/docs/grammar.html +++ b/src/tools/rc/docs/grammar.html @@ -7,87 +7,47 @@

    rdef grammar

    -

    This is the (somewhat boring) specification of the rdef file format as it is -understood by librdef. It also describes to a certain extent how the compiler -works. You don't need to read this unless you want to hack librdef. Knowledge -of compiler theory and lex/yacc is assumed.

    +

    Version 1.1

    + +

    This is the (somewhat boring) specification of the rdef file format as it is understood by librdef. It also describes to a certain extent how the compiler works. You don't need to read this unless you want to hack librdef. Knowledge of compiler theory and lex/yacc is assumed.

    The lexer

    -

    Like any compiler, librdef contains a lexer (aka scanner) and a parser. The -lexer reads the input file and chops it up into tokens. The lexer ignores -single-line // comments and /* ... */ multi-line comments. It also ignores -whitespace and newlines.

    +

    Like any compiler, librdef contains a lexer (aka scanner) and a parser. The lexer reads the input file and chops it up into tokens. The lexer ignores single-line // comments and /* ... */ multi-line comments. It also ignores whitespace and newlines.

    The lexer recognizes the following tokens:

    -
    - +
    - + - + - + - + - + - + -
    BOOLtrue or false
    INTEGERYou can specify integers as decimal numbers, -hexadecimal numbers (starting with 0x or 0X, alpha digits are case -insensitive), octal numbers (starting with a leading 0), binary numbers -(starting with 0b or 0B), or as a four character code ('CCCC'). Valid range is -64 bits. At this point, numbers are always unsigned. The minus sign is treated -as a separate token, and is dealt with by the parser.
    INTEGERYou can specify integers as decimal numbers, hexadecimal numbers (starting with 0x or 0X, alpha digits are case insensitive), octal numbers (starting with a leading 0), binary numbers (starting with 0b or 0B), or as a four character code ('CCCC'). Valid range is 64 bits. At this point, numbers are always unsigned. The minus sign is treated as a separate token, and is dealt with by the parser.
    FLOATA floating point literal. Must contain a decimal point, -may contain a signed exponent. Stored internally as a double.
    FLOATA floating point literal. Must contain a decimal point, may contain a signed exponent. Stored internally as a double.
    STRINGUTF-8 compatible string literal, enclosed by double -quotes. Can contain escape sequences (\b \f \n \r \t \v \" \\ \0), octal -escapes (\000) and hex escapes (\0x00 or \x00). May not span more than one -line, although you are allowed to specify multiple string literals in a row and -the lexer will automatically concatenate them. There is no maximum length.
    STRINGUTF-8 compatible string literal, enclosed by double quotes. Can contain escape sequences (\b \f \n \r \t \v \" \\ \0), octal escapes (\000) and hex escapes (\0x00 or \x00). May not span more than one line, although you are allowed to specify multiple string literals in a row and the lexer will automatically concatenate them. There is no maximum length.
    RAWHexadecimal representation of raw data, enclosed by double -quotes, and prefixed by a dollar sign: $"12FFAB". Each byte is represented by -two hex characters, so there must be an even number of characters between the -quotes. The alpha digits are not case sensitive. Like STRING, a RAW token may -not span more than one line, but multiple consecutive RAW tokens are -automatically concatenated. No maximum length.
    RAWHexadecimal representation of raw data, enclosed by double quotes, and prefixed by a dollar sign: $"12FFAB". Each byte is represented by two hex characters, so there must be an even number of characters between the quotes. The alpha digits are not case sensitive. Like STRING, a RAW token may not span more than one line, but multiple consecutive RAW tokens are automatically concatenated. No maximum length.
    IDENTC/C++ identifier. First character is alphabetic or -underscore. Other characters are alphanumeric or underscore.
    IDENTC/C++ identifier. First character is alphabetic or underscore. Other characters are alphanumeric or underscore.
    TYPECODEA hash sign followed by a 32-bit unsigned decimal -number, hex number, or four character code. Examples: #200, #0x00C8, -#'MIMS'
    TYPECODEA hash sign followed by a 32-bit unsigned decimal number, hex number, or four character code. Examples: #200, #0x00C8, #'MIMS'
    -
    +

    The following are treated as keywords and special symbols:

    -
    enum resource array message archive type import -{ } ( ) , ; = -
    +
    enum resource array message archive type import { } ( ) , ; = - + * / % ^ | & ~
    -

    The lexer also deals with #include statements, which look like: #include -"filename"\n. When you #include a file, the lexer expects it to contain valid -rdef syntax. So even though the include file is probably a C/C++ header, it -should not contain anything but the enum statement and/or comments. The lexer -only looks for include files in the include search paths that you have -specified, so if you want it to look in the current working directory you have -to explicitly specify that. You may nest #includes.

    +

    The lexer also deals with #include statements, which look like: #include "filename"\n. When you #include a file, the lexer expects it to contain valid rdef syntax. So even though the include file is probably a C/C++ header, it should not contain anything but the enum statement and/or comments. The lexer only looks for include files in the include search paths that you have specified, so if you want it to look in the current working directory you have to explicitly specify that. You may nest #includes.

    -

    A note about UTF-8. Since the double quote (hex 0x22) is never part of the -second or third byte of a UTF-8 character, the lexer can safely deal with UTF-8 -characters inside string literals. That is also the reason that the decompiler -does not escape characters that are not human-readable (except the ones in the -7-bit ASCII range), because they could be part of a UTF-8 encoding. The current -version of librdef does not handle L"..." (wide char) strings, but nobody uses -them anyway.

    +

    A note about UTF-8. Since the double quote (hex 0x22) is never part of the second or third byte of a UTF-8 character, the lexer can safely deal with UTF-8 characters inside string literals. That is also the reason that the decompiler does not escape characters that are not human-readable (except the ones in the 7-bit ASCII range), because they could be part of a UTF-8 encoding. The current version of librdef does not handle L"..." (wide char) strings, but nobody uses them anyway.

    The parser

    -

    The parser takes the tokens from the lexer and matches them against the -rules of the grammar. What follows is the grammar in a simplified variation of -BNF, so the actual bison source file may look a little different. Legend:

    +

    The parser takes the tokens from the lexer and matches them against the rules of the grammar. What follows is the grammar in a simplified variation of BNF, so the actual bison source file may look a little different. Legend:

    -
    - +
    @@ -95,34 +55,31 @@ BNF, so the actual bison source file may look a little different. Legend:

    -
    [ a ]match a 0 or 1 times
    { b }match b 0 or more times
    c | dmatch either c or d
    lowercasenonterminal
    UPPERtoken from the lexer
    'c'token from the lexer
    -
    +

    The rdef grammar consists of the following rules:

    -
    - +
    - + - + - + - + - + + + @@ -132,79 +89,57 @@ BNF, so the actual bison source file may look a little different. Legend:

    - + - + - + - + - + - + -
    script{enum | typedef | resource}
    enum ENUM '{' [symboldef {',' symboldef} [',']] '}' -';'
    enum ENUM '{' [symboldef {',' symboldef} [',']] '}' ';'
    symboldefIDENT ['=' integer]
    typedefTYPE [id] [TYPECODE] IDENT '{' fielddef {',' fieldef} -'}' ';'
    typedefTYPE [id] [TYPECODE] IDENT '{' fielddef {',' fielddef} '}' ';'
    fielddefdatatype IDENT ['[' INTEGER ']'] ['=' data]
    fielddefdatatype IDENT ['[' INTEGER ']'] ['=' expr]
    resourceRESOURCE [id] [typecode] data ';'
    resourceRESOURCE [id] [typecode] expr ';'
    id'(' [(integer | IDENT) [',' STRING] | STRING] ')'
    typecode['('] TYPECODE [')']
    data[typecast] (BOOL | integer | float | STRING | RAW | array -| message | archive | type)
    exprexpr BINARY_OPER expr | UNARY_OPER expr | data
    data[typecast] (BOOL | integer | float | STRING | RAW | array | message | archive | type | define | '(' expr ')' )
    typecast['(' datatype ')']
    float['-'] FLOAT
    arrayARRAY ['{' [data {',' data}] '}'] | [ARRAY] IMPORT -STRING
    arrayARRAY ['{' [expr {',' expr}] '}'] | [ARRAY] IMPORT STRING
    messageMESSAGE ['(' integer ')'] ['{' [msgfield {',' -msgfield}] '}']
    messageMESSAGE ['(' integer ')'] ['{' [msgfield {',' msgfield}] '}']
    msgfield[TYPECODE] [datatype] STRING '=' data
    msgfield[TYPECODE] [datatype] STRING '=' expr
    archiveARCHIVE [archiveid] IDENT '{' msgfield {',' msgfield} -'}'
    archiveARCHIVE [archiveid] IDENT '{' msgfield {',' msgfield} '}'
    archiveid'(' [STRING] [',' integer] ')'
    typeIDENT [data | '{' [typefield {',' typefield}] -'}']
    typeIDENT [data | '{' [typefield {',' typefield}] '}']
    typefield[IDENT '='] data
    typefield[IDENT '='] expr
    -
    +defineIDENT + +

    Semantics

    Resource names

    -

    There are several different ways to specify the ID and name of a new -resource:

    +

    There are several different ways to specify the ID and name of a new resource:

    -
    - +
    - + - + - + - + - + - + - + -
    resourceThe resource is assigned the default name -and ID of its data type.
    resourceThe resource is assigned the default name and ID of its data type.
    resource()The resource is assigned the default name and ID of -its data type.
    resource()The resource is assigned the default name and ID of its data type.
    resource(1)The resource is assigned the numeric ID 1, and the -default name of its data type.
    resource(1)The resource is assigned the numeric ID 1, and the default name of its data type.
    resource("xxx")The resource is assigned the name "xxx" and the -default ID of its data type.
    resource("xxx")The resource is assigned the name "xxx" and the default ID of its data type.
    resource(1, "xxx")The resource is assigned the numeric ID 1, -and the name "xxx".
    resource(1, "xxx")The resource is assigned the numeric ID 1, and the name "xxx".
    resource(sss)The resource is assigned the numeric ID that -corresponds with the symbol sss, which should have been defined in an enum -earlier. If the "auto names" option is passed to the compiler, the resource is -also given the name "sss", otherwise the default name from its data type is -used.
    resource(sss)The resource is assigned the numeric ID that corresponds with the symbol sss, which should have been defined in an enum earlier. If the "auto names" option is passed to the compiler, the resource is also given the name "sss", otherwise the default name from its data type is used.
    resource(sss, "xxx")The resource is assigned the numeric ID -that corresponds with the symbol sss, and the name "xxx".
    resource(sss, "xxx")The resource is assigned the numeric ID that corresponds with the symbol sss, and the name "xxx".
    -
    +

    Data types and type casts

    -

    Resources (and message fields) have a type code and a data type. The data -type determines the format the data is stored in, while the type code tells the -user how to interpret the data. Typically, there is some kind of relation -between the two, otherwise the resource will be a little hard to read.

    +

    Resources (and message fields) have a type code and a data type. The data type determines the format the data is stored in, while the type code tells the user how to interpret the data. Typically, there is some kind of relation between the two, otherwise the resource will be a little hard to read.

    -

    The following table lists the compiler's built-in data types. (Users can -also define their own types; this is described in a later section.)

    +

    The following table lists the compiler's built-in data types. (Users can also define their own types; this is described in a later section.)

    -
    +
    -
    - - +
    @@ -220,7 +155,7 @@ also define their own types; this is described in a later section.)

    boolB_BOOL_TYPE
    int8B_INT8_TYPE
    uint8B_UINT8_TYPE
    - +
    @@ -233,70 +168,50 @@ also define their own types; this is described in a later section.)

    off_tB_OFF_T_TYPE
    time_tB_TIME_TYPE
    floatB_FLOAT_TYPE
    archiveB_MESSAGE_TYPE
    -
    - +
    -

    The type code has no effect on how the data is stored. For example, if you -do this: "resource(x) #'LONG' true", then the data will not automatically be -stored as a 32-bit number! If you don't specify an explicit type code, the -compiler uses the type of the data for that.

    +

    The type code has no effect on how the data is stored. For example, if you do this: "resource(x) #'LONG' true", then the data will not automatically be stored as a 32-bit number! If you don't specify an explicit type code, the compiler uses the type of the data for that.

    -

    You can change the data type with a type cast. The following casts are -allowed:

    +

    You can change the data type with a type cast. The following casts are allowed:

    -
    - +
    - + - + -
    boolYou cannot cast bool data.
    integerYou can cast to all numeric data types. Casts to -smaller datatypes will truncate the number. Casting negative numbers to -unsigned datatypes (and vice versa) will wrap them, i.e. (uint8) -1 becomes -255.
    integerYou can cast to all numeric data types. Casts to smaller datatypes will truncate the number. Casting negative numbers to unsigned datatypes (and vice versa) will wrap them, i.e. (uint8) -1 becomes 255.
    floating pointYou can only cast to float or double.
    stringYou cannot cast string data.
    raw, buffer, arrayYou can cast anything to raw, but not the -other way around.
    raw, buffer, arrayYou can cast anything to raw, but not the other way around.
    message, archiveYou cannot cast message data.
    typeYou cannot cast user-defined types.
    -
    +
    -

    In addition to the "simple" built-in data types, the compiler also natively -supports several data structures from the BeOS API (point, rect, rgb_color) and -a few convenience types (app_signature, app_flags, etc). These types all follow -the same rules as user-defined types.

    +

    In addition to the "simple" built-in data types, the compiler also natively supports several data structures from the BeOS API (point, rect, rgb_color) and a few convenience types (app_signature, app_flags, etc). These types all follow the same rules as user-defined types.

    Arrays

    -

    The following definitions are semantically -equivalent:

    +

    The following definitions are semantically equivalent:

    resource(x) $"AABB";
     resource(x) array { $"AA" $"BB" };
     resource(x) array { $"AA", $"BB" };
    -

    The comma is optional and simply concatenates the two literals. When you -decompile this code, it always looks like:

    +

    The comma is optional and simply concatenates the two literals. When you decompile this code, it always looks like:

    resource(x) $"AABB";
    -

    Strings behave differently. The following two definitions are equivalent, -and concatenate the two literals into one string:

    +

    Strings behave differently. The following two definitions are equivalent, and concatenate the two literals into one string:

    resource(x) "AA" "BB";
     resource(x) #'CSTR' array { "AA" "BB" };
    -

    However, if you put a comma between the the strings, the compiler will still -glue them together but with a '\0' character in the middle. Now the resource -contains two strings: "AA" and "BB". You can also specify the '\0' -character yourself:

    +

    However, if you put a comma between the the strings, the compiler will still glue them together but with a '\0' character in the middle. Now the resource contains two strings: "AA" and "BB". You can also specify the '\0' character yourself:

    resource(x) "AA\0BB";
     resource(x) #'CSTR' array { "AA", "BB" };
    @@ -306,133 +221,85 @@ resource(x) #'CSTR' array { "AA", "BB" };
    resource(x) "AA", "BB";
     resource(x) $"AA", $"BB";
    -

    Note that the data type of an array is always raw data, no matter how you -specify its contents. Because raw literals may be empty ($""), so may -arrays.

    +

    Note that the data type of an array is always raw data, no matter how you specify its contents. Because raw literals may be empty ($""), so may arrays.

    Messages and archives

    -

    A message resource is a flattened BMessage. By default it has the data type -B_MESSAGE_TYPE and corresponding type code #'MSGG'. If you don't specify a -"what" code for the message, it defaults to 0.

    +

    A message resource is a flattened BMessage. By default it has the data type B_MESSAGE_TYPE and corresponding type code #'MSGG'. If you don't specify a "what" code for the message, it defaults to 0.

    -

    Message fields assume the type of their data, unless you specify a different -type in front of the field name. (Normal casting rules apply here.) You can -also give the field a different type code, which tells the BMessage how to -interpret the data, but not how it is stored in the message. This type code -also goes in front of the field name. You can give the same name to multiple -fields, provided that they all have the same type. (The data of these fields -does not have to be the same size.) A message may be empty; it is still a valid -BMessage, but it contains no fields.

    +

    Message fields assume the type of their data, unless you specify a different type in front of the field name. (Normal casting rules apply here.) You can also give the field a different type code, which tells the BMessage how to interpret the data, but not how it is stored in the message. This type code also goes in front of the field name. You can give the same name to multiple fields, provided that they all have the same type. (The data of these fields does not have to be the same size.) A message may be empty; it is still a valid BMessage, but it contains no fields.

    -

    An archive is also a flattened BMessage, but one that was made by -Archive()'ing a BArchivable class, such as BBitmap. The name of the archive, in -this case BBitmap, is automatically added to the message in a field called -"class". The "archive" keyword is optionally followed by a set of parentheses -that enclose a string and/or an integer. The int is the "what" code, the string -is stored in a field called "add_on" (used for dynamic loading of -BArchivables). Other than that, archives and messages are identical. The -compiler does not check whether the contents of the archive actually make -sense, so if you don't structure the data properly you may be unable to -unarchive the object later. Unlike a message, an archive may not be empty, -because that is pointless.

    +

    An archive is also a flattened BMessage, but one that was made by Archive()'ing a BArchivable class, such as BBitmap. The name of the archive, in this case BBitmap, is automatically added to the message in a field called "class". The "archive" keyword is optionally followed by a set of parentheses that enclose a string and/or an integer. The int is the "what" code, the string is stored in a field called "add_on" (used for dynamic loading of BArchivables). Other than that, archives and messages are identical. The compiler does not check whether the contents of the archive actually make sense, so if you don't structure the data properly you may be unable to unarchive the object later. Unlike a message, an archive may not be empty, because that is pointless.

    User-defined types

    -

    We allow users to define their own types. A "type" is just a fancy array, -because the data from the various fields is simply concatenated into one big -block of bytes. The difference is that user-defined types are much easier to -fill in.

    +

    We allow users to define their own types. A "type" is just a fancy array, because the data from the various fields is simply concatenated into one big block of bytes. The difference is that user-defined types are much easier to fill in.

    -

    A user-defined type has a symbolic name, a type code, and a number of data -fields. After all the fields have been concatenated, the type code is applied -to the whole block. So, the data type of this resource is always the same as -its type code (unlike arrays, which are always raw data). If no type code is -specified, it defaults to B_RAW_TYPE.

    +

    A user-defined type has a symbolic name, a type code, and a number of data fields. After all the fields have been concatenated, the type code is applied to the whole block. So, the data type of this resource is always the same as its type code (unlike arrays, which are always raw data). If no type code is specified, it defaults to B_RAW_TYPE.

    -

    The data fields always have a default value. For simple fields this is -typically 0 (numeric types) or empty (string, raw, message). The default value -of a user-defined type as a whole is the combination of the default values of -its fields. Of course, the user can specify other defaults. (When a user -creates a new resource that uses such a type, he is basically overriding the -default values with his own.)

    +

    The data fields always have a default value. For simple fields this is typically 0 (numeric types) or empty (string, raw, message). The default value of a user-defined type as a whole is the combination of the default values of its fields. Of course, the user can specify other defaults. (When a user creates a new resource that uses such a type, he is basically overriding the default values with his own.)

    -

    The user may fill in the data fields by name, by order, or using a -combination of both. Every time the compiler sees an unnamed data item, it -stuffs it into the next available field. Named data items are simply assigned -to the field with the same name, and may overwrite a value that was previously -put there "by order". Any fields that are not filled in keep their default -value. For example:

    +

    The user may fill in the data fields by name, by order, or using a combination of both. Every time the compiler sees an unnamed data item, it stuffs it into the next available field. Named data items are simply assigned to the field with the same name, and may overwrite a value that was previously put there "by order". Any fields that are not filled in keep their default value. For example:

    type vector { int32 x, int32 y, int32 z, int32 w = 4 };
     resource(1) vector { 1, 3, x = 2 };
    -

    Here, x is first set to 1, y is set to 3, x is now overwritten by the value -2, z is given the default value 0, and w defaults to 4.

    +

    Here, x is first set to 1, y is set to 3, x is now overwritten by the value 2, z is given the default value 0, and w defaults to 4.

    -

    Note: if a user-defined type contains string, raw, or message fields, the -size of the type depends on the data that the user puts into it, because these -fields have a variable size. However, the user may specify a fixed size for a -field (number of bytes, enclosed in square brackets following the field name). -In this case, data that is too long will be truncated and data that is too -short will be padded with zeroes. You can do this for all types, but it really -only makes sense for strings and raw data.

    +

    Note: if a user-defined type contains string, raw, or message fields, the size of the type depends on the data that the user puts into it, because these fields have a variable size. However, the user may specify a fixed size for a field (number of bytes, enclosed in square brackets following the field name). In this case, data that is too long will be truncated and data that is too short will be padded with zeroes. You can do this for all types, but it really only makes sense for strings and raw data. More about this in the manual.

    -

    A type definition may also contain a default resource ID and name. The -default ID of built-in types is usually 1 and the name is empty (NULL). For -example:

    +

    A type definition may also contain a default resource ID and name. The default ID of built-in types is usually 1 and the name is empty (NULL). For example:

    type(10, "MyName") mytype { int32 a };
     resource mytype 123;
    -

    The resource is now called "MyName" and has ID 10. Obviously you can only do -this once or you will receive a duplicate resource error. If this type is used -inside an array or other compound type, the default ID and resource name are -ignored. Note: this feature introduces a shift/reduce conflict in the -compiler:

    +

    The resource is now called "MyName" and has ID 10. Obviously you can only do this once or you will receive a duplicate resource error. If this type is used inside an array or other compound type, the default ID and resource name are ignored. Note: this feature introduces a shift/reduce conflict in the compiler:

    resource (int8) 123;
    -

    This probably doesn't do what you expect. The compiler now considers the -"(int8)" to be the resource ID, not a typecast. If you did not declare "int8" -in an enum (probably not), this gives a compiler error. Not a big problem, -because it is unlikely that you will ever do this. Here is a workaround:

    +

    This probably doesn't do what you expect. The compiler now considers the "(int8)" to be the resource ID, not a typecast. If you did not declare "int8" in an enum (probably not), this gives a compiler error. Not a big problem, because it is unlikely that you will ever do this. Here is a workaround:

    resource() (int8) 123;
    +

    The grammar and Bison

    + +

    Above I mentioned one of the shift/reduce conflicts from the grammar. There are several others. These are mostly the result of keeping the original grammar intact as much as possible, without having to introduce weird syntax rules for the new features. These issues aren't fatal but if you try to do something funky in your script, you may get an error message.

    + +

    The main culprit here is the "( expr )" rule from "data", which allows you to nest expressions with parens, e.g. "(10 + 5) * 3". This causes problems for Bison, because we already use parens all over the place. Specifically, this rule conflicts with the empty "MESSAGE" from the "message" rule, "ARRAY" from "array", and "IDENT" from "type". These rules have no other symbols following them, which makes them ambiguous with respect to the "datatype" rules. Still with me? The parser will typically pick the right one, though.

    + +

    The nested expressions rule also caused a reduce/reduce conflict. To get rid of that, I had to explicitly mention the names of the various types in the "typecast" rule, which introduces a little code duplication but it's not too bad. Just so you know, the original rule was simply:

    + +
    typecast
    +    : '(' datatype ')' { $$ = $2; }
    +    ;
    + +

    The new rule is a little more bulky:

    + +
    typecast
    +    : '(' ARRAY ')'   { ... }
    +    | '(' MESSAGE ')' { ... }
    +    ...and so on for all the datatypes...
    +    ;
    + +

    The unary minus operator is not part of the "expr" (or "data") rules, but of "integer" and "float". This also causes a shift/reduce warning.

    +

    Symbol table

    -

    The compiler uses two symbol tables: one for the enum symbols, and one with -the data type definitions. We need those tables to find the numeric ID/type -definition that corresponds with an identifier, and to make sure that there are -no duplicate or missing identifiers. These two symbol tables are independent, -so you may use the same identifier both as an enum symbol and a type name.

    +

    The compiler uses two symbol tables: one for the enum symbols, and one with the data type definitions. We need those tables to find the numeric ID/type definition that corresponds with an identifier, and to make sure that there are no duplicate or missing identifiers. These two symbol tables are independent, so you may use the same identifier both as an enum symbol and a type name.

    -

    The compiler does not need to keep a symbol table for the resources. -Although the combination of a resource's ID and its type code must be unique, -we can use the BResources class to check for this when we add a resource. There -is no point in duplicating this functionality in the compiler. (However, if we -are merging the new resources into an existing resource file, we will simply -overwrite duplicates.)

    +

    The compiler does not need to keep a symbol table for the resources. Although the combination of a resource's ID and its type code must be unique, we can use the BResources class to check for this when we add a resource. There is no point in duplicating this functionality in the compiler. (However, if we are merging the new resources into an existing resource file, we will simply overwrite duplicates.)

    Misc remarks

    diff --git a/src/tools/rc/docs/history.html b/src/tools/rc/docs/history.html index d036353c74..257a22b9e6 100644 --- a/src/tools/rc/docs/history.html +++ b/src/tools/rc/docs/history.html @@ -8,16 +8,22 @@

    This is the change history of the rc resource compiler and librdef.

    -

    Version 1.1 (not released yet):

    +

    Version 1.1 (December 22, 2003):

    Version 1.0 (February 16, 2003):

    diff --git a/src/tools/rc/docs/librdef.html b/src/tools/rc/docs/librdef.html index 41e2843c93..bb54d3831f 100644 --- a/src/tools/rc/docs/librdef.html +++ b/src/tools/rc/docs/librdef.html @@ -6,70 +6,50 @@

    The librdef library

    -

    Of course, it would be cool if other applications (such as GUI resource -editors) could also import and export rdef files. That is why the bulk of rc's -functionality is implemented in a separate shared library, librdef.so.

    +

    Of course, it would be cool if other applications (such as GUI resource editors) could also import and export rdef files. That is why the bulk of rc's functionality is implemented in a separate shared library, librdef.so.

    -

    Using the library in your own applications is very simple. Here are some -quick instructions to get you started:

    +

    Using the library in your own applications is very simple. Here are some quick instructions to get you started:

    1. #include "rdef.h" in your sources
    2. link your app to librdef.so
    -

    The API is rather bare-bones, but it gets the job done. The library uses -files to transfer data to and from your application. This may seem odd, but it -is actually a big advantage. After calling the API functions to compile an rdef -file, you can use the standard BResources class to read the resources from the -output file. Chances are high that your application already knows how to do -this.

    +

    The API is rather bare-bones, but it gets the job done. The library uses files to transfer data to and from your application. This may seem odd, but it is actually a big advantage. After calling the API functions to compile an rdef file, you can use the standard BResources class to read the resources from the output file. Chances are high that your application already knows how to do this.

    To compile a resource file, the steps are typically this:

      -
    1. Call rdef_add_include_dir() one or more times to add include -file search paths.
    2. +
    3. Call rdef_add_include_dir() one or more times to add include file search paths.
    4. -
    5. Call rdef_add_input_file() one or more times to add the rdef -files that you want to compile.
    6. +
    7. Call rdef_add_input_file() one or more times to add the rdef files that you want to compile.
    8. Call rdef_set_flags() to toggle compiler options.
    9. -
    10. Call rdef_compile() with the name of the output file. This -performs the actual compilation.
    11. +
    12. Call rdef_compile() with the name of the output file. This performs the actual compilation.
    13. -
    14. Call rdef_free_input_files() to clear the list of input files -that you added earlier.
    15. +
    16. Call rdef_free_input_files() to clear the list of input files that you added earlier.
    17. -
    18. Call rdef_free_include_dirs() to clear the list of include -directories that you added earlier.
    19. +
    20. Call rdef_free_include_dirs() to clear the list of include directories that you added earlier.
    -

    Decompiling is very similar, although include directories are not used -here:

    +

    Decompiling is very similar, although include directories are not used here:

      -
    1. Call rdef_add_input_file() one or more times to add the -resource files that you want to decompile.
    2. +
    3. Call rdef_add_input_file() one or more times to add the resource files that you want to decompile.
    4. Call rdef_set_flags() to toggle compiler options.
    5. -
    6. Call rdef_decompile() with the name of the output file. The -name of the header file (if any) will be automatically constructed by appending -".h" to the output file name.
    7. +
    8. Call rdef_decompile() with the name of the output file. The name of the header file (if any) will be automatically constructed by appending ".h" to the output file name.
    9. -
    10. Call rdef_free_input_files() to clear the list of input files -that you added earlier.
    11. +
    12. Call rdef_free_input_files() to clear the list of input files that you added earlier.
    -

    If one of these functions returns something other than B_OK, an error -occurred. You can look at the following variables to find out more about the -error, and construct meaningul error messages:

    +

    If one of these functions returns something other than B_OK, an error occurred. You can look at the following variables to find out more about the error, and construct meaningul error messages:

    -

    For more information about using librdef, see "rdef.h", which explains the -available functions and data structures in more depth. For a real-world -example, take a look at "rc.cpp", which contains the complete implementation of -the rc compiler. As you'll see, there really isn't much to it, because librdef -already does all the work.

    +

    For more information about using librdef, see "rdef.h", which explains the available functions and data structures in more depth. For a real-world example, take a look at "rc.cpp", which contains the complete implementation of the rc compiler. As you'll see, there really isn't much to it, because librdef already does all the work.

    diff --git a/src/tools/rc/docs/notes.html b/src/tools/rc/docs/notes.html new file mode 100644 index 0000000000..4ad189c39b --- /dev/null +++ b/src/tools/rc/docs/notes.html @@ -0,0 +1,43 @@ + + +Work notes + + + +

    Work notes

    + +

    These are my notes on the evolution of rc. I used to keep these in separate files on my development machine, but it makes more sense to include them here. Warning: geeky stuff ahead.

    + + + + + diff --git a/src/tools/rc/docs/rc.html b/src/tools/rc/docs/rc.html index f73dd0fea0..4850a564bc 100644 --- a/src/tools/rc/docs/rc.html +++ b/src/tools/rc/docs/rc.html @@ -6,6 +6,8 @@

    The rc resource compiler

    +

    Version 1.1

    +

    Table of contents