This is ld.info, produced by makeinfo version 4.0 from ./ld.texinfo. START-INFO-DIR-ENTRY * Ld: (ld). The GNU linker. END-INFO-DIR-ENTRY This file documents the GNU linker LD version 2.10. Copyright (C) 1991, 92, 93, 94, 95, 96, 97, 98, 99, 2000 Free Software Foundation, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided also that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions.  File: ld.info, Node: Overlay Description, Prev: Output Section Attributes, Up: SECTIONS Overlay description ------------------- An overlay description provides an easy way to describe sections which are to be loaded as part of a single memory image but are to be run at the same memory address. At run time, some sort of overlay manager will copy the overlaid sections in and out of the runtime memory address as required, perhaps by simply manipulating addressing bits. This approach can be useful, for example, when a certain region of memory is faster than another. Overlays are described using the `OVERLAY' command. The `OVERLAY' command is used within a `SECTIONS' command, like an output section description. The full syntax of the `OVERLAY' command is as follows: OVERLAY [START] : [NOCROSSREFS] [AT ( LDADDR )] { SECNAME1 { OUTPUT-SECTION-COMMAND OUTPUT-SECTION-COMMAND ... } [:PHDR...] [=FILL] SECNAME2 { OUTPUT-SECTION-COMMAND OUTPUT-SECTION-COMMAND ... } [:PHDR...] [=FILL] ... } [>REGION] [:PHDR...] [=FILL] Everything is optional except `OVERLAY' (a keyword), and each section must have a name (SECNAME1 and SECNAME2 above). The section definitions within the `OVERLAY' construct are identical to those within the general `SECTIONS' contruct (*note SECTIONS::), except that no addresses and no memory regions may be defined for sections within an `OVERLAY'. The sections are all defined with the same starting address. The load addresses of the sections are arranged such that they are consecutive in memory starting at the load address used for the `OVERLAY' as a whole (as with normal section definitions, the load address is optional, and defaults to the start address; the start address is also optional, and defaults to the current value of the location counter). If the `NOCROSSREFS' keyword is used, and there any references among the sections, the linker will report an error. Since the sections all run at the same address, it normally does not make sense for one section to refer directly to another. *Note NOCROSSREFS: Miscellaneous Commands. For each section within the `OVERLAY', the linker automatically defines two symbols. The symbol `__load_start_SECNAME' is defined as the starting load address of the section. The symbol `__load_stop_SECNAME' is defined as the final load address of the section. Any characters within SECNAME which are not legal within C identifiers are removed. C (or assembler) code may use these symbols to move the overlaid sections around as necessary. At the end of the overlay, the value of the location counter is set to the start address of the overlay plus the size of the largest section. Here is an example. Remember that this would appear inside a `SECTIONS' construct. OVERLAY 0x1000 : AT (0x4000) { .text0 { o1/*.o(.text) } .text1 { o2/*.o(.text) } } This will define both `.text0' and `.text1' to start at address 0x1000. `.text0' will be loaded at address 0x4000, and `.text1' will be loaded immediately after `.text0'. The following symbols will be defined: `__load_start_text0', `__load_stop_text0', `__load_start_text1', `__load_stop_text1'. C code to copy overlay `.text1' into the overlay area might look like the following. extern char __load_start_text1, __load_stop_text1; memcpy ((char *) 0x1000, &__load_start_text1, &__load_stop_text1 - &__load_start_text1); Note that the `OVERLAY' command is just syntactic sugar, since everything it does can be done using the more basic commands. The above example could have been written identically as follows. .text0 0x1000 : AT (0x4000) { o1/*.o(.text) } __load_start_text0 = LOADADDR (.text0); __load_stop_text0 = LOADADDR (.text0) + SIZEOF (.text0); .text1 0x1000 : AT (0x4000 + SIZEOF (.text0)) { o2/*.o(.text) } __load_start_text1 = LOADADDR (.text1); __load_stop_text1 = LOADADDR (.text1) + SIZEOF (.text1); . = 0x1000 + MAX (SIZEOF (.text0), SIZEOF (.text1));  File: ld.info, Node: MEMORY, Next: PHDRS, Prev: SECTIONS, Up: Scripts MEMORY command ============== The linker's default configuration permits allocation of all available memory. You can override this by using the `MEMORY' command. The `MEMORY' command describes the location and size of blocks of memory in the target. You can use it to describe which memory regions may be used by the linker, and which memory regions it must avoid. You can then assign sections to particular memory regions. The linker will set section addresses based on the memory regions, and will warn about regions that become too full. The linker will not shuffle sections around to fit into the available regions. A linker script may contain at most one use of the `MEMORY' command. However, you can define as many blocks of memory within it as you wish. The syntax is: MEMORY { NAME [(ATTR)] : ORIGIN = ORIGIN, LENGTH = LEN ... } The NAME is a name used in the linker script to refer to the region. The region name has no meaning outside of the linker script. Region names are stored in a separate name space, and will not conflict with symbol names, file names, or section names. Each memory region must have a distinct name. The ATTR string is an optional list of attributes that specify whether to use a particular memory region for an input section which is not explicitly mapped in the linker script. As described in *Note SECTIONS::, if you do not specify an output section for some input section, the linker will create an output section with the same name as the input section. If you define region attributes, the linker will use them to select the memory region for the output section that it creates. The ATTR string must consist only of the following characters: `R' Read-only section `W' Read/write section `X' Executable section `A' Allocatable section `I' Initialized section `L' Same as `I' `!' Invert the sense of any of the preceding attributes If a unmapped section matches any of the listed attributes other than `!', it will be placed in the memory region. The `!' attribute reverses this test, so that an unmapped section will be placed in the memory region only if it does not match any of the listed attributes. The ORIGIN is an expression for the start address of the memory region. The expression must evaluate to a constant before memory allocation is performed, which means that you may not use any section relative symbols. The keyword `ORIGIN' may be abbreviated to `org' or `o' (but not, for example, `ORG'). The LEN is an expression for the size in bytes of the memory region. As with the ORIGIN expression, the expression must evaluate to a constant before memory allocation is performed. The keyword `LENGTH' may be abbreviated to `len' or `l'. In the following example, we specify that there are two memory regions available for allocation: one starting at `0' for 256 kilobytes, and the other starting at `0x40000000' for four megabytes. The linker will place into the `rom' memory region every section which is not explicitly mapped into a memory region, and is either read-only or executable. The linker will place other sections which are not explicitly mapped into a memory region into the `ram' memory region. MEMORY { rom (rx) : ORIGIN = 0, LENGTH = 256K ram (!rx) : org = 0x40000000, l = 4M } Once you define a memory region, you can direct the linker to place specific output sections into that memory region by using the `>REGION' output section attribute. For example, if you have a memory region named `mem', you would use `>mem' in the output section definition. *Note Output Section Region::. If no address was specified for the output section, the linker will set the address to the next available address within the memory region. If the combined output sections directed to a memory region are too large for the region, the linker will issue an error message.  File: ld.info, Node: PHDRS, Next: VERSION, Prev: MEMORY, Up: Scripts PHDRS Command ============= The ELF object file format uses "program headers", also knows as "segments". The program headers describe how the program should be loaded into memory. You can print them out by using the `objdump' program with the `-p' option. When you run an ELF program on a native ELF system, the system loader reads the program headers in order to figure out how to load the program. This will only work if the program headers are set correctly. This manual does not describe the details of how the system loader interprets program headers; for more information, see the ELF ABI. The linker will create reasonable program headers by default. However, in some cases, you may need to specify the program headers more precisely. You may use the `PHDRS' command for this purpose. When the linker sees the `PHDRS' command in the linker script, it will not create any program headers other than the ones specified. The linker only pays attention to the `PHDRS' command when generating an ELF output file. In other cases, the linker will simply ignore `PHDRS'. This is the syntax of the `PHDRS' command. The words `PHDRS', `FILEHDR', `AT', and `FLAGS' are keywords. PHDRS { NAME TYPE [ FILEHDR ] [ PHDRS ] [ AT ( ADDRESS ) ] [ FLAGS ( FLAGS ) ] ; } The NAME is used only for reference in the `SECTIONS' command of the linker script. It is not put into the output file. Program header names are stored in a separate name space, and will not conflict with symbol names, file names, or section names. Each program header must have a distinct name. Certain program header types describe segments of memory which the system loader will load from the file. In the linker script, you specify the contents of these segments by placing allocatable output sections in the segments. You use the `:PHDR' output section attribute to place a section in a particular segment. *Note Output Section Phdr::. It is normal to put certain sections in more than one segment. This merely implies that one segment of memory contains another. You may repeat `:PHDR', using it once for each segment which should contain the section. If you place a section in one or more segments using `:PHDR', then the linker will place all subsequent allocatable sections which do not specify `:PHDR' in the same segments. This is for convenience, since generally a whole set of contiguous sections will be placed in a single segment. You can use `:NONE' to override the default segment and tell the linker to not put the section in any segment at all. You may use the `FILEHDR' and `PHDRS' keywords appear after the program header type to further describe the contents of the segment. The `FILEHDR' keyword means that the segment should include the ELF file header. The `PHDRS' keyword means that the segment should include the ELF program headers themselves. The TYPE may be one of the following. The numbers indicate the value of the keyword. `PT_NULL' (0) Indicates an unused program header. `PT_LOAD' (1) Indicates that this program header describes a segment to be loaded from the file. `PT_DYNAMIC' (2) Indicates a segment where dynamic linking information can be found. `PT_INTERP' (3) Indicates a segment where the name of the program interpreter may be found. `PT_NOTE' (4) Indicates a segment holding note information. `PT_SHLIB' (5) A reserved program header type, defined but not specified by the ELF ABI. `PT_PHDR' (6) Indicates a segment where the program headers may be found. EXPRESSION An expression giving the numeric type of the program header. This may be used for types not defined above. You can specify that a segment should be loaded at a particular address in memory by using an `AT' expression. This is identical to the `AT' command used as an output section attribute (*note Output Section LMA::). The `AT' command for a program header overrides the output section attribute. The linker will normally set the segment flags based on the sections which comprise the segment. You may use the `FLAGS' keyword to explicitly specify the segment flags. The value of FLAGS must be an integer. It is used to set the `p_flags' field of the program header. Here is an example of `PHDRS'. This shows a typical set of program headers used on a native ELF system. PHDRS { headers PT_PHDR PHDRS ; interp PT_INTERP ; text PT_LOAD FILEHDR PHDRS ; data PT_LOAD ; dynamic PT_DYNAMIC ; } SECTIONS { . = SIZEOF_HEADERS; .interp : { *(.interp) } :text :interp .text : { *(.text) } :text .rodata : { *(.rodata) } /* defaults to :text */ ... . = . + 0x1000; /* move to a new page in memory */ .data : { *(.data) } :data .dynamic : { *(.dynamic) } :data :dynamic ... }  File: ld.info, Node: VERSION, Next: Expressions, Prev: PHDRS, Up: Scripts VERSION Command =============== The linker supports symbol versions when using ELF. Symbol versions are only useful when using shared libraries. The dynamic linker can use symbol versions to select a specific version of a function when it runs a program that may have been linked against an earlier version of the shared library. You can include a version script directly in the main linker script, or you can supply the version script as an implicit linker script. You can also use the `--version-script' linker option. The syntax of the `VERSION' command is simply VERSION { version-script-commands } The format of the version script commands is identical to that used by Sun's linker in Solaris 2.5. The version script defines a tree of version nodes. You specify the node names and interdependencies in the version script. You can specify which symbols are bound to which version nodes, and you can reduce a specified set of symbols to local scope so that they are not globally visible outside of the shared library. The easiest way to demonstrate the version script language is with a few examples. VERS_1.1 { global: foo1; local: old*; original*; new*; }; VERS_1.2 { foo2; } VERS_1.1; VERS_2.0 { bar1; bar2; } VERS_1.2; This example version script defines three version nodes. The first version node defined is `VERS_1.1'; it has no other dependencies. The script binds the symbol `foo1' to `VERS_1.1'. It reduces a number of symbols to local scope so that they are not visible outside of the shared library. Next, the version script defines node `VERS_1.2'. This node depends upon `VERS_1.1'. The script binds the symbol `foo2' to the version node `VERS_1.2'. Finally, the version script defines node `VERS_2.0'. This node depends upon `VERS_1.2'. The scripts binds the symbols `bar1' and `bar2' are bound to the version node `VERS_2.0'. When the linker finds a symbol defined in a library which is not specifically bound to a version node, it will effectively bind it to an unspecified base version of the library. You can bind all otherwise unspecified symbols to a given version node by using `global: *' somewhere in the version script. The names of the version nodes have no specific meaning other than what they might suggest to the person reading them. The `2.0' version could just as well have appeared in between `1.1' and `1.2'. However, this would be a confusing way to write a version script. When you link an application against a shared library that has versioned symbols, the application itself knows which version of each symbol it requires, and it also knows which version nodes it needs from each shared library it is linked against. Thus at runtime, the dynamic loader can make a quick check to make sure that the libraries you have linked against do in fact supply all of the version nodes that the application will need to resolve all of the dynamic symbols. In this way it is possible for the dynamic linker to know with certainty that all external symbols that it needs will be resolvable without having to search for each symbol reference. The symbol versioning is in effect a much more sophisticated way of doing minor version checking that SunOS does. The fundamental problem that is being addressed here is that typically references to external functions are bound on an as-needed basis, and are not all bound when the application starts up. If a shared library is out of date, a required interface may be missing; when the application tries to use that interface, it may suddenly and unexpectedly fail. With symbol versioning, the user will get a warning when they start their program if the libraries being used with the application are too old. There are several GNU extensions to Sun's versioning approach. The first of these is the ability to bind a symbol to a version node in the source file where the symbol is defined instead of in the versioning script. This was done mainly to reduce the burden on the library maintainer. You can do this by putting something like: __asm__(".symver original_foo,foo@VERS_1.1"); in the C source file. This renames the function `original_foo' to be an alias for `foo' bound to the version node `VERS_1.1'. The `local:' directive can be used to prevent the symbol `original_foo' from being exported. The second GNU extension is to allow multiple versions of the same function to appear in a given shared library. In this way you can make an incompatible change to an interface without increasing the major version number of the shared library, while still allowing applications linked against the old interface to continue to function. To do this, you must use multiple `.symver' directives in the source file. Here is an example: __asm__(".symver original_foo,foo@"); __asm__(".symver old_foo,foo@VERS_1.1"); __asm__(".symver old_foo1,foo@VERS_1.2"); __asm__(".symver new_foo,foo@@VERS_2.0"); In this example, `foo@' represents the symbol `foo' bound to the unspecified base version of the symbol. The source file that contains this example would define 4 C functions: `original_foo', `old_foo', `old_foo1', and `new_foo'. When you have multiple definitions of a given symbol, there needs to be some way to specify a default version to which external references to this symbol will be bound. You can do this with the `foo@@VERS_2.0' type of `.symver' directive. You can only declare one version of a symbol as the default in this manner; otherwise you would effectively have multiple definitions of the same symbol. If you wish to bind a reference to a specific version of the symbol within the shared library, you can use the aliases of convenience (i.e. `old_foo'), or you can use the `.symver' directive to specifically bind to an external version of the function in question.  File: ld.info, Node: Expressions, Next: Implicit Linker Scripts, Prev: VERSION, Up: Scripts Expressions in Linker Scripts ============================= The syntax for expressions in the linker script language is identical to that of C expressions. All expressions are evaluated as integers. All expressions are evaluated in the same size, which is 32 bits if both the host and target are 32 bits, and is otherwise 64 bits. You can use and set symbol values in expressions. The linker defines several special purpose builtin functions for use in expressions. * Menu: * Constants:: Constants * Symbols:: Symbol Names * Location Counter:: The Location Counter * Operators:: Operators * Evaluation:: Evaluation * Expression Section:: The Section of an Expression * Builtin Functions:: Builtin Functions  File: ld.info, Node: Constants, Next: Symbols, Up: Expressions Constants --------- All constants are integers. As in C, the linker considers an integer beginning with `0' to be octal, and an integer beginning with `0x' or `0X' to be hexadecimal. The linker considers other integers to be decimal. In addition, you can use the suffixes `K' and `M' to scale a constant by `1024' or `1024*1024' respectively. For example, the following all refer to the same quantity: _fourk_1 = 4K; _fourk_2 = 4096; _fourk_3 = 0x1000;  File: ld.info, Node: Symbols, Next: Location Counter, Prev: Constants, Up: Expressions Symbol Names ------------ Unless quoted, symbol names start with a letter, underscore, or period and may include letters, digits, underscores, periods, and hyphens. Unquoted symbol names must not conflict with any keywords. You can specify a symbol which contains odd characters or has the same name as a keyword by surrounding the symbol name in double quotes: "SECTION" = 9; "with a space" = "also with a space" + 10; Since symbols can contain many non-alphabetic characters, it is safest to delimit symbols with spaces. For example, `A-B' is one symbol, whereas `A - B' is an expression involving subtraction.  File: ld.info, Node: Location Counter, Next: Operators, Prev: Symbols, Up: Expressions The Location Counter -------------------- The special linker variable "dot" `.' always contains the current output location counter. Since the `.' always refers to a location in an output section, it may only appear in an expression within a `SECTIONS' command. The `.' symbol may appear anywhere that an ordinary symbol is allowed in an expression. Assigning a value to `.' will cause the location counter to be moved. This may be used to create holes in the output section. The location counter may never be moved backwards. SECTIONS { output : { file1(.text) . = . + 1000; file2(.text) . += 1000; file3(.text) } = 0x1234; } In the previous example, the `.text' section from `file1' is located at the beginning of the output section `output'. It is followed by a 1000 byte gap. Then the `.text' section from `file2' appears, also with a 1000 byte gap following before the `.text' section from `file3'. The notation `= 0x1234' specifies what data to write in the gaps (*note Output Section Fill::). Note: `.' actually refers to the byte offset from the start of the current containing object. Normally this is the `SECTIONS' statement, whoes start address is 0, hence `.' can be used as an absolute address. If `.' is used inside a section description however, it refers to the byte offset from the start of that section, not an absolute address. Thus in a script like this: SECTIONS { . = 0x100 .text: { *(.text) . = 0x200 } . = 0x500 .data: { *(.data) . += 0x600 } } The `.text' section will be assigned a starting address of 0x100 and a size of exactly 0x200 bytes, even if there is not enough data in the `.text' input sections to fill this area. (If there is too much data, an error will be produced because this would be an attempt to move `.' backwards). The `.data' section will start at 0x500 and it will have an extra 0x600 bytes worth of space after the end of the values from the `.data' input sections and before the end of the `.data' output section itself.  File: ld.info, Node: Operators, Next: Evaluation, Prev: Location Counter, Up: Expressions Operators --------- The linker recognizes the standard C set of arithmetic operators, with the standard bindings and precedence levels: precedence associativity Operators Notes (highest) 1 left ! - ~ (1) 2 left * / % 3 left + - 4 left >> << 5 left == != > < <= >= 6 left & 7 left | 8 left && 9 left || 10 right ? : 11 right &= += -= *= /= (2) (lowest) Notes: (1) Prefix operators (2) *Note Assignments::.  File: ld.info, Node: Evaluation, Next: Expression Section, Prev: Operators, Up: Expressions Evaluation ---------- The linker evaluates expressions lazily. It only computes the value of an expression when absolutely necessary. The linker needs some information, such as the value of the start address of the first section, and the origins and lengths of memory regions, in order to do any linking at all. These values are computed as soon as possible when the linker reads in the linker script. However, other values (such as symbol values) are not known or needed until after storage allocation. Such values are evaluated later, when other information (such as the sizes of output sections) is available for use in the symbol assignment expression. The sizes of sections cannot be known until after allocation, so assignments dependent upon these are not performed until after allocation. Some expressions, such as those depending upon the location counter `.', must be evaluated during section allocation. If the result of an expression is required, but the value is not available, then an error results. For example, a script like the following SECTIONS { .text 9+this_isnt_constant : { *(.text) } } will cause the error message `non constant expression for initial address'.  File: ld.info, Node: Expression Section, Next: Builtin Functions, Prev: Evaluation, Up: Expressions The Section of an Expression ---------------------------- When the linker evaluates an expression, the result is either absolute or relative to some section. A relative expression is expressed as a fixed offset from the base of a section. The position of the expression within the linker script determines whether it is absolute or relative. An expression which appears within an output section definition is relative to the base of the output section. An expression which appears elsewhere will be absolute. A symbol set to a relative expression will be relocatable if you request relocatable output using the `-r' option. That means that a further link operation may change the value of the symbol. The symbol's section will be the section of the relative expression. A symbol set to an absolute expression will retain the same value through any further link operation. The symbol will be absolute, and will not have any particular associated section. You can use the builtin function `ABSOLUTE' to force an expression to be absolute when it would otherwise be relative. For example, to create an absolute symbol set to the address of the end of the output section `.data': SECTIONS { .data : { *(.data) _edata = ABSOLUTE(.); } } If `ABSOLUTE' were not used, `_edata' would be relative to the `.data' section.  File: ld.info, Node: Builtin Functions, Prev: Expression Section, Up: Expressions Builtin Functions ----------------- The linker script language includes a number of builtin functions for use in linker script expressions. `ABSOLUTE(EXP)' Return the absolute (non-relocatable, as opposed to non-negative) value of the expression EXP. Primarily useful to assign an absolute value to a symbol within a section definition, where symbol values are normally section relative. *Note Expression Section::. `ADDR(SECTION)' Return the absolute address (the VMA) of the named SECTION. Your script must previously have defined the location of that section. In the following example, `symbol_1' and `symbol_2' are assigned identical values: SECTIONS { ... .output1 : { start_of_output_1 = ABSOLUTE(.); ... } .output : { symbol_1 = ADDR(.output1); symbol_2 = start_of_output_1; } ... } `ALIGN(EXP)' Return the location counter (`.') aligned to the next EXP boundary. EXP must be an expression whose value is a power of two. This is equivalent to (. + EXP - 1) & ~(EXP - 1) `ALIGN' doesn't change the value of the location counter--it just does arithmetic on it. Here is an example which aligns the output `.data' section to the next `0x2000' byte boundary after the preceding section and sets a variable within the section to the next `0x8000' boundary after the input sections: SECTIONS { ... .data ALIGN(0x2000): { *(.data) variable = ALIGN(0x8000); } ... } The first use of `ALIGN' in this example specifies the location of a section because it is used as the optional ADDRESS attribute of a section definition (*note Output Section Address::). The second use of `ALIGN' is used to defines the value of a symbol. The builtin function `NEXT' is closely related to `ALIGN'. `BLOCK(EXP)' This is a synonym for `ALIGN', for compatibility with older linker scripts. It is most often seen when setting the address of an output section. `DEFINED(SYMBOL)' Return 1 if SYMBOL is in the linker global symbol table and is defined, otherwise return 0. You can use this function to provide default values for symbols. For example, the following script fragment shows how to set a global symbol `begin' to the first location in the `.text' section--but if a symbol called `begin' already existed, its value is preserved: SECTIONS { ... .text : { begin = DEFINED(begin) ? begin : . ; ... } ... } `LOADADDR(SECTION)' Return the absolute LMA of the named SECTION. This is normally the same as `ADDR', but it may be different if the `AT' attribute is used in the output section definition (*note Output Section LMA::). `MAX(EXP1, EXP2)' Returns the maximum of EXP1 and EXP2. `MIN(EXP1, EXP2)' Returns the minimum of EXP1 and EXP2. `NEXT(EXP)' Return the next unallocated address that is a multiple of EXP. This function is closely related to `ALIGN(EXP)'; unless you use the `MEMORY' command to define discontinuous memory for the output file, the two functions are equivalent. `SIZEOF(SECTION)' Return the size in bytes of the named SECTION, if that section has been allocated. If the section has not been allocated when this is evaluated, the linker will report an error. In the following example, `symbol_1' and `symbol_2' are assigned identical values: SECTIONS{ ... .output { .start = . ; ... .end = . ; } symbol_1 = .end - .start ; symbol_2 = SIZEOF(.output); ... } `SIZEOF_HEADERS' `sizeof_headers' Return the size in bytes of the output file's headers. This is information which appears at the start of the output file. You can use this number when setting the start address of the first section, if you choose, to facilitate paging. When producing an ELF output file, if the linker script uses the `SIZEOF_HEADERS' builtin function, the linker must compute the number of program headers before it has determined all the section addresses and sizes. If the linker later discovers that it needs additional program headers, it will report an error `not enough room for program headers'. To avoid this error, you must avoid using the `SIZEOF_HEADERS' function, or you must rework your linker script to avoid forcing the linker to use additional program headers, or you must define the program headers yourself using the `PHDRS' command (*note PHDRS::).  File: ld.info, Node: Implicit Linker Scripts, Prev: Expressions, Up: Scripts Implicit Linker Scripts ======================= If you specify a linker input file which the linker can not recognize as an object file or an archive file, it will try to read the file as a linker script. If the file can not be parsed as a linker script, the linker will report an error. An implicit linker script will not replace the default linker script. Typically an implicit linker script would contain only symbol assignments, or the `INPUT', `GROUP', or `VERSION' commands. Any input files read because of an implicit linker script will be read at the position in the command line where the implicit linker script was read. This can affect archive searching.  File: ld.info, Node: Machine Dependent, Next: BFD, Prev: Scripts, Up: Top Machine Dependent Features ************************** `ld' has additional features on some platforms; the following sections describe them. Machines where `ld' has no additional functionality are not listed. * Menu: * H8/300:: `ld' and the H8/300 * i960:: `ld' and the Intel 960 family * ARM:: `ld' and the ARM family  File: ld.info, Node: H8/300, Next: i960, Up: Machine Dependent `ld' and the H8/300 =================== For the H8/300, `ld' can perform these global optimizations when you specify the `--relax' command-line option. _relaxing address modes_ `ld' finds all `jsr' and `jmp' instructions whose targets are within eight bits, and turns them into eight-bit program-counter relative `bsr' and `bra' instructions, respectively. _synthesizing instructions_ `ld' finds all `mov.b' instructions which use the sixteen-bit absolute address form, but refer to the top page of memory, and changes them to use the eight-bit address form. (That is: the linker turns `mov.b `@'AA:16' into `mov.b `@'AA:8' whenever the address AA is in the top page of memory).  File: ld.info, Node: i960, Next: ARM, Prev: H8/300, Up: Machine Dependent `ld' and the Intel 960 family ============================= You can use the `-AARCHITECTURE' command line option to specify one of the two-letter names identifying members of the 960 family; the option specifies the desired output target, and warns of any incompatible instructions in the input files. It also modifies the linker's search strategy for archive libraries, to support the use of libraries specific to each particular architecture, by including in the search loop names suffixed with the string identifying the architecture. For example, if your `ld' command line included `-ACA' as well as `-ltry', the linker would look (in its built-in search paths, and in any paths you specify with `-L') for a library with the names try libtry.a tryca libtryca.a The first two possibilities would be considered in any event; the last two are due to the use of `-ACA'. You can meaningfully use `-A' more than once on a command line, since the 960 architecture family allows combination of target architectures; each use will add another pair of name variants to search for when `-l' specifies a library. `ld' supports the `--relax' option for the i960 family. If you specify `--relax', `ld' finds all `balx' and `calx' instructions whose targets are within 24 bits, and turns them into 24-bit program-counter relative `bal' and `cal' instructions, respectively. `ld' also turns `cal' instructions into `bal' instructions when it determines that the target subroutine is a leaf routine (that is, the target subroutine does not itself call any subroutines).  File: ld.info, Node: ARM, Prev: i960, Up: Machine Dependent `ld''s support for interworking between ARM and Thumb code ========================================================== For the ARM, `ld' will generate code stubs to allow functions calls betweem ARM and Thumb code. These stubs only work with code that has been compiled and assembled with the `-mthumb-interwork' command line option. If it is necessary to link with old ARM object files or libraries, which have not been compiled with the -mthumb-interwork option then the `--support-old-code' command line switch should be given to the linker. This will make it generate larger stub functions which will work with non-interworking aware ARM code. Note, however, the linker does not support generating stubs for function calls to non-interworking aware Thumb code. The `--thumb-entry' switch is a duplicate of the generic `--entry' switch, in that it sets the program's starting address. But it also sets the bottom bit of the address, so that it can be branched to using a BX instruction, and the program will start executing in Thumb mode straight away.  File: ld.info, Node: BFD, Next: Reporting Bugs, Prev: Machine Dependent, Up: Top BFD *** The linker accesses object and archive files using the BFD libraries. These libraries allow the linker to use the same routines to operate on object files whatever the object file format. A different object file format can be supported simply by creating a new BFD back end and adding it to the library. To conserve runtime memory, however, the linker and associated tools are usually configured to support only a subset of the object file formats available. You can use `objdump -i' (*note objdump: (binutils.info)objdump.) to list all the formats available for your configuration. As with most implementations, BFD is a compromise between several conflicting requirements. The major factor influencing BFD design was efficiency: any time used converting between formats is time which would not have been spent had BFD not been involved. This is partly offset by abstraction payback; since BFD simplifies applications and back ends, more time and care may be spent optimizing algorithms for a greater speed. One minor artifact of the BFD solution which you should bear in mind is the potential for information loss. There are two places where useful information can be lost using the BFD mechanism: during conversion and during output. *Note BFD information loss::. * Menu: * BFD outline:: How it works: an outline of BFD  File: ld.info, Node: BFD outline, Up: BFD How it works: an outline of BFD =============================== When an object file is opened, BFD subroutines automatically determine the format of the input object file. They then build a descriptor in memory with pointers to routines that will be used to access elements of the object file's data structures. As different information from the the object files is required, BFD reads from different sections of the file and processes them. For example, a very common operation for the linker is processing symbol tables. Each BFD back end provides a routine for converting between the object file's representation of symbols and an internal canonical format. When the linker asks for the symbol table of an object file, it calls through a memory pointer to the routine from the relevant BFD back end which reads and converts the table into a canonical form. The linker then operates upon the canonical form. When the link is finished and the linker writes the output file's symbol table, another BFD back end routine is called to take the newly created symbol table and convert it into the chosen output format. * Menu: * BFD information loss:: Information Loss * Canonical format:: The BFD canonical object-file format  File: ld.info, Node: BFD information loss, Next: Canonical format, Up: BFD outline Information Loss ---------------- _Information can be lost during output._ The output formats supported by BFD do not provide identical facilities, and information which can be described in one form has nowhere to go in another format. One example of this is alignment information in `b.out'. There is nowhere in an `a.out' format file to store alignment information on the contained data, so when a file is linked from `b.out' and an `a.out' image is produced, alignment information will not propagate to the output file. (The linker will still use the alignment information internally, so the link is performed correctly). Another example is COFF section names. COFF files may contain an unlimited number of sections, each one with a textual section name. If the target of the link is a format which does not have many sections (e.g., `a.out') or has sections without names (e.g., the Oasys format), the link cannot be done simply. You can circumvent this problem by describing the desired input-to-output section mapping with the linker command language. _Information can be lost during canonicalization._ The BFD internal canonical form of the external formats is not exhaustive; there are structures in input formats for which there is no direct representation internally. This means that the BFD back ends cannot maintain all possible data richness through the transformation between external to internal and back to external formats. This limitation is only a problem when an application reads one format and writes another. Each BFD back end is responsible for maintaining as much data as possible, and the internal BFD canonical form has structures which are opaque to the BFD core, and exported only to the back ends. When a file is read in one format, the canonical form is generated for BFD and the application. At the same time, the back end saves away any information which may otherwise be lost. If the data is then written back in the same format, the back end routine will be able to use the canonical form provided by the BFD core as well as the information it prepared earlier. Since there is a great deal of commonality between back ends, there is no information lost when linking or copying big endian COFF to little endian COFF, or `a.out' to `b.out'. When a mixture of formats is linked, the information is only lost from the files whose format differs from the destination.  File: ld.info, Node: Canonical format, Prev: BFD information loss, Up: BFD outline The BFD canonical object-file format ------------------------------------ The greatest potential for loss of information occurs when there is the least overlap between the information provided by the source format, that stored by the canonical format, and that needed by the destination format. A brief description of the canonical form may help you understand which kinds of data you can count on preserving across conversions. _files_ Information stored on a per-file basis includes target machine architecture, particular implementation format type, a demand pageable bit, and a write protected bit. Information like Unix magic numbers is not stored here--only the magic numbers' meaning, so a `ZMAGIC' file would have both the demand pageable bit and the write protected text bit set. The byte order of the target is stored on a per-file basis, so that big- and little-endian object files may be used with one another. _sections_ Each section in the input file contains the name of the section, the section's original address in the object file, size and alignment information, various flags, and pointers into other BFD data structures. _symbols_ Each symbol contains a pointer to the information for the object file which originally defined it, its name, its value, and various flag bits. When a BFD back end reads in a symbol table, it relocates all symbols to make them relative to the base of the section where they were defined. Doing this ensures that each symbol points to its containing section. Each symbol also has a varying amount of hidden private data for the BFD back end. Since the symbol points to the original file, the private data format for that symbol is accessible. `ld' can operate on a collection of symbols of wildly different formats without problems. Normal global and simple local symbols are maintained on output, so an output file (no matter its format) will retain symbols pointing to functions and to global, static, and common variables. Some symbol information is not worth retaining; in `a.out', type information is stored in the symbol table as long symbol names. This information would be useless to most COFF debuggers; the linker has command line switches to allow users to throw it away. There is one word of type information within the symbol, so if the format supports symbol type information within symbols (for example, COFF, IEEE, Oasys) and the type is simple enough to fit within one word (nearly everything but aggregates), the information will be preserved. _relocation level_ Each canonical BFD relocation record contains a pointer to the symbol to relocate to, the offset of the data to relocate, the section the data is in, and a pointer to a relocation type descriptor. Relocation is performed by passing messages through the relocation type descriptor and the symbol pointer. Therefore, relocations can be performed on output data using a relocation method that is only available in one of the input formats. For instance, Oasys provides a byte relocation format. A relocation record requesting this relocation type would point indirectly to a routine to perform this, so the relocation may be performed on a byte being written to a 68k COFF file, even though 68k COFF has no such relocation type. _line numbers_ Object formats can contain, for debugging purposes, some form of mapping between symbols, source line numbers, and addresses in the output file. These addresses have to be relocated along with the symbol information. Each symbol with an associated list of line number records points to the first record of the list. The head of a line number list consists of a pointer to the symbol, which allows finding out the address of the function whose line number is being described. The rest of the list is made up of pairs: offsets into the section and line numbers. Any format which can simply derive this information can pass it successfully between formats (COFF, IEEE and Oasys).  File: ld.info, Node: Reporting Bugs, Next: MRI, Prev: BFD, Up: Top Reporting Bugs ************** Your bug reports play an essential role in making `ld' reliable. Reporting a bug may help you by bringing a solution to your problem, or it may not. But in any case the principal function of a bug report is to help the entire community by making the next version of `ld' work better. Bug reports are your contribution to the maintenance of `ld'. In order for a bug report to serve its purpose, you must include the information that enables us to fix the bug. * Menu: * Bug Criteria:: Have you found a bug? * Bug Reporting:: How to report bugs  File: ld.info, Node: Bug Criteria, Next: Bug Reporting, Up: Reporting Bugs Have you found a bug? ===================== If you are not sure whether you have found a bug, here are some guidelines: * If the linker gets a fatal signal, for any input whatever, that is a `ld' bug. Reliable linkers never crash. * If `ld' produces an error message for valid input, that is a bug. * If `ld' does not produce an error message for invalid input, that may be a bug. In the general case, the linker can not verify that object files are correct. * If you are an experienced user of linkers, your suggestions for improvement of `ld' are welcome in any case.