NetBSD/gnu/dist/gdb/bfd/doc/bfd.info-7
2003-08-11 20:21:35 +00:00

971 lines
39 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This is bfd.info, produced by makeinfo version 4.1 from bfd.texinfo.
START-INFO-DIR-ENTRY
* Bfd: (bfd). The Binary File Descriptor library.
END-INFO-DIR-ENTRY
This file documents the BFD library.
Copyright (C) 1991, 2000, 2001 Free Software Foundation, Inc.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.1
or any later version published by the Free Software Foundation;
with no Invariant Sections, with no Front-Cover Texts, and with no
Back-Cover Texts. A copy of the license is included in the
section entitled "GNU Free Documentation License".

File: bfd.info, Node: coff, Next: elf, Prev: aout, Up: BFD back ends
coff backends
=============
BFD supports a number of different flavours of coff format. The
major differences between formats are the sizes and alignments of
fields in structures on disk, and the occasional extra field.
Coff in all its varieties is implemented with a few common files and
a number of implementation specific files. For example, The 88k bcs
coff format is implemented in the file `coff-m88k.c'. This file
`#include's `coff/m88k.h' which defines the external structure of the
coff format for the 88k, and `coff/internal.h' which defines the
internal structure. `coff-m88k.c' also defines the relocations used by
the 88k format *Note Relocations::.
The Intel i960 processor version of coff is implemented in
`coff-i960.c'. This file has the same structure as `coff-m88k.c',
except that it includes `coff/i960.h' rather than `coff-m88k.h'.
Porting to a new version of coff
--------------------------------
The recommended method is to select from the existing
implementations the version of coff which is most like the one you want
to use. For example, we'll say that i386 coff is the one you select,
and that your coff flavour is called foo. Copy `i386coff.c' to
`foocoff.c', copy `../include/coff/i386.h' to `../include/coff/foo.h',
and add the lines to `targets.c' and `Makefile.in' so that your new
back end is used. Alter the shapes of the structures in
`../include/coff/foo.h' so that they match what you need. You will
probably also have to add `#ifdef's to the code in `coff/internal.h' and
`coffcode.h' if your version of coff is too wild.
You can verify that your new BFD backend works quite simply by
building `objdump' from the `binutils' directory, and making sure that
its version of what's going on and your host system's idea (assuming it
has the pretty standard coff dump utility, usually called `att-dump' or
just `dump') are the same. Then clean up your code, and send what
you've done to Cygnus. Then your stuff will be in the next release, and
you won't have to keep integrating it.
How the coff backend works
--------------------------
File layout
...........
The Coff backend is split into generic routines that are applicable
to any Coff target and routines that are specific to a particular
target. The target-specific routines are further split into ones which
are basically the same for all Coff targets except that they use the
external symbol format or use different values for certain constants.
The generic routines are in `coffgen.c'. These routines work for
any Coff target. They use some hooks into the target specific code;
the hooks are in a `bfd_coff_backend_data' structure, one of which
exists for each target.
The essentially similar target-specific routines are in
`coffcode.h'. This header file includes executable C code. The
various Coff targets first include the appropriate Coff header file,
make any special defines that are needed, and then include `coffcode.h'.
Some of the Coff targets then also have additional routines in the
target source file itself.
For example, `coff-i960.c' includes `coff/internal.h' and
`coff/i960.h'. It then defines a few constants, such as `I960', and
includes `coffcode.h'. Since the i960 has complex relocation types,
`coff-i960.c' also includes some code to manipulate the i960 relocs.
This code is not in `coffcode.h' because it would not be used by any
other target.
Bit twiddling
.............
Each flavour of coff supported in BFD has its own header file
describing the external layout of the structures. There is also an
internal description of the coff layout, in `coff/internal.h'. A major
function of the coff backend is swapping the bytes and twiddling the
bits to translate the external form of the structures into the normal
internal form. This is all performed in the `bfd_swap'_thing_direction
routines. Some elements are different sizes between different versions
of coff; it is the duty of the coff version specific include file to
override the definitions of various packing routines in `coffcode.h'.
E.g., the size of line number entry in coff is sometimes 16 bits, and
sometimes 32 bits. `#define'ing `PUT_LNSZ_LNNO' and `GET_LNSZ_LNNO'
will select the correct one. No doubt, some day someone will find a
version of coff which has a varying field size not catered to at the
moment. To port BFD, that person will have to add more `#defines'.
Three of the bit twiddling routines are exported to `gdb';
`coff_swap_aux_in', `coff_swap_sym_in' and `coff_swap_lineno_in'. `GDB'
reads the symbol table on its own, but uses BFD to fix things up. More
of the bit twiddlers are exported for `gas'; `coff_swap_aux_out',
`coff_swap_sym_out', `coff_swap_lineno_out', `coff_swap_reloc_out',
`coff_swap_filehdr_out', `coff_swap_aouthdr_out',
`coff_swap_scnhdr_out'. `Gas' currently keeps track of all the symbol
table and reloc drudgery itself, thereby saving the internal BFD
overhead, but uses BFD to swap things on the way out, making cross
ports much safer. Doing so also allows BFD (and thus the linker) to
use the same header files as `gas', which makes one avenue to disaster
disappear.
Symbol reading
..............
The simple canonical form for symbols used by BFD is not rich enough
to keep all the information available in a coff symbol table. The back
end gets around this problem by keeping the original symbol table
around, "behind the scenes".
When a symbol table is requested (through a call to
`bfd_canonicalize_symtab'), a request gets through to
`coff_get_normalized_symtab'. This reads the symbol table from the coff
file and swaps all the structures inside into the internal form. It
also fixes up all the pointers in the table (represented in the file by
offsets from the first symbol in the table) into physical pointers to
elements in the new internal table. This involves some work since the
meanings of fields change depending upon context: a field that is a
pointer to another structure in the symbol table at one moment may be
the size in bytes of a structure at the next. Another pass is made
over the table. All symbols which mark file names (`C_FILE' symbols)
are modified so that the internal string points to the value in the
auxent (the real filename) rather than the normal text associated with
the symbol (`".file"').
At this time the symbol names are moved around. Coff stores all
symbols less than nine characters long physically within the symbol
table; longer strings are kept at the end of the file in the string
table. This pass moves all strings into memory and replaces them with
pointers to the strings.
The symbol table is massaged once again, this time to create the
canonical table used by the BFD application. Each symbol is inspected
in turn, and a decision made (using the `sclass' field) about the
various flags to set in the `asymbol'. *Note Symbols::. The generated
canonical table shares strings with the hidden internal symbol table.
Any linenumbers are read from the coff file too, and attached to the
symbols which own the functions the linenumbers belong to.
Symbol writing
..............
Writing a symbol to a coff file which didn't come from a coff file
will lose any debugging information. The `asymbol' structure remembers
the BFD from which the symbol was taken, and on output the back end
makes sure that the same destination target as source target is present.
When the symbols have come from a coff file then all the debugging
information is preserved.
Symbol tables are provided for writing to the back end in a vector
of pointers to pointers. This allows applications like the linker to
accumulate and output large symbol tables without having to do too much
byte copying.
This function runs through the provided symbol table and patches
each symbol marked as a file place holder (`C_FILE') to point to the
next file place holder in the list. It also marks each `offset' field
in the list with the offset from the first symbol of the current symbol.
Another function of this procedure is to turn the canonical value
form of BFD into the form used by coff. Internally, BFD expects symbol
values to be offsets from a section base; so a symbol physically at
0x120, but in a section starting at 0x100, would have the value 0x20.
Coff expects symbols to contain their final value, so symbols have
their values changed at this point to reflect their sum with their
owning section. This transformation uses the `output_section' field of
the `asymbol''s `asection' *Note Sections::.
* `coff_mangle_symbols'
This routine runs though the provided symbol table and uses the
offsets generated by the previous pass and the pointers generated when
the symbol table was read in to create the structured hierachy required
by coff. It changes each pointer to a symbol into the index into the
symbol table of the asymbol.
* `coff_write_symbols'
This routine runs through the symbol table and patches up the
symbols from their internal form into the coff way, calls the bit
twiddlers, and writes out the table to the file.
`coff_symbol_type'
..................
*Description*
The hidden information for an `asymbol' is described in a
`combined_entry_type':
typedef struct coff_ptr_struct
{
/* Remembers the offset from the first symbol in the file for
this symbol. Generated by coff_renumber_symbols. */
unsigned int offset;
/* Should the value of this symbol be renumbered. Used for
XCOFF C_BSTAT symbols. Set by coff_slurp_symbol_table. */
unsigned int fix_value : 1;
/* Should the tag field of this symbol be renumbered.
Created by coff_pointerize_aux. */
unsigned int fix_tag : 1;
/* Should the endidx field of this symbol be renumbered.
Created by coff_pointerize_aux. */
unsigned int fix_end : 1;
/* Should the x_csect.x_scnlen field be renumbered.
Created by coff_pointerize_aux. */
unsigned int fix_scnlen : 1;
/* Fix up an XCOFF C_BINCL/C_EINCL symbol. The value is the
index into the line number entries. Set by coff_slurp_symbol_table. */
unsigned int fix_line : 1;
/* The container for the symbol structure as read and translated
from the file. */
union
{
union internal_auxent auxent;
struct internal_syment syment;
} u;
} combined_entry_type;
/* Each canonical asymbol really looks like this: */
typedef struct coff_symbol_struct
{
/* The actual symbol which the rest of BFD works with */
asymbol symbol;
/* A pointer to the hidden information for this symbol */
combined_entry_type *native;
/* A pointer to the linenumber information for this symbol */
struct lineno_cache_entry *lineno;
/* Have the line numbers been relocated yet ? */
boolean done_lineno;
} coff_symbol_type;
`bfd_coff_backend_data'
.......................
/* COFF symbol classifications. */
enum coff_symbol_classification
{
/* Global symbol. */
COFF_SYMBOL_GLOBAL,
/* Common symbol. */
COFF_SYMBOL_COMMON,
/* Undefined symbol. */
COFF_SYMBOL_UNDEFINED,
/* Local symbol. */
COFF_SYMBOL_LOCAL,
/* PE section symbol. */
COFF_SYMBOL_PE_SECTION
};
Special entry points for gdb to swap in coff symbol table parts:
typedef struct
{
void (*_bfd_coff_swap_aux_in)
PARAMS ((bfd *, PTR, int, int, int, int, PTR));
void (*_bfd_coff_swap_sym_in)
PARAMS ((bfd *, PTR, PTR));
void (*_bfd_coff_swap_lineno_in)
PARAMS ((bfd *, PTR, PTR));
unsigned int (*_bfd_coff_swap_aux_out)
PARAMS ((bfd *, PTR, int, int, int, int, PTR));
unsigned int (*_bfd_coff_swap_sym_out)
PARAMS ((bfd *, PTR, PTR));
unsigned int (*_bfd_coff_swap_lineno_out)
PARAMS ((bfd *, PTR, PTR));
unsigned int (*_bfd_coff_swap_reloc_out)
PARAMS ((bfd *, PTR, PTR));
unsigned int (*_bfd_coff_swap_filehdr_out)
PARAMS ((bfd *, PTR, PTR));
unsigned int (*_bfd_coff_swap_aouthdr_out)
PARAMS ((bfd *, PTR, PTR));
unsigned int (*_bfd_coff_swap_scnhdr_out)
PARAMS ((bfd *, PTR, PTR));
unsigned int _bfd_filhsz;
unsigned int _bfd_aoutsz;
unsigned int _bfd_scnhsz;
unsigned int _bfd_symesz;
unsigned int _bfd_auxesz;
unsigned int _bfd_relsz;
unsigned int _bfd_linesz;
unsigned int _bfd_filnmlen;
boolean _bfd_coff_long_filenames;
boolean _bfd_coff_long_section_names;
unsigned int _bfd_coff_default_section_alignment_power;
boolean _bfd_coff_force_symnames_in_strings;
unsigned int _bfd_coff_debug_string_prefix_length;
void (*_bfd_coff_swap_filehdr_in)
PARAMS ((bfd *, PTR, PTR));
void (*_bfd_coff_swap_aouthdr_in)
PARAMS ((bfd *, PTR, PTR));
void (*_bfd_coff_swap_scnhdr_in)
PARAMS ((bfd *, PTR, PTR));
void (*_bfd_coff_swap_reloc_in)
PARAMS ((bfd *abfd, PTR, PTR));
boolean (*_bfd_coff_bad_format_hook)
PARAMS ((bfd *, PTR));
boolean (*_bfd_coff_set_arch_mach_hook)
PARAMS ((bfd *, PTR));
PTR (*_bfd_coff_mkobject_hook)
PARAMS ((bfd *, PTR, PTR));
boolean (*_bfd_styp_to_sec_flags_hook)
PARAMS ((bfd *, PTR, const char *, asection *, flagword *));
void (*_bfd_set_alignment_hook)
PARAMS ((bfd *, asection *, PTR));
boolean (*_bfd_coff_slurp_symbol_table)
PARAMS ((bfd *));
boolean (*_bfd_coff_symname_in_debug)
PARAMS ((bfd *, struct internal_syment *));
boolean (*_bfd_coff_pointerize_aux_hook)
PARAMS ((bfd *, combined_entry_type *, combined_entry_type *,
unsigned int, combined_entry_type *));
boolean (*_bfd_coff_print_aux)
PARAMS ((bfd *, FILE *, combined_entry_type *, combined_entry_type *,
combined_entry_type *, unsigned int));
void (*_bfd_coff_reloc16_extra_cases)
PARAMS ((bfd *, struct bfd_link_info *, struct bfd_link_order *, arelent *,
bfd_byte *, unsigned int *, unsigned int *));
int (*_bfd_coff_reloc16_estimate)
PARAMS ((bfd *, asection *, arelent *, unsigned int,
struct bfd_link_info *));
enum coff_symbol_classification (*_bfd_coff_classify_symbol)
PARAMS ((bfd *, struct internal_syment *));
boolean (*_bfd_coff_compute_section_file_positions)
PARAMS ((bfd *));
boolean (*_bfd_coff_start_final_link)
PARAMS ((bfd *, struct bfd_link_info *));
boolean (*_bfd_coff_relocate_section)
PARAMS ((bfd *, struct bfd_link_info *, bfd *, asection *, bfd_byte *,
struct internal_reloc *, struct internal_syment *, asection **));
reloc_howto_type *(*_bfd_coff_rtype_to_howto)
PARAMS ((bfd *, asection *, struct internal_reloc *,
struct coff_link_hash_entry *, struct internal_syment *,
bfd_vma *));
boolean (*_bfd_coff_adjust_symndx)\
PARAMS ((bfd *, struct bfd_link_info *, bfd *, asection *,
struct internal_reloc *, boolean *));
boolean (*_bfd_coff_link_add_one_symbol)
PARAMS ((struct bfd_link_info *, bfd *, const char *, flagword,
asection *, bfd_vma, const char *, boolean, boolean,
struct bfd_link_hash_entry **));
boolean (*_bfd_coff_link_output_has_begun)
PARAMS ((bfd *, struct coff_final_link_info *));
boolean (*_bfd_coff_final_link_postscript)
PARAMS ((bfd *, struct coff_final_link_info *));
} bfd_coff_backend_data;
#define coff_backend_info(abfd) \
((bfd_coff_backend_data *) (abfd)->xvec->backend_data)
#define bfd_coff_swap_aux_in(a,e,t,c,ind,num,i) \
((coff_backend_info (a)->_bfd_coff_swap_aux_in) (a,e,t,c,ind,num,i))
#define bfd_coff_swap_sym_in(a,e,i) \
((coff_backend_info (a)->_bfd_coff_swap_sym_in) (a,e,i))
#define bfd_coff_swap_lineno_in(a,e,i) \
((coff_backend_info ( a)->_bfd_coff_swap_lineno_in) (a,e,i))
#define bfd_coff_swap_reloc_out(abfd, i, o) \
((coff_backend_info (abfd)->_bfd_coff_swap_reloc_out) (abfd, i, o))
#define bfd_coff_swap_lineno_out(abfd, i, o) \
((coff_backend_info (abfd)->_bfd_coff_swap_lineno_out) (abfd, i, o))
#define bfd_coff_swap_aux_out(a,i,t,c,ind,num,o) \
((coff_backend_info (a)->_bfd_coff_swap_aux_out) (a,i,t,c,ind,num,o))
#define bfd_coff_swap_sym_out(abfd, i,o) \
((coff_backend_info (abfd)->_bfd_coff_swap_sym_out) (abfd, i, o))
#define bfd_coff_swap_scnhdr_out(abfd, i,o) \
((coff_backend_info (abfd)->_bfd_coff_swap_scnhdr_out) (abfd, i, o))
#define bfd_coff_swap_filehdr_out(abfd, i,o) \
((coff_backend_info (abfd)->_bfd_coff_swap_filehdr_out) (abfd, i, o))
#define bfd_coff_swap_aouthdr_out(abfd, i,o) \
((coff_backend_info (abfd)->_bfd_coff_swap_aouthdr_out) (abfd, i, o))
#define bfd_coff_filhsz(abfd) (coff_backend_info (abfd)->_bfd_filhsz)
#define bfd_coff_aoutsz(abfd) (coff_backend_info (abfd)->_bfd_aoutsz)
#define bfd_coff_scnhsz(abfd) (coff_backend_info (abfd)->_bfd_scnhsz)
#define bfd_coff_symesz(abfd) (coff_backend_info (abfd)->_bfd_symesz)
#define bfd_coff_auxesz(abfd) (coff_backend_info (abfd)->_bfd_auxesz)
#define bfd_coff_relsz(abfd) (coff_backend_info (abfd)->_bfd_relsz)
#define bfd_coff_linesz(abfd) (coff_backend_info (abfd)->_bfd_linesz)
#define bfd_coff_filnmlen(abfd) (coff_backend_info (abfd)->_bfd_filnmlen)
#define bfd_coff_long_filenames(abfd) \
(coff_backend_info (abfd)->_bfd_coff_long_filenames)
#define bfd_coff_long_section_names(abfd) \
(coff_backend_info (abfd)->_bfd_coff_long_section_names)
#define bfd_coff_default_section_alignment_power(abfd) \
(coff_backend_info (abfd)->_bfd_coff_default_section_alignment_power)
#define bfd_coff_swap_filehdr_in(abfd, i,o) \
((coff_backend_info (abfd)->_bfd_coff_swap_filehdr_in) (abfd, i, o))
#define bfd_coff_swap_aouthdr_in(abfd, i,o) \
((coff_backend_info (abfd)->_bfd_coff_swap_aouthdr_in) (abfd, i, o))
#define bfd_coff_swap_scnhdr_in(abfd, i,o) \
((coff_backend_info (abfd)->_bfd_coff_swap_scnhdr_in) (abfd, i, o))
#define bfd_coff_swap_reloc_in(abfd, i, o) \
((coff_backend_info (abfd)->_bfd_coff_swap_reloc_in) (abfd, i, o))
#define bfd_coff_bad_format_hook(abfd, filehdr) \
((coff_backend_info (abfd)->_bfd_coff_bad_format_hook) (abfd, filehdr))
#define bfd_coff_set_arch_mach_hook(abfd, filehdr)\
((coff_backend_info (abfd)->_bfd_coff_set_arch_mach_hook) (abfd, filehdr))
#define bfd_coff_mkobject_hook(abfd, filehdr, aouthdr)\
((coff_backend_info (abfd)->_bfd_coff_mkobject_hook) (abfd, filehdr, aouthdr))
#define bfd_coff_styp_to_sec_flags_hook(abfd, scnhdr, name, section, flags_ptr)\
((coff_backend_info (abfd)->_bfd_styp_to_sec_flags_hook)\
(abfd, scnhdr, name, section, flags_ptr))
#define bfd_coff_set_alignment_hook(abfd, sec, scnhdr)\
((coff_backend_info (abfd)->_bfd_set_alignment_hook) (abfd, sec, scnhdr))
#define bfd_coff_slurp_symbol_table(abfd)\
((coff_backend_info (abfd)->_bfd_coff_slurp_symbol_table) (abfd))
#define bfd_coff_symname_in_debug(abfd, sym)\
((coff_backend_info (abfd)->_bfd_coff_symname_in_debug) (abfd, sym))
#define bfd_coff_force_symnames_in_strings(abfd)\
(coff_backend_info (abfd)->_bfd_coff_force_symnames_in_strings)
#define bfd_coff_debug_string_prefix_length(abfd)\
(coff_backend_info (abfd)->_bfd_coff_debug_string_prefix_length)
#define bfd_coff_print_aux(abfd, file, base, symbol, aux, indaux)\
((coff_backend_info (abfd)->_bfd_coff_print_aux)\
(abfd, file, base, symbol, aux, indaux))
#define bfd_coff_reloc16_extra_cases(abfd, link_info, link_order, reloc, data, src_ptr, dst_ptr)\
((coff_backend_info (abfd)->_bfd_coff_reloc16_extra_cases)\
(abfd, link_info, link_order, reloc, data, src_ptr, dst_ptr))
#define bfd_coff_reloc16_estimate(abfd, section, reloc, shrink, link_info)\
((coff_backend_info (abfd)->_bfd_coff_reloc16_estimate)\
(abfd, section, reloc, shrink, link_info))
#define bfd_coff_classify_symbol(abfd, sym)\
((coff_backend_info (abfd)->_bfd_coff_classify_symbol)\
(abfd, sym))
#define bfd_coff_compute_section_file_positions(abfd)\
((coff_backend_info (abfd)->_bfd_coff_compute_section_file_positions)\
(abfd))
#define bfd_coff_start_final_link(obfd, info)\
((coff_backend_info (obfd)->_bfd_coff_start_final_link)\
(obfd, info))
#define bfd_coff_relocate_section(obfd,info,ibfd,o,con,rel,isyms,secs)\
((coff_backend_info (ibfd)->_bfd_coff_relocate_section)\
(obfd, info, ibfd, o, con, rel, isyms, secs))
#define bfd_coff_rtype_to_howto(abfd, sec, rel, h, sym, addendp)\
((coff_backend_info (abfd)->_bfd_coff_rtype_to_howto)\
(abfd, sec, rel, h, sym, addendp))
#define bfd_coff_adjust_symndx(obfd, info, ibfd, sec, rel, adjustedp)\
((coff_backend_info (abfd)->_bfd_coff_adjust_symndx)\
(obfd, info, ibfd, sec, rel, adjustedp))
#define bfd_coff_link_add_one_symbol(info,abfd,name,flags,section,value,string,cp,coll,hashp)\
((coff_backend_info (abfd)->_bfd_coff_link_add_one_symbol)\
(info, abfd, name, flags, section, value, string, cp, coll, hashp))
#define bfd_coff_link_output_has_begun(a,p) \
((coff_backend_info (a)->_bfd_coff_link_output_has_begun) (a,p))
#define bfd_coff_final_link_postscript(a,p) \
((coff_backend_info (a)->_bfd_coff_final_link_postscript) (a,p))
Writing relocations
...................
To write relocations, the back end steps though the canonical
relocation table and create an `internal_reloc'. The symbol index to
use is removed from the `offset' field in the symbol table supplied.
The address comes directly from the sum of the section base address and
the relocation offset; the type is dug directly from the howto field.
Then the `internal_reloc' is swapped into the shape of an
`external_reloc' and written out to disk.
Reading linenumbers
...................
Creating the linenumber table is done by reading in the entire coff
linenumber table, and creating another table for internal use.
A coff linenumber table is structured so that each function is
marked as having a line number of 0. Each line within the function is
an offset from the first line in the function. The base of the line
number information for the table is stored in the symbol associated
with the function.
Note: The PE format uses line number 0 for a flag indicating a new
source file.
The information is copied from the external to the internal table,
and each symbol which marks a function is marked by pointing its...
How does this work ?
Reading relocations
...................
Coff relocations are easily transformed into the internal BFD form
(`arelent').
Reading a coff relocation table is done in the following stages:
* Read the entire coff relocation table into memory.
* Process each relocation in turn; first swap it from the external
to the internal form.
* Turn the symbol referenced in the relocation's symbol index into a
pointer into the canonical symbol table. This table is the same
as the one returned by a call to `bfd_canonicalize_symtab'. The
back end will call that routine and save the result if a
canonicalization hasn't been done.
* The reloc index is turned into a pointer to a howto structure, in
a back end specific way. For instance, the 386 and 960 use the
`r_type' to directly produce an index into a howto table vector;
the 88k subtracts a number from the `r_type' field and creates an
addend field.

File: bfd.info, Node: elf, Next: mmo, Prev: coff, Up: BFD back ends
ELF backends
BFD support for ELF formats is being worked on. Currently, the best
supported back ends are for sparc and i386 (running svr4 or Solaris 2).
Documentation of the internals of the support code still needs to be
written. The code is changing quickly enough that we haven't bothered
yet.
`bfd_elf_find_section'
......................
*Synopsis*
struct elf_internal_shdr *bfd_elf_find_section (bfd *abfd, char *name);
*Description*
Helper functions for GDB to locate the string tables. Since BFD hides
string tables from callers, GDB needs to use an internal hook to find
them. Sun's .stabstr, in particular, isn't even pointed to by the
.stab section, so ordinary mechanisms wouldn't work to find it, even if
we had some.

File: bfd.info, Node: mmo, Prev: elf, Up: BFD back ends
mmo backend
===========
The mmo object format is used exclusively together with Professor
Donald E. Knuth's educational 64-bit processor MMIX. The simulator
`mmix' which is available at
<http://www-cs-faculty.stanford.edu/~knuth/programs/mmix.tar.gz>
understands this format. That package also includes a combined
assembler and linker called `mmixal'. The mmo format has no advantages
feature-wise compared to e.g. ELF. It is a simple non-relocatable
object format with no support for archives or debugging information,
except for symbol value information and line numbers (which is not yet
implemented in BFD). See
<http://www-cs-faculty.stanford.edu/~knuth/mmix.html> for more
information about MMIX. The ELF format is used for intermediate object
files in the BFD implementation.
* Menu:
* File layout::
* Symbol-table::
* mmo section mapping::

File: bfd.info, Node: File layout, Next: Symbol-table, Prev: mmo, Up: mmo
File layout
-----------
The mmo file contents is not partitioned into named sections as with
e.g. ELF. Memory areas is formed by specifying the location of the
data that follows. Only the memory area `0x0000...00' to `0x01ff...ff'
is executable, so it is used for code (and constants) and the area
`0x2000...00' to `0x20ff...ff' is used for writable data. *Note mmo
section mapping::.
Contents is entered as 32-bit words, xor:ed over previous contents,
always zero-initialized. A word that starts with the byte `0x98' forms
a command called a `lopcode', where the next byte distinguished between
the thirteen lopcodes. The two remaining bytes, called the `Y' and `Z'
fields, or the `YZ' field (a 16-bit big-endian number), are used for
various purposes different for each lopcode. As documented in
<http://www-cs-faculty.stanford.edu/~knuth/mmixal-intro.ps.gz>, the
lopcodes are:
There is provision for specifying "special data" of 65536 different
types. We use type 80 (decimal), arbitrarily chosen the same as the
ELF `e_machine' number for MMIX, filling it with section information
normally found in ELF objects. *Note mmo section mapping::.
`lop_quote'
0x98000001. The next word is contents, regardless of whether it
starts with 0x98 or not.
`lop_loc'
0x9801YYZZ, where `Z' is 1 or 2. This is a location directive,
setting the location for the next data to the next 32-bit word
(for Z = 1) or 64-bit word (for Z = 2), plus Y * 2^56. Normally
`Y' is 0 for the text segment and 2 for the data segment.
`lop_skip'
0x9802YYZZ. Increase the current location by `YZ' bytes.
`lop_fixo'
0x9803YYZZ, where `Z' is 1 or 2. Store the current location as 64
bits into the location pointed to by the next 32-bit (Z = 1) or
64-bit (Z = 2) word, plus Y * 2^56.
`lop_fixr'
0x9804YYZZ. `YZ' is stored into the current location plus 2 - 4 *
YZ.
`lop_fixrx'
0x980500ZZ. `Z' is 16 or 24. A value `L' derived from the
following 32-bit word are used in a manner similar to `YZ' in
lop_fixr: it is xor:ed into the current location minus 4 * L. The
first byte of the word is 0 or 1. If it is 1, then L = (LOWEST 24
BITS OF WORD) - 2^Z, if 0, then L = (LOWEST 24 BITS OF WORD).
`lop_file'
0x9806YYZZ. `Y' is the file number, `Z' is count of 32-bit words.
Set the file number to `Y' and the line counter to 0. The next Z
* 4 bytes contain the file name, padded with zeros if the count is
not a multiple of four. The same `Y' may occur multiple times,
but `Z' must be 0 for all but the first occurrence.
`lop_line'
0x9807YYZZ. `YZ' is the line number. Together with lop_file, it
forms the source location for the next 32-bit word. Note that for
each non-lopcode 32-bit word, line numbers are assumed incremented
by one.
`lop_spec'
0x9808YYZZ. `YZ' is the type number. Data until the next lopcode
other than lop_quote forms special data of type `YZ'. *Note mmo
section mapping::.
Other types than 80, (or type 80 with a content that does not
parse) is stored in sections named `.MMIX.spec_data.N' where N is
the `YZ'-type. The flags for such a sections say not to allocate
or load the data. The vma is 0. Contents of multiple occurrences
of special data N is concatenated to the data of the previous
lop_spec Ns. The location in data or code at which the lop_spec
occurred is lost.
`lop_pre'
0x980901ZZ. The first lopcode in a file. The `Z' field forms the
length of header information in 32-bit words, where the first word
tells the time in seconds since `00:00:00 GMT Jan 1 1970'.
`lop_post'
0x980a00ZZ. Z > 32. This lopcode follows after all
content-generating lopcodes in a program. The `Z' field denotes
the value of `rG' at the beginning of the program. The following
256 - Z big-endian 64-bit words are loaded into global registers
`$G' ... `$255'.
`lop_stab'
0x980b0000. The next-to-last lopcode in a program. Must follow
immediately after the lop_post lopcode and its data. After this
lopcode follows all symbols in a compressed format (*note
Symbol-table::).
`lop_end'
0x980cYYZZ. The last lopcode in a program. It must follow the
lop_stab lopcode and its data. The `YZ' field contains the number
of 32-bit words of symbol table information after the preceding
lop_stab lopcode.
Note that the lopcode "fixups"; `lop_fixr', `lop_fixrx' and
`lop_fixo' are not generated by BFD, but are handled. They are
generated by `mmixal'.
This trivial one-label, one-instruction file:
:Main TRAP 1,2,3
can be represented this way in mmo:
0x98090101 - lop_pre, one 32-bit word with timestamp.
<timestamp>
0x98010002 - lop_loc, text segment, using a 64-bit address.
Note that mmixal does not emit this for the file above.
0x00000000 - Address, high 32 bits.
0x00000000 - Address, low 32 bits.
0x98060002 - lop_file, 2 32-bit words for file-name.
0x74657374 - "test"
0x2e730000 - ".s\0\0"
0x98070001 - lop_line, line 1.
0x00010203 - TRAP 1,2,3
0x980a00ff - lop_post, setting $255 to 0.
0x00000000
0x00000000
0x980b0000 - lop_stab for ":Main" = 0, serial 1.
0x203a4040 *Note Symbol-table::.
0x10404020
0x4d206120
0x69016e00
0x81000000
0x980c0005 - lop_end; symbol table contained five 32-bit words.

File: bfd.info, Node: Symbol-table, Next: mmo section mapping, Prev: File layout, Up: mmo
Symbol table format
-------------------
From mmixal.w (or really, the generated mmixal.tex) in
<http://www-cs-faculty.stanford.edu/~knuth/programs/mmix.tar.gz>):
"Symbols are stored and retrieved by means of a `ternary search trie',
following ideas of Bentley and Sedgewick. (See ACM-SIAM Symp. on
Discrete Algorithms `8' (1997), 360-369; R.Sedgewick, `Algorithms in C'
(Reading, Mass. Addison-Wesley, 1998), `15.4'.) Each trie node stores
a character, and there are branches to subtries for the cases where a
given character is less than, equal to, or greater than the character
in the trie. There also is a pointer to a symbol table entry if a
symbol ends at the current node."
So it's a tree encoded as a stream of bytes. The stream of bytes
acts on a single virtual global symbol, adding and removing characters
and signalling complete symbol points. Here, we read the stream and
create symbols at the completion points.
First, there's a control byte `m'. If any of the listed bits in `m'
is nonzero, we execute what stands at the right, in the listed order:
(MMO3_LEFT)
0x40 - Traverse left trie.
(Read a new command byte and recurse.)
(MMO3_SYMBITS)
0x2f - Read the next byte as a character and store it in the
current character position; increment character position.
Test the bits of `m':
(MMO3_WCHAR)
0x80 - The character is 16-bit (so read another byte,
merge into current character.
(MMO3_TYPEBITS)
0xf - We have a complete symbol; parse the type, value
and serial number and do what should be done
with a symbol. The type and length information
is in j = (m & 0xf).
(MMO3_REGQUAL_BITS)
j == 0xf: A register variable. The following
byte tells which register.
j <= 8: An absolute symbol. Read j bytes as the
big-endian number the symbol equals.
A j = 2 with two zero bytes denotes an
unknown symbol.
j > 8: As with j <= 8, but add (0x20 << 56)
to the value in the following j - 8
bytes.
Then comes the serial number, as a variant of
uleb128, but better named ubeb128:
Read bytes and shift the previous value left 7
(multiply by 128). Add in the new byte, repeat
until a byte has bit 7 set. The serial number
is the computed value minus 128.
(MMO3_MIDDLE)
0x20 - Traverse middle trie. (Read a new command byte
and recurse.) Decrement character position.
(MMO3_RIGHT)
0x10 - Traverse right trie. (Read a new command byte and
recurse.)
Let's look again at the `lop_stab' for the trivial file (*note File
layout::).
0x980b0000 - lop_stab for ":Main" = 0, serial 1.
0x203a4040
0x10404020
0x4d206120
0x69016e00
0x81000000
This forms the trivial trie (note that the path between ":" and "M"
is redundant):
203a ":"
40 /
40 /
10 \
40 /
40 /
204d "M"
2061 "a"
2069 "i"
016e "n" is the last character in a full symbol, and
with a value represented in one byte.
00 The value is 0.
81 The serial number is 1.

File: bfd.info, Node: mmo section mapping, Prev: Symbol-table, Up: mmo
mmo section mapping
-------------------
The implementation in BFD uses special data type 80 (decimal) to
encapsulate and describe named sections, containing e.g. debug
information. If needed, any datum in the encapsulation will be quoted
using lop_quote. First comes a 32-bit word holding the number of
32-bit words containing the zero-terminated zero-padded segment name.
After the name there's a 32-bit word holding flags describing the
section type. Then comes a 64-bit big-endian word with the section
length (in bytes), then another with the section start address.
Depending on the type of section, the contents might follow,
zero-padded to 32-bit boundary. For a loadable section (such as data
or code), the contents might follow at some later point, not
necessarily immediately, as a lop_loc with the same start address as in
the section description, followed by the contents. This in effect
forms a descriptor that must be emitted before the actual contents.
Sections described this way must not overlap.
For areas that don't have such descriptors, synthetic sections are
formed by BFD. Consecutive contents in the two memory areas
`0x0000...00' to `0x01ff...ff' and `0x2000...00' to `0x20ff...ff' are
entered in sections named `.text' and `.data' respectively. If an area
is not otherwise described, but would together with a neighboring lower
area be less than `0x40000000' bytes long, it is joined with the lower
area and the gap is zero-filled. For other cases, a new section is
formed, named `.MMIX.sec.N'. Here, N is a number, a running count
through the mmo file, starting at 0.
A loadable section specified as:
.section secname,"ax"
TETRA 1,2,3,4,-1,-2009
BYTE 80
and linked to address `0x4', is represented by the sequence:
0x98080050 - lop_spec 80
0x00000002 - two 32-bit words for the section name
0x7365636e - "secn"
0x616d6500 - "ame\0"
0x00000033 - flags CODE, READONLY, LOAD, ALLOC
0x00000000 - high 32 bits of section length
0x0000001c - section length is 28 bytes; 6 * 4 + 1 + alignment to 32 bits
0x00000000 - high 32 bits of section address
0x00000004 - section address is 4
0x98010002 - 64 bits with address of following data
0x00000000 - high 32 bits of address
0x00000004 - low 32 bits: data starts at address 4
0x00000001 - 1
0x00000002 - 2
0x00000003 - 3
0x00000004 - 4
0xffffffff - -1
0xfffff827 - -2009
0x50000000 - 80 as a byte, padded with zeros.
Note that the lop_spec wrapping does not include the section
contents. Compare this to a non-loaded section specified as:
.section thirdsec
TETRA 200001,100002
BYTE 38,40
This, when linked to address `0x200000000000001c', is represented by:
0x98080050 - lop_spec 80
0x00000002 - two 32-bit words for the section name
0x7365636e - "thir"
0x616d6500 - "dsec"
0x00000010 - flag READONLY
0x00000000 - high 32 bits of section length
0x0000000c - section length is 12 bytes; 2 * 4 + 2 + alignment to 32 bits
0x20000000 - high 32 bits of address
0x0000001c - low 32 bits of address 0x200000000000001c
0x00030d41 - 200001
0x000186a2 - 100002
0x26280000 - 38, 40 as bytes, padded with zeros
For the latter example, the section contents must not be loaded in
memory, and is therefore specified as part of the special data. The
address is usually unimportant but might provide information for e.g.
the DWARF 2 debugging format.