397 lines
12 KiB
Groff
397 lines
12 KiB
Groff
.\" Copyright (c) 1991 The Regents of the University of California.
|
|
.\" All rights reserved.
|
|
.\"
|
|
.\" This man page is derived from documentation contributed to Berkeley by
|
|
.\" Donn Seeley at UUNET Technologies, Inc.
|
|
.\"
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
.\" modification, are permitted provided that the following conditions
|
|
.\" are met:
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
.\" 3. All advertising materials mentioning features or use of this software
|
|
.\" must display the following acknowledgement:
|
|
.\" This product includes software developed by the University of
|
|
.\" California, Berkeley and its contributors.
|
|
.\" 4. Neither the name of the University nor the names of its contributors
|
|
.\" may be used to endorse or promote products derived from this software
|
|
.\" without specific prior written permission.
|
|
.\"
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
.\" SUCH DAMAGE.
|
|
.\"
|
|
.\" @(#)a.out.5 6.3 (Berkeley) 4/29/91
|
|
.\"
|
|
.Dd April 29, 1991
|
|
.Dt A.OUT 5
|
|
.Os
|
|
.Sh NAME
|
|
.Nm a.out
|
|
.Nd format of executable binary files
|
|
.Sh SYNOPSIS
|
|
.Fd #include <a.out.h>
|
|
.Sh DESCRIPTION
|
|
The include file
|
|
.Aq Pa a.out.h
|
|
declares three structures and several macros.
|
|
The structures describe the format of
|
|
executable machine code files
|
|
.Pq Sq binaries
|
|
on the system.
|
|
.Pp
|
|
A binary file consists of up to 7 sections.
|
|
In order, these sections are:
|
|
.Bl -tag -width "text relocations"
|
|
.It exec header
|
|
Contains parameters used by the kernel
|
|
to load a binary file into memory and execute it,
|
|
and by the link editor
|
|
.Xr ld 1
|
|
to combine a binary file with other binary files.
|
|
This section is the only mandatory one.
|
|
.It text segment
|
|
Contains machine code and related data
|
|
that are loaded into memory when a program executes.
|
|
May be loaded read-only.
|
|
.It data segment
|
|
Contains initialized data; always loaded into writable memory.
|
|
.It text relocations
|
|
Contains records used by the link editor
|
|
to update pointers in the text segment when combining binary files.
|
|
.It data relocations
|
|
Like the text relocation section, but for data segment pointers.
|
|
.It symbol table
|
|
Contains records used by the link editor
|
|
to cross reference the addresses of named variables and functions
|
|
.Pq Sq symbols
|
|
between binary files.
|
|
.It string table
|
|
Contains the character strings corresponding to the symbol names.
|
|
.El
|
|
.Pp
|
|
Every binary file begins with an
|
|
.Fa exec
|
|
structure:
|
|
.Bd -literal -offset indent
|
|
struct exec {
|
|
unsigned short a_mid;
|
|
unsigned short a_magic;
|
|
unsigned long a_text;
|
|
unsigned long a_data;
|
|
unsigned long a_bss;
|
|
unsigned long a_syms;
|
|
unsigned long a_entry;
|
|
unsigned long a_trsize;
|
|
unsigned long a_drsize;
|
|
};
|
|
.Ed
|
|
.Pp
|
|
The fields have the following functions:
|
|
.Bl -tag -width a_trsize
|
|
.It Fa a_mid
|
|
Contains a bit pattern that
|
|
identifies binaries that were built for
|
|
certain sub-classes of an architecture
|
|
.Pq Sq machine IDs
|
|
or variants of the operating system on a given architecture.
|
|
The kernel may not support all machine IDs
|
|
on a given architecture.
|
|
The
|
|
.Fa a_mid
|
|
field is not present on some architectures;
|
|
in this case, the
|
|
.Fa a_magic
|
|
field has type
|
|
.Em unsigned long .
|
|
.It Fa a_magic
|
|
Contains a bit pattern
|
|
.Pq Sq magic number
|
|
that uniquely identifies binary files
|
|
and distinguishes different loading conventions.
|
|
The field must contain one of the following values:
|
|
.Bl -tag -width ZMAGIC
|
|
.It Dv OMAGIC
|
|
The text and data segments immediately follow the header
|
|
and are contiguous.
|
|
The kernel loads both text and data segments into writable memory.
|
|
.It Dv NMAGIC
|
|
As with
|
|
.Dv OMAGIC ,
|
|
text and data segments immediately follow the header and are contiguous.
|
|
However, the kernel loads the text into read-only memory
|
|
and loads the data into writable memory at the next
|
|
page boundary after the text.
|
|
.It Dv ZMAGIC
|
|
The kernel loads individual pages on demand from the binary.
|
|
The header, text segment and data segment are all
|
|
padded by the link editor to a multiple of the page size.
|
|
Pages that the kernel loads from the text segment are read-only,
|
|
while pages from the data segment are writable.
|
|
.El
|
|
.It Fa a_text
|
|
Contains the size of the text segment in bytes.
|
|
.It Fa a_data
|
|
Contains the size of the data segment in bytes.
|
|
.It Fa a_bss
|
|
Contains the number of bytes in the
|
|
.Sq bss segment
|
|
and is used by the kernel to set the initial break
|
|
.Pq Xr brk 2
|
|
after the data segment.
|
|
The kernel loads the program so that this amount of writable memory
|
|
appears to follow the data segment and initially reads as zeroes.
|
|
.It Fa a_syms
|
|
Contains the size in bytes of the symbol table section.
|
|
.It Fa a_entry
|
|
Contains the address in memory of the entry point
|
|
of the program after the kernel has loaded it;
|
|
the kernel starts the execution of the program
|
|
from the machine instruction at this address.
|
|
.It Fa a_trsize
|
|
Contains the size in bytes of the text relocation table.
|
|
.It Fa a_drsize
|
|
Contains the size in bytes of the data relocation table.
|
|
.El
|
|
.Pp
|
|
The
|
|
.Pa a.out.h
|
|
include file defines several macros which use an
|
|
.Fa exec
|
|
structure to test consistency or to locate section offsets in the binary file.
|
|
.Bl -tag -width N_BADMAG(exec)
|
|
.It Fn N_BADMAG exec
|
|
Nonzero if the
|
|
.Fa a_magic
|
|
field does not contain a recognized value.
|
|
.It Fn N_TXTOFF exec
|
|
The byte offset in the binary file of the beginning of the text segment.
|
|
.It Fn N_SYMOFF exec
|
|
The byte offset of the beginning of the symbol table.
|
|
.It Fn N_STROFF exec
|
|
The byte offset of the beginning of the string table.
|
|
.El
|
|
.Pp
|
|
Relocation records have a standard format which
|
|
is described by the
|
|
.Fa relocation_info
|
|
structure:
|
|
.Bd -literal -offset indent
|
|
struct relocation_info {
|
|
int r_address;
|
|
unsigned int r_symbolnum : 24,
|
|
r_pcrel : 1,
|
|
r_length : 2,
|
|
r_extern : 1,
|
|
: 4;
|
|
};
|
|
.Ed
|
|
.Pp
|
|
The
|
|
.Fa relocation_info
|
|
fields are used as follows:
|
|
.Bl -tag -width r_symbolnum
|
|
.It Fa r_address
|
|
Contains the byte offset of a pointer that needs to be link-edited.
|
|
Text relocation offsets are reckoned from the start of the text segment,
|
|
and data relocation offsets from the start of the data segment.
|
|
The link editor adds the value that is already stored at this offset
|
|
into the new value that it computes using this relocation record.
|
|
.It Fa r_symbolnum
|
|
Contains the ordinal number of a symbol structure
|
|
in the symbol table (it is
|
|
.Em not
|
|
a byte offset).
|
|
After the link editor resolves the absolute address for this symbol,
|
|
it adds that address to the pointer that is undergoing relocation.
|
|
(If the
|
|
.Fa r_extern
|
|
bit is clear, the situation is different; see below.)
|
|
.It Fa r_pcrel
|
|
If this is set,
|
|
the link editor assumes that it is updating a pointer
|
|
that is part of a machine code instruction using pc-relative addressing.
|
|
The address of the relocated pointer is implicitly added
|
|
to its value when the running program uses it.
|
|
.It Fa r_length
|
|
Contains the log base 2 of the length of the pointer in bytes;
|
|
0 for 1-byte displacements, 1 for 2-byte displacements,
|
|
2 for 4-byte displacements.
|
|
.It Fa r_extern
|
|
Set if this relocation requires an external reference;
|
|
the link editor must use a symbol address to update the pointer.
|
|
When the
|
|
.Fa r_extern
|
|
bit is clear, the relocation is
|
|
.Sq local ;
|
|
the link editor updates the pointer to reflect
|
|
changes in the load addresses of the various segments,
|
|
rather than changes in the value of a symbol.
|
|
In this case, the content of the
|
|
.Fa r_symbolnum
|
|
field is an
|
|
.Fa n_type
|
|
value (see below);
|
|
this type field tells the link editor
|
|
what segment the relocated pointer points into.
|
|
.El
|
|
.Pp
|
|
Symbols map names to addresses (or more generally, strings to values).
|
|
Since the link-editor adjusts addresses,
|
|
a symbol's name must be used to stand for its address
|
|
until an absolute value has been assigned.
|
|
Symbols consist of a fixed-length record in the symbol table
|
|
and a variable-length name in the string table.
|
|
The symbol table is an array of
|
|
.Fa nlist
|
|
structures:
|
|
.Bd -literal -offset indent
|
|
struct nlist {
|
|
union {
|
|
char *n_name;
|
|
long n_strx;
|
|
} n_un;
|
|
unsigned char n_type;
|
|
char n_other;
|
|
short n_desc;
|
|
unsigned long n_value;
|
|
};
|
|
.Ed
|
|
.Pp
|
|
The fields are used as follows:
|
|
.Bl -tag -width n_un.n_strx
|
|
.It Fa n_un.n_strx
|
|
Contains a byte offset into the string table
|
|
for the name of this symbol.
|
|
When a program accesses a symbol table with the
|
|
.Xr nlist 3
|
|
function,
|
|
this field is replaced with the
|
|
.Fa n_un.n_name
|
|
field, which is a pointer to the string in memory.
|
|
.It Fa n_type
|
|
Used by the link editor to determine
|
|
how to update the symbol's value.
|
|
The
|
|
.Fa n_type
|
|
field is broken down into three sub-fields using bitmasks.
|
|
The link editor treats symbols with the
|
|
.Dv N_EXT
|
|
type bit set as
|
|
.Sq external
|
|
symbols and permits references to them from other binary files.
|
|
The
|
|
.Dv N_TYPE
|
|
mask selects bits of interest to the link editor:
|
|
.Bl -tag -width N_TEXT
|
|
.It Dv N_UNDF
|
|
An undefined symbol.
|
|
The link editor must locate an external symbol with the same name
|
|
in another binary file to determine the absolute value of this symbol.
|
|
As a special case, if the
|
|
.Fa n_value
|
|
field is nonzero and no binary file in the link-edit defines this symbol,
|
|
the link-editor will resolve this symbol to an address
|
|
in the bss segment,
|
|
reserving an amount of bytes equal to
|
|
.Fa n_value .
|
|
If this symbol is undefined in more than one binary file
|
|
and the binary files do not agree on the size,
|
|
the link editor chooses the greatest size found across all binaries.
|
|
.It Dv N_ABS
|
|
An absolute symbol.
|
|
The link editor does not update an absolute symbol.
|
|
.It Dv N_TEXT
|
|
A text symbol.
|
|
This symbol's value is a text address and
|
|
the link editor will update it when it merges binary files.
|
|
.It Dv N_DATA
|
|
A data symbol; similar to
|
|
.Dv N_TEXT
|
|
but for data addresses.
|
|
The values for text and data symbols are not file offsets but
|
|
addresses; to recover the file offsets, it is necessary
|
|
to identify the loaded address of the beginning of the corresponding
|
|
section and subtract it, then add the offset of the section.
|
|
.It Dv N_BSS
|
|
A bss symbol; like text or data symbols but
|
|
has no corresponding offset in the binary file.
|
|
.It Dv N_FN
|
|
A filename symbol.
|
|
The link editor inserts this symbol before
|
|
the other symbols from a binary file when
|
|
merging binary files.
|
|
The name of the symbol is the filename given to the link editor,
|
|
and its value is the first text address from that binary file.
|
|
Filename symbols are not needed for link-editing or loading,
|
|
but are useful for debuggers.
|
|
.El
|
|
.Pp
|
|
The
|
|
.Dv N_STAB
|
|
mask selects bits of interest to symbolic debuggers
|
|
such as
|
|
.Xr gdb 1 ;
|
|
the values are described in
|
|
.Xr stab 5 .
|
|
.It Fa n_other
|
|
This field is currently unused.
|
|
.It Fa n_desc
|
|
Reserved for use by debuggers; passed untouched by the link editor.
|
|
Different debuggers use this field for different purposes.
|
|
.It Fa n_value
|
|
Contains the value of the symbol.
|
|
For text, data and bss symbols, this is an address;
|
|
for other symbols (such as debugger symbols),
|
|
the value may be arbitrary.
|
|
.El
|
|
.Pp
|
|
The string table consists of an
|
|
.Em unsigned long
|
|
length followed by null-terminated symbol strings.
|
|
The length represents the size of the entire table in bytes,
|
|
so its minimum value (or the offset of the first string)
|
|
is always 4 on 32-bit machines.
|
|
.Sh SEE ALSO
|
|
.Xr ld 1 ,
|
|
.Xr execve 2 ,
|
|
.Xr nlist 3 ,
|
|
.Xr core 5 ,
|
|
.Xr dbx 5 ,
|
|
.Xr stab 5
|
|
.Sh HISTORY
|
|
The
|
|
.Pa a.out.h
|
|
include file appeared in
|
|
.At v7 .
|
|
.Sh BUGS
|
|
Since not all of the supported architectures use the
|
|
.Fa a_mid
|
|
field,
|
|
it can be difficult to determine what
|
|
architecture a binary will execute on
|
|
without examining its actual machine code.
|
|
Even with a machine identifier,
|
|
the byte order of the
|
|
.Fa exec
|
|
header is machine-dependent.
|
|
.Pp
|
|
Nobody seems to agree on what
|
|
.Em bss
|
|
stands for.
|
|
.Pp
|
|
New binary file formats may be supported in the future,
|
|
and they probably will not be compatible at any level
|
|
with this ancient format.
|