Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ a.out(5) — RISC iX 1.2

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

adb(1)

as(1)

ld(1)

mkshare(1)

nm(1)

dbx(1)

stab(5)

strip(1)

squeeze(1)

unsqueeze(1)

A.OUT(5)  —  UNIX Programmer’s Manual

NAME

a.out − assembler and link editor output

SYNOPSIS

#include <a.out.h>

DESCRIPTION

A.out is the output file of the assembler as(1), the compiler cc(1) and the link editor ld(1). The latter program makes a.out executable if there were no errors and no unresolved external references.  Layout information as obtained from the include file for the ARM is:

/∗

∗ Header prepended to each a.out file.

∗/
struct exec {
longa_magic;/∗ magic number ∗/
unsigned longa_text;/∗ size of text segment ∗/
unsigned longa_data;/∗ size of initialized data ∗/
unsigned longa_bss;/∗ size of uninitialized data ∗/
unsigned longa_syms;/∗ size of symbol table ∗/
unsigned longa_entry;/∗ entry point ∗/
unsigned longa_trsize;/∗ size of text relocation ∗/
unsigned longa_drsize;/∗ size of data relocation ∗/
};
 /∗

∗ Header prepended to each a.out except for those of type

∗ OMAGIC − note that this starts with struct exec.

∗/
#defineSHLIBLEN60
 struct exec_header {
struct execa_exec;/∗ The (old) small header ∗/
struct versiona_version;/∗ Version number time and text ∗/
unsigned longa_sq4items;/∗ number of squeezed type 4 items ∗/
unsigned longa_sq3items;/∗ number of squeezed type 3 items ∗/
unsigned longa_sq4size;/∗ size of squeezed type 4 items ∗/
unsigned longa_sq3size;/∗ size of squeezed type 3 items ∗/
unsigned longa_sq4last;/∗ last entry in type 4 table (check only) ∗/
unsigned longa_sq3last;/∗ last entry in type 3 table (check only) ∗/
time_ta_timestamp;/∗ link time of this object ∗/
time_ta_shlibtime;/∗ timestamp of shared library ∗/
chara_shlibname[SHLIBLEN];/∗ Path name of shared library ∗/
};
 /∗

∗ Basic magic numbers

∗/
#defineOMAGIC0407/∗ old impure format ∗/
#defineNMAGIC0410/∗ read-only text ∗/
#defineZMAGIC0413/∗ demand load format ∗/
 /∗

∗ Flags which may be or’ed with the magic number.

∗ All combinations are valid.

∗/
#define MF_IMPURE00200/∗ impure text ∗/
#define MF_SQUEEZED01000/∗ text and data squeezed ∗/
#define MF_USES_SL02000/∗ this object uses a shared library ∗/
#define MF_IS_SL04000/∗ this object is a shared library ∗/
 /∗

∗ Names for common combinations

∗/
#define IMAGIC(MF_IMPURE|ZMAGIC)/∗ demand load format (impure text) ∗/
#define SPOMAGIC(MF_USES_SL|OMAGIC)/∗ OMAGIC with a large header - may ∗/
/∗ contain a reference to a shared∗/
/∗ library required by the object.∗/
#define SLOMAGIC(MF_IS_SL|OMAGIC)/∗ a reference to a shared library. ∗/
/∗ The text portion of the object∗/
/∗ contains “overflow text” from∗/
/∗ the shared library to be linked∗/
/∗ in with an object.∗/
#define QMAGIC(MF_SQUEEZED|ZMAGIC)/∗ squeezed demand paged ∗/
#define SPZMAGIC(MF_USES_SL|ZMAGIC)/∗ program which uses shared lib ∗/
#define SPQMAGIC(MF_USES_SL|QMAGIC)/∗ squeezed ditto ∗/
#define SLZMAGIC(MF_IS_SL|ZMAGIC)/∗ shared library part of program ∗/
#define SLPZMAGIC(MF_USES_SL|SLZMAGIC)/∗ shared lib which uses another ∗/

Squeezed shared libraries are not supported. 

/∗

∗ Macros which take exec structures as arguments and tell whether

∗ the file has a reasonable magic number or give offsets to

∗ text|symbols|strings.

∗/
#define N_BADMAG(x) \
   ( ( ((x).a_magic & ~007200) != ZMAGIC ) && \
     ( ((x).a_magic & ~006000) != OMAGIC ) && \
     (  (x).a_magic != NMAGIC ) \
   )
 #defineIS_SQUEEZED(magic)(((magic) & MF_SQUEEZED) != 0)
#defineIS_SHARED_LIB(magic)(((magic) & MF_IS_SL)    != 0)
 #define N_TXTOFF(x) \
   (    (x).a_magic == OMAGIC ? sizeof (struct exec) : \
        (((x).a_magic & ~007200) == ZMAGIC ? PAGESIZE : sizeof (struct exec_header)) \
   )
#define N_SYMOFF(x) \
        (N_TXTOFF(x) + (x).a_text+(x).a_data + (x).a_trsize+(x).a_drsize)
#define N_STROFF(x) \
        (N_SYMOFF(x) + (x).a_syms)

The file has five sections: a header, the program text and data, relocation information, a symbol table and a string table (in that order).  The last three may be omitted if the program was loaded with the “−s” option of ld or if the symbols and relocation have been removed by strip(1).

In the header the sizes of each section are given in bytes.  The size of the header is not included in any of the other sizes. 

When an a.out file is executed, three logical segments are set up: the text segment, the data segment (with uninitialized data, which starts off as all 0, following initialized), and a stack.  Only “demand load” (ZMAGIC) formats may be loaded − the other formats are historical relics (NMAGIC) or are input files for ld (OMAGIC). 

The text segment begins at offset 32768 (the start of the first page) in the virtual memory of the process.  For convenience during loading the text segment begins at offset 32768 in a ZMAGIC file (see the definition of N_TXTOFF above.)  For a program which does not use a shared library (the MF_USES_SL flag is not set in the magic number) this is loaded at virtual address 32768.  Otherwise it is loaded at the first page after the end of any shared library text (see the description of shared libraries below.)  Unless the “impure” (MF_IMPURE) flag is set the text segment will not be writable and will be shared between different instances of the same program.  If the file is impure the text segment size will be 0 − ld merges the text and data segments. 

The data segment begins immediately after the text segment.  The linker rounds the size of the text segment up to a 32768 byte boundary, thus the data segment starts on a (memory) page boundary in the file (this also happens to be a file system page boundary).  The data segment is loaded immediately after the text segment and is, of source, writable.  The uninitialised (bss) data occurs immediately after this and is initialised to zero.  See the description of ld for details of the symbols which may be used in a program to determine the virtual address of these segments of the address space. 

The heap occurs immediately after the end of the uninitialised data − this may not be on a memory page boundary.  The stack segment is at the top of the virtual address space, and also contains the shared library data (see below).  This segment ends at USRSTACK (from <machine/vmparam.h>). The stack is automatically extended as required.  The data segment is only extended as requested by brk(2).

If the image is squeezed (the MF_SQUEEZED flag is set in the magic number) the text and data segments in the a.out file will be squeezed.  When the kernel loads the relevant memory pages the contents of the relevant section of the file will be unsqueezed.  In this case the first 32768 bytes of the file (corresponding to the first memory page) contain the tables for unsqueezing the remainder of the file.  The extended exec header contains the information the kernel requires to locate these tables. 

If the magic number in the header is identically OMAGIC (0407) the old format exec header is used − this is for compatibility with programs producing input for the linker.  For all other formats the extended header is used.  This contains information about the version of the program or binary, the squeeze tables (as above), the link time and any shared library requirement. 

The version information is left blank by ld. It is filled in separately for all distributed binaries and may be read using version(1).

The squeeze information is also left blank − it is filled in by the program which squeezes the linked image. 

The a_timestamp field is filled in by ld, which sets it to the link time of the output.  The shared library information is filled in at the same time − for a program which does not use a shared library it is blank (all bytes 0) for programs which do it contains the timestamp from the shared library and the absolute path name of the shared library.

Shared libraries are indicated by the MF_IS_SL flag in the magic number and are treated very differently by the kernel.  A shared library is produced using the −Z flag to ld(1). It may not be executed directly but may be used in a further link step to produce a program which is capable of sharing the library code with other programs linked with the shared library − see the description of ld(1) and mkshare(1).

When the kernel encounters a program which uses a shared library it looks for the library − the program cannot be executed unless the library is in the correct place.  Use file(1) or phead(1) to find out details of the shared library used by a program.

The kernel loads the shared library text first (recursively loading a shared library used by the library if necessary.)  The data for the shared library is placed at the virtual address given by the a_entry field in the shared library exec header.  ld sets this so that the data sits immediately below USRSTACK, or below the data of a shared library which the library shares.  Shared libraries never have any unininitialised data.  The program text and data is loaded after the shared library text.  The kernel then commences execution of the process at the entry point given by the a_entry field in the exec header of the program. 

After the header in the file follow the text, data, text relocation data relocation, symbol table and string table in that order.  The N_TXTOFF macro returns this absolute file position of the text segment when given the name of an exec structure as argument.  The data segment is contiguous with the text and immediately followed by the text relocation and then the data relocation information.  The symbol table follows all this; its position is computed by the N_SYMOFF macro.  Finally, the string table immediately follows the symbol table at a position which can be gotten easily using N_STROFF.  The first 4 bytes of the string table are not used for string storage, but rather contain the size of the string table; this size INCLUDES the 4 bytes, the minimum string table size is thus 4. 

The layout of a symbol table entry and the principal flag values that distinguish symbol types are given in the include file as follows:

/∗

∗ Format of a symbol table entry.

∗/
struct nlist {
union {
char∗n_name;/∗ for use when in-core ∗/
longn_strx;/∗ index into file string table ∗/
} n_un;
unsigned charn_type;/∗ type flag, i.e. N_TEXT etc; see below ∗/
charn_other;
shortn_desc;/∗ see <stab.h> ∗/
unsigned longn_value;/∗ value of this symbol (or offset) ∗/
};
#definen_hashn_desc/∗ used internally by ld ∗/
 /∗

∗ Simple values for n_type.

∗/
#defineN_UNDF0x0/∗ undefined ∗/
#defineN_ABS0x2/∗ absolute ∗/
#defineN_TEXT0x4/∗ text ∗/
#defineN_DATA0x6/∗ data ∗/
#defineN_BSS0x8/∗ bss ∗/
#defineN_COMM0x12/∗ common (internal to ld) ∗/
#defineN_FN0x1f/∗ file name symbol ∗/
 #defineN_EXT01/∗ external bit, or’ed in ∗/
#defineN_TYPE0x1e/∗ mask for all the type bits ∗/
 /∗

∗ Other permanent symbol table entries have some of the N_STAB bits set.

∗ These are given in <stab.h>

∗/
#defineN_STAB0xe0/∗ if any of these bits set, don’t discard ∗/
 /∗

∗ Format for namelist values.

∗/
#defineN_FORMAT"%08x"

In the a.out file a symbol’s n_un.n_strx field gives an index into the string table.  A n_strx value of 0 indicates that no name is associated with a particular symbol table entry.  The field n_un.n_name can be used to refer to the symbol name only if the program sets this up using n_strx and appropriate data from the string table. 

If a symbol’s type is undefined external, and the value field is non-zero, the symbol is interpreted by the loader ld as the name of a common region whose size is indicated by the value of the symbol. 

The value of a byte in the text or data which is not a portion of a reference to an undefined external symbol is exactly that value which will appear in memory when the file is executed.  If a byte in the text or data involves a reference to an undefined external symbol, as indicated by the relocation information, then the value stored in the file is an offset from the associated external symbol.  When the file is processed by the link editor and the external symbol becomes defined, the value of the symbol will be added to the bytes in the file. 

If relocation information is present, it amounts to eight bytes per relocatable datum as in the following structure:

/∗

∗ Format of a relocation datum.

∗/
struct relocation_info {
intr_address;/∗ address which is relocated ∗/
unsigned intr_symbolnum:24,/∗ local symbol ordinal ∗/
r_pcrel:1, /∗ was relocated pc relative already ∗/
r_length:2,/∗ 0=byte, 1=word, 2=long ∗/
r_extern:1,/∗ does not include value of sym referenced ∗/
r_neg:1,/∗ negative relocation ∗/
:3;/∗ nothing, yet ∗/
};

There is no relocation information if a_trsize+a_drsize==0.  If r_extern is 0, then r_symbolnum is actually a n_type for the relocation (i.e. N_TEXT meaning relative to segment text origin.) 

SEE ALSO

adb(1), as(1), ld(1), mkshare(1), nm(1), dbx(1), stab(5), strip(1), squeeze(1), unsqueeze(1)

BUGS

The a_entry symbol is overloaded. 

Although all magic number/flag combinations are valid most of them are not understood by the kernel − in particular magic numbers based on OMAGIC or NMAGIC will not execute correctly.  All valid combinations are listed in the include file. 

7th Edition  —  Revision 1.4 of 03/12/88

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026