Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ ld(1) — RISC iX 1.2

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

as(1)

ar(1)

cc(1)

ranlib(1)

mkshare(1)

phead(1)

strip(1)

tsort(1)

lorder(1)

squeeze(1)

unsqueeze(1)

LD(1)  —  UNIX Programmer’s Manual

NAME

ld − link editor

SYNOPSIS

ld [ option ] ... file ... 

DESCRIPTION

ld combines several object programs into one, resolves external references, and searches libraries.  In the simplest case several object files are given, and ld combines them, producing an object module which can be either executed or become the input for a further ld run.  (In the latter case, the −r option must be given to preserve the relocation bits.)  The output of ld is left on a.out.  This file is made executable only if no errors occurred during the load. 

The argument routines are concatenated in the order specified. 

If any argument is a library, it is searched exactly once at the point it is encountered in the argument list.  Only those archive members defining an unresolved external reference are loaded.  If a member from a library references another member in the library, and the library has not been processed by ranlib(1), the referenced routine must appear after the referencing routine in the library. Thus the order of programs within libraries may be important.  See tsort(1), lorder(1). The first member of a library should be a file named ‘__.SYMDEF’, which is understood to be a dictionary for the library as produced by ranlib(1); the dictionary is searched iteratively to satisfy as many references as possible.

The symbols ‘__etext’, ‘__edata’, ‘__end’, ‘__estext’ and ‘__esdata’ (‘_etext’, ‘_edata’, ‘_end’, ‘_estext’ and ‘_esdata’ in C) are reserved, and if referred to, are set to the first location above the program, the first location above initialized data, the first location above all program data, the first location after shared library text and the first location in shared library data respectively.  It is erroneous to define these symbols. 

ld understands several options.  The option −A must come before all file name or −l arguments, with this exception option order, except for −D, is unimportant.  Argument order is obviously crucially important because ld only makes a single pass over the arguments to determine symbol definitions. 

−A This option specifies incremental loading, i.e.  linking is to be done in a manner so that the resulting object may be read into an already executing program.  The next argument is the name of a file whose symbol table will be taken as a basis on which to define additional symbols.  Only newly linked material will be entered into the text and data portions of a.out, but the new symbol table will reflect every symbol defined before and after the incremental load.  This argument must appear before any other object file in the argument list.  Incremental loading is incompatible with the use of shared libraries − although the argument to −A may refer to a shared library nothing included in the linked output may.  The −T option may be used as well, and will be taken to mean that the newly linked segment will commence at the corresponding address.  The default value is the old value of ‘__end’. 

−d Force definition of common storage even if the −r flag is present.  This option also causes ld to output a list of undefined symbols when −r is present, however −r still prevents the definition of ‘__end’, ‘__edata’ and ‘__etext’. 

−e The following argument is taken to be the name of the entry point of the loaded program; the symbol ‘start’ is the default.  Note that this is only of use for assembler programs, since high level languages require a run time environment which must be entered before any user routines are called. 

−i Produce impure demand load output − a demand loadable executable with a writable text segment.  The output has magic number 0613. 

−Ldir Add dir to the list of directories in which libraries are searched for.  Directories specified with −L are searched before the standard directories.  All −L arguments are read before ld starts to read the input arguments − thus although order is important within the −L arguments order relative to other arguments is not.  The default list of directories is ‘/usr/lib’ then ‘/usr/local/lib’. 

−lx This option is an abbreviation for the library name ‘libx.a’, where x is a string.  ld searches for libraries first in any directories specified with −L options, then in the standard directories ‘/usr/lib’ and ‘/usr/local/lib’ (see the description of −L.)  A library is searched when its name is encountered, so the placement of a −l is significant. 

−M produce a primitive load map, listing the names of the files which will be loaded. 

−o The name argument after −o is used as the name of the ld output file, instead of a.out. 

−N Do not make the text portion read only or sharable.  (Use ‘magic number’ 0407.)  The resultant executable cannot be loaded by the kernel (only demand loadable executables are supported.)  −N should only be used when producing output for use in a further link step. 

−Q Squeeze the output executable, which must be in demand load format.  The program squeeze(1) must exist on the path of ld for this to work. 

−r Generate relocation bits in the output file so that it can be the subject of another ld run.  This flag also prevents final definitions from being given to common symbols, and suppresses the ‘undefined symbol’ diagnostics.  The output will have magic number 0407 (OMAGIC.) 

−s ‘Strip’ the output, that is, remove the symbol table and relocation bits to save space (but impair the usefulness of the debuggers).  This information can also be removed by strip(1).

−t (‘trace’)  Print the name of each file as it is processed. 

−u Take the following argument as a symbol and enter it as undefined in the symbol table.  This is useful for loading wholly from a library, since initially the symbol table is empty and an unresolved reference is needed to force the loading of the first routine. 

−X Save local symbols except for those whose names begin with ‘L’.  This option is used by cc(1) to discard internally-generated labels while retaining symbols local to routines.

−x Do not preserve local (non-.globl) symbols in the output symbol table; only enter external symbols.  This option saves some space in the output file. 

−ysym
Indicate each file in which sym appears, its type and whether the file defines or references it.  Many such options may be given to trace many symbols.  (It is usually necessary to begin sym with an ‘_’, as external C, FORTRAN and Pascal variables begin with underscores.) 

−z Arrange for the process to be loaded on demand from the resulting executable file (413 format) rather than preloaded.  This is the default.  Results in a 32768 byte header on the output file followed by a text and data segment each of which have size a multiple of pagesize bytes (default 32768 − see −P.)  The file is in effect padded with nulls.  With this format the first few BSS segment symbols may be placed at the end of the data segment of the output.  This is to avoid wasting space resulting from the data segment size roundup. 

−Z Produce a shared library.  The output is in a suitable format for the kernel to load as a shared library.  The magic number has the ‘I am a shared library’ flag set.  The data used by the library is placed at the top of the address space (above the stack.)  The entry point stored in the output a.out file is set to the address of the first byte of the data. 

The following options are not used for normal programs, since they can easily produce images that the kernel will not correctly run. They can be used in cross-development environments or as modifiers to some of the other options (e.g.  −A and −Z).  They should be used with caution. 

−B Set the base of the text portion of the output − unlike −T this causes a gap to be left in the output corresponding to the difference between the origin of the text segment and the base of the output text, the resulting executable will thus execute correctly when loaded by the kernel.  The argument is a hexadecimal number. 

−D Take the next argument as a hexadecimal number and pad the data segment with zero bytes to the indicated length.  The argument may appear between input files. 

−E The next argument is interpreted as a hexadecimal number addressing the byte beyond the end of the data segment of the program.  Thus −E specifies the value ld will give to ‘__end’.  The option is not compatible with −H.  −H is normally used with −Z to specify the position of shared library data. 

−H The next argument is interpreted as a decimal number specifying the size of a hole between the text and data segments.  The option is not compatible with −E. 

−P The next argument is interpreted as a hexadecimal number defining the page size of the target machine.  This is used to determine the rounding used for the text segment size of demand loaded output.  The default is 32768 (hex 8000.) 

−T The next argument is a hexadecimal number which sets the text segment origin.  The default origin is 32768 (hex 8000.)  If the text segment origin is changed the kernel will not be able to (correctly) load the executable.  See also −E. 

DIAGNOSTICS

ld detects two significant user errors − undefined symbols and multiply defined symbols.  Undefined symbols arise when a reference to an external symbol is not resolved by a definition of the same (external) symbol in another input file or archive member.  This is an error unless the −r option is given. 

Multiply defined symbols arise when two or more modules contain a definition of the same external symbol and the definitions are different − ld will allow multiple definitions if they have the same value (after relocation.) 

With correctly formed input modules this only applies to external symbols − it is impossible to form undefined non-external symbols with the normal tools, and non-external symbols are not used to resolve external references, therefore multiple definitions are impossible. 

When using shared libraries multiple definitions may occur with programs which work with the equivalent non-shared libraries.  This is because a shared library defines a large number of symbols, many of which would not have been included if the program had been linked with a non-shared library.  Thus programs which accidentally redefine symbols from a library will give rise to multiple symbol declarations when linked with a shared version of the library.  In many cases this will be a genuine bug in the program (wherever parts of the library which the program uses depend on the original definition of the symbol in question.) 

The only sure way of avoiding multiple declarations of symbols in a system which uses separate compilation is to adopt a naming convention for external function and data symbols which guarantees that they are unique.  One typical scheme uses names of the form “module_name”, where “name” is a descriptive name appropriate to the function or data, and “module” is a name for a small piece of code containing closely related functions − normally confined to a single source file. 

Notice that ld is not capable of doing even the most basic type checking of a symbol − no information is stored with an undefined symbol to say whether it is expected to be a reference to text or data, thus ld will quite happily satisfy a request for a definition of a symbol refering to a byte of (writable) data with a symbol refering to a function. 

FILES

/usr/lib/lib∗.alibraries
/usr/local/lib/lib∗.amore libraries
a.outoutput file

SEE ALSO

as(1), ar(1), cc(1), ranlib(1), mkshare(1), phead(1), strip(1), tsort(1), lorder(1), squeeze(1), unsqueeze(1)

BUGS

Specifying the option −T creates an executable which the kernel will load into the wrong place − normally resulting in the program immediately crashing.  Other combinations of options can have similar effects.  This is because the a.out header does not contain sufficient information to allow text and data to be loaded into arbitrary positions in the address space. 

FEATURES

The option −n is not supported − it should produce read-only text executables which are not demand loadable (NMAGIC - 0410 magic number.)  There is never any advantage in using this format.  Thus the option is converted (with a warning) to −z. 

The shared library scheme is a simple one.  Programs may share the text in a shared library by being linked with it.  Each program may depend on only one shared library (although that shared library may itself depend on another).  There is no support for dynamic loading or other niceties. 

4th Berkeley Distribution  —  Revision 1.10 of 22/11/90

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026