join(1) join(1)
NAME
join - join two files on identical-valued field
SYNOPSIS
join [option ...] file1 file2
DESCRIPTION
join compares two files on the basis of relations ("join fields") and
joins all pairs of lines with identical join fields. The result is
displayed on the standard output.
When join is invoked, a join field on which the files are to be com-
pared must be specified for each of the two files. Each field is
bounded by a pair of field separators. join compares each line in the
first file with lines in the second and displays one output line on
the standard output for each pair of lines with identical join fields.
The output line comprises specific fields from both lines.
Before the call
Each input file must be sorted so that the join fields are arranged in
the currently valid collating sequence [see sort(1)]. If the default
field separator is used (join without the -t option), leading separa-
tors must be ignored (see sort, option -b) when the files are sorted.
However, if you invoke join with option -t, leading field separators
must be taken into account when sorting the files (see sort without
option -b).
OPTIONS
No option specified:
The first field in a line is the default join field for both
files; the default separators are blanks, tabs, and newline char-
acters. Multiple field separators count as one field separator,
and leading separators are ignored.
join displays one output line on the standard output for each
pair of lines with identical join fields. Each output line con-
sists of the following entries in the given order:
- the common field
- the rest of the line from the first file
- the rest of the line from the second file
The default output field separator is a blank.
Page 1 Reliant UNIX 5.44 Printed 11/98
join(1) join(1)
-1 m or -2 m
The mth field is defined as the join field for the nth file. n
stands for 1 or 2. Enter an integer greater than or equal to 1
for m.
-1 or -2 not specified: The join field for the nth file is the
1st field.
The options -1 and -2 correspond to the old option -j, which will
continue to be supported, but must not be combined with the new
synopsis. The following correspondence exists between the old and
the new synopsis:
-j m is equivalent to -1 m -2 m
-j1 m is equivalent to -1 m
-j2 m is equivalent to -2 m
-a n In addition to the normal output, prints each line of file n for
which no matching join field can be found in the other file.
n can be 1 or 2. To generate output for both files you can enter
-a 1 -a 2.
-a cannot be combined with -v.
-e string
Replaces empty output fields with the specified string.
-o list
join changes the output line format, so that each output line
comprises the individual fields specified in list. The common
field is not printed unless you explicitly specify it in list.
The list you specify must consist of elements in the form n.m,
where n is either 1 or 2, and m is greater than or equal to 1.
Each element in the form n.m stands for the mth field in the nth
file. Enter 0 (zero) for the comparison field. The elements are
separated by commas.
Blanks or tabs may still be used as separators, but must not be
combined with the new syntax.
-t c Defines character c as a field separator for both input and out-
put lines. Each occurrence of c is interpreted as a field separa-
tor, i.e.
- two consecutive c separators designate an empty field, and
- a leading c is significant and designates an empty first
field.
Page 2 Reliant UNIX 5.44 Printed 11/98
join(1) join(1)
In addition, the newline character acts as a field separator for
the input lines.
The default field separators (blanks and tabs) are interpreted as
field separators only if you specify them as a value for c.
-v (v - vice versa) join only outputs the lines of the nth input
file whose comparison field does not match the comparison field
of the other file.
You can specify 1 or 2 for n. If the output for both files is
produced, specify -v 1 -v 2.
-v cannot be combined with -a.
-- If file1 begins with a dash (-), the end of the command-line
options must be marked with --.
file1 file2
Names of the two files to be joined on the basis of common fields
by join.
If you use a dash (-) as the name for file1, join reads from
standard input.
Page 3 Reliant UNIX 5.44 Printed 11/98
join(1) join(1)
Caution:
If the files are not sorted on their join fields, join will not
process all lines!
Problems may arise if a numeric file name (e.g. 1.2) is specified
for file1 and the -o option is used immediately before this file
name is listed. To avoid such conflicts, a numeric file name
should be preceded by a slash (e.g. ./1.2).
LOCALE
The LCMESSAGES environment variable governs the language in which
message texts are displayed.
LCCOLLATE governs the collating sequence.
LCTYPE governs character classes and character conversion (shifting).
If LCMESSAGES, LCCOLLATE or LCTYPE is undefined or is defined as
the null string, it defaults to the value of LANG. If LANG is likewise
undefined or null, the system acts as if it were not international-
ized.
If any of the locale variables has an invalid value, the system acts
as if none of the variables were set.
The LCALL environment variable governs the entire locale. LCALL
takes precedence over all the other environment variables which affect
internationalization.
EXAMPLES
Example 1
In the file place, a place is assigned to a name. In the file amount,
an amount and a date are assigned to the same names. Both files are
sorted by name. join is to join the two files on the names:
Contents of place:
Albert Buffalo
Hugh Washington
Irene Philadelphia
Contents of amount:
Albert 287.56 20.03.88
Hugh 23.15 25.06.87
Hugh 167.87 16.12.87
Irene 1212.12 12.12.88
Irene 1.98 01.01.88
Page 4 Reliant UNIX 5.44 Printed 11/98
join(1) join(1)
Join the two files on the first join field:
$ join place amount
Albert Buffalo 287.56 20.03.88
Hugh Washington 23.15 25.06.87
Hugh Washington 167.87 16.12.87
Irene Philadelphia 1212.12 12.12.88
Irene Philadelphia 1.98 01.01.88
Join the two files and format in columns with awk:
$ join place amount | awk '{printf("%-10s %-15s %-10s %-10s\n", $1,$2,$3,$4)}'
Albert Buffalo 287.56 20.03.88
Hugh Washington 23.15 25.06.87
Hugh Washington 167.87 16.12.87
Irene Philadelphia 1212.12 12.12.88
Irene Philadelphia 1.98 01.01.88
Example 2
In the file city, a name is assigned to a city. In the file amount
(see Example 1), an amount and a date are assigned to a name. city is
sorted by cities, amount by names. join is to join the two files on
the names.
Contents of city:
Buffalo Albert
Buffalo Frank
Washington Hugh
New York Eric
Philadelphia Irene
In this example, the join field for city is field 2, while that of
amount is field 1.
Before the files are joined, city must be sorted on field 2. The out-
put is subsequently formatted into columns with awk:
$ sort -b -k 2 city | join -1 2 - amount | \
> awk '{printf("%-10s %-15s %-10s %-10s\n",$1,$2,$3,$4)}'
Albert Buffalo 287.56 20.03.88
Hugh Washington 23.15 25.06.87
Hugh Washington 167.87 16.12.87
Irene Philadelphia 1212.12 12.12.88
Irene Philadelphia 1.98 01.01.88
SEE ALSO
awk(1), comm(1), sort(1), uniq(1).
Page 5 Reliant UNIX 5.44 Printed 11/98