join
PURPOSE
Joins data fields of two files.
SYNOPSIS
join [ options ] file1 file2
DESCRIPTION
The join command reads file1 and file2, joins lines in
the files according to the flags, and writes the results
to standard output. Both files must be sorted according
to the collating sequence specified by the NLCTAB envi-
ronment variable, if set, for the fields on which they
are to be joined (normally the first field in each line).
One line appears in the output for each identical join
field appearing in both file1 and file2. The join field
is the field in the input files that join looks at to
determine what will be included in the output. The
output line consists of the join field, the rest of the
line from file1, then the rest of the line from file2.
You can specify standard input in place of file1 by sub-
stituting a - (minus) for the name.
Both input files must be sorted in increasing ASCII col-
lating sequence on the fields on which they are to be
joined (the join field, normally the first field in each
line).
Fields are normally separated by a blank, a tab char-
acter, or a new-line character. In this case, join
treats consecutive separators as one, and discards
leading separators.
FLAGS
-anum When num is 1, join produces an output
line for each line found in file1 but
not in file2. When num is 2, join
produces an output line for each line
found in the file2 but not in file1.
-e string Replaces empty output fields with
string.
-j[n] num Joins the two files on the numth field
of file n. n is 0 or 1. If you do
not specify n, join uses the numth
field in each file.
-o n.num[,n.num . . . ]
Makes each output line consist of the
fields specified in list, in which
each element has the form n.num, where
n is a file number and num is a field
number.
-tchar Uses char as the field separator char-
acter in the input and the output.
Every appearance of char in a line is
significant. The default separator is
a blank. With default field sepa-
ration, the collating sequence is that
of sort -b. If you specify -t, the
sequence is that of a plain sort. To
specify a tab character, enclose it in
single quotation marks ('').
EXAMPLES
Note: The vertical alignment shown in these examples may
not be consistent with your output.
1. To perform a simple join operation on two files whose
first fields are the same:
join phonedir names
+----------------------+-------------------+--------------------------------+
| If "phonedir" con- | and "names" is | then join displays: |
| tains the following | this listing of | |
| telephone directory: | names and depart- | |
| | ment numbers: | |
+----------------------+-------------------+--------------------------------+
| Brown J. 555-6|35 Elder Dept. 38| Elder G. 555-1234 Dept. 389 |
| Dickerson B. 555-1|42 Frost Dept. 21| Green P. 555-2240 Dept. 311 |
| Elder G. 555-1|34 Green Dept. 31| McGuff M. 555-5341 Dept. 454 |
| Green P. 555-2|40 McGuff Dept. 45| Wilde C. 555-1234 Dept. 520 |
| Harper M. 555-0|56 Wilde Dept. 52| |
| Johnson M. 555-7|58 | |
| Lewis B. 555-3|37 | |
| McGuff M. 555-5|41 | |
| Wilde C. 555-1|34 | |
+----------------------+-------------------+--------------------------------+
Each line consists of the join field (the last name),
followed by the rest of the line found in "phonedir"
and the rest of the line in "names".
2. To display unmatched lines with the command:
join -a2 phonedir names
+----------------------+-----------------+----------------------------------+
| If "phonedir" con- | and "names" | then join displays: |
| tains: | contains: | |
+----------------------+-----------------+----------------------------------+
| Brown J. 555-6|35 Elder Dept. |89 Elder G. 555-1234 Dept. 389 |
| Dickerson B. 555-1|42 Frost Dept. |17 Frost Dept. 217 |
| Elder G. 555-1|34 Green Dept. |11 Green P. 555-2240 Dept. 311 |
| Green P. 555-2|40 McGuff Dept. |54 McGuff M. 555-5341 Dept. 454 |
| Harper M. 555-0|56 Wilde Dept. |20 Wilde C. 555-1234 Dept. 520 |
| Johnson M. 555-7|58 | |
| Lewis B. 555-3|37 | |
| McGuff M. 555-5|41 | |
| Wilde C. 555-1|34 | |
+----------------------+-----------------+----------------------------------+
This performs the same join operation as in Example
1, and also lists the lines of "names" that have no
match in "phonedir". It includes "Frost"'s name and
department number in the listing, although there is
no entry for "Frost" in "phonedir":
3. To display selected fields:
join -o 2.3 2.1 1.2 1.3 phonedir names
This displays the following fields in the order
given:
Field 3 of "names"
(Department Number)
Field 1 of "names"
(Last Name)
Field 2 of "phonedir" (First
Initial)
Field 3 of "phonedir" (Telephone
Number)
+------------------------+------------------------+------------------------+
| If "phonedir" con- | and "names" contains: | then join displays: |
| tains: | | |
+------------------------+------------------------+------------------------+
| Brown J. 555-623| Elder Dept. 389 | 389 Elder G. 555-123|
| Dickerson B. 555-184| Frost Dept. 217 | 311 Green P. 555-224|
| Elder G. 555-123| Green Dept. 311 | 454 McGuff M. 555-534|
| Green P. 555-224| McGuff Dept. 454 | 520 Wilde C. 555-123|
| Harper M. 555-025| Wilde Dept. 520 | |
| Johnson M. 555-735| | |
| Lewis B. 555-323| | |
| McGuff M. 555-534| | |
| Wilde C. 555-123| | |
+------------------------+------------------------+------------------------+
4. To perform the join operation on a field other than
the first:
sort +2 -3 phonedir | join -j1 3 - numbers
This combines the lines in "phonedir" and "names",
comparing the third field of "phonedir" to the first
field of "numbers".
First, this sorts "phonedir" by the third field,
because both files must be sorted by their join
fields. The output of sort is then piped to join.
The - (minus sign) by itself causes the join command
to use this output as its first file. The -j"1 3"
defines the third field of the sorted "phonedir" as
the join field. This is compared to the first field
of "numbers" because its join field is not specified
with a -j flag.
+--------------+----------------------------------------+
| If "numbers" | then this command displays the names |
| contains: | listed in "phonedir" for each tele- |
| | phone number: |
+--------------+----------------------------------------+
| 555-0256 | 555-0256 Harper M. |
| 555-1234 | 555-1234 Elder G. |
| 555-5555 | 555-1234 Wilde C. |
| 555-7358 | 555-7358 Johnson M. |
+--------------+----------------------------------------+
Note that join lists all the matches for a given
field. In this case, join lists both "Elder G." and
"Wilde C." as having the telephone number "555-1234".
The number "555-5555" is not listed because it does
not appear in "phonedir".
RELATED INFORMATION
The following commands: "awk," "comm," "sort," "cut,"
and "paste."
The environment miscellaneous facility in AIX Operating
System Technical Reference.
The "Overview of International Character Support" in Man-
aging the AIX Operating System.