Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ join(1) — Reliant UNIX 5.44c4

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

awk(1)

comm(1)

sort(1)

uniq(1)

join(1)                                                             join(1)

NAME
     join - join two files on identical-valued field

SYNOPSIS
     join [option ...] file1 file2

DESCRIPTION
     join compares two files on the basis of relations ("join fields") and
     joins all pairs of lines with identical join fields. The result is
     displayed on the standard output.

     When join is invoked, a join field on which the files are to be com-
     pared must be specified for each of the two files. Each field is
     bounded by a pair of field separators. join compares each line in the
     first file with lines in the second and displays one output line on
     the standard output for each pair of lines with identical join fields.
     The output line comprises specific fields from both lines.

   Before the call

     Each input file must be sorted so that the join fields are arranged in
     the currently valid collating sequence [see sort(1)]. If the default
     field separator is used (join without the -t option), leading separa-
     tors must be ignored (see sort, option -b) when the files are sorted.
     However, if you invoke join with option -t, leading field separators
     must be taken into account when sorting the files (see sort without
     option -b).

OPTIONS
     No option specified:
          The first field in a line is the default join field for both
          files; the default separators are blanks, tabs, and newline char-
          acters. Multiple field separators count as one field separator,
          and leading separators are ignored.

          join displays one output line on the standard output for each
          pair of lines with identical join fields. Each output line con-
          sists of the following entries in the given order:

          -  the common field

          -  the rest of the line from the first file

          -  the rest of the line from the second file

          The default output field separator is a blank.









Page 1                       Reliant UNIX 5.44                Printed 11/98

join(1)                                                             join(1)

     -1 m or -2 m
          The mth field is defined as the join field for the nth file. n
          stands for 1 or 2. Enter an integer greater than or equal to 1
          for m.

          -1 or -2 not specified: The join field for the nth file is the
          1st field.

          The options -1 and -2 correspond to the old option -j, which will
          continue to be supported, but must not be combined with the new
          synopsis. The following correspondence exists between the old and
          the new synopsis:

          -j m is equivalent to -1 m -2 m

          -j1 m is equivalent to -1 m

          -j2 m is equivalent to -2 m

     -a n In addition to the normal output, prints each line of file n for
          which no matching join field can be found in the other file.

          n can be 1 or 2. To generate output for both files you can enter
          -a 1 -a 2.

          -a cannot be combined with -v.

     -e string
          Replaces empty output fields with the specified string.

     -o list
          join changes the output line format, so that each output line
          comprises the individual fields specified in list. The common
          field is not printed unless you explicitly specify it in list.

          The list you specify must consist of elements in the form n.m,
          where n is either 1 or 2, and m is greater than or equal to 1.
          Each element in the form n.m stands for the mth field in the nth
          file. Enter 0 (zero) for the comparison field. The elements are
          separated by commas.

          Blanks or tabs may still be used as separators, but must not be
          combined with the new syntax.

     -t c Defines character c as a field separator for both input and out-
          put lines. Each occurrence of c is interpreted as a field separa-
          tor, i.e.

          -  two consecutive c separators designate an empty field, and

          -  a leading c is significant and designates an empty first
             field.


Page 2                       Reliant UNIX 5.44                Printed 11/98

join(1)                                                             join(1)

          In addition, the newline character acts as a field separator for
          the input lines.

          The default field separators (blanks and tabs) are interpreted as
          field separators only if you specify them as a value for c.

     -v   (v - vice versa) join only outputs the lines of the nth input
          file whose comparison field does not match the comparison field
          of the other file.

          You can specify 1 or 2 for n. If the output for both files is
          produced, specify -v 1 -v 2.

          -v cannot be combined with -a.

     --   If file1 begins with a dash (-), the end of the command-line
          options must be marked with --.

     file1 file2
          Names of the two files to be joined on the basis of common fields
          by join.

          If you use a dash (-) as the name for file1, join reads from
          standard input.






























Page 3                       Reliant UNIX 5.44                Printed 11/98

join(1)                                                             join(1)

          Caution:

          If the files are not sorted on their join fields, join will not
          process all lines!

          Problems may arise if a numeric file name (e.g. 1.2) is specified
          for file1 and the -o option is used immediately before this file
          name is listed. To avoid such conflicts, a numeric file name
          should be preceded by a slash (e.g. ./1.2).

LOCALE
     The LCMESSAGES environment variable governs the language in which
     message texts are displayed.

     LCCOLLATE governs the collating sequence.

     LCTYPE governs character classes and character conversion (shifting).

     If LCMESSAGES, LCCOLLATE or LCTYPE is undefined or is defined as
     the null string, it defaults to the value of LANG. If LANG is likewise
     undefined or null, the system acts as if it were not international-
     ized.

     If any of the locale variables has an invalid value, the system acts
     as if none of the variables were set.

     The LCALL environment variable governs the entire locale. LCALL
     takes precedence over all the other environment variables which affect
     internationalization.

EXAMPLES
   Example 1

     In the file place, a place is assigned to a name. In the file amount,
     an amount and a date are assigned to the same names. Both files are
     sorted by name. join is to join the two files on the names:

     Contents of place:

     Albert Buffalo
     Hugh Washington
     Irene Philadelphia

     Contents of amount:

     Albert   287.56  20.03.88
     Hugh      23.15  25.06.87
     Hugh     167.87  16.12.87
     Irene   1212.12  12.12.88
     Irene      1.98  01.01.88




Page 4                       Reliant UNIX 5.44                Printed 11/98

join(1)                                                             join(1)

     Join the two files on the first join field:

     $ join place amount
     Albert Buffalo 287.56 20.03.88
     Hugh Washington 23.15 25.06.87
     Hugh Washington 167.87 16.12.87
     Irene Philadelphia 1212.12 12.12.88
     Irene Philadelphia 1.98 01.01.88

     Join the two files and format in columns with awk:

  $ join place amount | awk '{printf("%-10s %-15s %-10s %-10s\n", $1,$2,$3,$4)}'
     Albert    Buffalo          287.56    20.03.88
     Hugh      Washington        23.15    25.06.87
     Hugh      Washington       167.87    16.12.87
     Irene     Philadelphia    1212.12    12.12.88
     Irene     Philadelphia       1.98    01.01.88

   Example 2

     In the file city, a name is assigned to a city. In the file amount
     (see Example 1), an amount and a date are assigned to a name. city is
     sorted by cities, amount by names. join is to join the two files on
     the names.

     Contents of city:

     Buffalo      Albert
     Buffalo      Frank
     Washington   Hugh
     New York     Eric
     Philadelphia Irene

     In this example, the join field for city is field 2, while that of
     amount is field 1.

     Before the files are joined, city must be sorted on field 2. The out-
     put is subsequently formatted into columns with awk:

     $ sort -b -k 2 city | join -1 2 - amount | \
     > awk '{printf("%-10s %-15s %-10s %-10s\n",$1,$2,$3,$4)}'
     Albert     Buffalo         287.56     20.03.88
     Hugh       Washington      23.15      25.06.87
     Hugh       Washington      167.87     16.12.87
     Irene      Philadelphia    1212.12    12.12.88
     Irene      Philadelphia    1.98       01.01.88

SEE ALSO
     awk(1), comm(1), sort(1), uniq(1).





Page 5                       Reliant UNIX 5.44                Printed 11/98

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026