Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ uniq(1) — Reliant UNIX 5.44c4

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

comm(1)

sort(1)

uniq(1)                                                             uniq(1)

NAME
     uniq - report repeated lines

SYNOPSIS
     uniq [option ...] [inputfile [outputfile]]

DESCRIPTION
     The command uniq searches a file for sequences of identical lines, and
     writes the file to standard output, removing all but one of repeated
     lines in the process. Note that repeated lines must be adjacent in
     order to be found, i.e. the input file must be sorted.

OPTIONS
     The options -c, -d, and -u must not be combined.

     No option specified:
          The named inputfile is output without repeated lines.

     -c   Outputs all lines without repetitions, starting each line with a
          decimal number to indicate how often it occurred repeatedly in
          inputfile. Counts are printed right-justified up to column 4;
          actual lines begin on column 6. uniq ignores the -u and -d
          options if set with the -c option.

     -d   Outputs one copy each of only those lines that are repeated in
          inputfile.

     -s n Causes the first n characters from the beginning of the line to
          be ignored when comparing for duplicates.

          If the -s option is combined with the -f option, the first n
          characters after the mth field are ignored. Blanks following the
          mth field are not ignored: they must be allowed for in the value
          of n.

          This corresponds to the old option +n, which is still supported,
          but must not be combined with the new synopsis (-f or -s).

          -s not specified:

          Lines are compared from the beginning of the line or beginning
          with field m+1 (option -f).

     -f m Ignores the first m fields from the beginning of the line, plus
          any tabs or blanks located in front of a field, when comparing
          for duplicates. A field is a string of non-blank characters sepa-
          rated from its neighbors by tabs or blanks.

          This corresponds to the old option -m, which is still supported,
          but must not be combined with the new synopsis (-f or -s).





Page 1                       Reliant UNIX 5.44                Printed 11/98

uniq(1)                                                             uniq(1)

          -f not specified:

          Lines are compared from the beginning of the line or beginning
          with character n+1 (option -s).

     -u   Outputs only the lines that are not repeated in inputfile.

     --   If inputfile begins with a dash (-), the end of the command-line
          options must be marked with --.

     inputfile
          Name of the file that is to be examined. If you specify a dash -
          for inputfile, uniq reads from the standard input.

          inputfile not specified: uniq reads from standard input.

     outputfile
          Name of the file to which the output is to be written. If you
          specify a dash - for outputfile, uniq writes to the standard
          output.

          outputfile not specified: uniq writes to standard output.

LOCALE
     The LCMESSAGES environment variable governs the language in which
     message texts are displayed.

     LCCTYPE governs character classes and character conversion (shift-
     ing).

     If LCMESSAGES or LCCTYPE is undefined or is defined as the null
     string, it defaults to the value of LANG. If LANG is likewise unde-
     fined or null, the system acts as if it were not internationalized.

     The LCALL environment variable governs the entire locale. LCALL
     takes precedence over all the other environment variables which affect
     internationalization.

     If any of the locale variables has an invalid value, the system acts
     as if none of the variables were set.

EXAMPLES
     Example 1

     You want to search a file for identical lines, regardless of where
     they are located in the file. A count showing how often each of these
     lines occurs is also to be output.

     $ sort file | uniq -c





Page 2                       Reliant UNIX 5.44                Printed 11/98

uniq(1)                                                             uniq(1)

     Example 2

     You want to output the 10 most frequently occurring words in the file
     text.

     $ cat text | sed 's/[      ][      ]*/\
     > /g' | sed '/^$/d' | sort | uniq -c | sort -rn | head

     Explanation:

     -  The first sed call generates a list of all words from text, by
        replacing consecutive tabs or blanks by a newline character. The
        square brackets each contain a tab and a blank.

     -  The second sed call deletes all blank lines.

     -  sort sorts the generated list in ASCII collating sequence.

     -  uniq -c removes duplicate lines from the sorted list and precedes
        each remaining line with a count indicating its frequency of
        occurrence.

     -  sort -rn does a reverse sort on this frequency list, i.e. the most
        frequent line appears first; the line with the least number of
        repetitions appears last.

     -  head prints the first 10 lines of the list.

SEE ALSO
     comm(1), sort(1).
























Page 3                       Reliant UNIX 5.44                Printed 11/98

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026