Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ sort(1) — Reliant UNIX 5.44c4

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

comm(1)

join(1)

uniq(1)

ctype(3C)

sort(1)                                                             sort(1)

NAME
     sort - sort and/or merge files

SYNOPSIS
     sort [option ...] [file ...]

DESCRIPTION
     sort sorts lines in an input file and writes the result on the stan-
     dard output.

     If you specify more than one file, sort sorts and merges the files in
     the same operation, i.e. the contents of all input files are sorted
     and printed together.

     Sorting can be performed either by whole lines or by specific parts of
     lines, known as sort keys. If you wish to sort by whole lines, you do
     not specify any sort keys; one or more keys can be used to sort by
     particular portions of lines. A sort key is defined by specifying the
     positions of fields in a line in the form +pos1 -pos2 (see Defining
     specific sort keys).

     sort divides the lines of a file into fields. A field is a string of
     characters that is delimited by a field separator or a newline. Blanks
     and tabs are the default field separators. In a sequence of one or
     more default separators, all separators are part of the next field.
     Leading blanks at the beginning of a line thus by default form part of
     the first field.

OPTIONS
     No option specified
          sort sorts the input lines lexicographically, whereby a byte is
          used for each single character. The sorting order defined using
          LCCOLLATE is valid for the characters.

   Options that alter the behavior of sort

     -c   sort checks whether the input file is already sorted according to
          the current ordering rules. If it is, nothing is output; other-
          wise, the first line that does not match the ordering rules is
          displayed.

          Only one file may be specified with option -c! The options -m and
          -o must not be combined with -c.

          Together with -u: sort also checks whether there are lines with
          identical sort keys available.

     -m   sort merges input files which are already sorted.

          -m must not be combined with -c.





Page 1                       Reliant UNIX 5.44                Printed 11/98

sort(1)                                                             sort(1)

     -o outputfile
          outputfile is the name of a file to which the sorted contents of
          the input file are to be written. The file named as outputfile
          can also be one of the input files, but in this case the original
          unsorted contents of the named file are overwritten.

          Only one -o option must be specified. -o must not be combined
          with -c.

          -o outputfile not specified: sort writes on the standard output.

     -T directory
          Specifies a directory for temporary files.

          -T not specified: Temporary files are created in /var/tmp.

     -u   (unique) Causes identical lines to be output once only. Lines
          with identical sort keys are considered identical lines.

     -y [kmem]
          Option -y defines the memory size that sort uses to start with.
          This initial size has a large impact on the speed with which the
          file is sorted. It is a waste of memory or of CPU time to sort a
          small file in a large amount of memory or a large file in a small
          amount of memory respectively.

          kmem   Amount of memory (in Kbytes) initially assigned to sort.
                 If you assign a value above the maximum of 1 Mbyte or
                 below the minimum of 16 Kbyte, the corresponding extremum
                 will be used. Thus if you define a value of 0 (-y0), for
                 example, sort will start with minimum memory.

                 kmem not specified: sort starts with maximum memory.

          -y [kmem] not specified:

          sort starts with a system default memory size (32 Kbytes), and
          continues to use more space if required.

     -z recsz
          With this option you allocate correctly sized buffers for the
          merge phase. You only need to do this if you are using option -c
          or -m, i.e. if you are not actually sorting the files.











Page 2                       Reliant UNIX 5.44                Printed 11/98

sort(1)                                                             sort(1)

          If you are sorting the files, sort records the size of the long-
          est line read in the sort phase so that buffers of the correct
          size can be allocated during the merge phase.

          If you are not sorting the files, sort normally uses a default
          value for the buffer size. Lines longer than this will cause sort
          to terminate abnormally. Supplying the actual number of bytes in
          the longest line to be merged (or some larger value) will prevent
          abnormal termination.

   Options that alter ordering rules

     The following options can be specified in either of two ways:

     -  either as options before the first positioning specification:

        They are then valid globally for all subsequent sort keys. When
        using -k, the options must be placed before the first specification
        of -k; in the case of +pos or -pos, the options can also be placed
        between the additional positioning specifications, and are then
        only valid for the subsequent sort keys.

     -  or as modifiers for individual sort keys:

        You then cancel the global settings for the relevant sorting field,
        i.e. a change to the ordering rule is only valid when made in
        accordance with the specified modification.

        Option letters appended to the field specification without a dash
        or a blank act as modifiers (see Defining specific sort keys).

     -b   Ignores leading field separators when determining the start and
          end of a sort key. Note that the b option is only effective when
          sorting is based on sort keys (i.e. not on the whole line).

     -d   Performs a lexicographical sort, taking into account only the
          characters for which the C functions isalnum(3C) and isspace(3C)
          return a value of "true". These are the characters defined in the
          current locale as alphanumeric letters, digits, or characters
          producing white space, such as blanks or tabs.

     -f   Folds lowercase into uppercase before sorting, thus making no
          distinction between them.

     -i   In non-numeric comparisons, ignores all characters for which the
          C function isprint(3C) returns a value of "false", i.e. all char-
          acters defined as non-printing in the current locale. If the col-
          lating sequence is based on the ASCII table, for example, charac-
          ters 001 through 037 (octal) and character 0177 (octal) are
          ignored [see ascii(5)].




Page 3                       Reliant UNIX 5.44                Printed 11/98

sort(1)                                                             sort(1)

     -M   The first three characters of the sort key are converted to
          uppercase, treated as names of months, and collated in calendar
          order. The -M option implies the -b option.

     -n   Sorts numerically. A numeric value must come first in the sort
          key and may consist of: blanks, minus signs, digits 0-9, and a
          decimal point. The -n option implies the -b option, i.e. leading
          blanks are ignored.

     -r   Reverses the collating sequence (sorting order).

   Option that alters field separators

     -t x Uses the character you specify for x as the field separator.
          Unlike default field separators, x is itself not part of a field.
          It may, however, be part of a sort key, for example if the sort
          key extends from the first to the third x-separated field. Every
          field separator x is significant, i.e. xx delimits an empty
          field.

          -t not specified:

          The default field separators apply (blanks and tabs). A sequence
          of one or more default field separators forms part of the follow-
          ing field.

   Defining specific sort keys

     When defining sort keys please note that sequences of letters defined
     as one collating element in the current locale count as a single
     letter. In a Spanish locale, for example, ch is a single collating
     element.

     Specifying sort keys with the new synopsis -k fieldseparator has the
     same effect as using the old synopsis +pos1 or -pos2, but the two must
     not be combined. Conversion to the new synopsis is recommended.

     You can specify several sort keys. Where there are several sort keys,
     sort first sorts by the first sort key, moves on to the next if the
     first sort key is equal, and so on.

     -k fieldseparator
          With -k you define start and end of a sort key. In
          fieldseparator you define the first and last character of the
          sort key.

          fieldseparator has the following format:

          startfield[type][,endfield[type]]

          whereby the number of the field and a character in the field can
          be specified for startfield and endfield:


Page 4                       Reliant UNIX 5.44                Printed 11/98

sort(1)                                                             sort(1)

          m[.n]

          m and n are integers with the following significance:

          m    m specifies the number of the first or last field.

          .n   n specifies the number of the first character used in the
               first field or the number of the last character used in the
               last field.

               n not specified:

               The field is used by the first character through to the last
               character.

          type modifies the sort key (see Options that alter ordering
               rules).

     +pos1 [-pos2]
          +pos1 and -pos2 specify the start and end of a sort key on the
          basis of the fields in the input lines.

          +pos1   is the position of the first character in the sort key;

          -pos2   refers to the first character after it. +pos1 must come
                  before -pos2.

          -pos2 not specified:

          The sort key extends from +pos1 to the end of the line.

          The pos1 and pos2 arguments have the form:

          m[.n][type]

          where m and n are integers with the following significance:

          m    Skips m fields of the line, addressing field m+1.

          .n   Skips n characters plus the field separator as of the last
               character of field m, thus addressing character n+1 within
               field m+1. If the -b option is in effect, field separators
               at the start of a field are not counted; thus, +m.nb refers
               to the n+1th non-whitespace character after field m.

               .n not specified:

               Is equivalent to .0 and refers to the first character after
               field m. If the -b option is in effect, field separators at
               the start of a field are not counted; thus, +m.0b refers to
               the first non-whitespace character in the m+1th field.



Page 5                       Reliant UNIX 5.44                Printed 11/98

sort(1)                                                             sort(1)

          type Modifies the sort key (see Options that alter ordering
               rules).

     Example:
          To specify a sort key that begins with the fourth character in
          the second field and ends with this field, you enter:

          sort -k 2.4,2 (new synopsis) or

          sort +1.3 -2 (old synopsis)

          Explanation:

                         End         End     End
                         Field1      Field2  Field3
                              |           |       |
                     030-456537 A.Mackenzie  Dublin
                                  |       |
                                  Sort key:

          2.4  Start at the 4th character of the 2nd field

          +1.3 Skip field 1 and 3 characters:

               the 4th character after field 1 is the 1st character in the
               sort key: M

          -2   Skip field 2 and 0 characters:

               the 1st character after field 2 is the 1st character after
               the sort field: blank. Thus the character before is the last
               character in the sort key: n

               Note that default field separators, unlike those defined
               with option -t, are part of the following field. Hence the
               first character of field 2 is the blank, the second charac-
               ter is the A, and so on.

     --   End of the list of options. Must be specified if file begins with
          -.

     file Name of the file you wish to sort.

          You may name more than one file. All named files are sorted and
          merged, and the input lines from all of them together are sorted
          and written to standard output. In the input files, any letter
          sequence defined as a collating element in the current locale
          counts as a single letter. Thus in a Spanish locale ch is a sin-
          gle collating element. When the last line in an input file is
          missing a newline character, sort appends one, issues a warning,
          and continues.



Page 6                       Reliant UNIX 5.44                Printed 11/98

sort(1)                                                             sort(1)

          Only one file may be specified together with the -c option.

          If you use a dash (-) as the name for file, sort reads from stan-
          dard input.

          file not specified: sort reads from standard input.

EXIT STATUS
     0   All input files were processed correctly. The input file was
         sorted correctly when -c was specified.

     1   -c specified: The input file was not sorted correctly. -c -u
         specified: Lines of input with identical sort keys were found.

     >1  Error

LOCALE
     The LCMESSAGES environment variable governs the language in which
     message texts are displayed.

     LCCOLLATE governs the preset collating sequence used by the sort com-
     mand.

     LCCTYPE governs how character classes are handled by the -b, -d, -f
     and -i options.

     LCNUMERIC governs the form of the radix character (decimal point) in
     conjunction with the -n option.

     LCTIME governs the currently valid month names, their abbreviations
     and their collating sequence in conjunction with option -M.

     Answers to yes/no queries must be given in the language appropriate to
     the current locale.

     If LCMESSAGES, LCCOLLATE, LCCTYPE, LCNUMERIC or LCTIME is unde-
     fined or is defined as the null string, it defaults to the value of
     LANG. If LANG is likewise undefined or null, the system acts as if it
     were not internationalized.

     The LCALL environment variable governs the entire locale. LCALL
     takes precedence over all the other environment variables which affect
     internationalization.

     If any of the locale variables has an invalid value, the system acts
     as if none of the variables were set.








Page 7                       Reliant UNIX 5.44                Printed 11/98

sort(1)                                                             sort(1)

EXAMPLES
     Example 1

     Sorting the contents of inputfile with the second field as the sort
     key.

     $ sort -k 2,2 inputfile

     Example 2

     Sorting the contents of inputfile1 and inputfile2 in reverse order,
     placing the output in outputfile, and using the first character in
     the second field as the sort key.

     $ sort -r -o outputfile -k 2.1,2.2 inputfile1 inputfile2

     Example 3

     Sort the contents of inputfile1 and inputfile2 in reverse order,
     using the first non-blank character of the second field as the sort
     key.

     $ sort -r -o outputfile -k 2.0b,2.1b inputfile1 inputfile2

     Example 4

     Displaying the /etc/passwd file, sorted by the numeric user ID (field
     3).

     $ sort -t : -k 3n,3  /etc/passwd

     Example 5

     Displaying the presorted file inputfile, suppressing all but the
     first occurrence of lines having the same third field.

     $ sort -u -k 3,3 inputfile

FILES
     /var/tmp/stm???
          Temporary files

SEE ALSO
     comm(1), join(1), uniq(1), ctype(3C).










Page 8                       Reliant UNIX 5.44                Printed 11/98

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026