Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ sort(C) — OpenDesktop 3.0.0

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

coltbl(M)

comm(C)

join(C)

locale(M)

uniq(C)


 sort(C)                       06 January 1993                        sort(C)


 Name

    sort - sort and merge files

 Syntax

    sort [ -cmu ] [ -ooutput ] [ -T tmpdir ] [ -ykmem ] [ -zrecsz ]
    [ -dfiMnr ] [ -b ] [ -tx ] [ +pos1 ] [ -pos2 ] [ files ]

 Description

    sort sorts lines of all the named files together and writes the result on
    the standard output.  The standard input is read if ``-'' is used as a
    file name or if no input files are named.

    Comparisons are based on one or more sort keys extracted from each line
    of input.  By default, there is one sort key, the entire input line, and
    ordering is determined by the collating sequence defined by the locale
    (see locale(M)).

    The following options alter the default behavior:

    -c   Check that the input file is sorted according to the ordering rules;
         give no output unless the file is out of sort.

    -m   Merge only, the input files are already sorted.

    -ooutput
         The argument given is the name of an output file to use instead of
         the standard output.  This file may be the same as one of the
         inputs.  There may be optional blanks between -o and output.

    -T tmpdir
         tmpdir is the pathname of a directory to be used for temporary
         files. The default is to try /usr/tmp and /tmp. If -T is specified
         then tmpdir and /tmp are tried.  There must be a space between -T
         and tmpdir.

    -u   Unique: suppress all but one in each set of lines having equal keys.
         This option can result in unwanted characters placed at the end of
         the sorted file.

    -ykmem
         The amount of main memory used by the sort has a large impact on its
         performance.  Sorting a small file in a large amount of memory is a
         waste.  If this option is omitted, sort begins using a system
         default memory size, and continues to use more space as needed.  If
         this option is presented with a value, kmem, sort will start using
         that number of kilobytes of memory, unless the administrative mini-
         mum or maximum is violated, in which case the corresponding extremum
         will be used.  Thus, -y0 is guaranteed to start with minimum memory.
         By convention, -y (with no argument) starts with maximum memory.

    -zrecsz
         Causes sort to use a buffer size of recsz bytes for the merge phase.
         Input lines longer than the buffer size will cause sort to terminate
         abnormally.  Normally, the size of the longest line read during the
         sort phase is recorded and this maximum is used as the record size
         during the merge phase, eliminating the need for the -z option.
         However, when the sort phase is omitted (-c or -m options) a system
         default buffer size is used, and if this is not large enough, the -z
         option should be used to prevent abnormal termination.

    The following options override the default ordering rules.

    -d   ``Dictionary'' order: only letters, digits and blanks (spaces and
         tabs) are significant in comparisons.  Dictionary order is defined
         by the locale setting (see locale(M)).

    -f   Fold lowercase letters into uppercase.  Conversion between lowercase
         and uppercase letters are governed by the locale setting (see
         locale(M)).

    -i   Ignore non-printable characters in non-numeric comparisons.  Non-
         printable characters are defined by the locale setting (see
         locale(M)).

    -M   Compare as months.  The first three non-blank characters of the
         field are folded to uppercase and compared so that ``JAN'' < ``FEB''
         < ... < ``DEC''.  Invalid fields compare low to ``JAN''.  The -M
         option implies the -b option (see below).

    -n   An initial numeric string, consisting of optional blanks, an
         optional minus sign, and zero or more digits with optional decimal
         point, is sorted by arithmetic value.  The -n option implies the -b
         option (see below).  Note that the -b option is only effective when
         restricted sort key specifications are in effect.

    -r   Reverse the sense of comparisons.

    When ordering options appear before restricted sort key specifications,
    the requested ordering rules are applied globally to all sort keys.  When
    attached to a specific sort key (described below), the specified ordering
    options override all global ordering options for that key.

    The notation +pos1 -pos2 restricts a sort key to one beginning at pos1
    and ending at pos2.  The characters at positions pos1 and pos2 are
    included in the sort key (provided that pos2 does not precede pos1).  A
    missing -pos2 means the end of the line.

    Specifying pos1 and pos2 involves the notion of a field (a minimal
    sequence of characters followed by a field separator or a newline).  By
    default, the first blank (space or tab) of a sequence of blanks acts as
    the field separator.  All blanks in a sequence of blanks are considered
    to be part of the next field; for example, all blanks at the beginning of
    a line are considered to be part of the first field.  The treatment of
    field separators can be altered using the options:

    -tx  Use x as the field separator character; x is not considered to be
         part of a field (although it may be included in a sort key).  Each
         occurrence of x is significant (for example, xx delimits an empty
         field).

    -b   Ignore leading blanks when determining the starting and ending posi-
         tions of a restricted sort key.  If the -b option is specified
         before the first +pos1 argument, it will be applied to all +pos1
         arguments.  Otherwise, the b flag may be attached independently to
         each +pos1 or -pos2 argument (see below).

    pos1 and pos2 each have the form m.n optionally followed by one or more
    of the flags b, d, f, i, n, or r.  A starting position specified by +m.n
    is interpreted to mean the n+1st character in the m+1st field.  A missing
    .n means .0, indicating the first character of the m+1st field. If the b
    flag is in effect, n is counted from the first non-blank in the m+1st
    field; +m.0b refers to the first non-blank character in the m+1st field.

    A last position specified by -m.n is interpreted to mean the nth charac-
    ter (including separators) after the last character of the mth field.  A
    missing .n means .0, indicating the last character of the mth field.  If
    the b flag is in effect, n is counted from the last leading blank in the
    m+1st field; -m.0b refers to the first non-blank in the m+1st field.

    When there are multiple sort keys, later keys are compared only after all
    earlier keys compare equal.  Lines that otherwise compare equal are
    ordered with all bytes significant.

 Exit values

    Comments and exits with non-zero status for various trouble conditions
    (for example, when input lines are too long), and for disorders discover-
    ed under the -c option.

    When the last line of an input file is missing a newline character, sort
    appends one, prints a warning message, and continues.

 Examples

    Sort the contents of infile with the second field as the sort key:

       sort +1 -2 infile

    Sort, in reverse order, the contents of infile1 and infile2, placing the
    output in outfile and using the first character of the second field as
    the sort key:

       sort -r -o outfile +1.0 -1.2 infile1 infile2

    Sort, in reverse order, the contents of infile1 and infile2 using the
    first non-blank character of the second field as the sort key:

       sort -r +1.0b -1.1b infile1 infile2

    Print the password file (passwd(F)) sorted by the numeric user ID (the
    third colon-separated field):

       sort -t: +2n -3 /etc/passwd

    Print the lines of the already sorted file infile, suppressing all but
    the first occurrence of lines having the same third field (the options -
    um with just one input file make the choice of a unique representative
    from a set of equal lines predictable):

       sort -um +2 -3 infile


 Files

    /usr/tmp/stm???

 See also

    coltbl(M), comm(C), join(C), locale(M), uniq(C)

 Standards conformance

    sort is conformant with:

    AT&T SVID Issue 2;
    X/Open Portability Guide, Issue 3, 1989.


Typewritten Software • bear@typewritten.org • Edmonds, WA 98026