Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ sort(1) — UnixWare 2.01

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

comm(1)

join(1)

uniq(1)






       sort(1)                                                      sort(1)


       NAME
             sort - sort and/or merge files

       SYNOPSIS
             sort [-m] [-o output] [-bdfiMnru] [-t x] [-ykmem]
                   [-zrecsz] [-k keydef] . . . [file . . . ]
             sort -c [-bdfiMnru] [-t x] [-k keydef] [-ykmem]
                   [-zrecsz] . . . [file]

       DESCRIPTION
             The sort command sorts lines of all the named files together
             and writes the result on the standard output.  The standard
             input is read if - is used as a filename or no input files are
             named.

             Comparisons are based on one or more sort keys extracted from
             each line of input.  By default, there is one sort key, the
             entire input line, and ordering is lexicographic by bytes in
             machine collating sequence.

             sort processes characters according to the locale specified in
             the LC_CTYPE, LC_COLLATE, and LC_NUMERIC environment variables
             [see LANG on environ(5)].  Multibyte characters are not
             processed by some of the options.

             The following options alter the default behavior:

             -c   Check that the input file is sorted according to the
                  ordering rules.  If posix2 is set, give no output and
                  only vary the exit status.

             -m   Merge only, the input files are already sorted.

             -u   Unique: suppress all but one in each set of lines having
                  equal keys.

             -o output
                  The argument given is the name of an output file to use
                  instead of the standard output.  This file may be the
                  same as one of the inputs.

             -ykmem
                  The amount of main memory used by sort has a large impact
                  on its performance.  Sorting a small file in a large
                  amount of memory is a waste.  If this option is omitted,
                  sort begins using a system default memory size, and


                           Copyright 1994 Novell, Inc.               Page 1













      sort(1)                                                      sort(1)


                 continues to use more space as needed.  If this option is
                 presented with a value (kmem), sort will start using that
                 number of kilobytes of memory, unless the administrative
                 minimum or maximum is violated, in which case the
                 corresponding extremum will be used.  Thus, -y0 is
                 guaranteed to start with minimum memory.  By convention,
                 -y (with no argument) starts with maximum memory.

            -zrecsz
                 The size of the longest line read is recorded in the sort
                 phase so buffers can be allocated during the merge phase.
                 If the sort phase is omitted via the -c or -m options, a
                 popular system default size will be used.  Lines longer
                 than the buffer size will cause sort to terminate
                 abnormally.  Supplying the actual number of bytes in the
                 longest line to be merged (or some larger value) will
                 prevent abnormal termination.

            If the sort phase is not omitted, then the maximum line size
            is calculated and used as the recsz, overriding the value of
            -z.  Thus, the -z option is significant only when used with -c
            or -m.

            The following options override the default ordering rules.

            -d   ``Dictionary'' order: only alphanumeric and space
                 characters (as specified by the locale in LC_CTYPE) are
                 significant in comparisons.  No comparison is performed
                 for multibyte characters.

            -f   Fold lowercase letters into uppercase (as specified by
                 the locale in LC_CTYPE).  Does not apply to multibyte
                 characters.

            -i   Ignore non-printable characters (as specified by the
                 locale in LC_CTYPE).  Multibyte characters are also
                 ignored.

            -M   Compare as months.  The full abbreviation for the given
                 locale is used, regardless of the size of the
                 abbreviation.  Month names are processed according to the
                 locale specified in the LC_TIME environment variable [see
                 LANG on environ(5)].  For example, in an English locale
                 the sorting order would be ``JAN'' < ``FEB'' < . . . <
                 ``DEC.''  Invalid fields compare low to ``JAN.''  The -M
                 option implies the -b option (see below).


                          Copyright 1994 Novell, Inc.               Page 2













       sort(1)                                                      sort(1)


             -n   An initial numeric string, consisting of optional blanks,
                  an optional minus sign, and zero or more digits with an
                  optional decimal point, is sorted by arithmetic value.
                  The -n option implies the -b option (see below).  Note
                  that the -b option is only effective when restricted sort
                  key specifications are in effect.

             -r   Reverse the sense of comparisons.

             When ordering options appear before restricted sort key
             specifications, the requested ordering rules are applied
             globally to all sort keys.  When attached to a specific sort
             key (described below), the specified ordering options override
             all global ordering options for that key.

             The notation -k pos1,pos2 restricts a sort key to one
             beginning at pos1 and ending at pos2.  The characters at
             position pos1 and pos2 are included in the sort key (provided
             that pos2 does not precede pos1).  A missing ,pos2 means the
             end of the line.

             The obsolescent notation +pos1 and -pos2 restricts a sort key
             to one beginning at pos1 and ending just before pos2.  The
             characters at position pos1 and just before pos2 are included
             in the sort key, provided that pos2 does not precede pos1.
             So:
                   +m.n -o.p

             is equivalent to:
                   if p == 0
                         -k m+1.n+1,o.0
                   if p > 0
                         -k m+1.n+1,o+1.p

             All uses of -k pos1,pos2 below apply equally well to +pos1
             -pos2 using the above mapping, including the flags usable in m
             and n.  See the EXAMPLES section for further clarification.

             Specifying pos1 and pos2 involves the notion of a field, a
             minimal sequence of characters followed by a field separator
             or a newline.  By default, the first blank (space or tab) of a
             sequence of blanks acts as the field separator.  All blanks in
             a sequence of blanks are considered to be part of the next
             field; for example, all blanks at the beginning of a line are
             considered to be part of the first field.  The treatment of
             field separators can be altered using the options:


                           Copyright 1994 Novell, Inc.               Page 3













      sort(1)                                                      sort(1)


            -b   Ignore leading blanks when determining the starting and
                 ending positions of a restricted sort key. (Single-byte
                 blanks only.)  If the -b option is specified before the
                 first -k argument, it will be applied to all those
                 arguments.  Otherwise, the b flag may be attached
                 independently to each posn in -k pos1,pos2 argument (see
                 below).

            -t x Use x as the field separator character; x is not
                 considered to be part of a field (although it may be
                 included in a sort key).  Each occurrence of x is
                 significant (for example, xx delimits an empty field).  x
                 may be a supplementary code set character.

            pos1 and pos2 each have the form m.n optionally followed by
            one or more of the flags bdfiMnr.  A starting position
            specified by -k m.n is interpreted to mean the nth character
            in the mth field A missing .n means .1 indicating the first
            character of the mth field.  If the b flag is in effect n is
            counted from the first non-blank in the mth field; -k m.1b
            refers to the first non-blank character in the mth field.

            A last position specified by -k . . . ,m.n is interpreted to
            mean the nth character (including separators) of the mth
            field.  A missing .n means .0, indicating the last character
            of the mth field.  If the b flag is in effect n is counted
            from the character after the last leading blank in the mth
            field; -k . . . ,m.1b refers to the first non-blank in the mth
            field.

            The b flag affects only the posn that it is attached to.  The
            other flags (dfiMnr) can be attached to either pos1 or pos2 or
            both, and always affect both specifiers.

            When there are multiple sort keys, later keys are compared
            only after all earlier keys compare equal.  Lines that
            otherwise compare equal are ordered with all bytes
            significant.

      EXAMPLES
            Sort the contents of infile with the second field as the sort
            key:
                  sort -k 2,2 infile





                          Copyright 1994 Novell, Inc.               Page 4













       sort(1)                                                      sort(1)


             Sort, in reverse order, the contents of infile1 and infile2,
             placing the output in outfile and using the first character of
             the second field as the sort key:
                   sort -r -o outfile -k 2.1,2.1 infile1 infile2

             Sort, in reverse order, the contents of infile1 and infile2
             using the first non-blank character of the second field as the
             sort key:
                   sort -r -k 2.1b,2.1b infile1 infile2

             Print the password file [passwd(4)] sorted by the numeric user
             ID (the third colon-separated field):
                   sort -t : -k 3,3n /etc/passwd

             Sort the contents of the password file using the group ID
             (fourth field) as the primary sort key and the user ID (third
             field) as the secondary sort key:
                   sort -t : -k 4,4 -k 3,3 /etc/passwd

             Print the lines of the already sorted file infile, suppressing
             all but the first occurrence of lines having the same third
             field (the options -um with just one input file make the
             choice of a unique representative from a set of equal lines
             predictable):
                   sort -um -k 3,3 infile

       FILES
             /var/tmp/stm???
             /usr/lib/locale/locale/LC_MESSAGES/uxcore.abi
                   language-specific message file [See LANG on environ
                   (5).]

       REFERENCES
             comm(1), join(1), uniq(1)

       NOTICES
             Comments and exits with non-zero status for various trouble
             conditions (for example, when input lines are too long), and
             for disorder discovered under the -c option.

             When the last line of an input file is missing a newline
             character, sort appends one, prints a warning message, and
             continues.  sort does not guarantee preservation of relative
             line ordering on equal keys.




                           Copyright 1994 Novell, Inc.               Page 5













      sort(1)                                                      sort(1)


            The +pos and -pos options are becoming obsolete due to POSIX.
            Application writers should avoid using them.

            Use the posix2 environmental variable to get POSIX.2 behavior
            that is inconsistent with existing System V behavior.











































                          Copyright 1994 Novell, Inc.               Page 6








Typewritten Software • bear@typewritten.org • Edmonds, WA 98026