Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ sort(1) — AIX PS/2 1.2.1

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

comm

join

uniq



SORT(1,C)                   AIX Commands Reference                    SORT(1,C)



-------------------------------------------------------------------------------
sort



PURPOSE

Sorts files.

SYNTAX


        +----------------------+
sort ---| +------------------+ |--->
        +-| -b -f -o outfile |-+
         ^| -c -i -r -M      ||
         || -d -m -t char    ||
         || -A -n -T -u      ||
         |+------------------+|
         +--------------------+

  +---------------------------------------------------------+ +------------+
>-| +---- +fskip ----+ +--------------------+ +-----------+ |-|            |-|
  +-|                |-| +---- -fskip ----+ |-| +-------+ |-+ +--- file ---+
   ^+- +fskip.cskip -+ +-|                |-+ +-| -b -i |-+|    ^        |
   |                     +- -fskip.cskip -+    ^| -d -n || |    +--------+
   |                                           || -f -r || |
   |                                           |+-------+| |
   |                                           +---------+ |
   +-------------------------------------------------------+


DESCRIPTION

The sort command sorts lines in its input files and writes the result to
standard output.  It treats all of its input files as one file when it performs
the sort.  A - (minus) in place of a file name specifies standard input.  If
you do not specify any file names, it sorts standard input.

The default sort key (the part of the line used for sorting) is an entire line.
Default ordering is lexicographic by characters in the collating sequence.  The
collation order is locale-dependent; the behavior of sort is modified by
setting the environment variables LANG or LC_COLLATE.

The two numbers, fskip and cskip, specify the sort key.  Both numbers have two
parts, as follows:

+fskip.cskip
-fskip.cskip

The fskip specifies the number of fields to skip from the beginning of the
input line, and cskip specifies the number of additional characters to skip to



Processed November 8, 1990         SORT(1,C)                                  1





SORT(1,C)                   AIX Commands Reference                    SORT(1,C)



the right beyond that point.  For both the starting point (+fskip.cskip) and
the ending point (-fskip.cskip) of a sort key, fskip is measured from the
beginning of the input line, and cskip is measured from the last field skipped.
If you omit .cskip, .0 is assumed.  If you omit fskip, 0 is assumed.  If you
omit the ending field specifier (-fskip.cskip), the end of the line is the end
of the sort key.

You can supply more than one sort key by repeating +fskip.cskip and
-fskip.cskip.  In cases where you specify more than one sort key, keys
specified further to the right on the command line are compared only after all
earlier keys are sorted.  For example, if the first key is to be sorted in
numerical order and the second in dictionary order, all strings that start with
the number one are sorted alphabetically before the strings that start with the
number two.  Lines that are identical in all keys are sorted with all
characters significant.  You can also specify different flags for different
sort keys in multiple sort keys.  See the examples for illustration.

A field is one or more characters bounded by the beginning of a line and the
current field separator, or one or more characters bounded by a the field
separator on either side.  The space character is the default field separator.

Notes:

  1. Lines longer than 1024 are truncated.
  2. The maximum number of fields on a line is 10.

FLAGS

-A           Sorts on a byte-by-byte basis using ASCII character values.

-b           Ignores leading blanks, spaces, and tabs in sort key comparisons.

-c           Checks that the input is sorted according to the ordering rules
             specified in the flags.  Displays nothing unless the file is not
             sorted.

-d           Sorts in dictionary order.  Only letters, digits and blanks are
             considered in comparisons.

-f           Merges uppercase and lowercase letters.  Case is not considered in
             the sorting, so that initial-capital words and all-capital words
             are not grouped together at the beginning of the output.

-i           Sorts only by characters in the ASCII range octal 040-0176 (all
             printable characters and the space character) in non-numeric
             comparisons.

-M           Compare as months.  The first three non-blank characters of the
             field are folded to uppercase and compared so that "JAN" < "FEB" <
             ...< "DEC".  Invalid fields compare low to "JAN".  The language
             used for month names are affected by locale.




Processed November 8, 1990         SORT(1,C)                                  2





SORT(1,C)                   AIX Commands Reference                    SORT(1,C)



-m           Merges only; the input is already sorted.

-n           Sorts any initial numeric strings (consisting of optional blanks,
             optional minus signs, and zero or more digits with optional
             decimal point) by arithmetic value.  The -n flag automatically
             gives you the -b flag.

-o  outfile  Directs output to outfile instead of standard output.  outfile can
             be the same as one of the input files.

-r           Reverses the order of the specified sort.

-tchar       Sets field separator character to char.  To specify the tab
             character as the field separator, you must enclose it in single
             quotation marks ("' '").

-T           Uses current directory instead of default directory for temporary
             files.

-u           Suppresses all but one in each set of equal lines.  Ignored
             characters (such as leading tabs and spaces) and characters
             outside of sort keys are not considered in this type of
             comparison.

EXAMPLES

  1. To perform a simple sort:

      sort  fruits

    This displays the contents of "fruits" sorted in ascending lexicographic
    order.  This means that the characters in each column are compared one by
    one, including spaces, digits, and special characters.  For instance, if
    "fruits" contains the text:

      banana
      orange
      Persimmon
      apple
      %%banana
      apple
      ORANGE

    then sort displays:

      %%banana
      ORANGE
      Persimmon
      apple
      apple
      banana
      orange



Processed November 8, 1990         SORT(1,C)                                  3





SORT(1,C)                   AIX Commands Reference                    SORT(1,C)




    This order follows from the fact that in the ASCII collating sequence, "%"
    (percent sign) precedes the uppercase letters, which precede the lowercase
    letters.  If the system uses a character set other than ASCII, your results
    may be different.

  2. To sort in dictionary order:

      sort  -d  fruits

    This sorts and displays the contents of "fruits", comparing only letters,
    digits, and blanks.  If "fruits" is the same as in Example 1, sort
    displays:

      ORANGE
      Persimmon
      apple
      apple
      %%banana
      banana
      orange

    The "-d" flag tells sort to ignore the "%" character because it is not a
    letter, digit, or blank.  This puts "%%banana" next to "banana".

  3. To group lines that contain uppercase and special characters with similar
    lowercase lines:

      sort  -d  -f  fruits

    This ignores special characters ("-d") and differences in case ("-f").
    Given the "fruits" of Example 1, this displays:

      apple
      apple
      %%banana
      banana
      ORANGE
      orange
      Persimmon

  4. To sort as in Example 3 and remove duplicate lines:

      sort  -d  -f  -u  fruits

    The "-u" flag tells sort to remove duplicate lines, making each line of the
    file unique.  This displays:

      apple
      %%banana
      orange
      Persimmon



Processed November 8, 1990         SORT(1,C)                                  4





SORT(1,C)                   AIX Commands Reference                    SORT(1,C)




    Not only was the duplicate "apple" removed, but "banana" and "ORANGE" as
    well.  These were removed because the "-d" told sort to treat "%%banana" as
    if it were "banana", and the "-f" told it to treat "ORANGE" as "orange".
    Thus, sort considered "%%banana" to be a duplicate of "banana" and "ORANGE"
    a duplicate of "orange".

    Note:  There is no way to predict which duplicate lines "sort -u" will keep
           and which it will remove.

  5. To sort as in Example 3 and remove duplicates, unless capitalized or
    punctuated differently:

      sort  -u  +0  -d  -f       +0  fruits

    The "+0 -d -f" does the same type of sort done with "-d -f" in Example 3.
    Then the "+0" performs another comparison to distinguish lines that are not
    actually identical.  This prevents "-u" from removing them.

    Given the "fruits" file shown in Example 1, the added "+0" distinguishes
    "%%banana" from "banana" and "ORANGE" from "orange".  However, the two
    instances of "apple" are identical, so one of them is deleted.

      apple
      %%banana
      banana
      ORANGE
      orange
      Persimmon

  6. To specify the character that separates fields:

      sort  -t:  +1  vegetables

    This sorts "vegetables", comparing the text that follows the first colon on
    each line.  The "+1" tells sort to ignore the first field and to compare
    from the start of the second field to the end of the line.  The "-t:" tells
    sort that colons separate fields.

    If "vegetables" contains:

      yams:104
      turnips:8
      potatoes:15
      carrots:104
      green beans:32
      radishes:5
      lettuce:15

    then sort displays:





Processed November 8, 1990         SORT(1,C)                                  5





SORT(1,C)                   AIX Commands Reference                    SORT(1,C)



      carrots:104
      yams:104
      lettuce:15
      potatoes:15
      green beans:32
      radishes:5
      turnips:8

    The numbers are not in numeric order.  This happened because a
    lexicographic sort compares each character from left to right.  In other
    words, ""3"" comes before ""5"" and ""2"" comes before "" "", so ""32""
    comes before ""5 "".

  7. To sort numbers:

      sort  -t: +1  -n  vegetables

    This sorts "vegetables" numerically on the second field.  If "vegetables"
    is the same as in Example 6, sort displays:

      radishes:5
      turnips:8
      lettuce:15
      potatoes:15
      green beans:32
      carrots:104
      yams:104

  8. To sort on more than one field:

      sort  -t: +1  -2  -n  +0  -1  -r  vegetables

    This performs a numeric sort on the second field ("+1 -2 -n").  Within that
    ordering, it sorts the first field in reverse alphabetic order
    ("+0 -1 -r").  The output looks like this:

      radishes:5
      turnips:8
      potatoes:15
      lettuce:15
      green beans:32
      yams:104
      carrots:104

    Now the lines are sorted in numeric order.  When two lines have the same
    number, they appear in reverse alphabetic order.

  9. To replace the original file with the sorted text:

      sort  -o  vegetables  vegetables

    This stores the sorted output into the file "vegetables" ("-o vegetables").



Processed November 8, 1990         SORT(1,C)                                  6





SORT(1,C)                   AIX Commands Reference                    SORT(1,C)




FILES

sort.c    Contains sort definitions.

RELATED INFORMATION

See the following commands:  "comm,"  "join," and  "uniq."

See "Introduction to International Character Support" in Managing the AIX
Operating System.












































Processed November 8, 1990         SORT(1,C)                                  7



Typewritten Software • bear@typewritten.org • Edmonds, WA 98026