SORT(1,C) AIX Commands Reference SORT(1,C)
-------------------------------------------------------------------------------
sort
PURPOSE
Sorts files.
SYNTAX
+----------------------+
sort ---| +------------------+ |--->
+-| -b -f -o outfile |-+
^| -c -i -r -M ||
|| -d -m -t char ||
|| -A -n -T -u ||
|+------------------+|
+--------------------+
+---------------------------------------------------------+ +------------+
>-| +---- +fskip ----+ +--------------------+ +-----------+ |-| |-|
+-| |-| +---- -fskip ----+ |-| +-------+ |-+ +--- file ---+
^+- +fskip.cskip -+ +-| |-+ +-| -b -i |-+| ^ |
| +- -fskip.cskip -+ ^| -d -n || | +--------+
| || -f -r || |
| |+-------+| |
| +---------+ |
+-------------------------------------------------------+
DESCRIPTION
The sort command sorts lines in its input files and writes the result to
standard output. It treats all of its input files as one file when it performs
the sort. A - (minus) in place of a file name specifies standard input. If
you do not specify any file names, it sorts standard input.
The default sort key (the part of the line used for sorting) is an entire line.
Default ordering is lexicographic by characters in the collating sequence. The
collation order is locale-dependent; the behavior of sort is modified by
setting the environment variables LANG or LC_COLLATE.
The two numbers, fskip and cskip, specify the sort key. Both numbers have two
parts, as follows:
+fskip.cskip
-fskip.cskip
The fskip specifies the number of fields to skip from the beginning of the
input line, and cskip specifies the number of additional characters to skip to
Processed November 8, 1990 SORT(1,C) 1
SORT(1,C) AIX Commands Reference SORT(1,C)
the right beyond that point. For both the starting point (+fskip.cskip) and
the ending point (-fskip.cskip) of a sort key, fskip is measured from the
beginning of the input line, and cskip is measured from the last field skipped.
If you omit .cskip, .0 is assumed. If you omit fskip, 0 is assumed. If you
omit the ending field specifier (-fskip.cskip), the end of the line is the end
of the sort key.
You can supply more than one sort key by repeating +fskip.cskip and
-fskip.cskip. In cases where you specify more than one sort key, keys
specified further to the right on the command line are compared only after all
earlier keys are sorted. For example, if the first key is to be sorted in
numerical order and the second in dictionary order, all strings that start with
the number one are sorted alphabetically before the strings that start with the
number two. Lines that are identical in all keys are sorted with all
characters significant. You can also specify different flags for different
sort keys in multiple sort keys. See the examples for illustration.
A field is one or more characters bounded by the beginning of a line and the
current field separator, or one or more characters bounded by a the field
separator on either side. The space character is the default field separator.
Notes:
1. Lines longer than 1024 are truncated.
2. The maximum number of fields on a line is 10.
FLAGS
-A Sorts on a byte-by-byte basis using ASCII character values.
-b Ignores leading blanks, spaces, and tabs in sort key comparisons.
-c Checks that the input is sorted according to the ordering rules
specified in the flags. Displays nothing unless the file is not
sorted.
-d Sorts in dictionary order. Only letters, digits and blanks are
considered in comparisons.
-f Merges uppercase and lowercase letters. Case is not considered in
the sorting, so that initial-capital words and all-capital words
are not grouped together at the beginning of the output.
-i Sorts only by characters in the ASCII range octal 040-0176 (all
printable characters and the space character) in non-numeric
comparisons.
-M Compare as months. The first three non-blank characters of the
field are folded to uppercase and compared so that "JAN" < "FEB" <
...< "DEC". Invalid fields compare low to "JAN". The language
used for month names are affected by locale.
Processed November 8, 1990 SORT(1,C) 2
SORT(1,C) AIX Commands Reference SORT(1,C)
-m Merges only; the input is already sorted.
-n Sorts any initial numeric strings (consisting of optional blanks,
optional minus signs, and zero or more digits with optional
decimal point) by arithmetic value. The -n flag automatically
gives you the -b flag.
-o outfile Directs output to outfile instead of standard output. outfile can
be the same as one of the input files.
-r Reverses the order of the specified sort.
-tchar Sets field separator character to char. To specify the tab
character as the field separator, you must enclose it in single
quotation marks ("' '").
-T Uses current directory instead of default directory for temporary
files.
-u Suppresses all but one in each set of equal lines. Ignored
characters (such as leading tabs and spaces) and characters
outside of sort keys are not considered in this type of
comparison.
EXAMPLES
1. To perform a simple sort:
sort fruits
This displays the contents of "fruits" sorted in ascending lexicographic
order. This means that the characters in each column are compared one by
one, including spaces, digits, and special characters. For instance, if
"fruits" contains the text:
banana
orange
Persimmon
apple
%%banana
apple
ORANGE
then sort displays:
%%banana
ORANGE
Persimmon
apple
apple
banana
orange
Processed November 8, 1990 SORT(1,C) 3
SORT(1,C) AIX Commands Reference SORT(1,C)
This order follows from the fact that in the ASCII collating sequence, "%"
(percent sign) precedes the uppercase letters, which precede the lowercase
letters. If the system uses a character set other than ASCII, your results
may be different.
2. To sort in dictionary order:
sort -d fruits
This sorts and displays the contents of "fruits", comparing only letters,
digits, and blanks. If "fruits" is the same as in Example 1, sort
displays:
ORANGE
Persimmon
apple
apple
%%banana
banana
orange
The "-d" flag tells sort to ignore the "%" character because it is not a
letter, digit, or blank. This puts "%%banana" next to "banana".
3. To group lines that contain uppercase and special characters with similar
lowercase lines:
sort -d -f fruits
This ignores special characters ("-d") and differences in case ("-f").
Given the "fruits" of Example 1, this displays:
apple
apple
%%banana
banana
ORANGE
orange
Persimmon
4. To sort as in Example 3 and remove duplicate lines:
sort -d -f -u fruits
The "-u" flag tells sort to remove duplicate lines, making each line of the
file unique. This displays:
apple
%%banana
orange
Persimmon
Processed November 8, 1990 SORT(1,C) 4
SORT(1,C) AIX Commands Reference SORT(1,C)
Not only was the duplicate "apple" removed, but "banana" and "ORANGE" as
well. These were removed because the "-d" told sort to treat "%%banana" as
if it were "banana", and the "-f" told it to treat "ORANGE" as "orange".
Thus, sort considered "%%banana" to be a duplicate of "banana" and "ORANGE"
a duplicate of "orange".
Note: There is no way to predict which duplicate lines "sort -u" will keep
and which it will remove.
5. To sort as in Example 3 and remove duplicates, unless capitalized or
punctuated differently:
sort -u +0 -d -f +0 fruits
The "+0 -d -f" does the same type of sort done with "-d -f" in Example 3.
Then the "+0" performs another comparison to distinguish lines that are not
actually identical. This prevents "-u" from removing them.
Given the "fruits" file shown in Example 1, the added "+0" distinguishes
"%%banana" from "banana" and "ORANGE" from "orange". However, the two
instances of "apple" are identical, so one of them is deleted.
apple
%%banana
banana
ORANGE
orange
Persimmon
6. To specify the character that separates fields:
sort -t: +1 vegetables
This sorts "vegetables", comparing the text that follows the first colon on
each line. The "+1" tells sort to ignore the first field and to compare
from the start of the second field to the end of the line. The "-t:" tells
sort that colons separate fields.
If "vegetables" contains:
yams:104
turnips:8
potatoes:15
carrots:104
green beans:32
radishes:5
lettuce:15
then sort displays:
Processed November 8, 1990 SORT(1,C) 5
SORT(1,C) AIX Commands Reference SORT(1,C)
carrots:104
yams:104
lettuce:15
potatoes:15
green beans:32
radishes:5
turnips:8
The numbers are not in numeric order. This happened because a
lexicographic sort compares each character from left to right. In other
words, ""3"" comes before ""5"" and ""2"" comes before "" "", so ""32""
comes before ""5 "".
7. To sort numbers:
sort -t: +1 -n vegetables
This sorts "vegetables" numerically on the second field. If "vegetables"
is the same as in Example 6, sort displays:
radishes:5
turnips:8
lettuce:15
potatoes:15
green beans:32
carrots:104
yams:104
8. To sort on more than one field:
sort -t: +1 -2 -n +0 -1 -r vegetables
This performs a numeric sort on the second field ("+1 -2 -n"). Within that
ordering, it sorts the first field in reverse alphabetic order
("+0 -1 -r"). The output looks like this:
radishes:5
turnips:8
potatoes:15
lettuce:15
green beans:32
yams:104
carrots:104
Now the lines are sorted in numeric order. When two lines have the same
number, they appear in reverse alphabetic order.
9. To replace the original file with the sorted text:
sort -o vegetables vegetables
This stores the sorted output into the file "vegetables" ("-o vegetables").
Processed November 8, 1990 SORT(1,C) 6
SORT(1,C) AIX Commands Reference SORT(1,C)
FILES
sort.c Contains sort definitions.
RELATED INFORMATION
See the following commands: "comm," "join," and "uniq."
See "Introduction to International Character Support" in Managing the AIX
Operating System.
Processed November 8, 1990 SORT(1,C) 7