tr(1) — Commands

NAME

tr − Translates characters

SYNOPSIS

tr [-Acs] string1 string2
tr -ds [-Ac] string1 string2
tr -d [-Ac] string1
tr -s [-Ac] string1

The tr command copies characters from the standard input to the standard output with substitution or deletion of selected characters.

DESCRIPTION

Input characters from string1 are replaced with the corresponding characters in string2. The tr command cannot handle an ASCII NUL (\000) in string1 or string2; it always deletes NUL from the input.

The trbsd command is a BSD compatible version of tr.

The following abbreviations can be used to introduce ranges of ASCII characters or repeated characters:

a-zStands for a string of characters whose ASCII codes run from character a to character z, inclusive.

[a∗number]
Stands for number repetitions of a. The number is considered to be in decimal unless the first digit of number is 0; then it is considered to be in octal.

[=equiv=]
Represents all characters or collating elements belonging to the equivalence class specified by equiv, as defined by the LC_COLLATE locale category. An equivalence class expression can be used for string1 or string2 only when used in combination with the -d and -s flags. (For more information, see the reference page for the locale file.)

[:class:]Represents all characters belonging to the defined character class, as defined by the current setting of the LC_CTYPE locale category. The following character class names are accepted when specified in string1:

alnum   cntrl   lower   space
alpha   digit   print   upper
blank   graph   punct   xdigit

When the -d and -s flags are specified together, any of the character class names are accepted in string2; otherwise, only character class names lower or upper are accepted in string2 and then only if its complement, the corresponding character class (upper and lower, respectively) is specified in the same relative position in string1. Such a specification is interpreted as a request for case conversion.

When [:lower:] appears in string1 and [:upper:] appears in string2, the arrays contain the characters from the toupper mapping in the LC_CTYPE category of the current locale. When [:upper:] appears in string1 and [:lower:] appears in string2, the arrays contain the characters from the tolower mapping in the LC_CTYPE category of the current locale.

The first character from each mapping pair is in the array for string1 and the second character from each mapping pair is in the array for string2 in the same relative position.

Use the escape character \ (backslash) to remove the special meaning from any character in a string. Use the \ (backslash) followed by 1, 2, or 3 octal digits for the code of a character.

When string2 is shorter than string1, a difference results between historical System V and BSD systems. A BSD system pads string2 with the last character found in string2. Thus, it is possible to do the following:

tr 0123456789 d

The preceding command translates all digits to the letter d. A portable application cannot rely on the BSD behavior; it would have to code the example in the following way:

tr 0123456789 ’[d∗]’

It should be noted that, despite similarities in appearance, the string arguments used by tr are not regular expressions.

The tr command correctly processes NULL characters in its input stream. NULL characters can be stripped using the following command:

tr -d ’\000’

If a given character appears more than once in string1, the character in string2 corresponding to its last appearance in string1 will be used in the translation.

System V Compatibility

The root of the directory tree that contains the commands modified for SVID2 compliance is specified in the file /etc/svid2_path. You can use /etc/svid2_profile as the basis for, or to include in, your .profile. The file /etc/svid2_profile reads /etc/svid2_path and sets the first entries in the PATH environment variable so that the modified SVID2 commands are found first.

In the SVID2 compliant version of the tr command, only characters in the octal range of 1 to 377 are complemented when you specify the -c option. This behavior is accomplished because the -A option is implicitly forced to be on when you specify the -c option.

FLAGS

-ATranslates on a byte-by-byte basis. When you specify this flag, tr does not support extended characters.

-cComplements (inverts) the set of characters in string1, which is the set of all characters in the current character set, as defined by the current setting of LC_CTYPE, except for those actually specified in the string1 argument. These characters are placed in the array in ascending collation sequence, as defined by the current setting of LC_COLLATE.

-dDeletes all occurrences of input characters or collating elements found in the array specified in string1.

If -c and -d are both specified, all characters except those specified by string1 are deleted. The contents of string2 are ignored, unless -s is also specified. Note, however, that the same string cannot be used for both the -d and the -s flags; when both flags are specified, both string1 (used for deletion) and string2 (used for squeezing) are required.

If -d is not specified, each input character or collating element found in the array specified by string1 is replaced by the character or collating element in the same relative position in the array specified by string2.

-sReplaces any character specified in string1 that occurs as a string of two or more repeating characters as a single instance of the character in string2.

If the string2 contains a character class, the argument’s array contains all of the characters in that character class. For example:

tr -s ’[:space:]’

In a case conversion, however, the string2 array contains only those characters defined as the second characters in each of the toupper or tolower character pairs, as appropriate. For example:

tr -s ’[:upper:]’ ’[:lower:]’

NOTES

Specifying the -A flag improves ASCII performance.

EXAMPLES

1.To translate braces into parentheses, enter:

tr ’{}’ ’()’ <textfile >newfile

This translates each { (left brace) to a ( (left parenthesis) and each } (right brace) to ) (right parenthesis). All other characters remain unchanged.

2.To translate lowercase ASCII characters to uppercase, enter:

tr ’a-z’ ’A-Z’ <textfile >newfile

3.The two strings can be of different lengths:

tr ’0-9’ ’#’ <textfile >newfile

This translates each 0 into a # (number sign) but does not treat the digits 1 to 9; if the two character strings are not the same length, the extra characters in the longer one are ignored.

4.To translate each digit to a # (number sign), enter:

tr ’0-9’ ’[#∗]’ <textfile >newfile

The ∗ (asterisk) tells tr to repeat the # (number sign) enough times to make the second string as long as the first one.

5.To translate each string of digits to a single # (number sign), enter:

tr -s ’0-9’ ’[#∗]’ <textfile >newfile

6.To translate all ASCII characters that are not specified, enter:

tr -c ’[ -~]’ ’[A-_]’ <textfile >newfile

This translates each nonprinting ASCII character to the next following corresponding control key letter (\001 translates to B, \002 to C, and so on). ASCII DEL (\177), the character that follows ~ (tilde), translates to a ] (right bracket).

7.To create a list of all words in file1 one per line in file2, where a word is taken to be a maximal string of letters, enter:

tr -cs ’[:alpha:]’ ’[\n∗]’ < file1 > file2

8.To use an equivalence class to identify accented variants of the base character e in file1, which are stripped of diacritical marks and written to file2, enter:

tr ’[=e=]’ ’[e∗]’ < file1 > file2

RELATED INFORMATION

Commands: ed(1)/red(1), sh(1), trbsd(1).

Files: ascii(5).

Museum

Related Articles