Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ expressions(5) — Reliant UNIX 5.44c4

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

awk(1)

bfs(1)

csplit(1)

e(1)

ed(1)

egrep(1)

ex(1)

expr(1)

extract(1)

grep(1)

lex(1)

man(1)

nl(1)

pg(1)

regcmp(1)

sed(1)

vi(1)

regex(3)

regcomp(3C)

regcmp(3G)

regexpr(3G)

regex(5)

regexp(5)

expressions(5)                                               expressions(5)

NAME
     expressions - regular expressions

DESCRIPTION
     Regular expressions are used for scanning a text for strings which
     match a defined pattern. A regular expression stands for a set of
     characters or character strings. Each character string in this set is
     said to be matched by the regular expression. A pattern is constructed
     from one or more regular expressions.

     A regular expression comprises a string of characters, which can be
     further classified into:

     -  ordinary characters

        All characters in the character set, except for the newline charac-
        ter and metacharacters, are ordinary characters. Within a pattern,
        ordinary characters match themselves, i.e. the pattern abc will
        match only those strings that contain the character sequence abc
        anywhere in them.

     -  metacharacters

        Metacharacters do not match themselves, but have a special meaning,
        which is explained below. Metacharacters preceded by a backslash \
        lose their special meaning.

     There are two forms of regular expression:

     -  simple regular expressions

     -  extended regular expressions

     The syntax of these forms of regular expression is described in the
     following sections.




















Page 1                       Reliant UNIX 5.44                Printed 11/98

expressions(5)                                               expressions(5)

SIMPLE REGULAR EXPRESSIONS

     The pattern to be searched for in a text can be made up of any single
     expressions. The following single expressions can be used if a command
     supports simple regular expressions.

   Single characters, collating elements

     ______________________________________________________________________
    | Expression    |  Meaning                                            |
    |_______________|_____________________________________________________|
    | c             |  The character c, where c must not be a special     |
    |               |  character (metacharacter).                         |
    |               |                                                     |
    |               |  Example: a matches a                               |
    |_______________|_____________________________________________________|
    | \c            |  The character c, where c can be any character other|
    |               |  than ( ) { } 1 2 3 4 5 6 7 8 9.                    |
    |               |                                                     |
    |               |  A regular expression in the form \c is meaningful  |
    |               |  if c is a metacharacter. \c then stands for the    |
    |               |  character c itself.                                |
    |               |                                                     |
    |               |  Example: \a matches a, \* matches *                |
    |_______________|_____________________________________________________|
    | [.cc.]        |  (Collating symbol; only within [ ]) Multi-character|
    |               |  collating elements must be represented in this form|
    |               |  to distinguish them from ordinary characters. An   |
    |               |  expression of this type is collated as a single    |
    |               |  character. cc has to be defined as a valid collat- |
    |               |  ing element in the internationalized environment.  |
    |               |                                                     |
    |               |  Example:                                           |
    |               |                                                     |
    |               |  In the Spanish locale LANG=LCCOLLATE=EsSP.88591  |
    |               |  ch is a valid collating element: in Spanish, ch is |
    |               |  treated as a single letter collating between c and |
    |               |  d. This letter must be represented in the form     |
    |               |  [.ch.] to distinguish it from the two-letter string|
    |               |  ch.                                                |
    |_______________|_____________________________________________________|













Page 2                       Reliant UNIX 5.44                Printed 11/98

expressions(5)                                               expressions(5)

   Groups of characters, classes

     ______________________________________________________________________
    | Expression    |  Meaning                                            |
    |_______________|_____________________________________________________|
    | .             |  Any character.                                     |
    |               |                                                     |
    |               |  Example: . matches a, x, *, ...                    |
    |_______________|_____________________________________________________|
    | [s]           |  Any character from the character string s. s can   |
    |               |  also be a character class.                         |
    |               |                                                     |
    |               |  Example: [mz] matches m, z                         |
    |               |                                                     |
    |               |  Warning:                                           |
    |               |                                                     |
    |               |  Metacharacters that have a special meaning in      |
    |               |  bracketed expressions (], -, ^) are treated as nor-|
    |               |  mal characters if they are placed at a particular  |
    |               |  position in the bracketed expression, i.e.         |
    |               |                                                     |
    |               |  ]   at the first position                          |
    |               |                                                     |
    |               |  -   at the first or last position                  |
    |               |                                                     |
    |               |  ^   at any position except the first or last.      |
    |               |                                                     |
    | [c1-c2]       |  Any character in the range between c1 and c2 in    |
    |               |  accordance with the currently valid collating      |
    |               |  sequence (c1 and c2 inclusive). c1 and c2 can also |
    |               |  be expressions for equivalence classes [=c=] or    |
    |               |  collating symbols [.cc.].                          |
    |               |                                                     |
    |               |  Example:                                           |
    |               |                                                     |
    |               |  In the German locale LANG=LCCOLLATE=DeDE.88591   |
    |               |  [a-d] matches the characters a, ä, b, c, d, while  |
    |               |  in the Spanish locale LANG=LCCOLLATE=EsSP.88591, |
    |               |  it matches the characters a, b, c, ch, d.          |
    |               |                                                     |
    | [s1c1-c2s2]   |  The two forms can be combined.                     |
    |_______________|_____________________________________________________|












Page 3                       Reliant UNIX 5.44                Printed 11/98

expressions(5)                                               expressions(5)

     ______________________________________________________________________
    | [^s]          |  Any character not contained in the character string|
    |               |  s.                                                 |
    |               |                                                     |
    |               |  Example:                                           |
    |               |                                                     |
    |               |  [^xyz] matches every character excluding x, y, z.  |
    |               |                                                     |
    | [^c1-c2]      |  Any character not in the range between c1 and c2.  |
    |               |                                                     |
    |               |  Example:                                           |
    |               |                                                     |
    |               |  [^0-9] matches every character excluding 0, 9 and  |
    |               |  the characters in the collating sequence between 0 |
    |               |  and 9.                                             |
    |               |                                                     |
    | [^s1c1-c2s2]  |  The two forms can be combined.                     |
    |_______________|_____________________________________________________|
    | [:class:]     |  (Character class expression; only within [ ]) Any  |
    |               |  character from the character class class. class can|
    |               |  be:                                                |
    |               |                                                     |
    |               |  alpha     any letter                               |
    |               |                                                     |
    |               |  upper     any uppercase letter                     |
    |               |                                                     |
    |               |  lower     any lowercase letter                     |
    |               |                                                     |
    |               |  digit     any decimal digit (0 through 9)          |
    |               |                                                     |
    |               |  xdigit    any hexadecimal digit (0 through 9,      |
    |               |            a through f and A through F)             |
    |               |                                                     |
    |               |  alnum     any alphanumeric character               |
    |               |            (letters and digits)                     |
    |               |                                                     |
    |               |  space     any character producing white space      |
    |               |            in displayed text                        |
    |               |            (e.g. blanks or tabs)                    |
    |               |                                                     |
    |               |  blank     blanks or tabs                           |
    |               |                                                     |
    |               |  punct     any separator                            |
    |               |                                                     |
    |               |  print     any printable character (including       |
    |               |            the characters in space)                 |
    |               |                                                     |
    |               |  graph     any printable character with a visible   |
    |               |            representation (excluding the characters |
    |               |            in space)                                |
    |_______________|_____________________________________________________|



Page 4                       Reliant UNIX 5.44                Printed 11/98

expressions(5)                                               expressions(5)

    |               |  cntrl     any control character                    |
    |               |                                                     |
    |               |  Example:                                           |
    |               |                                                     |
    |               |  In the German locale LANG=LCCTYPE=DeDE.88591 the |
    |               |  characters ä, ö, ü, ß, a, ..., z match the regular |
    |               |  expression [[:lower:]]. The character è does not   |
    |               |  belong to lower or alpha, but is a printable char- |
    |               |  acter.                                             |
    |_______________|_____________________________________________________|
    | [=c=]         |  (Equivalence class expression; only within [ ]) Any|
    |               |  character or collating element defined as having   |
    |               |  the same relative order as c. c must not be an     |
    |               |  equals sign = or a right square bracket ].         |
    |               |                                                     |
    |               |  Example:                                           |
    |               |                                                     |
    |               |  In the German locale LANG=LCCOLLATE=DeDE.88591   |
    |               |  the characters u and ü form an equivalence class.  |
    |               |  Consequently the characters u and ü match the regu-|
    |               |  lar expression [[=u=]]. In this locale, the regular|
    |               |  expressions [[=u=]v], [[=ü=]v], and [uüv] are      |
    |               |  synonyms.                                          |
    |_______________|_____________________________________________________|

   Concatenation
     Single expressions can be concatenated. All concatenated expressions
     together describe the pattern to be searched for in a text.

     ______________________________________________________________________
    | Expression    |  Meaning                                            |
    |_______________|_____________________________________________________|
    | rx            |  An occurrence of a character string matching the   |
    |               |  regular expression r, followed by a character      |
    |               |  string matching the regular expression x.          |
    |               |                                                     |
    |               |  Example: [ab]. matches ax, a3, a*, bz, ...         |
    |_______________|_____________________________________________________|
















Page 5                       Reliant UNIX 5.44                Printed 11/98

expressions(5)                                               expressions(5)

   Repeats
     Expressions that describe single characters or groups of characters
     can be repeated, as can references back to subexpressions.

     ______________________________________________________________________
    | Expression    |  Meaning                                            |
    |_______________|_____________________________________________________|
    | r*            |  Zero, one, or more occurrences of the regular      |
    |               |  expression r.                                      |
    |               |                                                     |
    |               |  Example: a* matches nothing, a, aa, aaa, ...       |
    |_______________|_____________________________________________________|
    | r\{m,n\}      |  At least m and at most n occurrences of the regular|
    |               |  expression r.                                      |
    |               |                                                     |
    |               |  Example: a\{1,2\} matches a or aa                  |
    |               |                                                     |
    | r\{m\}        |  Exactly m occurrences of the regular expression r. |
    |               |                                                     |
    |               |  Example: a\{3\} matches aaa                        |
    |               |                                                     |
    | r\{m,\}       |  At least m occurrences of the regular expression r.|
    |               |                                                     |
    |               |  Example: a\{3,\} matches aaa, aaaa, aaaaa, ...     |
    |_______________|_____________________________________________________|





























Page 6                       Reliant UNIX 5.44                Printed 11/98

expressions(5)                                               expressions(5)

   Anchoring
     Patterns can be "anchored" at the start of the end of a line.

     ______________________________________________________________________
    | Expression    |  Meaning                                            |
    |_______________|_____________________________________________________|
    | ^r            |  A character string appearing at the start of a     |
    |               |  line, that matches the regular expression r, i.e.  |
    |               |  straight after a newline character or at the start |
    |               |  of a file.                                         |
    |               |                                                     |
    |               |  Example:                                           |
    |               |                                                     |
    |               |  ^[aA]pple matches apple or Apple at the start of a |
    |               |  line.                                              |
    |_______________|_____________________________________________________|
    | r$            |  A character string appearing at the end of a line, |
    |               |  that matches the regular expression r, i.e.        |
    |               |  directly before a newline character.               |
    |               |                                                     |
    |               |  Example:                                           |
    |               |                                                     |
    |               |  [bB]irne$ matches barge or Barge at the end of a   |
    |               |  line.                                              |
    |_______________|_____________________________________________________|





























Page 7                       Reliant UNIX 5.44                Printed 11/98

expressions(5)                                               expressions(5)

   Subexpressions and references
     Parts of a pattern can be combined as a subexpression. This subexpres-
     sion can then be repeated at a later position in the pattern by means
     of a reference. The reference always stands for the same character
     string as the subexpression.

     ______________________________________________________________________
    | Expression    |  Meaning                                            |
    |_______________|_____________________________________________________|
    | \(x\)         |  The regular expression x is identified as a subex- |
    |               |  pression. It matches all character strings that    |
    |               |  match the regular expression x.                    |
    |               |                                                     |
    |               |  Example: \(aa*\) matches a, aa, aaa, ...           |
    |_______________|_____________________________________________________|
    | \n            |  n is an integer between 1 and 9. \n is reference to|
    |               |  the nth subexpression x in a pattern. x must be    |
    |               |  placed before the reference in the pattern. \n     |
    |               |  matches the same character string as x.            |
    |               |                                                     |
    |               |  Example:                                           |
    |               |                                                     |
    |               |  \(aa*\)\1 matches aa, aaaa, aaaaaa, ...        |
    |               |                                                     |
    |               |  \(a\(b\)\)\2 matches abb                           |
    |_______________|_____________________________________________________|

   Grouping, alternatives
     Only exist for extended regular expressions.

























Page 8                       Reliant UNIX 5.44                Printed 11/98

expressions(5)                                               expressions(5)

EXTENDED REGULAR EXPRESSIONS

     The pattern to be searched for in a text can be made up of any single
     expressions. The following single expressions can be used if a command
     supports extended regular expressions.

   Single characters, collating elements
     As for simple regular expressions.

   Groups of characters, classes
     As for simple regular expressions

   Concatenation
     As for simple regular expressions.








































Page 9                       Reliant UNIX 5.44                Printed 11/98

expressions(5)                                               expressions(5)

   Repeats
     Expressions that describe single characters or groups of characters
     can be repeated, as can groupings and alternatives.

     ______________________________________________________________________
    | Expression    |  Meaning                                            |
    |_______________|_____________________________________________________|
    | r*            |  Zero, one, or more occurrences of the regular      |
    |               |  expression r.                                      |
    |               |                                                     |
    |               |  Example: a* matches nothing, a, aa, aaa, ...       |
    |               |                                                     |
    | r+            |  One or more occurrences of the regular expression  |
    |               |  r.                                                 |
    |               |                                                     |
    |               |  Example: u+ matches u, uu, uuu, ...                |
    |               |                                                     |
    | r?            |  Zero or one occurrences of the regular expression  |
    |               |  r.                                                 |
    |               |                                                     |
    |               |  Example: u? matches nothing or u                   |
    |_______________|_____________________________________________________|
    | r{m,n}        |  At least m and at most n occurrences of the regular|
    |               |  expression r.                                      |
    |               |                                                     |
    |               |  Example: a{1,2} matches a or aa                    |
    |               |                                                     |
    | r{m}          |  Exactly m occurrences of the regular expression r. |
    |               |                                                     |
    |               |  Example: a{3} matches aaa                          |
    |               |                                                     |
    | r{m,}         |  At least m occurrences of the regular expression r.|
    |               |                                                     |
    |               |  Example: a{3,} matches aaa, aaaa, aaaaa, ...       |
    |_______________|_____________________________________________________|

   Anchoring
     As for simple regular expressions.

   Subexpressions and references
     Only exist for simple regular expressions.













Page 10                      Reliant UNIX 5.44                Printed 11/98

expressions(5)                                               expressions(5)

   Grouping, alternatives

     ______________________________________________________________________
    | Expression    |  Meaning                                            |
    |_______________|_____________________________________________________|
    | (rx)          |  The regular expressions r and x are combined in a  |
    |               |  group that matches all character strings matching  |
    |               |  the regular expression rx.                         |
    |               |                                                     |
    |               |  Example:                                           |
    |               |                                                     |
    |               |  (ok(abc)) matches okabc                            |
    |               |                                                     |
    |               |  (au)* matches nothing or au, auau, ...             |
    |_______________|_____________________________________________________|
    | (r1|r2)       |  Character strings that match the regular expression|
    |               |  r1 or r2.                                          |
    |               |                                                     |
    |               |  Example: (ok|ko) matches ok or ko                  |
    |_______________|_____________________________________________________|


































Page 11                      Reliant UNIX 5.44                Printed 11/98

expressions(5)                                               expressions(5)

PRECEDENCE
     The following tables show the precedence of operators in regular
     expressions. The operators are collated in descending order from
     highest precedence to lowest precedence.

   Precedence for simple regular expressions

     ______________________________________________________________________
    | Symbols from the internationalized environment |  [= =] [: :] [. .] |
    |________________________________________________|____________________|
    | Quoting characters                             |  \character        |
    |________________________________________________|____________________|
    | Bracketed expressions                          |  [ ]               |
    |________________________________________________|____________________|
    | Subexpressions, references                     |  \( \) \n          |
    |________________________________________________|____________________|
    | Repeat                                         |  * \{m,n\}         |
    |________________________________________________|____________________|
    | Concatenation                                  |  rx                |
    |________________________________________________|____________________|
    | Anchoring                                      |  ^ $               |
    |________________________________________________|____________________|
































Page 12                      Reliant UNIX 5.44                Printed 11/98

expressions(5)                                               expressions(5)

   Precedence for extended regular expressions

     ______________________________________________________________________
    | Symbols from the internationalized environment |  [= =] [: :] [. .] |
    |________________________________________________|____________________|
    | Quoting characters                             |  \character        |
    |________________________________________________|____________________|
    | Bracketed expressions                          |  [ ]               |
    |________________________________________________|____________________|
    | Grouping                                       |  ( )               |
    |________________________________________________|____________________|
    | Repeat                                         |  * ? + {m,n}       |
    |________________________________________________|____________________|
    | Concatenation                                  |  rx                |
    |________________________________________________|____________________|
    | Anchoring                                      |  ^ $               |
    |________________________________________________|____________________|
    | Alternatives                                   |  |                 |
    |________________________________________________|____________________|



































Page 13                      Reliant UNIX 5.44                Printed 11/98

expressions(5)                                               expressions(5)

   Commands with regular expressions
     The following table is an overview of the commands that process regu-
     lar expressions.

     _________________________________________
    | Command |  Type of regular expressions |
    |_________|______________________________|
    | apropos |  extended                    |
    |_________|______________________________|
    | awk     |  extended internationalized  |
    |_________|______________________________|
    | bfs     |  simple                      |
    |_________|______________________________|
    | csplit  |  simple                      |
    |_________|______________________________|
    | e       |  simple                      |
    |_________|______________________________|
    | ed      |  simple                      |
    |_________|______________________________|
    | egrep   |  extended                    |
    |_________|______________________________|
    | ex      |  simple                      |
    |_________|______________________________|
    | expr    |  simple                      |
    |_________|______________________________|
    | extract |  simple                      |
    |_________|______________________________|
    | findman |  extended                    |
    |_________|______________________________|
    | grep    |  simple                      |
    |_________|______________________________|
    | lex     |  extended                    |
    |_________|______________________________|
    | man     |  simple                      |
    |_________|______________________________|
    | nl      |  simple                      |
    |_________|______________________________|
    | pg      |  simple                      |
    |_________|______________________________|
    | ed      |  simple                      |
    |_________|______________________________|
    | vi      |  simple                      |
    |_________|______________________________|
    | whatis  |  extended                    |
    |_________|______________________________|









Page 14                      Reliant UNIX 5.44                Printed 11/98

expressions(5)                                               expressions(5)

LOCALE
     In bracketed regular expressions the LCCOLLATE environment variable
     determines the meaning of metacharacters, equivalence classes, and
     collating elements, while the LCCTYPE environment variable determines
     the meaning of character classes.

     If LCCOLLATE or LCCTYPE is undefined or defined as a null string,
     the value of LANG is taken as the default value for the unset or empty
     variable. If LANG is also undefined or defined as a null string, the
     system behaves as if it has not been internationalized.

     If one of the variables has a value that is invalid for the interna-
     tionalized environment, the system behaves as if no variables have
     been set.

     The LCALL environment variable determines the entire international-
     ized environment. LCALL takes precedence over all other environment
     variables in the area of internationalization.

SEE ALSO
     awk(1), bfs(1), csplit(1), e(1), ed(1), egrep(1), ex(1), expr(1),
     extract(1), grep(1), lex(1), man(1), nl(1), pg(1), regcmp(1), sed(1),
     vi(1), regex(3), regcomp(3C), regcmp(3G), regexpr(3G), regex(5),
     regexp(5).






























Page 15                      Reliant UNIX 5.44                Printed 11/98

Typewritten Software • bear@typewritten.org • Edmonds, WA 98026