Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ egrep(1) — UnixWare 2.01

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

ed(1)

fgrep(1)

sed(1)

sh(1)

vi(1)






       grep(1)                                                      grep(1)


       NAME
             grep, egrep - search a file for a pattern

       SYNOPSIS
             grep [-E|-F] [-c|-l|-q] [-bhinsvx] expression [file ...]
             grep [-E|-F] [-c|-l|-q] [-bhinsvx] -e expression ...
                     [-f exprfile] ... [file ...]
             grep [-E|-F] [-c|-l|-q] [-bhinsvx] [-e expression] ...
                     -f exprfile ... [file ...]
             egrep [-c|-l|-q] [-bhinsvx] expression [file ...]
             egrep [-c|-l|-q] [-bhinsvx] -e expression ...
                     [-f exprfile] ... [file ...]
             egrep [-c|-l|-q] [-bhinsvx] [-e expression] ...
                     -f exprfile ... [file ...]

       DESCRIPTION
             grep and egrep search files for patterns and print all lines
             that contain a match to at least one of the patterns (in
             expression and exprfile).  By default grep uses basic regular
             expressions (see below for details on regular expressions).
             If the -E or -F options are specified, grep behaves
             respectively like egrep or fgrep, see "Options" below.

             Be careful using the characters $, *, [, ^, |,(, ), and \ in
             the expression because they are also meaningful to the shell.
             It is safest to enclose the entire expression in single quotes
             `...` or put the expression in an exprfile.  A null pattern
             matches all lines.

             If no files are specified, grep and egrep assume standard
             input.  If a ``-'' is specified as a file, standard input is
             used.  Normally, each line matched is copied to standard
             output.  The filename is printed before each line matched if
             there is more than one input file, unless the -h option is
             specified.

          Options
             -E    Behave like egrep.  All specified patterns (in
                   expression and exprfile) are then full regular
                   expressions.  When this option is specified, all other
                   grep options (except -F) have the same effect as usual,
                   and the same effect as they have for egrep.

             -F    Behave like fgrep.  All specified patterns (in
                   expression and exprfile) are then fixed strings.  When
                   this option is specified, all other grep options (except


                           Copyright 1994 Novell, Inc.               Page 1













      grep(1)                                                      grep(1)


                  -E) have the same effect as usual, and the same effect
                  as they have for fgrep(1).

            -b    Precede each line by the block number on which it was
                  found.  This can be useful in locating block numbers by
                  context (first block is 0).

            -c    Print only a count of the lines that match the patterns.

            -e expression
                  Specify one or more patterns (regular expressions or
                  strings) to be used during the search for input.  The
                  patterns in expression are separated by newline
                  characters.  Two adjacent newlines indicate a null
                  pattern.  The last pattern does not require a
                  terminating newline.  When multiple -e or -f options are
                  specified, all the patterns specified will be used.
                  (Obviously, if expression is to contain newlines, it
                  should be quoted.)

                  This option is useful for specifying patterns that begin
                  with a ``-''.

            -f exprfile
                  Read one or more patterns (regular expressions or
                  strings) from exprfile.  The patterns in exprfile are
                  terminated by a newline character.  An empty line in
                  exprfile indicates a null pattern.  When multiple -e or
                  -f options are specified, all the patterns specified
                  will be used.

            -h    Suppress printing of filenames when searching multiple
                  files.

            -i    Ignore uppercase/lowercase distinction during
                  comparisons, as defined by the character classification
                  locale [see LANG etc., on environ(5)].

            -l    Print the names of files with matching lines, one per
                  line.  Does not repeat a file name even if multiple
                  matches are present.  If the input file is stdin, then a
                  name such as ``(standard input)'' will be written,
                  depending upon the message locale.





                          Copyright 1994 Novell, Inc.               Page 2













       grep(1)                                                      grep(1)


             -n    Precede each line by its line number in the file (first
                   line is 1).

             -q    Quiet, do not write anything to the standard output,
                   regardless of any matches.  Exits with zero if any input
                   line is matched.

             -s    Suppress error messages about nonexistent or unreadable
                   files.

             -v    Print all lines except those that contain a pattern.

             -x    Match only lines for which the pattern matches the
                   entire line.  For character strings, the pattern must
                   match all characters in the line.  For regular
                   expressions, this option is equivalent to placing a
                   ``^'' at the start of the pattern, and a ``$'' at the
                   end of the pattern.

          Regular Expressions
             Regular expressions (REs) enable you to select specific
             strings from a set of character strings.

             REs are context-independent syntax representing a variety of
             character sets and character set orderings.  These character
             sets are interpreted according to the current locale.  While
             many REs can be interpreted differently depending on the
             current locale, many features (such as character class
             expressions) provide for contextual invariance across locales.

             Basic Regular Expressions (BREs) are supported by default by
             grep.  A slightly different notation, called Extended Regular
             Expressions (EREs), are supported by grep -E (or egrep). The
             following applies to both BREs and EREs.

             Matching is based on the bit pattern used for encoding the
             character, not on the graphic representation of the character.

             Searches for a matching sequence start at the beginning of a
             string and stop when the first sequence matching the
             expression is found.  If the pattern allows a variable number
             of matching characters (and there is more than one such
             sequence starting at that point) then the longest sequence is
             matched.




                           Copyright 1994 Novell, Inc.               Page 3













      grep(1)                                                      grep(1)


            Consistent with the whole match being the longest of the
            leftmost matches, each subpattern, from left to right, matches
            the longest possible string. For this purpose, a null string
            is considered to be longer than no match at all.  For example,
            matching the BRE \(.*\).* against abcdef, the subexpression
            (\1) is abcdef, and matching the BRE \(a*\)* against bc, the
            subexpression (\1) is the null string.

         Basic Regular Expressions
            For BREs, ordinary characters, a special character preceded by
            a backslash, or a period, matches a single character.  A
            bracket expression matches a single character or collating
            element.

            An ordinary character is a BRE that matches itself (that is,
            any character in the supported character set, except for the
            BRE special characters listed below).

            The interpretation of an ordinary character preceded by a
            backslash (\) is undefined, except for the characters ), (, {,
            and }, the numbers 1 through 9, and a character inside a
            bracket expression.

            In certain contexts, a BRE special character has special
            properties.  The BRE characteristics and the contexts in which
            they have their special meaning are:

            1.    The period (.), left bracket ([), and backslash (\) are
                  special except when used in a bracket expression.  If an
                  expression contains a left bracket not preceded by a
                  backslash (and that is not part of a bracket
                  expression), it will yield undefined results.

            2.    The asterisk (*) is special except when used in a
                  bracket expression, as the first character of an entire
                  BRE (after an anchor circumflex, if any), or as the
                  first character of a subexpression (after an anchor
                  circumflex, if any).

            3.    The circumflex (^) is special when used as an anchor or
                  as the first character in a bracket expression.

            4.    The dollar sign ($) is special when used as an anchor.





                          Copyright 1994 Novell, Inc.               Page 4













       grep(1)                                                      grep(1)


             If a period (.) is used outside a bracket expression, then it
             is a BRE matching any character in the supported character
             set, except NUL.

             A bracket expression (that is, an expression enclosed in
             square brackets,[]), is an RE that matches a single collating
             element contained in the nonempty set of collating elements
             the bracket expression represents.  The following rules and
             definitions apply:

             1.    A bracket expression is a matching or nonmatching list
                   expression.  It consists of one or more expressions.
                   These include collating elements, collating symbols,
                   equivalence classes, character classes, or range
                   expressions.  The right bracket (]) loses its special
                   meaning and represents itself in a bracket expression if
                   it occurs first in the list (after an initial
                   circumflex, if any).  Otherwise, it terminates the
                   bracket expression unless it appears as part of a
                   collating symbol, equivalence class, or character class
                   construct (such as [.].] and [=a=]).  The special
                   characters period (.), asterisk (*), left bracket ([),
                   and backslash (\) lose their special meaning within a
                   bracket expression.

                   The [., [=, and [: character sequences are special
                   inside a bracket expression and are used to delimit
                   collating symbols, equivalence class expressions, and
                   character class constructs. These character sequences
                   are followed by a character sequence and the matching
                   terminating sequence .], =], or :].

             2.    A matching list expression specifies a list that matches
                   any one of the expressions represented in the list.  The
                   first character in the list can not be the circumflex.
                   For example, [abc] is an RE that matches any of a, b, or
                   c.

             3.    A nonmatching list expression begins with a circumflex
                   and specifies a list that matches any character or
                   collating element except for the expressions represented
                   in the list after the leading circumflex.  For example,
                   [^abc] is an RE that matches any character or collating
                   element except a, b, or c.  The circumflex has this
                   special meaning only when it occurs first in the list,
                   immediately following the left bracket.


                           Copyright 1994 Novell, Inc.               Page 5













      grep(1)                                                      grep(1)


            4.    A collating symbol is a collating element enclosed
                  within bracket-period ([. .]) delimiters. Multiple-
                  character collating elements are represented as
                  collating symbols when it is necessary to distinguish
                  them from a list of the individual characters that make
                  up the multiple-character collating element.  For
                  example, if the string ch is a two-character collating
                  element in the current collation sequence with the
                  associated collating symbol <ch>, the expression
                  [[.ch.]] is treated as an RE matching the character
                  sequence ch, while [ch] is treated as an RE matching the
                  character c or h.  Collating symbols are recognized only
                  inside bracket expressions.  This implies that the RE
                  [[.ch.]]*c matches the first through fifth character in
                  the string chchch.  If the string is not a collating
                  element in the current collating sequence definition, or
                  if the collating element has no characters associated
                  with it, the symbol is treated as an invalid expression.

            5.    An equivalence class expression represents the set of
                  collating elements belonging to an equivalence class, as
                  defined by the collation portion of the current locale.
                  Only primary equivalence classes are recognized. The
                  class is expressed by enclosing any one of the collating
                  elements in the equivalence class within a bracket-equal
                  ([= =]) delimiters.  For example, if a, `, and ^ form an
                  equivalence class, then [[=a=]b], [[=`=]b], and [[=^=]b]
                  are each equivalent to [a`^b].  If the collating element
                  does not belong to an equivalence class, the equivalence
                  class expression is treated as a collating symbol.

            6.    A character class represents the set of characters
                  belonging to a character class, as defined in the
                  character classification portion of the current locale.
                  All character classes specified in the current locale
                  are recognized.  A character class expression can be
                  expressed as a character class name enclosed within
                  bracket-colon [: :] delimiters.

                  These are supported on all conforming implementations:

                              [:alnum:]   [:cntrl:]   [:lower:]   [:space:]
                              [:alpha:]   [:digit:]   [:print:]   [:upper:]
                              [:blank:]   [:graph:]   [:punct:]   [:xdigit:]




                          Copyright 1994 Novell, Inc.               Page 6













       grep(1)                                                      grep(1)


                   Other, locale-dependent character classes  may  also  be
                   recognized.

             7.    A range  expression  represents  the  set  of  collating
                   elements  that  fall between two elements in the current
                   collation sequence.  It is  expressed  as  the  starting
                   point and the ending point separated by a hyphen.

                   Range expressions are not used  portably  because  their
                   behavior depends on the collating sequence order defined
                   by the current locale.

                   In the following,  all  examples  assume  the  collation
                   sequence   specified  for  the  default  locale,  unless
                   another collation sequence is specifically defined.

                   The starting range point and the ending range point is a
                   collating   element  or  symbol.  An  equivalence  class
                   expression used as a starting or ending point of a range
                   expression  produces  unspecified  results.   The ending
                   range  point  collates  equal  to  or  higher  than  the
                   starting  range  point;  otherwise,  the  expression  is
                   treated as invalid.  The order  used  is  the  order  in
                   which  the  collating  elements  are  specified  in  the
                   current  locales'  collation   definition.   One-to-many
                   mappings  are  not performed. For example, assuming that
                   the character eszet  (B)  is  placed  in  the  collation
                   sequence after r and s but before t (and that it maps to
                   the  sequence  ss  for  collation  purposes),  then  the
                   expression   [r-s]   matches  only  r  and  s,  but  the
                   expression [s-t] matches s, B, or t.

                   The interpretation of range expressions where the ending
                   range  point  is  also  the  starting  range  point of a
                   subsequent range expression is undefined.

                   The hyphen character is treated as itself if  it  occurs
                   first  (after  an initial circumflex, if any) or last in
                   the list, or  as  an  ending  range  point  in  a  range
                   expression.   As  examples,  the  expressions  [-ac] and
                   [ac-] are equivalent and match any of the characters  a,
                   c,   or   -;  the  expressions  [^-ac]  and  [^ac-]  are
                   equivalent and match any characters except a, c,  or  -;
                   [%--]  matches  any  of  the  characters between % and -
                   inclusive; the  expression  [--@]  matches  any  of  the
                   characters between - and @ inclusive, and the expression


                           Copyright 1994 Novell, Inc.               Page 7













      grep(1)                                                      grep(1)


                  [a--@] is invalid  because  the  letter  a  follows  the
                  symbol  -  in  the default locale.  To use the hyphen as
                  the starting range point, it either comes first  in  the
                  bracket  expression  or  is  specified  as  a  collating
                  symbol. For example, [][.-.]-0], which matches either  a
                  right bracket or any character or collating element that
                  collates between hyphen and 0, inclusive.

            The following rules can be used  to  construct  BREs  matching
            multiple characters from BREs matching a single character.

            1.    The concatenation of BREs matches the  concatenation  of
                  the strings matched by each component of the BRE.

            2.    A subexpression can be defined within a BRE by enclosing
                  it  between  the  character  pairs  \(  and  \).  Such a
                  subexpression matches whatever  it  would  have  matched
                  without  the  \(  and  \),  except that anchoring within
                  subexpressions is optional behavior.  Subexpressions can
                  be arbitrarily nested.

            3.    The  backreference  expressions  \n   matches  the  same
                  (possibly  empty) string of characters as was matched by
                  a subexpression enclosed between \( and \) preceding the
                  \n.  The character n is a single digit from 1 through 9,
                  specifying the n-th subexpression [the one  that  begins
                  with  the n-th \( and ends with the corresponding paired
                  \)].   The  expression  is  invalid  if  less   than   n
                  subexpressions   precede   the  \n.   For  example,  the
                  expression ^\(.*\)\1$ matches a line entirely consisting
                  of  two  adjacent appearances of the same string and the
                  expression \(a\)*\1 fails to match a.

            4.    When a BRE matching a single character, a subexpression,
                  or  a backreference is followed by the special character
                  asterisk, it matches (together with that asterisk)  what
                  zero  or  more  consecutive occurrences of the BRE would
                  match.  For example, [ab]* and [ab][ab]  are  equivalent
                  when matching the string ab.

            5.    When a BRE matching a single character, a subexpression,
                  or a backreference is followed by an interval expression
                  of the format \{m\},  \{m,\},  or  \{m,n\},  it  matches
                  (together  with  that interval expression) what repeated
                  consecutive occurrences of the  BRE  would  match.   The
                  values  of  m  and  n  are decimal integers in the range


                          Copyright 1994 Novell, Inc.               Page 8













       grep(1)                                                      grep(1)


                   0<m<n<{RE_DUP_MAX}  where  m  specifies  the  exact   or
                   minimum  number  of  occurrences  and  n  specifies  the
                   maximum number of  occurrences.   The  expression  \{m\}
                   matches  exactly  m  occurrences  of  the preceding BRE,
                   \{m,\} matches  at  least  m  occurrences,  and  \{m,n\}
                   matches  any  number  of  occurrences  between  m and n,
                   inclusive.

                   For example, in the string abababccccccd, the BRE c\{3\}
                   is  matched  by  characters  seven through nine, the BRE
                   \(ab\)\{4,\}  is  not  matched  at  all,  and  the   BRE
                   c\{1,3\}d is matched by characters ten through thirteen.

                   An occurrence of multiple adjacent  duplication  symbols
                   (* and intervals) produces undefined results.

             The BRE order of precedence, from high to low, is shown in the
             following table:
                 Collation-related bracket symbols   [= =] [: :] [. .]
                 Escaped characters                  \special character
                 Bracket expression                  [ ]
                 Subexpressions/backreference        \(\)  \n
                 BRE duplication                     *\{m,n\}
                 Concatenation
                 Anchoring                           ^ $

             A BRE can be limited to matching strings that begin or  end  a
             line;  this  is  called  anchoring.  The circumflex and dollar
             sign special characters are  considered  BRE  anchors  in  the
             following contexts:

             1.    A circumflex  is  an  anchor  when  used  as  the  first
                   character  of  an  entire  BRE.   The implementation may
                   treat the circumflex as an anchor when used as the first
                   character  of  a  subexpression.  The circumflex anchors
                   the expression (or optionally, the subexpression) to the
                   beginning  of  a  string; only sequences starting at the
                   first character of a string are matched by the BRE.  For
                   example,  the  BRE  ^ab matches ab in the string abcdef,
                   but fails to  match  in  the  string  cdefab.   The  BRE
                   \(^ab\)   may  match  the  former string. A portable BRE
                   escapes a leading circumflex in a subexpression to match
                   a literal circumflex.





                           Copyright 1994 Novell, Inc.               Page 9













      grep(1)                                                      grep(1)


            2.    A dollar sign  is  an  anchor  when  used  as  the  last
                  character  of  an  entire  BRE.   The implementation may
                  treat a dollar sign as an anchor when used as  the  last
                  character  of  a subexpression.  The dollar sign anchors
                  the expression (or optionally, the subexpression) to the
                  end  of the string being matched; the dollar sign can be
                  said to match the  "end-of-string"  following  the  last
                  character.

            3.    A BRE anchored by both ^ and $ matches  only  an  entire
                  string.   For  example, the BRE ^abcdef$ matches strings
                  consisting only of abcdef.

         Extended Regular Expressions
            An ERE ordinary character, a special character preceded by a
            backslash, or a period matches a single character.  A bracket
            expression matches a single character or a single collating
            element.  An ERE matching a single character enclosed in
            parentheses matches the same way an ERE without parentheses
            would have matched.

            An ordinary character is an ERE that matches itself.  An
            ordinary character is any character in the supported character
            set, except for the ERE special characters listed below. The
            interpretation of an ordinary character preceded by a
            backslash is undefined.

            An ERE special character has special properties in certain
            contexts.  Outside those contexts, or when preceded by a
            backslash, such a character is an ERE that matches the special
            character itself.  The ERE special characters and the contexts
            in which they have their special meanings are defined as
            follows:

            1.    The period (.), left bracket ([), backslash (\) and left
                  parenthesis [(] are special except when used in a
                  bracket expression.  Outside a bracket expression, a
                  left parenthesis immediately followed by a right
                  parenthesis produces undefined results.

            2.    The right parenthesis [)] is special when matched with a
                  preceding left parenthesis, both outside a bracket
                  expression.





                          Copyright 1994 Novell, Inc.              Page 10













       grep(1)                                                      grep(1)


             3.    The asterisk (*), plus sign (+), question mark (?), and
                   left brace ({) are special except when used in a bracket
                   expression. Any of the following uses produce undefined
                   results:

                         These characters appear first in an ERE or
                         immediately following a vertical line, circumflex,
                         or left parenthesis.

                         A left brace is not part of a valid interval
                         expression.

             4.    The vertical line (|) is special except when used in a
                   bracket expression.  A vertical line appearing first or
                   last in an ERE, immediately following a vertical line or
                   left parenthesis, or immediately preceding a right
                   parenthesis produces undefined results.

             5.    The circumflex (^) is special when used as an anchor or
                   as the first character of a bracket expression.

             6.    The dollar sign ($) is special when used as an anchor.

             A period (.), when used outside a bracket expression, is an
             ERE that matches any character in the supported character set
             except NUL.

             The rules for ERE bracket expressions are the same as for RE
             bracket expressions.

             The following rules are used to construct EREs matching
             multiple characters from EREs matching a single character:

             1.    A concatenation of EREs matches the concatenation of the
                   character sequences matched by each component of the
                   ERE.  A concatenation of ERE enclosed in parentheses
                   matches whatever the concatenation without the
                   parentheses matches.  For example, both the ERE cd and
                   the ERE (cd) are matched by the third and fourth
                   character of the string abcdefabcdef.

             2.    When an ERE matching a single character or an ERE
                   enclosed in parentheses is followed by the special
                   character plus sign (+), it matches (together with the
                   plus sign) what one or more consecutive occurrences of
                   the ERE would match.  For example, the ERE b+(bc)


                           Copyright 1994 Novell, Inc.              Page 11













      grep(1)                                                      grep(1)


                  matches the fourth through seventh characters in the
                  string acabbbcde.  Furthermore, [ab]+ and [ab][ab]* are
                  equivalent.

            3.    When an ERE matching a single character or an ERE
                  enclosed in parentheses is followed by the special
                  character asterisk (*),  it matches (together with that
                  asterisk) what zero or more consecutive occurrences of
                  the ERE would match.  For example, the ERE b*c matches
                  the first character in the string cabbbcde and the ERE
                  b*cd matches the third through seventh characters in the
                  string cabbbcdebbbbbbcdbc.  Furthermore, [ab]* and
                  [ab][ab] are equivalent when matching the string ab.

            4.    When an ERE matching a single character or an ERE
                  enclosed in parentheses is followed by the special
                  character question mark (?), it matches (together with
                  that question mark) what zero or one consecutive
                  occurrences of the ERE would match.  For example, the
                  ERE b?c matches the second character in the string
                  acabbbcde.

            5.    When an ERE matching a single character or an ERE
                  enclosed in parentheses is followed by an interval
                  expression of the format {m}, {m,}, or {m,n}, it matches
                  (together with that interval expression) what repeated
                  consecutive occurrences of the ERE would match.  The
                  values of m and n are decimal integers in the range
                  0<m<n<{RE_DUP_MAX} where m specifies the exact or
                  minimum number of occurrences and n specifies the
                  maximum number of occurrences.  The expression {m}
                  matches exactly m occurrences of the preceding ERE, {m,}
                  matches at least m occurrences, and {m,n} matches any
                  number of occurrences between m and n, inclusive.

                  For example, in the string abababccccccd the ERE c{3} is
                  matched by characters seven through nine, and the ERE
                  (ab){2,} is matched by characters one through six.

            An occurrence of multiple adjacent duplication symbols (+, *,
            ?, and intervals) produces undefined results.

            Two EREs separated by the special character vertical line (|)
            match a string that is matched by either.  For example, the
            ERE a((bc)|d) matches the string abc and the string ad. Single
            characters, or expressions matching single characters,


                          Copyright 1994 Novell, Inc.              Page 12













       grep(1)                                                      grep(1)


             separated by the vertical line and enclosed in parentheses,
             are treated as an ERE matching a single character.

             The ERE order of precedence, from high to low, is shown in the
             following table.
                 Collation-related bracket symbols   [= =] [:  :] [. .]
                 Escaped characters                  \special character
                 Bracket expression                  []
                 Grouping                            ()
                 Single-character ERE duplication    * + ? {m,n}
                 Concatenation
                 Anchoring                           ^ $
                 Alternation                         |

             An ERE can be limited to matching strings that begin or end a
             line; this is called anchoring. The circumflex and dollar-sign
             bracket special characters are considered ERE anchors when
             used anywhere outside a bracket expression.  This has the
             following effects:

             1.    A circumflex outside a bracket expression anchors the
                   (sub)expression it begins to the beginning of a string.
                   Such a (sub)expression can match only a sequence
                   starting at the first character of a string.  For
                   example, the EREs ^ab and (^ab) match ab in the string
                   abcdef but fail to match the string cdefab, and the ERE
                   a^b is valid, but can never match because the a prevents
                   the expression ^b from matching, starting at the first
                   character.

             2.    A dollar sign outside a bracket expression anchors the
                   (sub)expression it ends to the end of a string; such a
                   (sub)expression can match only a sequence ending at the
                   last character of a string.  For example, the EREs ef$
                   and (ef$) match ef in the string abcdef, but fail to
                   match in the string cdefab, and the ERE e$f is valid,
                   but can never match because the f prevents the
                   expression e$ from matching, ending at the last
                   character.

          Errors
             Exit status returns 0 if any matches are found, 1 if none are
             found, and 2 for syntax errors or inaccessible files (even if
             matches were found).




                           Copyright 1994 Novell, Inc.              Page 13













      grep(1)                                                      grep(1)


         Files
            /usr/lib/locale/locale/LC_MESSAGES/uxcore.abi
                  language-specific message file [see LANG on environ(5)].

      REFERENCES
            ed(1), fgrep(1), sed(1), sh(1), vi(1)

      NOTICES
            If there is a line with embedded nulls, grep will only match
            up to the first null.






































                          Copyright 1994 Novell, Inc.              Page 14








Typewritten Software • bear@typewritten.org • Edmonds, WA 98026