Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ regcmp(3X) — Interactive 2.2

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

regcmp(1)

malloc(3C)

regexp(5)

ed(1)



          REGCMP(3X)           INTERACTIVE UNIX System           REGCMP(3X)



          NAME
               regcmp, regex - compile and execute regular expression

          SYNOPSIS
               char *regcmp (string1 [, string2, ...], (char *)0)
               char *string1, *string2, ...;

               char *regex (re, subject[, ret0, ...])
               char *re, *subject, *ret0, ...;

               extern char *__loc1;

          DESCRIPTION
               The regcmp function compiles a regular expression (consist-
               ing of the concatenated arguments) and returns a pointer to
               the compiled form.  The malloc(3C) function is used to
               create space for the compiled form.  It is the user's
               responsibility to free unneeded space so allocated.  A NULL
               return from regcmp indicates an incorrect argument.
               regcmp(1) has been written to generally preclude the need
               for this routine at execution time.

               Regex executes a compiled pattern against the subject
               string.  Additional arguments are passed to receive values
               back.  Regex returns NULL on failure or a pointer to the
               next unmatched character on success.  A global character
               pointer __loc1 points to where the match began.  regcmp and
               regex were mostly borrowed from the editor, ed(1); however,
               the syntax and semantics have been changed slightly.  The
               following are the valid symbols and their associated mean-
               ings.

               []*.^     These symbols retain their meaning in ed(1).

               $         Matches the end of the string; \n matches a new-
                         line.

               -         Within brackets the minus means through.  For
                         example, [a-z] is equivalent to [abcd...xyz].  The
                         - can appear as itself only if used as the first
                         or last character.  For example, the character
                         class expression []-] matches the characters
                         ] and -.

               +         A regular expression followed by + means one or
                         more times.  For example, [0-9]+ is equivalent to
                         [0-9] [0-9]*.

               {m} {m,} {m,u}
                         Integer values enclosed in {} indicate the number
                         of times the preceding regular expression is to be
                         applied.  The value m is the minimum number and u
                         is a number, less than 256, which is the maximum.


          Rev. C Software Development Set                            Page 1





          REGCMP(3X)           INTERACTIVE UNIX System           REGCMP(3X)



                         If only m is present (e.g., {m}), it indicates the
                         exact number of times the regular expression is to
                         be applied.  The value {m,} is analogous to
                         {m,infinity}.  The plus (+) and star (*) opera-
                         tions are equivalent to {1,} and {0,} respec-
                         tively.

               ( ... )$n The value of the enclosed regular expression is to
                         be returned.  The value will be stored in the
                         (n+1)th argument following the subject argument.
                         At most ten enclosed regular expressions are
                         allowed.  Regex makes its assignments uncondition-
                         ally.

               ( ... )   Parentheses are used for grouping.  An operator,
                         e.g., *, +, {}, can work on a single character or
                         a regular expression enclosed in parentheses.  For
                         example, (a*(cb+)*)$0.

               By necessity, all the above defined symbols are special.
               They must, therefore, be escaped with a \ (backslash) to be
               used as themselves.

          EXAMPLES
               Example 1:
                    char *cursor, *newcursor, *ptr;
                         ...
                    newcursor = regex((ptr = regcmp("^\n", (char *)0)), cursor);
                    free(ptr);

               This example will match a leading new-line in the subject
               string pointed at by cursor.

               Example 2:
                    char ret0[9];
                    char *newcursor, *name;
                         ...
                    name = regcmp("([A-Za-z][A-za-z0-9]{0,7})$0", (char *)0);
                    newcursor = regex(name, "012Testing345", ret0);

               This example will match through the string ``Testing3'' and
               will return the address of the character after the last
               matched character (the ``4'').  The string ``Testing3'' will
               be copied to the character array ret0.

               Example 3:
                    #include "file.i"
                    char *string, *newcursor;
                         ...
                    newcursor = regex(name, string);

               This example applies a precompiled regular expression in
               file.i [see regcmp(1)] against string.


          Rev. C Software Development Set                            Page 2





          REGCMP(3X)           INTERACTIVE UNIX System           REGCMP(3X)



          SEE ALSO
               regcmp(1), malloc(3C), regexp(5).
               ed(1) in the INTERACTIVE UNIX System User's/System
               Administrator's Reference Manual.

          BUGS
               The user program may run out of memory if regcmp is called
               iteratively without freeing the vectors no longer required.















































          Rev. C Software Development Set                            Page 3



Typewritten Software • bear@typewritten.org • Edmonds, WA 98026