Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ regex(3x) — CX/UX 6.20

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

regcmp(1)

malloc(3C)

ed(1)



regcmp(3X)                                             regcmp(3X)



NAME
     regcmp, regex - compile and execute regular expression

SYNOPSIS
     #include <libgen.h>

     cc [flag ...] file ...  -lgen [library ...]

     char *regcmp (const char *string1 [, char *string2, ...], (char *)0);

     char *regex (const char *re, const char *subject [, char *ret0, ...]);

     extern char *loc1;

DESCRIPTION
     regcmp compiles a regular expression (consisting of the con-
     catenated arguments) and returns a pointer to the compiled
     form.  malloc(3C) is used to create space for the compiled
     form.  It is the user's responsibility to free unneeded
     space so allocated.  A NULL return from regcmp indicates an
     incorrect argument.  regcmp(1) has been written to generally
     preclude the need for this routine at execution time.

     regex executes a compiled pattern against the subject
     string.  Additional arguments are passed to receive values
     back.  regex returns NULL on failure or a pointer to the
     next unmatched character on success.  A global character
     pointer loc1 points to where the match began.  regcmp and
     regex were mostly borrowed from the editor, ed(1); however,
     the syntax and semantics have been changed slightly.  The
     following are the valid symbols and their associated mean-
     ings.

     []*.^     These symbols retain their meaning in ed(1).

     $         Matches the end of the string; \n matches a new-
               line.

     -         Within brackets the minus means through.  For
               example, [a-z] is equivalent to [abcd...xyz].  The
               - can appear as itself only if used as the first
               or last character.  For example, the character
               class expression []-] matches the characters ] and
               -.

     +         A regular expression followed by + means one or
               more times.  For example, [0-9]+ is equivalent to
               [0-9][0-9]*.

     {m} {m,} {m,u}
               Integer values enclosed in {} indicate the number
               of times the preceding regular expression is to be



Page 1                        CX/UX Programmer's Reference Manual





regcmp(3X)                                             regcmp(3X)



               applied.  The value m is the minimum number and u
               is a number, less than 256, which is the maximum.
               If only m is present (i.e., {m}), it indicates the
               exact number of times the regular expression is to
               be applied.  The value {m,} is analogous to
               {m,infinity}.  The plus (+) and star (*) opera-
               tions are equivalent to {1,} and {0,} respec-
               tively.

     ( ... )$n The value of the enclosed regular expression is to
               be returned.  The value will be stored in the
               (n+1)th argument following the subject argument.
               At most, ten enclosed regular expressions are
               allowed.  regex makes its assignments uncondition-
               ally.

     ( ... )   Parentheses are used for grouping.  An operator,
               e.g., *, +, {}, can work on a single character or
               a regular expression enclosed in parentheses.  For
               example, (a*(cb+)*)$0.

     By necessity, all the above defined symbols are special.
     They must, therefore, be escaped with a \ (backslash) to be
     used as themselves.

EXAMPLES
     The following example matches a leading newline in the sub-
     ject string pointed at by cursor.

          char *cursor, *newcursor, *ptr;
               ...
          newcursor = regex((ptr = regcmp("^\n", (char *)0)), cursor);
          free(ptr);

     The following example matches through the string Testing3
     and returns the address of the character after the last
     matched character (the ``4'').  The string Testing3 is
     copied to the character array ret0.

          char ret0[9];
          char *newcursor, *name;
               ...
          name = regcmp("([A-Za-z][A-za-z0-9]{0,7})$0", (char *)0);
          newcursor = regex(name, "012Testing345", ret0);

     The following example applies a precompiled regular expres-
     sion in file.i [see regcmp(1)] against string.

          #include "file.i"
          char *string, *newcursor;
               ...
          newcursor = regex(name, string);



Page 2                        CX/UX Programmer's Reference Manual





regcmp(3X)                                             regcmp(3X)



SEE ALSO
     regcmp(1), malloc(3C).
     ed(1) in the CX/UX User's Reference Manual.

NOTES
     The user program may run out of memory if regcmp is called
     iteratively without freeing the vectors no longer required.
















































Page 3                        CX/UX Programmer's Reference Manual



Typewritten Software • bear@typewritten.org • Edmonds, WA 98026