Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ regcmp(3g) — NEWS-os 5.0.1

Media Vault

Software Library

Restoration Projects

Artifacts Sought

Related Articles

regcmp(1)

malloc(3C)

ed(1)



regcmp(3G)        MISC. REFERENCE MANUAL PAGES         regcmp(3G)



NAME
     regcmp, regex - compile and execute regular expression

SYNOPSIS
     #include <libgen.h>

     cc [flag ...] file ...  -lgen [library ...]

     char *regcmp (const char *string1 [,  char  *string2,  ...],
     (char *)0);

     char *regex (const char *re, const  char  *subject  [,  char
     *ret0, ...]);

     extern char *loc1;

DESCRIPTION
     regcmp compiles a regular expression (consisting of the con-
     catenated  arguments)  and returns a pointer to the compiled
     form.  malloc(3C) is used to create space for  the  compiled
     form.   It  is  the  user's  responsibility to free unneeded
     space so allocated.  A NULL return from regcmp indicates  an
     incorrect argument.  regcmp(1) has been written to generally
     preclude the need for this routine at execution time.

     regex  executes  a  compiled  pattern  against  the  subject
     string.   Additional  arguments are passed to receive values
     back.  regex returns NULL on failure or  a  pointer  to  the
     next  unmatched  character  on  success.  A global character
     pointer loc1 points to where the match began.  regcmp  and
     regex  were mostly borrowed from the editor, ed(1); however,
     the syntax and semantics have been  changed  slightly.   The
     following  are  the valid symbols and their associated mean-
     ings.

     []*.^     These symbols retain their meaning in ed(1).

     $         Matches the end of the string; \n matches  a  new-
               line.

     -         Within brackets  the  minus  means  through.   For
               example, [a-z] is equivalent to [abcd...xyz].  The
               - can appear as itself only if used as  the  first
               or  last  character.   For  example, the character
               class expression []-] matches the characters ] and
               -.

     +         A regular expression followed by +  means  one  or
               more  times.  For example, [0-9]+ is equivalent to
               [0-9][0-9]*.

     {m} {m,} {m,u}



                                                                1





regcmp(3G)        MISC. REFERENCE MANUAL PAGES         regcmp(3G)



               Integer values enclosed in {} indicate the  number
               of times the preceding regular expression is to be
               applied.  The value m is the minimum number and  u
               is  a number, less than 256, which is the maximum.
               If only m is present (i.e., {m}), it indicates the
               exact number of times the regular expression is to
               be  applied.   The  value  {m,}  is  analogous  to
               {m,infinity}.   The  plus  (+) and star (*) opera-
               tions are equivalent  to  {1,}  and  {0,}  respec-
               tively.

     ( ... )$n The value of the enclosed regular expression is to
               be  returned.   The  value  will  be stored in the
               (n+1)th argument following the  subject  argument.
               At  most,  ten  enclosed  regular  expressions are
               allowed.  regex makes its assignments uncondition-
               ally.

     ( ... )   Parentheses are used for grouping.   An  operator,
               e.g.,  *, +, {}, can work on a single character or
               a regular expression enclosed in parentheses.  For
               example, (a*(cb+)*)$0.

     By necessity, all the above  defined  symbols  are  special.
     They  must, therefore, be escaped with a \ (backslash) to be
     used as themselves.

EXAMPLES
     The following example matches a leading new-line in the sub-
     ject string pointed at by cursor.

          char *cursor, *newcursor, *ptr;
               ...
          newcursor = regex((ptr = regcmp("^\n", (char *)0)), cursor);
          free(ptr);

     The following example matches through  the  string  Testing3
     and  returns  the  address  of  the character after the last
     matched character  (the  ``4'').   The  string  Testing3  is
     copied to the character array ret0.

          char ret0[9];
          char *newcursor, *name;
               ...
          name = regcmp("([A-Za-z][A-za-z0-9]{0,7})$0", (char *)0);
          newcursor = regex(name, "012Testing345", ret0);

     The following example applies a precompiled regular  expres-
     sion in file.i [see regcmp(1)] against string.

          #include "file.i"
          char *string, *newcursor;



                                                                2





regcmp(3G)        MISC. REFERENCE MANUAL PAGES         regcmp(3G)



               ...
          newcursor = regex(name, string);

SEE ALSO
     regcmp(1), malloc(3C).
     ed(1) in the User's Reference Manual.

NOTES
     The user program may run out of memory if regcmp  is  called
     iteratively without freeing the vectors no longer required.













































                                                                3



Typewritten Software • bear@typewritten.org • Edmonds, WA 98026