Museum

Home

Lab Overview

Retrotechnology Articles

Online Manuals

⇒ regex(3x) — DG/UX 4.00

Media Vault

Software Library

Restoration Projects

Artifacts Sought



                                                               regcmp(3x)



        _________________________________________________________________
        regcmp, regex                                          Subroutine
        compile and execute regular expression
        _________________________________________________________________


        SYNTAX

        char *regcmp (string1 [, string2, ...], (char *)0)
        char *string1, *string2, ...;

        char *regex (re, subject[, ret0, ...])
        char *re, *subject, *ret0, ...;

        extern char *loc1;


        DESCRIPTION

        Regcmp compiles a regular expression and returns a pointer to the
        compiled form.  Malloc(3C) is used to create space for the
        vector.  You must free unneeded space so allocated.  A NULL
        return from regcmp indicates an incorrect argument.  Regcmp(1)
        has been written to generally preclude the need for this routine
        at execution time.

        Regex executes a compiled pattern against the subject string.
        Additional arguments are passed to receive values back.  Regex
        returns NULL on failure or a pointer to the next unmatched
        character on success.  A global character pointer __loc1 points
        to where the match began.  Regcmp and regex were mostly borrowed
        from the editor, ed(1); however, the syntax and semantics have
        been changed slightly.  The following are the valid symbols and
        their associated meanings.

        []*.^     These symbols retain their current meaning.

        $         Matches the end of the string; \n matches a new-line.

        -         Within brackets the minus means through.  For example,
                  [a-z] is equivalent to [abcd...xyz].  The - can appear
                  as itself only if used as the first or last character.
                  For example, the character class expression []-]
                  matches the characters ] and -.

        +         A regular expression followed by + means one or more
                  times.  For example, [0-9]+ is equivalent to
                  [0-9][0-9]*.

        {m} {m,} {m,u}
                  Integer values enclosed in {} indicate the number of



        DG/UX 4.00                                                 Page 1
               Licensed material--property of copyright holder(s)





                                                               regcmp(3x)



                  times the preceding regular expression is to be
                  applied.  The value m is the minimum number and u is a
                  number, less than 256, which is the maximum.  If only m
                  is present (e.g., {m}), it indicates the exact number
                  of times the regular expression is to be applied.  The
                  value {m,} is analogous to {m,infinity}.  The plus (+)
                  and star (*) operations are equivalent to {1,} and {0,}
                  respectively.

        ( ... )$n The value of the enclosed regular expression is
                  returned.  The value will be stored in the (n+1)th
                  argument following the subject argument.  At most ten
                  enclosed regular expressions are allowed.  Regex makes
                  its assignments unconditionally.

        ( ... )   Parentheses are used for grouping.  An operator, e.g.,
                  *, +, {}, can work on a single character or a regular
                  expression enclosed in parentheses.  For example,
                  (a*(cb+)*)$0.

        All of these symbols are special.  They must, therefore, be
        escaped to be used as themselves.


        EXAMPLES

        Example 1:
             char *cursor, *newcursor, *ptr;
                  ...
             newcursor = regex((ptr = regcmp("^\n", 0)), cursor);
             free(ptr);

        This example will match a leading new-line in the subject string
        that the cursor points to.

        Example 2:
             char ret0[9];
             char *newcursor, *name;
                  ...
             name = regcmp("([A-Za-z][A-za-z0-9_]{0,7})$0", 0);
             newcursor = regex(name, "123Testing321", ret0);

        This example matches through the string Testing3 and returns the
        address of the character after the last matched character
        (cursor+11).  The string Testing3 is copied to the character
        array ret0.

        Example 3:
             #include "file.i"
             char *string, *newcursor;
                  ...



        DG/UX 4.00                                                 Page 2
               Licensed material--property of copyright holder(s)





                                                               regcmp(3x)



             newcursor = regex(name, string);

        This example applies a precompiled regular expression in file.i
        (see regcmp(1)) against string.

        This routine is kept in /lib/libPW.a.


        SEE ALSO

        malloc(3C).
        ed(1), regcmp(1) in the User's Reference for the DG/UX System


        WARNING

        The user program may run out of memory if regcmp is called
        iteratively without freeing the vectors no longer required.




































        DG/UX 4.00                                                 Page 3
               Licensed material--property of copyright holder(s)



Typewritten Software • bear@typewritten.org • Edmonds, WA 98026