Next: , Previous: Regular Expressions, Up: Regular Expressions


6.8.1 Regular-expression procedures

Procedures that perform regular-expression match and search accept standardized arguments. Regexp is the regular expression; it is either a string representation of a regular expression, or a compiled regular expression object. String is the string being matched or searched. Procedures that operate on substrings also accept start and end index arguments with the usual meaning. The optional argument case-fold? says whether the match/search is case-sensitive; if case-fold? is #f, it is case-sensitive, otherwise it is case-insensitive. The optional argument syntax-table is a character syntax table that defines the character syntax, such as which characters are legal word constituents. This feature is primarily for Edwin, so character syntax tables will not be documented here. Supplying #f for (or omitting) syntax-table will select the default character syntax, equivalent to Edwin's fundamental mode.

— procedure: re-string-match regexp string [case-fold? [syntax-table]]
— procedure: re-substring-match regexp string start end [case-fold? [syntax-table]]

These procedures match regexp against the respective string or substring, returning #f for no match, or a set of match registers (see below) if the match succeeds. Here is an example showing how to extract the matched substring:

          (let ((r (re-substring-match regexp string start end)))
            (and r
                 (substring string start (re-match-end-index 0 r))))
     
— procedure: re-string-search-forward regexp string [case-fold? [syntax-table]]
— procedure: re-substring-search-forward regexp string start end [case-fold? [syntax-table]]

Searches string for the leftmost substring matching regexp. Returns a set of match registers (see below) if the search is successful, or #f if it is unsuccessful.

re-substring-search-forward limits its search to the specified substring of string; re-string-search-forward searches all of string.

— procedure: re-string-search-backward regexp string [case-fold? [syntax-table]]
— procedure: re-substring-search-backward regexp string start end [case-fold? [syntax-table]]

Searches string for the rightmost substring matching regexp. Returns a set of match registers (see below) if the search is successful, or #f if it is unsuccessful.

re-substring-search-backward limits its search to the specified substring of string; re-string-search-backward searches all of string.

When a successful match or search occurs, the above procedures return a set of match registers. The match registers are a set of index registers that record indexes into the matched string. Each index register corresponds to an instance of the regular-expression grouping operator `\(', and records the start index (inclusive) and end index (exclusive) of the matched group. These registers are numbered from 1 to 9, corresponding left-to-right to the grouping operators in the expression. Additionally, register 0 corresponds to the entire substring matching the regular expression.

— procedure: re-match-start-index n registers
— procedure: re-match-end-index n registers

N must be an exact integer between 0 and 9 inclusive. Registers must be a match-registers object as returned by one of the regular-expression match or search procedures above. re-match-start-index returns the start index of the corresponding regular-expression register, and re-match-end-index returns the corresponding end index.

— procedure: re-match-extract string registers n

Registers must be a match-registers object as returned by one of the regular-expression match or search procedures above. String must be the string that was passed as an argument to the procedure that returned registers. N must be an exact integer between 0 and 9 inclusive. If the matched regular expression contained m grouping operators, then the value of this procedure is undefined for n strictly greater than m.

This procedure extracts the substring corresponding to the match register specified by registers and n. This is equivalent to the following expression:

          (substring string
                     (re-match-start-index n registers)
                     (re-match-end-index n registers))
     
— procedure: regexp-group alternative ...

Each alternative must be a string representation of a regular expression. The returned value is a new string representation of a regular expression that consists of the alternatives combined by a grouping operator. For example:

          (regexp-group "foo" "bar" "baz")
            => "\\(foo\\|bar\\|baz\\)"
     
— procedure: re-compile-pattern regexp-string

Regexp-string must be the string representation of a regular expression. Returns a compiled regular expression object of the represented regular expression.

Procedures that apply regular expressions, such as re-string-search-forward, are sometimes faster when used with compiled regular expression objects than when used with the string representations of regular expressions, so applications that reuse regular expressions may speed up matching and searching by caching the compiled regular expression objects. However, the regular expression procedures have some internal caches as well, so this is likely to improve performance only for applications that use a large number of different regular expressions before cycling through the same ones again.