next | previous | forward | backward | up | top | index | toc | Macaulay2 web site
Macaulay2Doc > The Macaulay2 language > strings and nets > regular expressions

regular expressions -- syntax for regular expressions inside Macaulay2

A regular expression is a string that specifies a pattern that describes a set of matching subject strings. Regular expressions are constructed inductively as follows. Ordinary (non-special) characters match themselves. A concatenation of regular expressions matches the concatenation of corresponding matching subject strings. Regular expressions separated by the character | match strings matched by any. Parentheses can be used for grouping, and results about which substrings of the target string matched which parenthesized subexpression of the regular expression can be returned.

The special characters are those appearing in the following constructions. The special character \ may be confusing, as inside a string delimited by quotation marks ("..."), you type two of them to get one, whereas inside a string delimited by triple slashes (///...///), you type one to get one. Thus regular expressions delimited by triple slashes are more readable.

There are the following character classes.

In order to match one of the special characters itself, precede it with a backslash or use regexQuote.

We illustrate the use of regular expressions with regex(String,String).

i1 : s = "1abcddddeF2";
i2 : regex("d", s)

o2 = {(4, 1)}

o2 : List
i3 : substring(oo#0,s)

o3 = d
i4 : regex("d*", s)

o4 = {(0, 0)}

o4 : List
i5 : substring(oo#0,s)

o5 = 
i6 : regex("d+", s)

o6 = {(4, 4)}

o6 : List
i7 : substring(oo#0,s)

o7 = dddd
i8 : regex("d+", "1abceF2")
i9 : regex("cdd+e", s)

o9 = {(3, 6)}

o9 : List
i10 : substring(oo#0,s)

o10 = cdddde
i11 : regex("cd(d+)e", s)

o11 = {(3, 6), (5, 3)}

o11 : List
i12 : substring(oo#0,s),substring(oo#1,s)

o12 = (cdddde, ddd)

o12 : Sequence
i13 : regex("[a-z]+", s)

o13 = {(1, 8)}

o13 : List
i14 : substring(oo#0,s)

o14 = abcdddde
i15 : t = "Dog cat cat.";
i16 : regex("[[:alpha:]]+", t)

o16 = {(0, 3)}

o16 : List
i17 : regex("([[:alpha:]]+) *\\1",t)

o17 = {(4, 7), (4, 3)}

o17 : List
i18 : substring(oo#0,t),substring(oo#1,t)

o18 = (cat cat, cat)

o18 : Sequence
For complete documentation on regular expressions see the entry for regex in section 7 of the unix man pages, or read the the GNU regex manual.

In addition to the functions mentioned below, regular expressions appear in about, backupFileRegexp, findFiles, and pretty.


Regular expression matching is done by calls to a C library, where the convention is that the end of a string is signalled by a byte containing zero, whereas in Macaulay2, strings may contain bytes containing zero. Hence regular expression parsing and matching ignore any bytes containing zero, as well as any subsequent bytes, potentially yielding surprising results.


functions that accept regular expressions