next | previous | forward | backward | up | top | index | toc | Macaulay2 website
Macaulay2Doc > The Macaulay2 language > strings and nets > regular expressions > separate

separate -- split a string into substrings using a regular expression



For an introduction to regular expressions, see regular expressions.

Example 1: The command separate(s) breaks the string at every occurrence of \r\n or \n.

i1 : s = "A string with both Unix-style\nand Windows-style\r\nnew line characters."

o1 = A string with both Unix-style
     and Windows-style
     new line characters.
i2 : separate s

o2 = {A string with both Unix-style, and Windows-style, new line characters.}

o2 : List

This is equivalent to using the lines function.

i3 : lines s

o3 = {A string with both Unix-style, and Windows-style, new line characters.}

o3 : List

Example 2: use commas, periods, and semicolons as separators.

i4 : separate("[,.;]", "Example: a string. That, is punctuated, weirdly; for demonstration purposes.")

o4 = {Example: a string,  That,  is punctuated,  weirdly,  for demonstration
     purposes, }

o4 : List

Example 3: match any number of consecutive spaces.

i5 : t = separate("[ ]+", "this    string has   different   lengths of    spacing  between     words")

o5 = {this, string, has, different, lengths, of, spacing, between, words}

o5 : List

We can now correct the original string using the demark and replace functions.

i6 : replace("has", "does not have", demark(" ", t))

o6 = this string does not have different lengths of spacing between words

Example 4: delete every word starting with "x" from a string, by using concatenate together with separate.

i7 : s = "algng xjfr kfjxse xhgfj xooi xwj kvexr anvi endj xkfi";
i8 : concatenate separate(" x[A-Za-z]*", s)

o8 = algng kfjxse kvexr anvi endj

Example 5: The optional argument n allows us to specify a separator that differs from the match criteria. In the previous example, words beginning with "x" were both the match and the separator. In this example, we match words beginning with "x", but separate the string using the leading "x". With concatenate, this results in deleting just the "x" from words starting with "x" (not the same as removing every "x").

i9 : concatenate separate(" (x)[A-Za-z]*", 1, s)

o9 = algng jfr kfjxse hgfj ooi wj kvexr anvi endj kfi

separateRegexp is a deprecated synonym for separate.


For backwards compatibility, if the pattern is a single character and it is an unescaped special character, such as +, *, or ., then it is treated as a literal character. In future code, the pattern must be escaped.

See also

Ways to use separate :

For the programmer

The object separate is a method function with options.