# separate -- split a string into substrings using a regular expression

## Synopsis

• Usage:
separate str
separate(re, str)
separate(re, n, str)
• Inputs:
• re, , a regular expression describing a pattern
• str, , the string to split
• n, an integer, the index of the parenthesized expression to split on
• Optional inputs:
• POSIX => , default value false, if true, interpret the re using the POSIX Extended flavor, otherwise the Perl flavor
• Outputs:
• a list, a list of strings obtained by breaking str at every match to the pattern re, or, if a natural number n is specified, using the $n$-th parenthesized expression in re as the separator. If no re is specified, str is split at every new line.

## Description

For an introduction to regular expressions, see regular expressions.

Example 1: The command separate(s) breaks the string at every occurrence of \r\n or \n.

 i1 : s = "A string with both Unix-style\nand Windows-style\r\nnew line characters." o1 = A string with both Unix-style and Windows-style new line characters. i2 : separate s o2 = {A string with both Unix-style, and Windows-style, new line characters.} o2 : List

This is equivalent to using the lines function.

 i3 : lines s o3 = {A string with both Unix-style, and Windows-style, new line characters.} o3 : List

Example 2: use commas, periods, and semicolons as separators.

 i4 : separate("[,.;]", "Example: a string. That, is punctuated, weirdly; for demonstration purposes.") o4 = {Example: a string, That, is punctuated, weirdly, for demonstration ------------------------------------------------------------------------ purposes, } o4 : List

Example 3: match any number of consecutive spaces.

 i5 : t = separate("[ ]+", "this string has different lengths of spacing between words") o5 = {this, string, has, different, lengths, of, spacing, between, words} o5 : List

We can now correct the original string using the demark and replace functions.

 i6 : replace("has", "does not have", demark(" ", t)) o6 = this string does not have different lengths of spacing between words

Example 4: delete every word starting with "x" from a string, by using concatenate together with separate.

 i7 : s = "algng xjfr kfjxse xhgfj xooi xwj kvexr anvi endj xkfi"; i8 : concatenate separate(" x[A-Za-z]*", s) o8 = algng kfjxse kvexr anvi endj

Example 5: The optional argument n allows us to specify a separator that differs from the match criteria. In the previous example, words beginning with "x" were both the match and the separator. In this example, we match words beginning with "x", but separate the string using the leading "x". With concatenate, this results in deleting just the "x" from words starting with "x" (not the same as removing every "x").

 i9 : concatenate separate(" (x)[A-Za-z]*", 1, s) o9 = algng jfr kfjxse hgfj ooi wj kvexr anvi endj kfi

separateRegexp is a deprecated synonym for separate.

## Caveat

For backwards compatibility, if the pattern is a single character and it is an unescaped special character, such as +, *, or ., then it is treated as a literal character. In future code, the pattern must be escaped.