next | previous | forward | backward | up | top | index | toc | Macaulay2 web site
Macaulay2Doc > The Macaulay2 language > strings and nets > separate

separate -- split a string into substrings

Synopsis

Description

We illustrate several different ways we can separate the following string into substrings.

i1 : s = "This is an example of a string.\nIt contains some letters, spaces, and punctuation.\r\nIt also contains some new line characters.\r\nIn fact, for some reason, both Unix-style\nand Windows-style\r\nnew line characters are present."

o1 = This is an example of a string.
     It contains some letters, spaces, and punctuation.
     It also contains some new line characters.
     In fact, for some reason, both Unix-style
     and Windows-style
     new line characters are present.

The command separate(s) breaks s at every occurrence of "\r\n" or "\n".

i2 : separate(s)

o2 = {This is an example of a string., It contains some letters, spaces, and
     ------------------------------------------------------------------------
     punctuation., It also contains some new line characters., In fact, for
     ------------------------------------------------------------------------
     some reason, both Unix-style, and Windows-style, new line characters are
     ------------------------------------------------------------------------
     present.}

o2 : List

This is equivalent to using the lines function.

i3 : lines s

o3 = {This is an example of a string., It contains some letters, spaces, and
     ------------------------------------------------------------------------
     punctuation., It also contains some new line characters., In fact, for
     ------------------------------------------------------------------------
     some reason, both Unix-style, and Windows-style, new line characters are
     ------------------------------------------------------------------------
     present.}

o3 : List

Instead of breaking at new line characters, we can specify which character to break at. For instance, we can separate at every comma:

i4 : separate(",", s)

o4 = {This is an example of a string.,  spaces,  and punctuation.        
      It contains some letters                  It also contains some new
                                                In fact                  
     ------------------------------------------------------------------------
                     ,  for some reason,  both Unix-style                }
     line characters.                    and Windows-style
                                         new line characters are present.

o4 : List

or at every space:

i5 : separate(" ", s)

o5 = {This, is, an, example, of, a, string., contains, some, letters,,
                                    It                                
     ------------------------------------------------------------------------
     spaces,, and, punctuation., also, contains, some, new, line,
                   It                                            
     ------------------------------------------------------------------------
     characters., fact,, for, some, reason,, both, Unix-style, Windows-style,
     In                                            and         new
     ------------------------------------------------------------------------
     line, characters, are, present.}

o5 : List

In the last two examples we can see line breaks appear in the output substrings, since we are no longer separating at them. (They are printed in the console as actual new lines, not using escape characters.)

Now let’s try breaking at the string "om". This occurs three times in our string (in three uses of the word "some"), so s is separated into four substrings. The separating characters "om" do not appear in any of the substrings.

i6 : t = separate("om", s)

o6 = {This is an example of a string., e letters, spaces, and punctuation.,
      It contains s                    It also contains s                  
                                                                           
     ------------------------------------------------------------------------
     e new line characters., e reason, both Unix-style       }
     In fact, for s          and Windows-style
                             new line characters are present.

o6 : List

We can recover the original string using the demark function.

i7 : demark("om", t)

o7 = This is an example of a string.
     It contains some letters, spaces, and punctuation.
     It also contains some new line characters.
     In fact, for some reason, both Unix-style
     and Windows-style
     new line characters are present.

In general, s = demark(x, separate(x, s)). The exception to this rule is that demark("\n", separate(s)) isn’t necessarily equal to s; this code will replace any "\r\n" line breaks in s with "\n" characters.

To use a string longer than 2 characters to separate, and for much greater flexibility and control in specifying separation rules, see separateRegexp.

See also

Ways to use separate :