The "span" family

The midstring, substring, and subchars families give you a way of taking strings apart when you know the lengths of the substrings you want, or when you know a particular substring.

The "span" family gives you another way of taking strings apart. The family contains three sub-families: span_left/[3,4,5], which scan from the left, span_right/[3,4,5], which scan from the right, and span_trim/[3,4,5], which scans from both ends towards the middle.


span_left(+Text, +Set, ?LenA, ?LenB, ?LenC)

span_left(+Text, +Set, ?LenA, ?LenB)

span_left(+Text, +Set, ?LenA)
are true when

The Set is

The first two arguments must be instantiated. Given them, the remaining three arguments are uniquely determined. The last three arguments give you a picture of how the text is divided:

                 |   LenA    |    LenB     |  LenC   |
         Text=    a a a a a a B B B B B B B c c c c c
                              \____Set____/
     

where Set embraces the characters in the B substring. By design, the Set argument occupies the same position in the argument list of this predicate that B does in the argument list of substring/[4,5] or midstring/[3,4,5,6]. The fact that the last three arguments of span_left/5 follow this convention means that you can use midstring/[3,4,5,6], substring/[4,5], or subchars/[4,5] to extract whichever substring interests you.

For example, to skip leading spaces in String, yielding Trimmed, you would write

     | ?- span_left(String, not(" "), Before),
     |    substring(String, Trimmed, Before, _, 0).
     

Note that this fails if there are no non-blank characters in String. To extract the first blank-delimited Token from String, yielding a Token and the Rest of the string, you would write

     | ?- span_left(String, not(" "), Before, Length, After),
     |    substring(String, Token, Before, Length, After),
     |    substring(String, Rest, _, After, 0).
     

span_right(+Text, +Set, ?LenA, ?LenB, ?LenC)

span_right(+Text, +Set, ?LenB, ?LenC)

span_right(+Text, +Set, ?LenC)
are true when

These three predicates are exactly like span_left/[3,4,5] except that they work from right to left instead of from left to right. In particular, the picture

                 |   LenA    |    LenB     |  LenC   |
         Text=    a a a a a a B B B B B B B c c c c c
                              \____Set____/
     

applies.

Finally, there are predicates that scan from both ends:


span_trim(+Text, +Set, ?LenA, ?LenB, ?LenC)
is true when

The Set argument of span_trim/5 has the same form as the Set argument of span_left/[3,4,5] or span_right/[3,4,5], but there is an important difference in how it is used: in span_trim/5 the Set specifies the characters that are to be trimmed away. The picture is

                 |   LenA    |    LenB     |  LenC   |
         Text=    a a a a a a B B B B B B B c c c c c
                  \___Set___/               \__Set__/
     

There is a special case of span_trim/5 that enables you to strip particular characters from both ends of a string. These unwanted characters are designated in Set in span_trim/3:

     span_trim(String, Set, Trimmed) :-
             span_trim(String, Set, Before, Length, After),
             substring(String, Trimmed, Before, Length, After).
     

A further specialization, span_trim/2, is intended for trimming blanks from fixed-length records:

     span_trim(String, Trimmed) :-
              span_trim(String, " ", Before, Length, After),
              substring(String, Trimmed, Before, Length, After).
     

For example,

     | ?- span_trim('  abc    ', " ", B, L, A).
     B = 2
     L = 3
     A = 4
     
     | ?- substring('  abc    ', Trimmed, 2, 3, 4).
     Trimmed = abc
     
     | ?- span_trim(' an   example ', Trimmed).
     Trimmed = 'an   example'
     

Note that the last example leaves the group of three internal blanks intact. There are no predicates in library(strings) for compressing such blanks.

In manipulating text objects, do not neglect the possibility of combining the "span" family with subchars/[4,5] or midstring/[3,4,5,6].