The midstring, substring, and subchars families give you a way of taking strings apart when you know the lengths of the substrings you want, or when you know a particular substring.
The "span" family gives you another way of taking strings apart.
The family contains three
sub-families: span_left/[3,4,5]
, which scan from the left,
span_right/[3,4,5]
, which scan from the right, and
span_trim/[3,4,5]
, which scans from both ends towards the middle.
span_left(
+Text,
+Set,
?LenA,
?LenB,
?LenC)
span_left(
+Text,
+Set,
?LenA,
?LenB)
span_left(
+Text,
+Set,
?LenA)
The Set is
''
represents an empty Set.
not(
X)
, where X is an atom or non-empty list of characters.
A character belongs to such a Set if and only if
it does not belong to the set X.
The first two arguments must be instantiated. Given them, the remaining three arguments are uniquely determined. The last three arguments give you a picture of how the text is divided:
| LenA | LenB | LenC | Text= a a a a a a B B B B B B B c c c c c \____Set____/
where Set embraces the characters in the B substring.
By design, the Set argument occupies the same position
in the argument list of this predicate that B does in the
argument list of substring/[4,5]
or midstring/[3,4,5,6]
. The fact that
the last three arguments of span_left/5
follow this convention
means that you can use midstring/[3,4,5,6]
, substring/[4,5]
, or subchars/[4,5]
to extract whichever substring interests you.
For example, to skip leading spaces in String, yielding Trimmed, you would write
| ?- span_left(String, not(" "), Before), | substring(String, Trimmed, Before, _, 0).
Note that this fails if there are no non-blank characters in String. To extract the first blank-delimited Token from String, yielding a Token and the Rest of the string, you would write
| ?- span_left(String, not(" "), Before, Length, After), | substring(String, Token, Before, Length, After), | substring(String, Rest, _, After, 0).
span_right(
+Text,
+Set,
?LenA,
?LenB,
?LenC)
span_right(
+Text,
+Set,
?LenB,
?LenC)
span_right(
+Text,
+Set,
?LenC)
These three predicates are exactly like span_left/[3,4,5]
except that
they work from right to left instead of from left to right. In
particular, the picture
| LenA | LenB | LenC | Text= a a a a a a B B B B B B B c c c c c \____Set____/
applies.
Finally, there are predicates that scan from both ends:
span_trim(
+Text,
+Set,
?LenA,
?LenB, ?
LenC)
The Set argument of span_trim/5
has the same form as
the Set argument of span_left/[3,4,5]
or span_right/[3,4,5]
,
but there is an important difference in how it is used:
in span_trim/5
the Set specifies the characters
that are to be trimmed away. The picture is
| LenA | LenB | LenC | Text= a a a a a a B B B B B B B c c c c c \___Set___/ \__Set__/
There is a special case of span_trim/5
that enables you to strip
particular characters from both ends of a string. These unwanted
characters are designated in Set in span_trim/3
:
span_trim(String, Set, Trimmed) :- span_trim(String, Set, Before, Length, After), substring(String, Trimmed, Before, Length, After).
A further specialization, span_trim/2
, is intended for trimming blanks
from fixed-length records:
span_trim(String, Trimmed) :- span_trim(String, " ", Before, Length, After), substring(String, Trimmed, Before, Length, After).
For example,
| ?- span_trim(' abc ', " ", B, L, A). B = 2 L = 3 A = 4 | ?- substring(' abc ', Trimmed, 2, 3, 4). Trimmed = abc | ?- span_trim(' an example ', Trimmed). Trimmed = 'an example'
Note that the last example leaves the group of three internal
blanks intact. There are no predicates in library(strings)
for compressing such blanks.
In manipulating text objects,
do not neglect the possibility of combining the "span" family with
subchars/[4,5]
or midstring/[3,4,5,6]
.