Case Conversion -- library(ctypes)

There are no case conversion operations in the supported library(strings) package. In the ASCII and EBCDIC character sets, case conversion is well defined, because each letter is either an uppercase letter (which has a unique lowercase equivalent) or a lowercase letter (which has a unique uppercase equivalent). Not all of the character sets have this property.

There are case conversion operations in library(ctypes). They are


to_lower(?Upper, ?Lower)
which is true when Upper and Lower are valid character codes, and either Upper is the code of an uppercase letter that has a unique lowercase equivalent, and Lower is the code of that unique lowercase equivalent, or Upper is the code of some other character, and Lower is the same as Upper. Note that this means that if Lower is the code of a lowercase letter that is the unique equivalent of some uppercase letter, there are two solutions for Upper.
to_upper(?Lower, ?Upper)
which is true when Lower and Upper are valid character codes, and either Lower is the code of a lowercase letter that has a unique uppercase equivalent; and Upper is the code of that unique uppercase equivalent, or Lower is the code of some other character, and Upper is the same as Lower. Note that this means that if Upper is the code of an uppercase letter that is the unique equivalent of some lowercase letter, there are two solutions for Lower.

In the ASCII and EBCDIC character sets, these definitions behave as one would expect. But consider the case of Greek.

Capital sigma is undeniably an uppercase letter; yet it has two lowercase equivalents: one for use at the end of words and one for use elsewhere. This means that to_upper/2 would map both medial and final lowercase sigma to uppercase sigma, but that to_lower/2 would leave uppercase sigma unchanged. A similar problem exists in German, where ß is a lowercase letter whose uppercase equivalent is the pair of letters SS.

Because of such problems, library(caseconv) is only adequate for ASCII or EBCDIC. This package defines two groups of predicates. The predicates in the first group test the case of a name. Those in the second group convert the case of a name or a non-empty list of character codes.


lower(+Text)
is true if Text contains no uppercase letters.
upper(+Text)
is true if Text contains no lowercase letters.
mixed(+Text)
is true if Text contains at least one lowercase letter and and least one uppercase letter.

In each case, Text may contain other things than letters. If mixed(Text) is true, then lower(Text) and upper(Text) must both be false. However, lower(Text) and upper(Text) can both be true if X contains no letters at all. Examples:

     | ?- lower(a).
     
     yes
     
     | ?- lower(quixotic).
     
     yes
     
     | ?- lower('Quixotic').
     
     no
     
     | ?- lower(**).
     
     yes
     
     | ?- upper(a).
     
     no
     
     | ?- upper('QUIXOTIC').
     
     yes
     
     | ?- upper('Quixotic').
     
     no
     
     | ?- upper(**).
     
     yes
     
     | ?- mixed(quixotic).
     
     no
     
     | ?- mixed('QUIXOTIC').
     
     no
     
     | ?- mixed('!$Quixotic<<<').
     
     yes
     
     | ?- mixed(**).
     
     no
     

lower(+Given, ?Lower)
unifies Lower with a lowercase version of Given. Uppercase letters are converted to lowercase, and no other changes are made. lower(Lower) is true.
upper(+Given, ?Upper)
unifies Upper with an uppercase version of Given. Lowercase letters are converted to uppercase, and no other changes are made. upper(Upper) is true.
mixed(+Given, ?Mixed)
unifies Mixed with a mixed-case version of Given. In each block of consecutive letters, the first letter is converted to uppercase and the following letters are converted to lowercase. No other changes are made. mixed(Mixed) is true if and only if Given contained at least two adjacent letters; otherwise upper(Mixed) is true.

In each of these predicates, Given may be an atom or a non-empty list of character codes. If Given is a number, these predicates will quietly fail. The action of these predicates on other terms is not defined. The second argument is unified with a term of the same type as Given, containing the same number of characters.

Examples (assuming that library(printchars) has been loaded):

     | ?- lower("Are other character sets a REAL problem?", X).
     
     X = "are other character sets a real problem?"
     
     | ?- upper('Yes, they are!', X).
     
     X = 'YES, THEY ARE!'
     
     | ?- mixed('what a nuisance', X).
     
     X = 'What A Nuisance'
     
     | ?- upper(1.2e3, X).
     
     no
     
     | ?- lower('1.2E3', X).
     
     X = '1.2e3'
     

Note that numbers cannot be converted by these predicates.