Next: , Previous: Autostr comparison, Up: Automatic strings


10.10 Transliterating automatic strings

10.10.1 Case fiddling

Libretto provides a group of functions for altering the case of strings. These functions use the ANSI case-handling functions (‘isupper’, ‘islower’, ‘toupper’ and ‘tolower’); their exact behaviour can therefore depend on the current locale.

— Function: void astr_upcase (Autostr *astr)
— Function: void astr_downcase (Autostr *astr)

Converts every character in astr to uppercase or lowercase respectively (according to the ANSI ‘toupper’ and ‘tolower’ functions).

— Function: void astr_flipcase (Autostr *astr)

Converts every uppercase character in astr to lowercase, and vice versa.

— Function: void astr_upcase_initials (Autostr *astr)
— Function: void astr_capitalise (Autostr *astr)

For every sequence of alphabetic characters in astr, ‘astr_upcase_initials’ converts the first character to uppercase. ‘astr_capitalise’ behaves similarly, but the remaining characters in each sequence are converted to lowercase, rather than being left as they were.

10.10.2 Generalised transliteration

— Function pointer: Astr_totype_f int (*) (int c)

The type for a function which can transliterate one character into another. An example would be the standard C function ‘toupper’.

— Function: void astr_translit_c (Autostr *astr, int from, int to)

Transliterates every occurrence of the character from in astr to to. Neither from nor to may be zero.

— Function: void astr_translit_f (Autostr *astr, Astr_istype_f from, Astr_totype_f to)

Transliterates every character in astr for which from returns a non-zero value by replacing it with the result of calling to with that character as its argument (or with itself if to returns zero).

— Function: void astr_translit (Autostr *astr, const Autostr *from_set, const Autostr *to_set)
— Function: void astr_translit_s (Autostr *astr, const char *from_set, const char *to_set)

Transliterates each character in astr that is in from_set to the corresponding member of to_set. from_set and to_set contain textual descriptions of the characters desired, in a manner similar to that used by the tr(1) program. Specifically, these `sets' consist of a sequence of elements, where each element is either a single character or a character range; a character range is two characters x and y where y (as an ‘unsigned char’) is greater than y (as an ‘unsigned char’), and where x and y are separated by a hyphen ‘-’. Specifying a character range as an element is equivalent to specifying all the individual characters in the inclusive interval described by the characters mentioned. If from_set specifies more characters than to_set, the behaviour is as if to_set ends with as many copies of its last character as are needed. If to_set specifies more characters than from_set, the additional characters are ignored.

An example or two may be in order here.

astr_translit_s (astr, "a", "j")
Transliterates each `a' into a `j'.
astr_translit_s (astr, "ab", "jk")
Transliterates each `a' into a `j' and each `b' into a `k'.
astr_translit_s (astr, "a-c", "j-l")
astr_translit_s (astr, "a-c", "j-z")
Transliterate each `a' into a `j', each `b' into a `k', and each `c' into an `l'.
astr_translit_s (astr, "a-e", "x")
Transliterates each `a', `b', `c', `d' and `e' into an `x'.

Note that there is some overhead associated with parsing the sets into the form used internally. For many applications, it is better to use ‘astr_translit_f’ with appropriate functions, especially on small strings.