CrystalSpace

Public API Reference

Main Page | Modules | Class Hierarchy | Alphabetical List | Class List | File List | Class Members | File Members | Related Pages

csUnicodeTransform Class Reference
[Utilities]

Contains functions to convert between several UTF encodings. More...

#include <csutil/csuctransform.h>

List of all members.

Static Public Member Functions

UTF Decoders
int UTF8Decode (const utf8_char *str, size_t strlen, utf32_char &ch, bool *isValid=0, bool returnNonChar=false)
 Decode an Unicode character encoded in UTF-8.
int UTF16Decode (const utf16_char *str, size_t strlen, utf32_char &ch, bool *isValid=0, bool returnNonChar=false)
 Decode an Unicode character encoded in UTF-16.
int UTF32Decode (const utf32_char *str, size_t strlen, utf32_char &ch, bool *isValid=0, bool returnNonChar=false)
 Decode an Unicode character encoded in UTF-32.
int Decode (const utf8_char *str, size_t strlen, utf32_char &ch, bool *isValid=0, bool returnNonChar=false)
 Decode an Unicode character encoded in UTF-8.
int Decode (const utf16_char *str, size_t strlen, utf32_char &ch, bool *isValid=0, bool returnNonChar=false)
 Decode an Unicode character encoded in UTF-16.
int Decode (const utf32_char *str, size_t strlen, utf32_char &ch, bool *isValid=0, bool returnNonChar=false)
 Decode an Unicode character encoded in UTF-32.
UTF Encoders
int EncodeUTF8 (const utf32_char ch, utf8_char *buf, size_t bufsize, bool allowNonchars=false)
 Encode an Unicode character to UTF-8.
int EncodeUTF16 (const utf32_char ch, utf16_char *buf, size_t bufsize, bool allowNonchars=false)
 Encode an Unicode character to UTF-16.
int EncodeUTF32 (const utf32_char ch, utf32_char *buf, size_t bufsize, bool allowNonchars=false)
 Encode an Unicode character to UTF-32.
int Encode (const utf32_char ch, utf8_char *buf, size_t bufsize, bool allowNonchars=false)
 Encode an Unicode character to UTF-8.
int Encode (const utf32_char ch, utf16_char *buf, size_t bufsize, bool allowNonchars=false)
 Encode an Unicode character to UTF-16.
int Encode (const utf32_char ch, utf32_char *buf, size_t bufsize, bool allowNonchars=false)
 Encode an Unicode character to UTF-32.
Converters between strings in different UTF encodings
size_t UTF8to16 (utf16_char *dest, size_t destSize, const utf8_char *source, size_t srcSize=(size_t)-1)
 Convert UTF-8 to UTF-16.
size_t UTF8to32 (utf32_char *dest, size_t destSize, const utf8_char *source, size_t srcSize=(size_t)-1)
 Convert UTF-8 to UTF-32.
size_t UTF16to8 (utf8_char *dest, size_t destSize, const utf16_char *source, size_t srcSize=(size_t)-1)
 Convert UTF-16 to UTF-8.
size_t UTF16to32 (utf32_char *dest, size_t destSize, const utf16_char *source, size_t srcSize=(size_t)-1)
 Convert UTF-16 to UTF-32.
size_t UTF32to8 (utf8_char *dest, size_t destSize, const utf32_char *source, size_t srcSize=(size_t)-1)
 Convert UTF-32 to UTF-8.
size_t UTF32to16 (utf16_char *dest, size_t destSize, const utf32_char *source, size_t srcSize=(size_t)-1)
 Convert UTF-32 to UTF-16.
Converters UTF and platform-specific wchar_t
size_t UTF8toWC (wchar_t *dest, size_t destSize, const utf8_char *source, size_t srcSize)
 Convert UTF-8 to platform-specific wide chars.
size_t UTF16toWC (wchar_t *dest, size_t destSize, const utf16_char *source, size_t srcSize)
 Convert UTF-16 to platform-specific wide chars.
size_t UTF32toWC (wchar_t *dest, size_t destSize, const utf32_char *source, size_t srcSize)
 Convert UTF-32 to platform-specific wide chars.
size_t WCtoUTF8 (utf8_char *dest, size_t destSize, const wchar_t *source, size_t srcSize)
 Convert platform-specific wide chars to UTF-8.
size_t WCtoUTF16 (utf16_char *dest, size_t destSize, const wchar_t *source, size_t srcSize)
 Convert platform-specific wide chars to UTF-16.
size_t WCtoUTF32 (utf32_char *dest, size_t destSize, const wchar_t *source, size_t srcSize)
 Convert platform-specific wide chars to UTF-32.
int Decode (const wchar_t *str, size_t strlen, utf32_char &ch, bool *isValid=0, bool returnNonChar=false)
 Decode an Unicode character encoded from wchar_t.
int Encode (const utf32_char ch, wchar_t *buf, size_t bufsize, bool allowNonchars=false)
 Encode an Unicode character to wchar_t.
Helpers to skip encoded chars in different UTF encodings
int UTF8Skip (const utf8_char *str, size_t maxSkip)
 Determine how many characters in an UTF-8 buffer need to be skipped to get to the next encoded char.
int UTF8Rewind (const utf8_char *str, size_t maxRew)
 Determine how many characters in an UTF-8 buffer need to skipped back to get to the start of the previous encoded character.
int UTF16Skip (const utf16_char *str, size_t maxSkip)
 Determine how many characters in an UTF-16 buffer need to be skipped to get to the next encoded char.
int UTF16Rewind (const utf16_char *str, size_t maxRew)
 Determine how many characters in an UTF-16 buffer need to skipped back to get to the start of the previous encoded character.
int UTF32Skip (const utf32_char *str, size_t maxSkip)
 Determine how many characters in an UTF-32 buffer need to be skipped to get to the next encoded char.
int UTF32Rewind (const utf32_char *str, size_t maxRew)
 Determine how many characters in an UTF-32 buffer need to skipped back to get to the start of the previous encoded character.
Character mappings
size_t MapToUpper (const utf32_char ch, utf32_char *dest, size_t destSize)
 Map a character to its upper case equivalent(s).
size_t MapToLower (const utf32_char ch, utf32_char *dest, size_t destSize)
 Map a character to its lower case equivalent(s).
size_t MapToFold (const utf32_char ch, utf32_char *dest, size_t destSize)
 Map a character to its fold equivalent(s).


Detailed Description

Contains functions to convert between several UTF encodings.

Definition at line 46 of file csuctransform.h.


Member Function Documentation

int csUnicodeTransform::Decode const wchar_t *  str,
size_t  strlen,
utf32_char ch,
bool *  isValid = 0,
bool  returnNonChar = false
[inline, static]
 

Decode an Unicode character encoded from wchar_t.

Parameters:
str Pointer to the encoded character.
strlen Number of chars in the string.
ch Decoded character.
isValid When an error occured during decoding, ch contains the replacement character (CS_UC_CHAR_REPLACER). In this case, the bool pointed to by isValid will be set to false. The parameter can be 0, but in this case the information whether the decoded char is the replacement character because the source data is errorneous is lost.
returnNonChar Whether decoded non-character or high and low surrogates are returned as such. Normally, those code points are replaced with CS_UC_CHAR_REPLACER to signal an invalid encoded code point.
Returns:
The number of characters in str that have to be skipped to retrieve the next encoding character.

Definition at line 706 of file csuctransform.h.

References utf16_char.

int csUnicodeTransform::Decode const utf32_char str,
size_t  strlen,
utf32_char ch,
bool *  isValid = 0,
bool  returnNonChar = false
[inline, static]
 

Decode an Unicode character encoded in UTF-32.

Parameters:
str Pointer to the encoded character.
strlen Number of chars in the string.
ch Decoded character.
isValid When an error occured during decoding, ch contains the replacement character (CS_UC_CHAR_REPLACER). In this case, the bool pointed to by isValid will be set to false. The parameter can be 0, but in this case the information whether the decoded char is the replacement character because the source data is errorneous is lost.
returnNonChar Whether decoded non-character or high and low surrogates are returned as such. Normally, those code points are replaced with CS_UC_CHAR_REPLACER to signal an invalid encoded code point.
Returns:
The number of characters in str that have to be skipped to retrieve the next encoding character.

Definition at line 262 of file csuctransform.h.

int csUnicodeTransform::Decode const utf16_char str,
size_t  strlen,
utf32_char ch,
bool *  isValid = 0,
bool  returnNonChar = false
[inline, static]
 

Decode an Unicode character encoded in UTF-16.

Parameters:
str Pointer to the encoded character.
strlen Number of chars in the string.
ch Decoded character.
isValid When an error occured during decoding, ch contains the replacement character (CS_UC_CHAR_REPLACER). In this case, the bool pointed to by isValid will be set to false. The parameter can be 0, but in this case the information whether the decoded char is the replacement character because the source data is errorneous is lost.
returnNonChar Whether decoded non-character or high and low surrogates are returned as such. Normally, those code points are replaced with CS_UC_CHAR_REPLACER to signal an invalid encoded code point.
Returns:
The number of characters in str that have to be skipped to retrieve the next encoding character.

Definition at line 253 of file csuctransform.h.

int csUnicodeTransform::Decode const utf8_char str,
size_t  strlen,
utf32_char ch,
bool *  isValid = 0,
bool  returnNonChar = false
[inline, static]
 

Decode an Unicode character encoded in UTF-8.

Parameters:
str Pointer to the encoded character.
strlen Number of chars in the string.
ch Decoded character.
isValid When an error occured during decoding, ch contains the replacement character (CS_UC_CHAR_REPLACER). In this case, the bool pointed to by isValid will be set to false. The parameter can be 0, but in this case the information whether the decoded char is the replacement character because the source data is errorneous is lost.
returnNonChar Whether decoded non-character or high and low surrogates are returned as such. Normally, those code points are replaced with CS_UC_CHAR_REPLACER to signal an invalid encoded code point.
Returns:
The number of characters in str that have to be skipped to retrieve the next encoding character.

Definition at line 244 of file csuctransform.h.

Referenced by csFmtDefaultReader< T >::GetNext().

int csUnicodeTransform::Encode const utf32_char  ch,
wchar_t *  buf,
size_t  bufsize,
bool  allowNonchars = false
[inline, static]
 

Encode an Unicode character to wchar_t.

Parameters:
ch Character to encode.
buf Pointer to the buffer receiving the encoded character.
bufsize Number of chars in the buffer.
allowNonchars Whether non-character or high and low surrogates are encoded. Normally, those code points are rejected to prevent the generation of invalid encoded strings.
Returns:
The number of characters needed to encode ch.
Remarks:
The buffer will be filled up as much as possible. Check the returned value whether the encoded character fit into the buffer.

Definition at line 715 of file csuctransform.h.

References utf16_char.

int csUnicodeTransform::Encode const utf32_char  ch,
utf32_char buf,
size_t  bufsize,
bool  allowNonchars = false
[inline, static]
 

Encode an Unicode character to UTF-32.

Parameters:
ch Character to encode.
buf Pointer to the buffer receiving the encoded character.
bufsize Number of chars in the buffer.
allowNonchars Whether non-character or high and low surrogates are encoded. Normally, those code points are rejected to prevent the generation of invalid encoded strings.
Returns:
The number of characters needed to encode ch.
Remarks:
The buffer will be filled up as much as possible. Check the returned value whether the encoded character fit into the buffer.

Definition at line 422 of file csuctransform.h.

int csUnicodeTransform::Encode const utf32_char  ch,
utf16_char buf,
size_t  bufsize,
bool  allowNonchars = false
[inline, static]
 

Encode an Unicode character to UTF-16.

Parameters:
ch Character to encode.
buf Pointer to the buffer receiving the encoded character.
bufsize Number of chars in the buffer.
allowNonchars Whether non-character or high and low surrogates are encoded. Normally, those code points are rejected to prevent the generation of invalid encoded strings.
Returns:
The number of characters needed to encode ch.
Remarks:
The buffer will be filled up as much as possible. Check the returned value whether the encoded character fit into the buffer.

Definition at line 413 of file csuctransform.h.

int csUnicodeTransform::Encode const utf32_char  ch,
utf8_char buf,
size_t  bufsize,
bool  allowNonchars = false
[inline, static]
 

Encode an Unicode character to UTF-8.

Parameters:
ch Character to encode.
buf Pointer to the buffer receiving the encoded character.
bufsize Number of chars in the buffer.
allowNonchars Whether non-character or high and low surrogates are encoded. Normally, those code points are rejected to prevent the generation of invalid encoded strings.
Returns:
The number of characters needed to encode ch.
Remarks:
The buffer will be filled up as much as possible. Check the returned value whether the encoded character fit into the buffer.

Definition at line 404 of file csuctransform.h.

Referenced by csFmtDefaultWriter< T >::Put().

int csUnicodeTransform::EncodeUTF16 const utf32_char  ch,
utf16_char buf,
size_t  bufsize,
bool  allowNonchars = false
[inline, static]
 

Encode an Unicode character to UTF-16.

Parameters:
ch Character to encode.
buf Pointer to the buffer receiving the encoded character.
bufsize Number of chars in the buffer.
allowNonchars Whether non-character or high and low surrogates are encoded. Normally, those code points are rejected to prevent the generation of invalid encoded strings.
Returns:
The number of characters needed to encode ch.
Remarks:
The buffer will be filled up as much as possible. Check the returned value whether the encoded character fit into the buffer.

Definition at line 355 of file csuctransform.h.

References CS_UC_CHAR_HIGH_SURROGATE_FIRST, CS_UC_CHAR_LOW_SURROGATE_FIRST, CS_UC_IS_NONCHARACTER, CS_UC_IS_SURROGATE, utf16_char, and utf32_char.

int csUnicodeTransform::EncodeUTF32 const utf32_char  ch,
utf32_char buf,
size_t  bufsize,
bool  allowNonchars = false
[inline, static]
 

Encode an Unicode character to UTF-32.

Parameters:
ch Character to encode.
buf Pointer to the buffer receiving the encoded character.
bufsize Number of chars in the buffer.
allowNonchars Whether non-character or high and low surrogates are encoded. Normally, those code points are rejected to prevent the generation of invalid encoded strings.
Returns:
The number of characters needed to encode ch.
Remarks:
The buffer will be filled up as much as possible. Check the returned value whether the encoded character fit into the buffer.

Definition at line 386 of file csuctransform.h.

References CS_UC_IS_NONCHARACTER, and CS_UC_IS_SURROGATE.

int csUnicodeTransform::EncodeUTF8 const utf32_char  ch,
utf8_char buf,
size_t  bufsize,
bool  allowNonchars = false
[inline, static]
 

Encode an Unicode character to UTF-8.

Parameters:
ch Character to encode.
buf Pointer to the buffer receiving the encoded character.
bufsize Number of chars in the buffer.
allowNonchars Whether non-character or high and low surrogates are encoded. Normally, those code points are rejected to prevent the generation of invalid encoded strings.
Returns:
The number of characters needed to encode ch.
Remarks:
The buffer will be filled up as much as possible. Check the returned value whether the encoded character fit into the buffer.

Definition at line 298 of file csuctransform.h.

References CS_UC_IS_NONCHARACTER, CS_UC_IS_SURROGATE, and utf8_char.

size_t csUnicodeTransform::MapToFold const utf32_char  ch,
utf32_char dest,
size_t  destSize
[static]
 

Map a character to its fold equivalent(s).

Fold mapping is useful for binary comparison of two Unicode strings.

Parameters:
ch Char to be mapped.
dest Destination buffer.
destSize Number of characters the destination buffer can hold.
Returns:
Number of characters the complete mapping result would require.

size_t csUnicodeTransform::MapToLower const utf32_char  ch,
utf32_char dest,
size_t  destSize
[static]
 

Map a character to its lower case equivalent(s).

Parameters:
ch Char to be mapped.
dest Destination buffer.
destSize Number of characters the destination buffer can hold.
Returns:
Number of characters the complete mapping result would require.

size_t csUnicodeTransform::MapToUpper const utf32_char  ch,
utf32_char dest,
size_t  destSize
[static]
 

Map a character to its upper case equivalent(s).

Parameters:
ch Char to be mapped.
dest Destination buffer.
destSize Number of characters the destination buffer can hold.
Returns:
Number of characters the complete mapping result would require.

int csUnicodeTransform::UTF16Decode const utf16_char str,
size_t  strlen,
utf32_char ch,
bool *  isValid = 0,
bool  returnNonChar = false
[inline, static]
 

Decode an Unicode character encoded in UTF-16.

Parameters:
str Pointer to the encoded character.
strlen Number of chars in the string.
ch Decoded character.
isValid When an error occured during decoding, ch contains the replacement character (CS_UC_CHAR_REPLACER). In this case, the bool pointed to by isValid will be set to false. The parameter can be 0, but in this case the information whether the decoded char is the replacement character because the source data is errorneous is lost.
returnNonChar Whether decoded non-character or high and low surrogates are returned as such. Normally, those code points are replaced with CS_UC_CHAR_REPLACER to signal an invalid encoded code point.
Returns:
The number of characters in str that have to be skipped to retrieve the next encoding character.

Definition at line 181 of file csuctransform.h.

References CS_UC_IS_HIGH_SURROGATE, CS_UC_IS_LOW_SURROGATE, CS_UC_IS_NONCHARACTER, CS_UC_IS_SURROGATE, and utf16_char.

int csUnicodeTransform::UTF16Rewind const utf16_char str,
size_t  maxRew
[inline, static]
 

Determine how many characters in an UTF-16 buffer need to skipped back to get to the start of the previous encoded character.

Parameters:
str Pointer to the encoded character after the character that is actually to be skipped back.
maxRew The number of characters to go back at max. Typically, this is the number of chars from str to the start of the buffer.
Returns:
Number of chars to skip back in the buffer. Returns 0 if maxRew is 0.

Definition at line 892 of file csuctransform.h.

References CS_UC_IS_HIGH_SURROGATE, CS_UC_IS_SURROGATE, and utf16_char.

int csUnicodeTransform::UTF16Skip const utf16_char str,
size_t  maxSkip
[inline, static]
 

Determine how many characters in an UTF-16 buffer need to be skipped to get to the next encoded char.

Parameters:
str Pointer to buffer with encoded character.
maxSkip The number of characters to skip at max. Usually, this is the number of chars from str to the end of the buffer.
Returns:
Number of chars to skip in the buffer. Returns 0 if maxSkip is 0.

Definition at line 879 of file csuctransform.h.

References CS_UC_IS_HIGH_SURROGATE.

size_t csUnicodeTransform::UTF16to32 utf32_char dest,
size_t  destSize,
const utf16_char source,
size_t  srcSize = (size_t)-1
[inline, static]
 

Convert UTF-16 to UTF-32.

Parameters:
dest Destination buffer.
destSize Number of characters the destination buffer can hold.
source Source buffer.
srcSize Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 516 of file csuctransform.h.

size_t csUnicodeTransform::UTF16to8 utf8_char dest,
size_t  destSize,
const utf16_char source,
size_t  srcSize = (size_t)-1
[inline, static]
 

Convert UTF-16 to UTF-8.

Parameters:
dest Destination buffer.
destSize Number of characters the destination buffer can hold.
source Source buffer.
srcSize Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 511 of file csuctransform.h.

size_t csUnicodeTransform::UTF16toWC wchar_t *  dest,
size_t  destSize,
const utf16_char source,
size_t  srcSize
[inline, static]
 

Convert UTF-16 to platform-specific wide chars.

Parameters:
dest Destination buffer.
destSize Number of characters the destination buffer can hold.
source Source buffer.
srcSize Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 627 of file csuctransform.h.

References utf16_char.

int csUnicodeTransform::UTF32Decode const utf32_char str,
size_t  strlen,
utf32_char ch,
bool *  isValid = 0,
bool  returnNonChar = false
[inline, static]
 

Decode an Unicode character encoded in UTF-32.

Parameters:
str Pointer to the encoded character.
strlen Number of chars in the string.
ch Decoded character.
isValid When an error occured during decoding, ch contains the replacement character (CS_UC_CHAR_REPLACER). In this case, the bool pointed to by isValid will be set to false. The parameter can be 0, but in this case the information whether the decoded char is the replacement character because the source data is errorneous is lost.
returnNonChar Whether decoded non-character or high and low surrogates are returned as such. Normally, those code points are replaced with CS_UC_CHAR_REPLACER to signal an invalid encoded code point.
Returns:
The number of characters in str that have to be skipped to retrieve the next encoding character.

Definition at line 224 of file csuctransform.h.

References CS_UC_IS_NONCHARACTER, and CS_UC_IS_SURROGATE.

int csUnicodeTransform::UTF32Rewind const utf32_char str,
size_t  maxRew
[inline, static]
 

Determine how many characters in an UTF-32 buffer need to skipped back to get to the start of the previous encoded character.

Parameters:
str Pointer to the encoded character after the character that is actually to be skipped back.
maxRew The number of characters to go back at max. Typically, this is the number of chars from str to the start of the buffer.
Returns:
Number of chars to skip back in the buffer. Returns 0 if maxRew is 0.

Definition at line 923 of file csuctransform.h.

int csUnicodeTransform::UTF32Skip const utf32_char str,
size_t  maxSkip
[inline, static]
 

Determine how many characters in an UTF-32 buffer need to be skipped to get to the next encoded char.

Parameters:
str Pointer to buffer with encoded character.
maxSkip The number of characters to skip at max. Usually, this is the number of chars from str to the end of the buffer.
Returns:
Number of chars to skip in the buffer. Returns 0 if maxSkip is 0.

Definition at line 913 of file csuctransform.h.

size_t csUnicodeTransform::UTF32to16 utf16_char dest,
size_t  destSize,
const utf32_char source,
size_t  srcSize = (size_t)-1
[inline, static]
 

Convert UTF-32 to UTF-16.

Parameters:
dest Destination buffer.
destSize Number of characters the destination buffer can hold.
source Source buffer.
srcSize Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 527 of file csuctransform.h.

References utf8_char.

size_t csUnicodeTransform::UTF32to8 utf8_char dest,
size_t  destSize,
const utf32_char source,
size_t  srcSize = (size_t)-1
[inline, static]
 

Convert UTF-32 to UTF-8.

Parameters:
dest Destination buffer.
destSize Number of characters the destination buffer can hold.
source Source buffer.
srcSize Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 522 of file csuctransform.h.

size_t csUnicodeTransform::UTF32toWC wchar_t *  dest,
size_t  destSize,
const utf32_char source,
size_t  srcSize
[inline, static]
 

Convert UTF-32 to platform-specific wide chars.

Parameters:
dest Destination buffer.
destSize Number of characters the destination buffer can hold.
source Source buffer.
srcSize Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 650 of file csuctransform.h.

References utf16_char.

int csUnicodeTransform::UTF8Decode const utf8_char str,
size_t  strlen,
utf32_char ch,
bool *  isValid = 0,
bool  returnNonChar = false
[inline, static]
 

Decode an Unicode character encoded in UTF-8.

Parameters:
str Pointer to the encoded character.
strlen Number of chars in the string.
ch Decoded character.
isValid When an error occured during decoding, ch contains the replacement character (CS_UC_CHAR_REPLACER). In this case, the bool pointed to by isValid will be set to false. The parameter can be 0, but in this case the information whether the decoded char is the replacement character because the source data is errorneous is lost.
returnNonChar Whether decoded non-character or high and low surrogates are returned as such. Normally, those code points are replaced with CS_UC_CHAR_REPLACER to signal an invalid encoded code point.
Returns:
The number of characters in str that have to be skipped to retrieve the next encoding character.

Definition at line 90 of file csuctransform.h.

References CS_UC_IS_NONCHARACTER, CS_UC_IS_SURROGATE, and utf8_char.

int csUnicodeTransform::UTF8Rewind const utf8_char str,
size_t  maxRew
[inline, static]
 

Determine how many characters in an UTF-8 buffer need to skipped back to get to the start of the previous encoded character.

Parameters:
str Pointer to the encoded character after the character that is actually to be skipped back.
maxRew The number of characters to go back at max. Typically, this is the number of chars from str to the start of the buffer.
Returns:
Number of chars to skip back in the buffer. Returns 0 if maxRew is 0.

Definition at line 852 of file csuctransform.h.

References utf8_char.

int csUnicodeTransform::UTF8Skip const utf8_char str,
size_t  maxSkip
[inline, static]
 

Determine how many characters in an UTF-8 buffer need to be skipped to get to the next encoded char.

Parameters:
str Pointer to buffer with encoded character.
maxSkip The number of characters to skip at max. Usually, this is the number of chars from str to the end of the buffer.
Returns:
Number of chars to skip in the buffer. Returns 0 if maxSkip is 0.

Definition at line 811 of file csuctransform.h.

size_t csUnicodeTransform::UTF8to16 utf16_char dest,
size_t  destSize,
const utf8_char source,
size_t  srcSize = (size_t)-1
[inline, static]
 

Convert UTF-8 to UTF-16.

Parameters:
dest Destination buffer.
destSize Number of characters the destination buffer can hold.
source Source buffer.
srcSize Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 500 of file csuctransform.h.

size_t csUnicodeTransform::UTF8to32 utf32_char dest,
size_t  destSize,
const utf8_char source,
size_t  srcSize = (size_t)-1
[inline, static]
 

Convert UTF-8 to UTF-32.

Parameters:
dest Destination buffer.
destSize Number of characters the destination buffer can hold.
source Source buffer.
srcSize Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 505 of file csuctransform.h.

size_t csUnicodeTransform::UTF8toWC wchar_t *  dest,
size_t  destSize,
const utf8_char source,
size_t  srcSize
[inline, static]
 

Convert UTF-8 to platform-specific wide chars.

Parameters:
dest Destination buffer.
destSize Number of characters the destination buffer can hold.
source Source buffer.
srcSize Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 617 of file csuctransform.h.

References utf16_char.

size_t csUnicodeTransform::WCtoUTF16 utf16_char dest,
size_t  destSize,
const wchar_t *  source,
size_t  srcSize
[inline, static]
 

Convert platform-specific wide chars to UTF-16.

Parameters:
dest Destination buffer.
destSize Number of characters the destination buffer can hold.
source Source buffer.
srcSize Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 670 of file csuctransform.h.

size_t csUnicodeTransform::WCtoUTF32 utf32_char dest,
size_t  destSize,
const wchar_t *  source,
size_t  srcSize
[inline, static]
 

Convert platform-specific wide chars to UTF-32.

Parameters:
dest Destination buffer.
destSize Number of characters the destination buffer can hold.
source Source buffer.
srcSize Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 693 of file csuctransform.h.

References utf16_char.

size_t csUnicodeTransform::WCtoUTF8 utf8_char dest,
size_t  destSize,
const wchar_t *  source,
size_t  srcSize
[inline, static]
 

Convert platform-specific wide chars to UTF-8.

Parameters:
dest Destination buffer.
destSize Number of characters the destination buffer can hold.
source Source buffer.
srcSize Number of characters contained in the source buffer. If this is -1, the length will be determined automatically.
Returns:
Number of characters in the complete converted string, including null terminator.
Remarks:
If the complete converted string wouldn't fit the destination buffer, it is truncated. However, it'll also be null-terminated. Hence, if it has a size of 1, you get an empty string. The returned value is the number of characters needed for the *whole* converted string.

Definition at line 660 of file csuctransform.h.

References utf16_char.


The documentation for this class was generated from the following file:
Generated for Crystal Space by doxygen 1.3.9.1