Main Page   Class Hierarchy   Alphabetical List   Compound List   File List   Compound Members   File Members   Search  

Transliterator Class Reference

Transliterator is an abstract class that transliterates text from one format to another. More...

#include <translit.h>

Inheritance diagram for Transliterator::

UObject UMemory List of all members.

Public Types

typedef Transliterator *(* Factory )(const UnicodeString &ID, Token context)
 A function that creates and returns a Transliterator. More...


Public Methods

virtual ~Transliterator ()
 Destructor. More...

virtual Transliterator * clone () const
 Implements Cloneable. More...

virtual int32_t transliterate (Replaceable &text, int32_t start, int32_t limit) const
 Transliterates a segment of a string, with optional filtering. More...

virtual void transliterate (Replaceable &text) const
 Transliterates an entire string in place. More...

virtual void transliterate (Replaceable &text, UTransPosition &index, const UnicodeString &insertion, UErrorCode &status) const
 Transliterates the portion of the text buffer that can be transliterated unambiguosly after new text has been inserted, typically as a result of a keyboard event. More...

virtual void transliterate (Replaceable &text, UTransPosition &index, UChar32 insertion, UErrorCode &status) const
 Transliterates the portion of the text buffer that can be transliterated unambiguosly after a new character has been inserted, typically as a result of a keyboard event. More...

virtual void transliterate (Replaceable &text, UTransPosition &index, UErrorCode &status) const
 Transliterates the portion of the text buffer that can be transliterated unambiguosly. More...

virtual void finishTransliteration (Replaceable &text, UTransPosition &index) const
 Finishes any pending transliterations that were waiting for more characters. More...

int32_t getMaximumContextLength (void) const
 Returns the length of the longest context required by this transliterator. More...

virtual const UnicodeStringgetID (void) const
 Returns a programmatic identifier for this transliterator. More...

const UnicodeFiltergetFilter (void) const
 Returns the filter used by this transliterator, or NULL if this transliterator uses no filter. More...

UnicodeFilterorphanFilter (void)
 Returns the filter used by this transliterator, or NULL if this transliterator uses no filter. More...

void adoptFilter (UnicodeFilter *adoptedFilter)
 Changes the filter used by this transliterator. More...

Transliterator * createInverse (UErrorCode &status) const
 Returns this transliterator's inverse. More...

virtual UnicodeStringtoRules (UnicodeString &result, UBool escapeUnprintable) const
 Create a rule string that can be passed to createFromRules() to recreate this transliterator. More...

UnicodeSetgetSourceSet (UnicodeSet &result) const
 Returns the set of all characters that may be modified in the input text by this Transliterator. More...

virtual void handleGetSourceSet (UnicodeSet &result) const
 Framework method that returns the set of all characters that may be modified in the input text by this Transliterator, ignoring the effect of this object's filter. More...

virtual UnicodeSetgetTargetSet (UnicodeSet &result) const
 Returns the set of all characters that may be generated as replacement text by this transliterator. More...

virtual UClassID getDynamicClassID (void) const=0
 Returns a unique class ID polymorphically. More...


Static Public Methods

Token integerToken (int32_t)
 Return a token containing an integer. More...

Token pointerToken (void *)
 Return a token containing a pointer. More...

UnicodeStringgetDisplayName (const UnicodeString &ID, UnicodeString &result)
 Returns a name for this transliterator that is appropriate for display to the user in the default locale. More...

UnicodeStringgetDisplayName (const UnicodeString &ID, const Locale &inLocale, UnicodeString &result)
 Returns a name for this transliterator that is appropriate for display to the user in the given locale. More...

Transliterator * createInstance (const UnicodeString &ID, UTransDirection dir, UParseError &parseError, UErrorCode &status)
 Returns a Transliterator object given its ID. More...

Transliterator * createInstance (const UnicodeString &ID, UTransDirection dir, UErrorCode &status)
 Returns a Transliterator object given its ID. More...

Transliterator * createFromRules (const UnicodeString &ID, const UnicodeString &rules, UTransDirection dir, UParseError &parseError, UErrorCode &status)
 Returns a Transliterator object constructed from the given rule string. More...

void registerFactory (const UnicodeString &id, Factory factory, Token context)
 Registers a factory function that creates transliterators of a given ID. More...

void registerInstance (Transliterator *adoptedObj)
 Registers a instance obj of a subclass of Transliterator with the system. More...

void unregister (const UnicodeString &ID)
 Unregisters a transliterator or class. More...

int32_t countAvailableIDs (void)
 Return the number of IDs currently registered with the system. More...

const UnicodeStringgetAvailableID (int32_t index)
 Return the index-th available ID. More...

int32_t countAvailableSources (void)
 Return the number of registered source specifiers. More...

UnicodeStringgetAvailableSource (int32_t index, UnicodeString &result)
 Return a registered source specifier. More...

int32_t countAvailableTargets (const UnicodeString &source)
 Return the number of registered target specifiers for a given source specifier. More...

UnicodeStringgetAvailableTarget (int32_t index, const UnicodeString &source, UnicodeString &result)
 Return a registered target specifier for a given source. More...

int32_t countAvailableVariants (const UnicodeString &source, const UnicodeString &target)
 Return the number of registered variant specifiers for a given source-target pair. More...

UnicodeStringgetAvailableVariant (int32_t index, const UnicodeString &source, const UnicodeString &target, UnicodeString &result)
 Return a registered variant specifier for a given source-target pair. More...


Protected Methods

 Transliterator (const UnicodeString &ID, UnicodeFilter *adoptedFilter)
 Default constructor. More...

 Transliterator (const Transliterator &)
 Copy constructor. More...

Transliterator & operator= (const Transliterator &)
 Assignment operator. More...

virtual void handleTransliterate (Replaceable &text, UTransPosition &pos, UBool incremental) const=0
 Abstract method that concrete subclasses define to implement their transliteration algorithm. More...

virtual void filteredTransliterate (Replaceable &text, UTransPosition &index, UBool incremental) const
 Transliterate a substring of text, as specified by index, taking filters into account. More...

void setMaximumContextLength (int32_t maxContextLength)
 Method for subclasses to use to set the maximum context length. More...

void setID (const UnicodeString &id)
 Set the ID of this transliterators. More...


Static Protected Methods

Transliterator * createBasicInstance (const UnicodeString &id, const UnicodeString *canon)
 Create a transliterator from a basic ID. More...

void _registerFactory (const UnicodeString &id, Factory factory, Token context)
void _registerInstance (Transliterator *adoptedObj)
void _registerSpecialInverse (const UnicodeString &target, const UnicodeString &inverseTarget, UBool bidirectional)
 Register two targets as being inverses of one another. More...

int32_t _countAvailableSources (void)
 Non-mutexed internal method. More...

UnicodeString_getAvailableSource (int32_t index, UnicodeString &result)
 Non-mutexed internal method. More...

int32_t _countAvailableTargets (const UnicodeString &source)
 Non-mutexed internal method. More...

UnicodeString_getAvailableTarget (int32_t index, const UnicodeString &source, UnicodeString &result)
 Non-mutexed internal method. More...

int32_t _countAvailableVariants (const UnicodeString &source, const UnicodeString &target)
 Non-mutexed internal method. More...

UnicodeString_getAvailableVariant (int32_t index, const UnicodeString &source, const UnicodeString &target, UnicodeString &result)
 Non-mutexed internal method. More...


Private Methods

void _transliterate (Replaceable &text, UTransPosition &index, const UnicodeString *insertion, UErrorCode &status) const
 This internal method does incremental transliteration. More...

virtual void filteredTransliterate (Replaceable &text, UTransPosition &index, UBool incremental, UBool rollback) const
 Top-level transliteration method, handling filtering, incremental and non-incremental transliteration, and rollback. More...


Static Private Methods

UBool initializeRegistry (void)

Private Attributes

UnicodeString ID
 Programmatic name, e.g., "Latin-Arabic". More...

UnicodeFilterfilter
 This transliterator's filter. More...

int32_t maximumContextLength

Friends

class TransliteratorParser
class TransliteratorIDParser
class CompoundTransliterator
class AnyTransliterator

Detailed Description

Transliterator is an abstract class that transliterates text from one format to another.

The most common kind of transliterator is a script, or alphabet, transliterator. For example, a Russian to Latin transliterator changes Russian text written in Cyrillic characters to phonetically equivalent Latin characters. It does not translate Russian to English! Transliteration, unlike translation, operates on characters, without reference to the meanings of words and sentences.

Although script conversion is its most common use, a transliterator can actually perform a more general class of tasks. In fact, Transliterator defines a very general API which specifies only that a segment of the input text is replaced by new text. The particulars of this conversion are determined entirely by subclasses of Transliterator.

Transliterators are stateless

Transliterator objects are stateless; they retain no information between calls to transliterate(). (However, this does not mean that threads may share transliterators without synchronizing them. Transliterators are not immutable, so they must be synchronized when shared between threads.) This1 might seem to limit the complexity of the transliteration operation. In practice, subclasses perform complex transliterations by delaying the replacement of text until it is known that no other replacements are possible. In other words, although the Transliterator objects are stateless, the source text itself embodies all the needed information, and delayed operation allows arbitrary complexity.

Batch transliteration

The simplest way to perform transliteration is all at once, on a string of existing text. This is referred to as batch transliteration. For example, given a string input and a transliterator t, the call

String result = t.transliterate(input);

will transliterate it and return the result. Other methods allow the client to specify a substring to be transliterated and to use Replaceable objects instead of strings, in order to preserve out-of-band information (such as text styles).

Keyboard transliteration

Somewhat more involved is keyboard, or incremental transliteration. This is the transliteration of text that is arriving from some source (typically the user's keyboard) one character at a time, or in some other piecemeal fashion.

In keyboard transliteration, a Replaceable buffer stores the text. As text is inserted, as much as possible is transliterated on the fly. This means a GUI that displays the contents of the buffer may show text being modified as each new character arrives.

Consider the simple RuleBasedTransliterator:

th>{theta}
t>{tau}

When the user types 't', nothing will happen, since the transliterator is waiting to see if the next character is 'h'. To remedy this, we introduce the notion of a cursor, marked by a '|' in the output string:

t>|{tau}
{tau}h>{theta}

Now when the user types 't', tau appears, and if the next character is 'h', the tau changes to a theta. This is accomplished by maintaining a cursor position (independent of the insertion point, and invisible in the GUI) across calls to transliterate(). Typically, the cursor will be coincident with the insertion point, but in a case like the one above, it will precede the insertion point.

Keyboard transliteration methods maintain a set of three indices that are updated with each call to transliterate(), including the cursor, start, and limit. Since these indices are changed by the method, they are passed in an int[] array. The START index marks the beginning of the substring that the transliterator will look at. It is advanced as text becomes committed (but it is not the committed index; that's the CURSOR). The CURSOR index, described above, marks the point at which the transliterator last stopped, either because it reached the end, or because it required more characters to disambiguate between possible inputs. The CURSOR can also be explicitly set by rules in a RuleBasedTransliterator. Any characters before the CURSOR index are frozen; future keyboard transliteration calls within this input sequence will not change them. New text is inserted at the LIMIT index, which marks the end of the substring that the transliterator looks at.

Because keyboard transliteration assumes that more characters are to arrive, it is conservative in its operation. It only transliterates when it can do so unambiguously. Otherwise it waits for more characters to arrive. When the client code knows that no more characters are forthcoming, perhaps because the user has performed some input termination operation, then it should call finishTransliteration() to complete any pending transliterations.

Inverses

Pairs of transliterators may be inverses of one another. For example, if transliterator A transliterates characters by incrementing their Unicode value (so "abc" -> "def"), and transliterator B decrements character values, then A is an inverse of B and vice versa. If we compose A with B in a compound transliterator, the result is the indentity transliterator, that is, a transliterator that does not change its input text.

The Transliterator method getInverse() returns a transliterator's inverse, if one exists, or null otherwise. However, the result of getInverse() usually will not be a true mathematical inverse. This is because true inverse transliterators are difficult to formulate. For example, consider two transliterators: AB, which transliterates the character 'A' to 'B', and BA, which transliterates 'B' to 'A'. It might seem that these are exact inverses, since

"A" x AB -> "B"
"B" x BA -> "A"

where 'x' represents transliteration. However,

"ABCD" x AB -> "BBCD"
"BBCD" x BA -> "AACD"

so AB composed with BA is not the identity. Nonetheless, BA may be usefully considered to be AB's inverse, and it is on this basis that AB.getInverse() could legitimately return BA.

IDs and display names

A transliterator is designated by a short identifier string or ID. IDs follow the format source-destination, where source describes the entity being replaced, and destination describes the entity replacing source. The entities may be the names of scripts, particular sequences of characters, or whatever else it is that the transliterator converts to or from. For example, a transliterator from Russian to Latin might be named "Russian-Latin". A transliterator from keyboard escape sequences to Latin-1 characters might be named "KeyboardEscape-Latin1". By convention, system entity names are in English, with the initial letters of words capitalized; user entity names may follow any format so long as they do not contain dashes.

In addition to programmatic IDs, transliterator objects have display names for presentation in user interfaces, returned by getDisplayName.

Factory methods and registration

In general, client code should use the factory method createInstance to obtain an instance of a transliterator given its ID. Valid IDs may be enumerated using getAvailableIDs(). Since transliterators are mutable, multiple calls to createInstance with the same ID will return distinct objects.

In addition to the system transliterators registered at startup, user transliterators may be registered by calling registerInstance() at run time. A registered instance acts a template; future calls to createInstance with the ID of the registered object return clones of that object. Thus any object passed to registerInstance() must implement clone() propertly. To register a transliterator subclass without instantiating it (until it is needed), users may call registerClass(). In this case, the objects are instantiated by invoking the zero-argument public constructor of the class.

Subclassing

Subclasses must implement the abstract method handleTransliterate().

Subclasses should override the transliterate() method taking a Replaceable and the transliterate() method taking a String and StringBuffer if the performance of these methods can be improved over the performance obtained by the default implementations in this class.

Author:
Alan Liu
Stable:
ICU 2.0

Definition at line 234 of file translit.h.


Member Typedef Documentation

typedef Transliterator*(* Transliterator::Factory)(const UnicodeString& ID, Token context)
 

A function that creates and returns a Transliterator.

When invoked, it will be passed the ID string that is being instantiated, together with the context pointer that was passed in when the factory function was first registered. Many factory functions will ignore both parameters, however, functions that are registered to more than one ID may use the ID or the context parameter to parameterize the transliterator they create.

Parameters:
ID  the string identifier for this transliterator
context  a context pointer that will be stored and later passed to the factory function when an ID matching the registration ID is being instantiated with this factory.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.


Constructor & Destructor Documentation

Transliterator::Transliterator const UnicodeString   ID,
UnicodeFilter   adoptedFilter
[protected]
 

Default constructor.

Parameters:
ID  the string identifier for this transliterator
adoptedFilter  the filter. Any character for which filter.contains() returns false will not be altered by this transliterator. If filter is null then no filtering is applied.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

Transliterator::Transliterator const Transliterator &    [protected]
 

Copy constructor.

Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual Transliterator::~Transliterator   [virtual]
 

Destructor.

Stable:
ICU 2.0


Member Function Documentation

int32_t Transliterator::_countAvailableSources void    [static, protected]
 

Non-mutexed internal method.

Internal:
For internal use only.

int32_t Transliterator::_countAvailableTargets const UnicodeString   source [static, protected]
 

Non-mutexed internal method.

Internal:
For internal use only.

int32_t Transliterator::_countAvailableVariants const UnicodeString   source,
const UnicodeString   target
[static, protected]
 

Non-mutexed internal method.

Internal:
For internal use only.

UnicodeString& Transliterator::_getAvailableSource int32_t    index,
UnicodeString   result
[static, protected]
 

Non-mutexed internal method.

Internal:
For internal use only.

UnicodeString& Transliterator::_getAvailableTarget int32_t    index,
const UnicodeString   source,
UnicodeString   result
[static, protected]
 

Non-mutexed internal method.

Internal:
For internal use only.

UnicodeString& Transliterator::_getAvailableVariant int32_t    index,
const UnicodeString   source,
const UnicodeString   target,
UnicodeString   result
[static, protected]
 

Non-mutexed internal method.

Internal:
For internal use only.

void Transliterator::_registerFactory const UnicodeString   id,
Factory    factory,
Token    context
[static, protected]
 

Internal:
Parameters:
id  the ID being registered
factory  a function pointer that will be copied and called later when the given ID is passed to createInstance()
context  a context pointer that will be stored and later passed to the factory function when an ID matching the registration ID is being instantiated with this factory.

void Transliterator::_registerInstance Transliterator *    adoptedObj [static, protected]
 

Internal:
For internal use only.

void Transliterator::_registerSpecialInverse const UnicodeString   target,
const UnicodeString   inverseTarget,
UBool    bidirectional
[static, protected]
 

Register two targets as being inverses of one another.

For example, calling registerSpecialInverse("NFC", "NFD", true) causes Transliterator to form the following inverse relationships:

NFC => NFD
 Any-NFC => Any-NFD
 NFD => NFC
 Any-NFD => Any-NFC

(Without the special inverse registration, the inverse of NFC would be NFC-Any.) Note that NFD is shorthand for Any-NFD, but that the presence or absence of "Any-" is preserved.

The relationship is symmetrical; registering (a, b) is equivalent to registering (b, a).

The relevant IDs must still be registered separately as factories or classes.

Only the targets are specified. Special inverses always have the form Any-Target1 <=> Any-Target2. The target should have canonical casing (the casing desired to be produced when an inverse is formed) and should contain no whitespace or other extraneous characters.

Parameters:
target  the target against which to register the inverse
inverseTarget  the inverse of target, that is Any-target.getInverse() => Any-inverseTarget
bidirectional  if true, register the reverse relation as well, that is, Any-inverseTarget.getInverse() => Any-target
Internal:
For internal use only.

void Transliterator::_transliterate Replaceable   text,
UTransPosition   index,
const UnicodeString   insertion,
UErrorCode   status
const [private]
 

This internal method does incremental transliteration.

If the 'insertion' is non-null then we append it to 'text' before proceeding. This method calls through to the pure virtual framework method handleTransliterate() to do the actual work.

Parameters:
text  the buffer holding transliterated and untransliterated text
index  an array of three integers. See (Replaceable, int[], String).
insertion  text to be inserted and possibly transliterated into the translation buffer at index.limit.
status  Output param to filled in with a success or an error.

void Transliterator::adoptFilter UnicodeFilter   adoptedFilter
 

Changes the filter used by this transliterator.

If the filter is set to null then no filtering will occur.

Callers must take care if a transliterator is in use by multiple threads. The filter should not be changed by one thread while another thread may be transliterating.

Parameters:
adoptedFilter  the new filter to be adopted.
Stable:
ICU 2.0

virtual Transliterator* Transliterator::clone void    const [inline, virtual]
 

Implements Cloneable.

All subclasses are encouraged to implement this method if it is possible and reasonable to do so. Subclasses that are to be registered with the system using registerInstance() are required to implement this method. If a subclass does not implement clone() properly and is registered with the system using registerInstance(), then the default clone() implementation will return null, and calls to createInstance() will fail.

Returns:
a copy of the object.
See also:
registerInstance
Stable:
ICU 2.0

Definition at line 368 of file translit.h.

int32_t Transliterator::countAvailableIDs void    [static]
 

Return the number of IDs currently registered with the system.

To retrieve the actual IDs, call getAvailableID(i) with i from 0 to countAvailableIDs() - 1.

Returns:
the number of IDs currently registered with the system.
Stable:
ICU 2.0

int32_t Transliterator::countAvailableSources void    [static]
 

Return the number of registered source specifiers.

Returns:
the number of registered source specifiers.
Stable:
ICU 2.0

int32_t Transliterator::countAvailableTargets const UnicodeString   source [static]
 

Return the number of registered target specifiers for a given source specifier.

Parameters:
source  the given source specifier.
Returns:
the number of registered target specifiers for a given source specifier.
Stable:
ICU 2.0

int32_t Transliterator::countAvailableVariants const UnicodeString   source,
const UnicodeString   target
[static]
 

Return the number of registered variant specifiers for a given source-target pair.

Parameters:
source  the source specifiers.
target  the target specifiers.
Stable:
ICU 2.0

Transliterator* Transliterator::createBasicInstance const UnicodeString   id,
const UnicodeString   canon
[static, protected]
 

Create a transliterator from a basic ID.

This is an ID containing only the forward direction source, target, and variant.

Parameters:
id  a basic ID of the form S-T or S-T/V.
canon  canonical ID to assign to the object, or NULL to leave the ID unchanged
Returns:
a newly created Transliterator or null if the ID is invalid.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

Transliterator* Transliterator::createFromRules const UnicodeString   ID,
const UnicodeString   rules,
UTransDirection    dir,
UParseError   parseError,
UErrorCode   status
[static]
 

Returns a Transliterator object constructed from the given rule string.

This will be a RuleBasedTransliterator, if the rule string contains only rules, or a CompoundTransliterator, if it contains ID blocks, or a NullTransliterator, if it contains ID blocks which parse as empty for the given direction.

Parameters:
ID  the id for the transliterator.
rules  rules, separated by ';'
dir  either FORWARD or REVERSE.
parseError  Struct to recieve information on position of error if an error is encountered
status  Output param set to success/failure code.
Stable:
ICU 2.0

Transliterator* Transliterator::createInstance const UnicodeString   ID,
UTransDirection    dir,
UErrorCode   status
[static]
 

Returns a Transliterator object given its ID.

The ID must be either a system transliterator ID or a ID registered using registerInstance().

Parameters:
ID  a valid ID, as enumerated by getAvailableIDs()
dir  either FORWARD or REVERSE.
status  Output param to filled in with a success or an error.
Returns:
A Transliterator object with the given ID
Stable:
ICU 2.0

Transliterator* Transliterator::createInstance const UnicodeString   ID,
UTransDirection    dir,
UParseError   parseError,
UErrorCode   status
[static]
 

Returns a Transliterator object given its ID.

The ID must be either a system transliterator ID or a ID registered using registerInstance().

Parameters:
ID  a valid ID, as enumerated by getAvailableIDs()
dir  either FORWARD or REVERSE.
parseError  Struct to recieve information on position of error if an error is encountered
status  Output param to filled in with a success or an error.
Returns:
A Transliterator object with the given ID
See also:
registerInstance , getAvailableIDs , getID
Stable:
ICU 2.0

Transliterator* Transliterator::createInverse UErrorCode   status const
 

Returns this transliterator's inverse.

See the class documentation for details. This implementation simply inverts the two entities in the ID and attempts to retrieve the resulting transliterator. That is, if getID() returns "A-B", then this method will return the result of createInstance("B-A"), or null if that call fails.

Subclasses with knowledge of their inverse may wish to override this method.

Parameters:
status  Output param to filled in with a success or an error.
Returns:
a transliterator that is an inverse, not necessarily exact, of this transliterator, or null if no such transliterator is registered.
See also:
registerInstance
Stable:
ICU 2.0

virtual void Transliterator::filteredTransliterate Replaceable   text,
UTransPosition   index,
UBool    incremental,
UBool    rollback
const [private, virtual]
 

Top-level transliteration method, handling filtering, incremental and non-incremental transliteration, and rollback.

All transliteration public API methods eventually call this method with a rollback argument of TRUE. Other entities may call this method but rollback should be FALSE.

If this transliterator has a filter, break up the input text into runs of unfiltered characters. Pass each run to <subclass>.handleTransliterate().

In incremental mode, if rollback is TRUE, perform a special incremental procedure in which several passes are made over the input text, adding one character at a time, and committing successful transliterations as they occur. Unsuccessful transliterations are rolled back and retried with additional characters to give correct results.

Parameters:
text  the text to be transliterated
index  the position indices
incremental  if TRUE, then assume more characters may be inserted at index.limit, and postpone processing to accomodate future incoming characters
rollback  if TRUE and if incremental is TRUE, then perform special incremental processing, as described above, and undo partial transliterations where necessary. If incremental is FALSE then this parameter is ignored.

virtual void Transliterator::filteredTransliterate Replaceable   text,
UTransPosition   index,
UBool    incremental
const [protected, virtual]
 

Transliterate a substring of text, as specified by index, taking filters into account.

This method is for subclasses that need to delegate to another transliterator, such as CompoundTransliterator.

Parameters:
text  the text to be transliterated
index  the position indices
incremental  if TRUE, then assume more characters may be inserted at index.limit, and postpone processing to accomodate future incoming characters
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual void Transliterator::finishTransliteration Replaceable   text,
UTransPosition   index
const [virtual]
 

Finishes any pending transliterations that were waiting for more characters.

Clients should call this method as the last call after a sequence of one or more calls to transliterate().

Parameters:
text  the buffer holding transliterated and untransliterated text.
index  the array of indices previously passed to transliterate
Stable:
ICU 2.0

const UnicodeString& Transliterator::getAvailableID int32_t    index [static]
 

Return the index-th available ID.

index must be between 0 and countAvailableIDs() - 1, inclusive. If index is out of range, the result of getAvailableID(0) is returned.

Parameters:
index  the given ID index.
Returns:
the index-th available ID. index must be between 0 and countAvailableIDs() - 1, inclusive. If index is out of range, the result of getAvailableID(0) is returned.
Stable:
ICU 2.0

UnicodeString& Transliterator::getAvailableSource int32_t    index,
UnicodeString   result
[static]
 

Return a registered source specifier.

Parameters:
index  which specifier to return, from 0 to n-1, where n = countAvailableSources()
result  fill-in paramter to receive the source specifier. If index is out of range, result will be empty.
Returns:
reference to result
Stable:
ICU 2.0

UnicodeString& Transliterator::getAvailableTarget int32_t    index,
const UnicodeString   source,
UnicodeString   result
[static]
 

Return a registered target specifier for a given source.

Parameters:
index  which specifier to return, from 0 to n-1, where n = countAvailableTargets(source)
source  the source specifier
result  fill-in paramter to receive the target specifier. If source is invalid or if index is out of range, result will be empty.
Returns:
reference to result
Stable:
ICU 2.0

UnicodeString& Transliterator::getAvailableVariant int32_t    index,
const UnicodeString   source,
const UnicodeString   target,
UnicodeString   result
[static]
 

Return a registered variant specifier for a given source-target pair.

Parameters:
index  which specifier to return, from 0 to n-1, where n = countAvailableVariants(source, target)
source  the source specifier
target  the target specifier
result  fill-in paramter to receive the variant specifier. If source is invalid or if target is invalid or if index is out of range, result will be empty.
Returns:
reference to result
Stable:
ICU 2.0

UnicodeString& Transliterator::getDisplayName const UnicodeString   ID,
const Locale   inLocale,
UnicodeString   result
[static]
 

Returns a name for this transliterator that is appropriate for display to the user in the given locale.

This name is taken from the locale resource data in the standard manner of the java.text package.

If no localized names exist in the system resource bundles, a name is synthesized using a localized MessageFormat pattern from the resource data. The arguments to this pattern are an integer followed by one or two strings. The integer is the number of strings, either 1 or 2. The strings are formed by splitting the ID for this transliterator at the first '-'. If there is no '-', then the entire ID forms the only string.

Parameters:
ID  the string identifier for this transliterator
inLocale  the Locale in which the display name should be localized.
result  Output param to receive the display name
Returns:
A reference to 'result'.
Stable:
ICU 2.0

UnicodeString& Transliterator::getDisplayName const UnicodeString   ID,
UnicodeString   result
[static]
 

Returns a name for this transliterator that is appropriate for display to the user in the default locale.

See getDisplayName for details.

Parameters:
ID  the string identifier for this transliterator
result  Output param to receive the display name
Returns:
A reference to 'result'.
Stable:
ICU 2.0

virtual UClassID Transliterator::getDynamicClassID void    const [pure virtual]
 

Returns a unique class ID polymorphically.

This method is to implement a simple version of RTTI, since not all C++ compilers support genuine RTTI. Polymorphic operator==() and clone() methods call this method.

Concrete subclasses of Transliterator that wish clients to be able to identify them should implement getDynamicClassID() and also a static method and data member:

Subclasses that do not implement this method will have a dynamic class ID of Transliterator::getStatisClassID().

Returns:
The class ID for this object. All objects of a given class have the same class ID. Objects of other classes have different class IDs.
Stable:
ICU 2.0

Reimplemented from UObject.

const UnicodeFilter* Transliterator::getFilter void    const
 

Returns the filter used by this transliterator, or NULL if this transliterator uses no filter.

Returns:
the filter used by this transliterator, or NULL if this transliterator uses no filter.
Stable:
ICU 2.0

virtual const UnicodeString& Transliterator::getID void    const [virtual]
 

Returns a programmatic identifier for this transliterator.

If this identifier is passed to createInstance(), it will return this object, if it has been registered.

Returns:
a programmatic identifier for this transliterator.
See also:
registerInstance , registerClass , getAvailableIDs
Stable:
ICU 2.0

int32_t Transliterator::getMaximumContextLength void    const [inline]
 

Returns the length of the longest context required by this transliterator.

This is preceding context. The default implementation supplied by Transliterator returns zero; subclasses that use preceding context should override this method to return the correct value. For example, if a transliterator translates "ddd" (where d is any digit) to "555" when preceded by "(ddd)", then the preceding context length is 5, the length of "(ddd)".

Returns:
The maximum number of preceding context characters this transliterator needs to examine
Stable:
ICU 2.0

Definition at line 1208 of file translit.h.

UnicodeSet& Transliterator::getSourceSet UnicodeSet   result const
 

Returns the set of all characters that may be modified in the input text by this Transliterator.

This incorporates this object's current filter; if the filter is changed, the return value of this function will change. The default implementation returns an empty set. Some subclasses may override handleGetSourceSet to return a more precise result. The return result is approximate in any case and is intended for use by tests, tools, or utilities.

Parameters:
result  receives result set; previous contents lost
Returns:
a reference to result
See also:
getTargetSet , handleGetSourceSet
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual UnicodeSet& Transliterator::getTargetSet UnicodeSet   result const [virtual]
 

Returns the set of all characters that may be generated as replacement text by this transliterator.

The default implementation returns the empty set. Some subclasses may override this method to return a more precise result. The return result is approximate in any case and is intended for use by tests, tools, or utilities requiring such meta-information.

Parameters:
result  receives result set; previous contents lost
Returns:
a reference to result
See also:
getTargetSet
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual void Transliterator::handleGetSourceSet UnicodeSet   result const [virtual]
 

Framework method that returns the set of all characters that may be modified in the input text by this Transliterator, ignoring the effect of this object's filter.

The base class implementation returns the empty set. Subclasses that wish to implement this should override this method.

Returns:
the set of characters that this transliterator may modify. The set may be modified, so subclasses should return a newly-created object.
Parameters:
result  receives result set; previous contents lost
See also:
getSourceSet , getTargetSet
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual void Transliterator::handleTransliterate Replaceable   text,
UTransPosition   pos,
UBool    incremental
const [protected, pure virtual]
 

Abstract method that concrete subclasses define to implement their transliteration algorithm.

This method handles both incremental and non-incremental transliteration. Let originalStart refer to the value of pos.start upon entry.

  • If incremental is false, then this method should transliterate all characters between pos.start and pos.limit. Upon return pos.start must == pos.limit.

  • If incremental is true, then this method should transliterate all characters between pos.start and pos.limit that can be unambiguously transliterated, regardless of future insertions of text at pos.limit. Upon return, pos.start should be in the range [originalStart, pos.limit). pos.start should be positioned such that characters [originalStart, pos.start) will not be changed in the future by this transliterator and characters [pos.start, pos.limit) are unchanged.

Implementations of this method should also obey the following invariants:

  • pos.limit and pos.contextLimit should be updated to reflect changes in length of the text between pos.start and pos.limit. The difference pos.contextLimit - pos.limit should not change.

  • pos.contextStart should not change.

  • Upon return, neither pos.start nor pos.limit should be less than originalStart.

  • Text before originalStart and text after pos.limit should not change.

  • Text before pos.contextStart and text after pos.contextLimit should be ignored.

Subclasses may safely assume that all characters in [pos.start, pos.limit) are filtered. In other words, the filter has already been applied by the time this method is called. See filteredTransliterate().

This method is not for public consumption. Calling this method directly will transliterate [pos.start, pos.limit) without applying the filter. End user code should call transliterate() instead of this method. Subclass code should call filteredTransliterate() instead of this method.

Parameters:
text  the buffer holding transliterated and untransliterated text
pos  the indices indicating the start, limit, context start, and context limit of the text.
incremental  if true, assume more text may be inserted at pos.limit and act accordingly. Otherwise, transliterate all text between pos.start and pos.limit and move pos.start up to pos.limit.
See also:
transliterate
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

UBool Transliterator::initializeRegistry void    [static, private]
 

Transliterator::Token Transliterator::integerToken int32_t    i [inline, static]
 

Return a token containing an integer.

Returns:
a token containing an integer.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

Definition at line 1218 of file translit.h.

Transliterator& Transliterator::operator= const Transliterator &    [protected]
 

Assignment operator.

Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

UnicodeFilter* Transliterator::orphanFilter void   
 

Returns the filter used by this transliterator, or NULL if this transliterator uses no filter.

The caller must eventually delete the result. After this call, this transliterator's filter is set to NULL.

Returns:
the filter used by this transliterator, or NULL if this transliterator uses no filter.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

Transliterator::Token Transliterator::pointerToken void *    p [inline, static]
 

Return a token containing a pointer.

Returns:
a token containing a pointer.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

Definition at line 1224 of file translit.h.

void Transliterator::registerFactory const UnicodeString   id,
Factory    factory,
Token    context
[static]
 

Registers a factory function that creates transliterators of a given ID.

Parameters:
id  the ID being registered
factory  a function pointer that will be copied and called later when the given ID is passed to createInstance()
context  a context pointer that will be stored and later passed to the factory function when an ID matching the registration ID is being instantiated with this factory.
Stable:
ICU 2.0

void Transliterator::registerInstance Transliterator *    adoptedObj [static]
 

Registers a instance obj of a subclass of Transliterator with the system.

When createInstance() is called with an ID string that is equal to obj->getID(), then obj->clone() is returned.

After this call the Transliterator class owns the adoptedObj and will delete it.

Parameters:
adoptedObj  an instance of subclass of Transliterator that defines clone()
See also:
createInstance , registerClass , unregister
Stable:
ICU 2.0

void Transliterator::setID const UnicodeString   id [inline, protected]
 

Set the ID of this transliterators.

Subclasses shouldn't do this, unless the underlying script behavior has changed.

Parameters:
id  the new id t to be set.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

Definition at line 1212 of file translit.h.

void Transliterator::setMaximumContextLength int32_t    maxContextLength [protected]
 

Method for subclasses to use to set the maximum context length.

Parameters:
maxContextLength  the new value to be set.
See also:
getMaximumContextLength
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

virtual UnicodeString& Transliterator::toRules UnicodeString   result,
UBool    escapeUnprintable
const [virtual]
 

Create a rule string that can be passed to createFromRules() to recreate this transliterator.

Parameters:
result  the string to receive the rules. Previous contents will be deleted.
escapeUnprintable  if TRUE then convert unprintable character to their hex escape representations, \uxxxx or \Uxxxxxxxx. Unprintable characters are those other than U+000A, U+0020..U+007E.
Stable:
ICU 2.0

virtual void Transliterator::transliterate Replaceable   text,
UTransPosition   index,
UErrorCode   status
const [virtual]
 

Transliterates the portion of the text buffer that can be transliterated unambiguosly.

This is a convenience method; see (Replaceable, UTransPosition, UnicodeString) for details.

Parameters:
text  the buffer holding transliterated and untransliterated text
index  an array of three integers. See (Replaceable, UTransPosition, UnicodeString).
status  Output param to filled in with a success or an error.
See also:
transliterate(Replaceable, int[], String)
Stable:
ICU 2.0

virtual void Transliterator::transliterate Replaceable   text,
UTransPosition   index,
UChar32    insertion,
UErrorCode   status
const [virtual]
 

Transliterates the portion of the text buffer that can be transliterated unambiguosly after a new character has been inserted, typically as a result of a keyboard event.

This is a convenience method; see (Replaceable, UTransPosition, UnicodeString) for details.

Parameters:
text  the buffer holding transliterated and untransliterated text
index  an array of three integers. See (Replaceable, UTransPosition, UnicodeString).
insertion  text to be inserted and possibly transliterated into the translation buffer at index.limit.
status  Output param to filled in with a success or an error.
See also:
transliterate(Replaceable, UTransPosition, UnicodeString)
Stable:
ICU 2.0

virtual void Transliterator::transliterate Replaceable   text,
UTransPosition   index,
const UnicodeString   insertion,
UErrorCode   status
const [virtual]
 

Transliterates the portion of the text buffer that can be transliterated unambiguosly after new text has been inserted, typically as a result of a keyboard event.

The new text in insertion will be inserted into text at index.limit, advancing index.limit by insertion.length(). Then the transliterator will try to transliterate characters of text between index.cursor and index.limit. Characters before index.cursor will not be changed.

Upon return, values in index will be updated. index.start will be advanced to the first character that future calls to this method will read. index.cursor and index.limit will be adjusted to delimit the range of text that future calls to this method may change.

Typical usage of this method begins with an initial call with index.start and index.limit set to indicate the portion of text to be transliterated, and index.cursor == index.start. Thereafter, index can be used without modification in future calls, provided that all changes to text are made via this method.

This method assumes that future calls may be made that will insert new text into the buffer. As a result, it only performs unambiguous transliterations. After the last call to this method, there may be untransliterated text that is waiting for more input to resolve an ambiguity. In order to perform these pending transliterations, clients should call finishTransliteration after the last call to this method has been made.

Parameters:
text  the buffer holding transliterated and untransliterated text
index  an array of three integers.
  • index.start: the beginning index, inclusive; 0 <= index.start <= index.limit.

  • index.limit: the ending index, exclusive; index.start <= index.limit <= text.length(). insertion is inserted at index.limit.

  • index.cursor: the next character to be considered for transliteration; index.start <= index.cursor <= index.limit. Characters before index.cursor will not be changed by future calls to this method.
Parameters:
insertion  text to be inserted and possibly transliterated into the translation buffer at index.limit. If null then no text is inserted.
status  Output param to filled in with a success or an error.
See also:
handleTransliterate
Exceptions:
IllegalArgumentException  if index is invalid
See also:
UTransPosition
Stable:
ICU 2.0

virtual void Transliterator::transliterate Replaceable   text const [virtual]
 

Transliterates an entire string in place.

Convenience method.

Parameters:
text  the string to be transliterated
Stable:
ICU 2.0

virtual int32_t Transliterator::transliterate Replaceable   text,
int32_t    start,
int32_t    limit
const [virtual]
 

Transliterates a segment of a string, with optional filtering.

Parameters:
text  the string to be transliterated
start  the beginning index, inclusive; 0 <= start <= limit.
limit  the ending index, exclusive; start <= limit <= text.length().
Returns:
The new limit index. The text previously occupying [start, limit) has been transliterated, possibly to a string of a different length, at [start, new-limit), where new-limit is the return value. If the input offsets are out of bounds, the returned value is -1 and the input string remains unchanged.
Stable:
ICU 2.0

void Transliterator::unregister const UnicodeString   ID [static]
 

Unregisters a transliterator or class.

This may be either a system transliterator or a user transliterator or class. Any attempt to construct an unregistered transliterator based on its ID will fail.

Parameters:
ID  the ID of the transliterator or class
Returns:
the Object that was registered with ID, or null if none was
See also:
registerInstance , registerClass
Stable:
ICU 2.0


Friends And Related Function Documentation

friend class AnyTransliterator [friend]
 

Definition at line 637 of file translit.h.

friend class CompoundTransliterator [friend]
 

Definition at line 636 of file translit.h.

friend class TransliteratorIDParser [friend]
 

Definition at line 344 of file translit.h.

friend class TransliteratorParser [friend]
 

Definition at line 343 of file translit.h.


Member Data Documentation

UnicodeString Transliterator::ID [private]
 

Programmatic name, e.g., "Latin-Arabic".

Definition at line 241 of file translit.h.

UnicodeFilter* Transliterator::filter [private]
 

This transliterator's filter.

Any character for which filter.contains() returns false will not be altered by this transliterator. If filter is null then no filtering is applied.

Definition at line 249 of file translit.h.

int32_t Transliterator::maximumContextLength [private]
 

Definition at line 251 of file translit.h.


The documentation for this class was generated from the following file:
Generated on Mon Nov 24 14:36:58 2003 for ICU 2.8 by doxygen1.2.11.1 written by Dimitri van Heesch, © 1997-2001