Main Page   Class Hierarchy   Alphabetical List   Compound List   File List   Compound Members   File Members   Search  

CanonicalIterator Class Reference

This class allows one to iterate through all the strings that are canonically equivalent to a given string. More...

#include <caniter.h>

Inheritance diagram for CanonicalIterator::

UObject UMemory List of all members.

Public Methods

 CanonicalIterator (const UnicodeString &source, UErrorCode &status)
 Construct a CanonicalIterator object. More...

 ~CanonicalIterator ()
 Destructor Cleans pieces. More...

UnicodeString getSource ()
 Gets the NFD form of the current source we are iterating over. More...

void reset ()
 Resets the iterator so that one can start again from the beginning. More...

UnicodeString next ()
 Get the next canonically equivalent string. More...

void setSource (const UnicodeString &newSource, UErrorCode &status)
 Set a new source for this iterator. More...

virtual UClassID getDynamicClassID () const
 ICU "poor man's RTTI", returns a UClassID for the actual class. More...


Static Public Methods

void permute (UnicodeString &source, UBool skipZeros, Hashtable *result, UErrorCode &status)
 Dumb recursive implementation of permutation. More...

UClassID getStaticClassID ()
 ICU "poor man's RTTI", returns a UClassID for this class. More...


Private Methods

 CanonicalIterator ()
 CanonicalIterator (const CanonicalIterator &other)
 Copy constructor. More...

CanonicalIterator & operator= (const CanonicalIterator &other)
 Assignment operator. More...

UnicodeStringgetEquivalents (const UnicodeString &segment, int32_t &result_len, UErrorCode &status)
HashtablegetEquivalents2 (const UChar *segment, int32_t segLen, UErrorCode &status)
Hashtableextract (UChar32 comp, const UChar *segment, int32_t segLen, int32_t segmentPos, UErrorCode &status)
 See if the decomposition of cp2 is at segment starting at segmentPos (with canonical rearrangment!) If so, take the remainder, and return the equivalents. More...

void cleanPieces ()

Private Attributes

UnicodeString source
UBool done
UnicodeString ** pieces
int32_t pieces_length
int32_t * pieces_lengths
int32_t * current
int32_t current_length
UnicodeString buffer

Detailed Description

This class allows one to iterate through all the strings that are canonically equivalent to a given string.

For example, here are some sample results: Results for: {LATIN CAPITAL LETTER A WITH RING ABOVE}{LATIN SMALL LETTER D}{COMBINING DOT ABOVE}{COMBINING CEDILLA} 1: \u0041\u030A\u0064\u0307\u0327 = {LATIN CAPITAL LETTER A}{COMBINING RING ABOVE}{LATIN SMALL LETTER D}{COMBINING DOT ABOVE}{COMBINING CEDILLA} 2: \u0041\u030A\u0064\u0327\u0307 = {LATIN CAPITAL LETTER A}{COMBINING RING ABOVE}{LATIN SMALL LETTER D}{COMBINING CEDILLA}{COMBINING DOT ABOVE} 3: \u0041\u030A\u1E0B\u0327 = {LATIN CAPITAL LETTER A}{COMBINING RING ABOVE}{LATIN SMALL LETTER D WITH DOT ABOVE}{COMBINING CEDILLA} 4: \u0041\u030A\u1E11\u0307 = {LATIN CAPITAL LETTER A}{COMBINING RING ABOVE}{LATIN SMALL LETTER D WITH CEDILLA}{COMBINING DOT ABOVE} 5: \u00C5\u0064\u0307\u0327 = {LATIN CAPITAL LETTER A WITH RING ABOVE}{LATIN SMALL LETTER D}{COMBINING DOT ABOVE}{COMBINING CEDILLA} 6: \u00C5\u0064\u0327\u0307 = {LATIN CAPITAL LETTER A WITH RING ABOVE}{LATIN SMALL LETTER D}{COMBINING CEDILLA}{COMBINING DOT ABOVE} 7: \u00C5\u1E0B\u0327 = {LATIN CAPITAL LETTER A WITH RING ABOVE}{LATIN SMALL LETTER D WITH DOT ABOVE}{COMBINING CEDILLA} 8: \u00C5\u1E11\u0307 = {LATIN CAPITAL LETTER A WITH RING ABOVE}{LATIN SMALL LETTER D WITH CEDILLA}{COMBINING DOT ABOVE} 9: \u212B\u0064\u0307\u0327 = {ANGSTROM SIGN}{LATIN SMALL LETTER D}{COMBINING DOT ABOVE}{COMBINING CEDILLA} 10: \u212B\u0064\u0327\u0307 = {ANGSTROM SIGN}{LATIN SMALL LETTER D}{COMBINING CEDILLA}{COMBINING DOT ABOVE} 11: \u212B\u1E0B\u0327 = {ANGSTROM SIGN}{LATIN SMALL LETTER D WITH DOT ABOVE}{COMBINING CEDILLA} 12: \u212B\u1E11\u0307 = {ANGSTROM SIGN}{LATIN SMALL LETTER D WITH CEDILLA}{COMBINING DOT ABOVE}
Note: the code is intended for use with small strings, and is not suitable for larger ones, since it has not been optimized for that situation. Note, CanonicalIterator is not intended to be subclassed.

Author:
M. Davis , C++ port by V. Weinstein
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

Definition at line 65 of file caniter.h.


Constructor & Destructor Documentation

CanonicalIterator::CanonicalIterator const UnicodeString   source,
UErrorCode   status
 

Construct a CanonicalIterator object.

Parameters:
source  string to get results for
status  Fill-in parameter which receives the status of this operation.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

CanonicalIterator::~CanonicalIterator  
 

Destructor Cleans pieces.

Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

CanonicalIterator::CanonicalIterator   [private]
 

CanonicalIterator::CanonicalIterator const CanonicalIterator &    other [private]
 

Copy constructor.

Private for now.

Internal:
For internal use only.


Member Function Documentation

void CanonicalIterator::cleanPieces   [private]
 

Hashtable* CanonicalIterator::extract UChar32    comp,
const UChar *    segment,
int32_t    segLen,
int32_t    segmentPos,
UErrorCode   status
[private]
 

See if the decomposition of cp2 is at segment starting at segmentPos (with canonical rearrangment!) If so, take the remainder, and return the equivalents.

virtual UClassID CanonicalIterator::getDynamicClassID void    const [virtual]
 

ICU "poor man's RTTI", returns a UClassID for the actual class.

Stable:
ICU 2.2

Reimplemented from UObject.

UnicodeString* CanonicalIterator::getEquivalents const UnicodeString   segment,
int32_t &    result_len,
UErrorCode   status
[private]
 

Hashtable* CanonicalIterator::getEquivalents2 const UChar *    segment,
int32_t    segLen,
UErrorCode   status
[private]
 

UnicodeString CanonicalIterator::getSource  
 

Gets the NFD form of the current source we are iterating over.

Returns:
gets the source: NOTE: it is the NFD form of source
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

UClassID CanonicalIterator::getStaticClassID   [static]
 

ICU "poor man's RTTI", returns a UClassID for this class.

Stable:
ICU 2.2

UnicodeString CanonicalIterator::next void   
 

Get the next canonically equivalent string.


Warning: The strings are not guaranteed to be in any particular order.

Returns:
the next string that is canonically equivalent. A bogus string is returned when the iteration is done.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

CanonicalIterator& CanonicalIterator::operator= const CanonicalIterator &    other [private]
 

Assignment operator.

Private for now.

Internal:
For internal use only.

void CanonicalIterator::permute UnicodeString   source,
UBool    skipZeros,
Hashtable   result,
UErrorCode   status
[static]
 

Dumb recursive implementation of permutation.

TODO: optimize

Parameters:
source  the string to find permutations for
skipZeros  determine if skip zeros
result  the results in a set.
status  Fill-in parameter which receives the status of this operation.
Internal:
For internal use only.

void CanonicalIterator::reset  
 

Resets the iterator so that one can start again from the beginning.

Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.

void CanonicalIterator::setSource const UnicodeString   newSource,
UErrorCode   status
 

Set a new source for this iterator.

Allows object reuse.

Parameters:
newSource  the source string to iterate against. This allows the same iterator to be used while changing the source string, saving object creation.
status  Fill-in parameter which receives the status of this operation.
Draft:
This API has been introduced in ICU 2.4. It is still in draft state and may be modified in a future release.


Member Data Documentation

UnicodeString CanonicalIterator::buffer [private]
 

Definition at line 170 of file caniter.h.

int32_t* CanonicalIterator::current [private]
 

Definition at line 166 of file caniter.h.

int32_t CanonicalIterator::current_length [private]
 

Definition at line 167 of file caniter.h.

UBool CanonicalIterator::done [private]
 

Definition at line 157 of file caniter.h.

UnicodeString** CanonicalIterator::pieces [private]
 

Definition at line 161 of file caniter.h.

int32_t CanonicalIterator::pieces_length [private]
 

Definition at line 162 of file caniter.h.

int32_t* CanonicalIterator::pieces_lengths [private]
 

Definition at line 163 of file caniter.h.

UnicodeString CanonicalIterator::source [private]
 

Definition at line 156 of file caniter.h.


The documentation for this class was generated from the following file:
Generated on Mon Nov 24 14:36:21 2003 for ICU 2.8 by doxygen1.2.11.1 written by Dimitri van Heesch, © 1997-2001