it.unimi.dsi.mg4j.index
Interface IndexIterator

All Superinterfaces:
DocumentIterator, IntIterator, Iterable<Interval>, Iterator<Integer>
All Known Implementing Classes:
AbstractIndexIterator, BitStreamHPIndexReader.BitStreamHPIndexReaderIndexIterator, BitStreamIndexReader.BitStreamIndexReaderIndexIterator, DocumentalConcatenatedClusterIndexIterator, DocumentalMergedClusterIndexIterator, GammaDeltaGammaDeltaBitStreamHPIndexReader.BitStreamHPIndexReaderIndexIterator, GammaDeltaGammaDeltaBitStreamIndexReader.BitStreamIndexReaderIndexIterator, Index.EmptyIndexIterator, MultiTermIndexIterator, SkipGammaDeltaGammaDeltaBitStreamIndexReader.BitStreamIndexReaderIndexIterator

public interface IndexIterator
extends DocumentIterator

An iterator over an inverted list.

An index iterator scans the inverted list of an indexed term. Each integer returned by nextDocument() is the index of a document containing the term. If the index contains counts, they can be obtained after each call to DocumentIterator.nextDocument() using count(). Then, if the index contains positions they can be obtained as an array using positionArray(), as an iterator using positions(), or stored into an array using positions(int[]).

Note that this interface extends DocumentIterator. The intervals returned for a document are exactly length-one intervals corresponding to the positions returned by positions(). If the index to which an instance of this class refers does not contain positions, an UnsupportedOperationException will be thrown.

Additionally, this interface strengthens DocumentIterator.weight(double) so that it returns an index iterator.


Method Summary
 int count()
          Returns the count, that is, the number of occurrences of the term in the current document.
 int frequency()
          Returns the frequency, that is, the number of documents that will be returned by this iterator.
 int id()
          Returns the id of this index iterator.
 IndexIterator id(int id)
          Sets the id of this index iterator.
 Index index()
          Returns the index over which this iterator is built.
 Payload payload()
          Returns the payload, if any, associated with the current document.
 int[] positionArray()
          Returns the positions at which the term appears in the current document in an array.
 IntIterator positions()
          Returns the positions at which the term appears in the current document.
 int positions(int[] positions)
          Stores the positions at which the term appears in the current document in a given array.
 String term()
          Returns the term whose inverted list is returned by this index iterator.
 IndexIterator term(CharSequence term)
          Sets the term whose inverted list is returned by this index iterator.
 int termNumber()
          Returns the number of the term whose inverted list is returned by this index iterator.
 IndexIterator weight(double weight)
          Returns the weight of this index iterator.
 
Methods inherited from interface it.unimi.dsi.mg4j.search.DocumentIterator
accept, acceptOnTruePaths, dispose, document, indices, intervalIterator, intervalIterator, intervalIterators, iterator, nextDocument, nextInt, skipTo, weight
 
Methods inherited from interface it.unimi.dsi.fastutil.ints.IntIterator
skip
 
Methods inherited from interface java.util.Iterator
hasNext, next, remove
 

Method Detail

index

Index index()
Returns the index over which this iterator is built.

Returns:
the index over which this iterator is built.

termNumber

int termNumber()
Returns the number of the term whose inverted list is returned by this index iterator.

Usually, the term number is automatically set by IndexReader.documents(CharSequence) or IndexReader.documents(int).

Returns:
the number of the term over which this iterator is built.
Throws:
IllegalStateException - if no term was set when the iterator was created.
See Also:
term()

term

String term()
Returns the term whose inverted list is returned by this index iterator.

Usually, the term is automatically set by IndexReader.documents(CharSequence) or IndexReader.documents(int), but you can supply your own term with term(CharSequence).

Returns:
the term over which this iterator is built, as a compact mutable string.
Throws:
IllegalStateException - if no term was set when the iterator was created.
See Also:
termNumber()

term

IndexIterator term(CharSequence term)
Sets the term whose inverted list is returned by this index iterator.

Usually, the term is automatically set by Index.documents(CharSequence) or by IndexReader.documents(CharSequence), but you can use this method to ensure that term() doesn't throw an exception.

Parameters:
term - a character sequence (that will be defensively copied) that will be assumed to be the term whose inverted list is returned by this index iterator.
Returns:
this index iterator.

frequency

int frequency()
              throws IOException
Returns the frequency, that is, the number of documents that will be returned by this iterator.

Returns:
the number of documents that will be returned by this iterator.
Throws:
IOException

payload

Payload payload()
                throws IOException
Returns the payload, if any, associated with the current document.

Returns:
the payload associated with the current document.
Throws:
IOException

count

int count()
          throws IOException
Returns the count, that is, the number of occurrences of the term in the current document.

Returns:
the count (number of occurrences) of the term in the current document.
Throws:
UnsupportedOperationException - if the index of this iterator does not contain counts.
IOException

positions

IntIterator positions()
                      throws IOException
Returns the positions at which the term appears in the current document.

Returns:
the positions of the current document in which the current term appears.
Throws:
UnsupportedOperationException - if the index of this iterator does not contain positions.
IOException

positions

int positions(int[] positions)
              throws IOException
Stores the positions at which the term appears in the current document in a given array.

If the array is not large enough (i.e., it does not contain count() elements), this method will return a negative number (the opposite of the count).

Parameters:
positions - an array that will be used to store positions.
Returns:
the count; it will have the sign changed if positions cannot hold all positions.
Throws:
UnsupportedOperationException - if the index of this iterator does not contain positions.
IOException

positionArray

int[] positionArray()
                    throws IOException
Returns the positions at which the term appears in the current document in an array.

Implementations are allowed to return the same array across different calls to this method.

Returns:
an array whose first count() elements contain the document positions.
Throws:
UnsupportedOperationException - if the index of this iterator does not contain positions.
IOException

id

IndexIterator id(int id)
Sets the id of this index iterator.

The id is an integer associated to each index iterator. It has no specific semantics, and can be used differently in different contexts. A typical usage pattern, for instance, is using it to assign a unique number to the index iterators contained in a composite document iterator (say, numbering consecutively the leaves of the composite).

Parameters:
id - the new id for this index iterator.
Returns:
this index iterator.

id

int id()
Returns the id of this index iterator.

Returns:
the id of this index iterator.
See Also:
id(int)

weight

IndexIterator weight(double weight)
Returns the weight of this index iterator.

Specified by:
weight in interface DocumentIterator
Parameters:
weight - the weight of this index iterator.
Returns:
this document iterator.
See Also:
DocumentIterator.weight(double)