it.unimi.dsi.mg4j.search.score
Interface Scorer

All Superinterfaces:
FlyweightPrototype<Scorer>, IntIterator, Iterator<Integer>
All Known Subinterfaces:
DelegatingScorer
All Known Implementing Classes:
AbstractAggregator, AbstractIndexScorer, AbstractScorer, AbstractWeightedScorer, BM25Scorer, ClarkeCormackScorer, ConstantScorer, CountScorer, DecreasingDocumentRankScorer, DocumentRankScorer, LinearAggregator, TfIdfScorer, VignaScorer

public interface Scorer
extends IntIterator, FlyweightPrototype<Scorer>

A wrapper for a DocumentIterator returning scored document pointers.

Typically, a scorer may have one or more constructors, but all scorers should provide a constructor that takes only strings as arguments to make the instantiation from command-line or similar interfaces easier.

To be (re)used, a scorer must first wrap an underlying DocumentIterator. This phase usually involves some preprocessing around properties of the document iterator to be scored. After wrapping, calls to nextDocument() and score() (or possibly score(Index)) will return the next document pointer and its score. Note that these methods are not usually idempotent, as they modify the state of the underlying iterator (e.g., they consume intervals).

Scores returned by a scorer might depend on some weights associated to each index.

Optionally, a scorer might be a DelegatingScorer.

Warning: implementations of this interface are not required to be thread-safe, but they provide flyweight copies. The copy() method is strengthened so to return an object implementing this interface.


Method Summary
 Scorer copy()
           
 int nextDocument()
          Returns the next document provided by this scorer, or -1 if no more documents are available.
 int nextInt()
          Deprecated. As of MG4J 1.2, the suggested way of iterating over scorer is nextDocument(), which provides fully lazy iteration. After a couple of releases, however, this annotation will be removed, as it is very practical to have scorers implementing IntIterator. Its main purpose is to let people know about nextDocument(), which solves the same issues as DocumentIterator.nextDocument().
 double score()
          Returns a score for the current document of the last document iterator given to wrap(DocumentIterator).
 double score(Index index)
          Returns a score for the current document of the last document iterator given to wrap(DocumentIterator), but considering only a given index (optional operation).
 boolean setWeights(Reference2DoubleMap<Index> index2Weight)
          Sets the weight map for this scorer (if applicable).
 boolean usesIntervals()
          Whether this scorer uses intervals.
 void wrap(DocumentIterator documentIterator)
          Wraps a document iterator and prepares the internal state of this scorer to work with it.
 
Methods inherited from interface it.unimi.dsi.fastutil.ints.IntIterator
skip
 
Methods inherited from interface java.util.Iterator
hasNext, next, remove
 

Method Detail

score

double score()
             throws IOException
Returns a score for the current document of the last document iterator given to wrap(DocumentIterator).

Returns:
the score.
Throws:
IOException

score

double score(Index index)
             throws IOException
Returns a score for the current document of the last document iterator given to wrap(DocumentIterator), but considering only a given index (optional operation).

Parameters:
index - the only index to be considered.
Returns:
the score.
Throws:
IOException

setWeights

boolean setWeights(Reference2DoubleMap<Index> index2Weight)
Sets the weight map for this scorer (if applicable).

The given map will be copied internally and can be used by the caller without affecting the scorer behaviour. Implementing classes should rescale the weights so that they have sum equal to one.

Indices not appearing in the map will have weight equal to 0.

Parameters:
index2Weight - a map from indices to weights.
Returns:
true if this scorer supports weights.

wrap

void wrap(DocumentIterator documentIterator)
          throws IOException
Wraps a document iterator and prepares the internal state of this scorer to work with it.

Subsequent calls to score() and score(Index) will use d to compute the score.

Parameters:
documentIterator - the document iterator that will be used in subsequent calls to score() and score(Index).
Throws:
IOException

usesIntervals

boolean usesIntervals()
Whether this scorer uses intervals.

This method is essential when aggregating scorers, because if several scores need intervals, a CachingDocumentIterator will be necessary.

Returns:
true if this scorer uses intervals.

nextInt

@Deprecated
int nextInt()
Deprecated. As of MG4J 1.2, the suggested way of iterating over scorer is nextDocument(), which provides fully lazy iteration. After a couple of releases, however, this annotation will be removed, as it is very practical to have scorers implementing IntIterator. Its main purpose is to let people know about nextDocument(), which solves the same issues as DocumentIterator.nextDocument().

Returns the next document.

Specified by:
nextInt in interface IntIterator
See Also:
nextDocument()

nextDocument

int nextDocument()
                 throws IOException
Returns the next document provided by this scorer, or -1 if no more documents are available.

Returns:
the next document, or -1 if no more documents are available.
Throws:
IOException

copy

Scorer copy()
Specified by:
copy in interface FlyweightPrototype<Scorer>