|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectit.unimi.dsi.mg4j.search.visitor.AbstractDocumentIteratorVisitor
it.unimi.dsi.mg4j.search.visitor.TermCollectionVisitor
public class TermCollectionVisitor
A visitor collecting information about terms appearing
in a DocumentIterator
.
The purpose of this visitor is that of exploring before iteration the structure
of a DocumentIterator
to count how many terms are actually used, and set up some
preliminary access data. More precisely, we count the distinct pairs index/term
appearing in all leaves of nonzero frequency (the latter
condition is used to skip empty iterators). For this visitor to work, all leaves
of nonzero frequency must return a non-null
value on
a call to IndexIterator.term()
.
During the visit, we keep track of which index/term pair have been already
seen. Each pair is assigned an distinct offset—a number between
zero and the overall number of distinct pairs—which is stored into
each index iterator id
and is used afterwards to access quickly data about the pair. Note that duplicate index/term pairs
get the same offset. The overall number of distinct pairs is returned
by numberOfPairs()
after a visit.
During the visit, the indices actually appearing in some nonzero-frequency
leaf are gathered; they are accessible as a vector returned
by indices()
, and the map from positions in this vector to indices
is inverted by indexMap()
.
The offset assigned to each pair index/term
is returned by offset(Index, String)
. Should you need to know the terms
associated to each index, they are returned by terms(Index)
.
The after a term collection, usually counters are set
up by a visit of CounterSetupVisitor
.
Constructor Summary | |
---|---|
TermCollectionVisitor()
Creates a new term-collection visitor. |
Method Summary | |
---|---|
Reference2IntMap<Index> |
indexMap()
Returns a map from indices met during term collection to their position into indices() . |
Index[] |
indices()
Returns the indices met during pair collection. |
int |
numberOfPairs()
Returns the number of distinct index/term pair corresponding to nonzero-frequency index iterators in the last visit. |
int |
offset(Index index,
String term)
Returns the offset associated to a given pair index/term. |
TermCollectionVisitor |
prepare()
Prepares the internal state of this visitor for a(nother) visit. |
String[] |
terms(Index index)
Returns the terms associated to the given index. |
String |
toString()
|
boolean |
visit(IndexIterator indexIterator)
Visits a leaf. |
Methods inherited from class it.unimi.dsi.mg4j.search.visitor.AbstractDocumentIteratorVisitor |
---|
visitPost, visitPre |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public TermCollectionVisitor()
Method Detail |
---|
public TermCollectionVisitor prepare()
DocumentIteratorVisitor
By specification, it must be safe to call this method any number of times.
prepare
in interface DocumentIteratorVisitor
prepare
in class AbstractDocumentIteratorVisitor
public boolean visit(IndexIterator indexIterator) throws IOException
DocumentIteratorVisitor
indexIterator
- the leaf to be visited.
IOException
public int numberOfPairs()
public Index[] indices()
Note that the returned array does not include indices only associated to index iterators of zero frequency.
public Reference2IntMap<Index> indexMap()
indices()
.
Note that the returned array does not include indices only associated to index iterators of zero frequency.
indices()
.public String[] terms(Index index)
index
- an index.
index
, in the same order in which
they appeared during the visit, skipping duplicates, if some nonzero-frequency iterator
based on index
was found; null
otherwise.public int offset(Index index, String term)
index
- an index appearing in indices()
.term
- a term appearing in the array returned by terms(Index)
with argument index
.
index
/term
.public String toString()
toString
in class Object
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |