Class IndexedDISI
- java.lang.Object
-
- org.apache.lucene.search.DocIdSetIterator
-
- org.apache.lucene.codecs.lucene70.IndexedDISI
-
final class IndexedDISI extends DocIdSetIterator
Disk-based implementation of aDocIdSetIterator
which can return the index of the current document, i.e. the ordinal of the current document among the list of documents that this iterator can return. This is useful to implement sparse doc values by only having to encode values for documents that actually have a value.Implementation-wise, this
DocIdSetIterator
is inspired ofroaring bitmaps
and encodes ranges of65536
documents independently and picks between 3 encodings depending on the density of the range:ALL
if the range contains 65536 documents exactly,DENSE
if the range contains 4096 documents or more; in that case documents are stored in a bit set,SPARSE
otherwise, and the lower 16 bits of the doc IDs are stored in ashort
.
Only ranges that contain at least one value are encoded.
This implementation uses 6 bytes per document in the worst-case, which happens in the case that all ranges contain exactly one document.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static class
IndexedDISI.Method
-
Field Summary
Fields Modifier and Type Field Description private int
block
private long
blockEnd
private long
cost
private int
doc
(package private) boolean
exists
private int
gap
private int
index
(package private) static int
MAX_ARRAY_LENGTH
(package private) IndexedDISI.Method
method
private int
nextBlockIndex
private int
numberOfOnes
private IndexInput
slice
The slice that stores theDocIdSetIterator
.private long
word
private int
wordIndex
-
Fields inherited from class org.apache.lucene.search.DocIdSetIterator
NO_MORE_DOCS
-
-
Constructor Summary
Constructors Constructor Description IndexedDISI(IndexInput slice, long cost)
IndexedDISI(IndexInput in, long offset, long length, long cost)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description int
advance(int target)
Advances to the first beyond the current whose document number is greater than or equal to target, and returns the document number itself.private void
advanceBlock(int targetBlock)
boolean
advanceExact(int target)
long
cost()
Returns the estimated cost of thisDocIdSetIterator
.int
docID()
Returns the following:-1
ifDocIdSetIterator.nextDoc()
orDocIdSetIterator.advance(int)
were not called yet.private static void
flush(int block, FixedBitSet buffer, int cardinality, IndexOutput out)
int
index()
int
nextDoc()
Advances to the next document in the set and returns the doc it is currently on, orDocIdSetIterator.NO_MORE_DOCS
if there are no more docs in the set.
NOTE: after the iterator has exhausted you should not call this method, as it may result in unpredicted behavior.private void
readBlockHeader()
(package private) static void
writeBitSet(DocIdSetIterator it, IndexOutput out)
-
Methods inherited from class org.apache.lucene.search.DocIdSetIterator
all, empty, range, slowAdvance
-
-
-
-
Field Detail
-
MAX_ARRAY_LENGTH
static final int MAX_ARRAY_LENGTH
- See Also:
- Constant Field Values
-
slice
private final IndexInput slice
The slice that stores theDocIdSetIterator
.
-
cost
private final long cost
-
block
private int block
-
blockEnd
private long blockEnd
-
nextBlockIndex
private int nextBlockIndex
-
method
IndexedDISI.Method method
-
doc
private int doc
-
index
private int index
-
exists
boolean exists
-
word
private long word
-
wordIndex
private int wordIndex
-
numberOfOnes
private int numberOfOnes
-
gap
private int gap
-
-
Constructor Detail
-
IndexedDISI
IndexedDISI(IndexInput in, long offset, long length, long cost) throws java.io.IOException
- Throws:
java.io.IOException
-
IndexedDISI
IndexedDISI(IndexInput slice, long cost) throws java.io.IOException
- Throws:
java.io.IOException
-
-
Method Detail
-
flush
private static void flush(int block, FixedBitSet buffer, int cardinality, IndexOutput out) throws java.io.IOException
- Throws:
java.io.IOException
-
writeBitSet
static void writeBitSet(DocIdSetIterator it, IndexOutput out) throws java.io.IOException
- Throws:
java.io.IOException
-
docID
public int docID()
Description copied from class:DocIdSetIterator
Returns the following:-1
ifDocIdSetIterator.nextDoc()
orDocIdSetIterator.advance(int)
were not called yet.DocIdSetIterator.NO_MORE_DOCS
if the iterator has exhausted.- Otherwise it should return the doc ID it is currently on.
- Specified by:
docID
in classDocIdSetIterator
-
advance
public int advance(int target) throws java.io.IOException
Description copied from class:DocIdSetIterator
Advances to the first beyond the current whose document number is greater than or equal to target, and returns the document number itself. Exhausts the iterator and returnsDocIdSetIterator.NO_MORE_DOCS
if target is greater than the highest document number in the set.The behavior of this method is undefined when called with
target ≤ current
, or after the iterator has exhausted. Both cases may result in unpredicted behavior.When
target > current
it behaves as if written:int advance(int target) { int doc; while ((doc = nextDoc()) < target) { } return doc; }
Some implementations are considerably more efficient than that.NOTE: this method may be called with
DocIdSetIterator.NO_MORE_DOCS
for efficiency by some Scorers. If your implementation cannot efficiently determine that it should exhaust, it is recommended that you check for that value in each call to this method.- Specified by:
advance
in classDocIdSetIterator
- Throws:
java.io.IOException
-
advanceExact
public boolean advanceExact(int target) throws java.io.IOException
- Throws:
java.io.IOException
-
advanceBlock
private void advanceBlock(int targetBlock) throws java.io.IOException
- Throws:
java.io.IOException
-
readBlockHeader
private void readBlockHeader() throws java.io.IOException
- Throws:
java.io.IOException
-
nextDoc
public int nextDoc() throws java.io.IOException
Description copied from class:DocIdSetIterator
Advances to the next document in the set and returns the doc it is currently on, orDocIdSetIterator.NO_MORE_DOCS
if there are no more docs in the set.
NOTE: after the iterator has exhausted you should not call this method, as it may result in unpredicted behavior.- Specified by:
nextDoc
in classDocIdSetIterator
- Throws:
java.io.IOException
-
index
public int index()
-
cost
public long cost()
Description copied from class:DocIdSetIterator
Returns the estimated cost of thisDocIdSetIterator
.This is generally an upper bound of the number of documents this iterator might match, but may be a rough heuristic, hardcoded value, or otherwise completely inaccurate.
- Specified by:
cost
in classDocIdSetIterator
-
-