it.unimi.dsi.mg4j.tool
Class Merge

java.lang.Object
  extended by it.unimi.dsi.mg4j.tool.Combine
      extended by it.unimi.dsi.mg4j.tool.Merge

public class Merge
extends Combine

Merges several indices.

This class merges indices by performing a simple ordered list merge. Documents appearing in two indices will cause an error.

Since:
1.0
Author:
Sebastiano Vigna

Nested Class Summary
 
Nested classes/interfaces inherited from class it.unimi.dsi.mg4j.tool.Combine
Combine.GammaCodedIntIterator
 
Field Summary
protected  int[] doc
          The reference array of the document queue.
protected  IntHeapSemiIndirectPriorityQueue documentQueue
          The queue containing document pointers (for remapped indices).
 
Fields inherited from class it.unimi.dsi.mg4j.tool.Combine
DEFAULT_BUFFER_SIZE, frequency, hasCounts, hasPayloads, hasPositions, index, indexIterator, indexReader, indexWriter, inputBasename, maxCount, numberOfDocuments, numberOfOccurrences, numIndices, position, size, termQueue, usedIndex
 
Constructor Summary
Merge(String outputBasename, String[] inputBasename, boolean metadataOnly, int bufferSize, Map<CompressionFlags.Component,CompressionFlags.Coding> writerFlags, boolean interleaved, boolean skips, int quantum, int height, int skipBufferSize, long logInterval)
           
 
Method Summary
protected  int combine(int numUsedIndices)
          Combines several indices.
protected  int combineNumberOfDocuments()
          Combines the number of documents.
protected  int combineSizes()
          Combines size lists.
static void main(String[] arg)
           
 
Methods inherited from class it.unimi.dsi.mg4j.tool.Combine
getIndex, main, run, sizes
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

doc

protected int[] doc
The reference array of the document queue.


documentQueue

protected IntHeapSemiIndirectPriorityQueue documentQueue
The queue containing document pointers (for remapped indices).

Constructor Detail

Merge

public Merge(String outputBasename,
             String[] inputBasename,
             boolean metadataOnly,
             int bufferSize,
             Map<CompressionFlags.Component,CompressionFlags.Coding> writerFlags,
             boolean interleaved,
             boolean skips,
             int quantum,
             int height,
             int skipBufferSize,
             long logInterval)
      throws IOException,
             ConfigurationException,
             URISyntaxException,
             ClassNotFoundException,
             SecurityException,
             InstantiationException,
             IllegalAccessException,
             InvocationTargetException,
             NoSuchMethodException
Throws:
IOException
ConfigurationException
URISyntaxException
ClassNotFoundException
SecurityException
InstantiationException
IllegalAccessException
InvocationTargetException
NoSuchMethodException
Method Detail

combineNumberOfDocuments

protected int combineNumberOfDocuments()
Description copied from class: Combine
Combines the number of documents.

Specified by:
combineNumberOfDocuments in class Combine
Returns:
the number of documents of the combined index.

combineSizes

protected int combineSizes()
                    throws IOException
Description copied from class: Combine
Combines size lists.

Specified by:
combineSizes in class Combine
Returns:
the maximum size of a document in the combined index.
Throws:
IOException

combine

protected int combine(int numUsedIndices)
               throws IOException
Description copied from class: Combine
Combines several indices.

When this method is called, exactly numUsedIndices entries of Combine.usedIndex contain, in increasing order, the indices containing inverted lists for the current term. Implementations of this method must combine the inverted list, save the total global count for the current term and return the resulting frequency.

Specified by:
combine in class Combine
Parameters:
numUsedIndices - the number of valid entries in Combine.usedIndex.
Returns:
the frequency of the combined lists.
Throws:
IOException

main

public static void main(String[] arg)
                 throws ConfigurationException,
                        SecurityException,
                        com.martiansoftware.jsap.JSAPException,
                        IOException,
                        URISyntaxException,
                        ClassNotFoundException,
                        InstantiationException,
                        IllegalAccessException,
                        InvocationTargetException,
                        NoSuchMethodException
Throws:
ConfigurationException
SecurityException
com.martiansoftware.jsap.JSAPException
IOException
URISyntaxException
ClassNotFoundException
InstantiationException
IllegalAccessException
InvocationTargetException
NoSuchMethodException