it.unimi.dsi.mg4j.tool
Class Occurrence

java.lang.Object
  extended byit.unimi.dsi.mg4j.tool.Occurrence
All Implemented Interfaces:
Comparable

public class Occurrence
extends Object
implements Comparable

A class denoting an occurrence.

An instance of this class is used to store an occurrence. The class stores the index of the term, the index of the document and the position (starting from 0) of the occurrence.

Since:
0.6
Author:
Sebastiano Vigna

Field Summary
 int docIndex
          The document index.
 int docPosition
          The position of this occurrence of term termIndex in document docIndex.
 int termIndex
          The term index.
 
Constructor Summary
Occurrence()
          Creates a new occurrence with all fields initialised to zero.
Occurrence(int termIndex, int docIndex, int docPosition)
          Creates a new occurrence with given indices.
 
Method Summary
 int compareTo(Object o)
          Compares this occurrence with another object.
static void countSortOnDocuments(Occurrence[] in, Occurrence[] out, int len, int[] count, int n)
          Performs a distribution-counting sort over a vector of occurrences using only docIndex as key.
static void countSortOnTerms(Occurrence[] in, Occurrence[] out, int len, int[] count, int n)
          Performs a distribution-counting sort over a vector of occurrences using only termIndex as key.
 boolean equals(Object o)
           
 int hashCode()
           
static int readOccurrences(Occurrence[] occurrence, int len, InputBitStream in)
          Reads a compressed stream of occurrences into a vector.
 void set(int termIndex, int docIndex, int docPosition)
          Sets the fields of this occurrence.
 String toString()
           
static void writeOccurrences(Occurrence[] occurrence, int len, OutputBitStream out)
          Writes in compressed form a vector of sorted occurrences.
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Field Detail

termIndex

public int termIndex
The term index.


docIndex

public int docIndex
The document index.


docPosition

public int docPosition
The position of this occurrence of term termIndex in document docIndex.

Constructor Detail

Occurrence

public Occurrence()
Creates a new occurrence with all fields initialised to zero.


Occurrence

public Occurrence(int termIndex,
                  int docIndex,
                  int docPosition)
Creates a new occurrence with given indices.

Parameters:
termIndex - the term index.
docIndex - the document index.
docPosition - the position of the occurrences in the document.
Method Detail

set

public void set(int termIndex,
                int docIndex,
                int docPosition)
Sets the fields of this occurrence.

Parameters:
termIndex - the term index.
docIndex - the document index.
docPosition - the position of the occurrences in the document.

hashCode

public int hashCode()

equals

public boolean equals(Object o)

compareTo

public int compareTo(Object o)
Compares this occurrence with another object.

Comparison between occurrences is lexicographical w.r.t. termIndex, docIndex and docPosition, in this order.

Specified by:
compareTo in interface Comparable
Parameters:
o - an occurrence.
Returns:
a negative integer, zero, or a positive integer as this occurence is less than, equal to, or greater than the specified occurrence.
Throws:
ClassCastException - if the argument is not an Occurrence.
See Also:
compareTo(Object)

toString

public String toString()

countSortOnTerms

public static void countSortOnTerms(Occurrence[] in,
                                    Occurrence[] out,
                                    int len,
                                    int[] count,
                                    int n)
Performs a distribution-counting sort over a vector of occurrences using only termIndex as key.

Parameters:
in - a vector of occurrences to be sorted.
out - a vector to store the sorted permutation of in; its length must be at least len.
len - the number of valid occurrences in occurrence.
count - a vector to perform the counting; its length must be greater than n.
n - the number of keys (terms).

countSortOnDocuments

public static void countSortOnDocuments(Occurrence[] in,
                                        Occurrence[] out,
                                        int len,
                                        int[] count,
                                        int n)
Performs a distribution-counting sort over a vector of occurrences using only docIndex as key.

Parameters:
in - a vector of occurrences to be sorted.
out - a vector to store the sorted permutation of in; its length must be at least len.
len - the number of valid occurrences in occurrence.
count - a vector to perform the counting; its length must be greater than n.
n - the number of keys (documents).

writeOccurrences

public static void writeOccurrences(Occurrence[] occurrence,
                                    int len,
                                    OutputBitStream out)
                             throws IOException
Writes in compressed form a vector of sorted occurrences.

Parameters:
occurrence - a vector of occurrences.
len - the number of valid occurrences in occurrence.
out - an already opened bit stream where the output will be sent.
Throws:
IOException

readOccurrences

public static int readOccurrences(Occurrence[] occurrence,
                                  int len,
                                  InputBitStream in)
                           throws IOException
Reads a compressed stream of occurrences into a vector.

Parameters:
occurrence - a vector of occurrences.
len - the maximum number of occurrences to be read.
in - an already opened input bit stream.
Returns:
the number of occurrences actually read.
Throws:
IOException