|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectit.unimi.dsi.mg4j.index.AbstractTermMap
it.unimi.dsi.mg4j.util.MinimalPerfectHash
it.unimi.dsi.mg4j.util.SignedMinimalPerfectHash
@Deprecated public abstract class SignedMinimalPerfectHash
Signed order-preserving minimal perfect hash tables.
Minimal perfect hash tables will always return a result, even for terms that were not present in the collection indexed by the table. Sometimes you may prefer to single out terms that were not present in the collection.
To this purpose, MG4J provides signed minimal perfect tables. In a signed
table, every term in the collection gets a signature that is used to
tell false positives. Signature may go from the simple hashcode-based signatures
provided by HashCodeSignedMinimalPerfectHash
class, to sophisticated
cryptographic signatures, to (at the other extreme) a class that actually stores
the terms (and thus completely avoids false positives) such as LiterallySignedMinimalPerfectHash
.
A signed table extends this class, and provides two methods: a
initSignatures(Iterable)
method that sets up the necessary data
structures, and a checkSignature(CharSequence,int)
method that
checks a given character sequence against the signature stored for
a term having given index.
It is good practise, of course, to replicate the constructors of this
class in all implementing subclasses (by simply invoking super
with the same arguments). Moreover, to be useful classes implementing this
class must be serialisable.
Field Summary | |
---|---|
static long |
serialVersionUID
Deprecated. |
Fields inherited from class it.unimi.dsi.mg4j.util.MinimalPerfectHash |
---|
ENLARGEMENT_FACTOR, g, init, m, n, n4, NODE_OVERHEAD, rightShift, t, TERM_THRESHOLD, WEIGHT_UNKNOWN, WEIGHT_UNKNOWN_SORTED_TERMS, weight0, weight1, weight2, weightLength |
Constructor Summary | |
---|---|
SignedMinimalPerfectHash(Iterable<? extends CharSequence> terms)
Deprecated. Creates a new signed order-preserving minimal perfect hash table for the given terms, using as many weights as the longest term in the collection. |
|
SignedMinimalPerfectHash(Iterable<? extends CharSequence> terms,
int weightLength)
Deprecated. Creates a new signed order-preserving minimal perfect hash table for the given terms using the given number of weights. |
|
SignedMinimalPerfectHash(String termFile,
String encoding)
Deprecated. Creates a new signed order-preserving minimal perfect hash table for the given file of terms. |
|
SignedMinimalPerfectHash(String termFile,
String encoding,
boolean zipped)
Deprecated. Creates a new signed order-preserving minimal perfect hash table for the (possibly gzip'd) given file of terms. |
|
SignedMinimalPerfectHash(String termFile,
String encoding,
int weightLength)
Deprecated. Creates a new signed order-preserving minimal perfect hash table for the given file of terms using the given number of weights. |
|
SignedMinimalPerfectHash(String termFile,
String encoding,
int weightLength,
boolean zipped)
Deprecated. Creates a new signed order-preserving minimal perfect hash table for the (possibly gzip'd) given file of terms using the given number of weights. |
Method Summary | |
---|---|
MinimalPerfectHash |
asUnsigned()
Deprecated. Returns a unsigned view of this signed minimal perfect hash. |
protected abstract boolean |
checkSignature(byte[] a,
int off,
int len,
int index)
Deprecated. Checks a signature against a byte-array fragment. |
protected abstract boolean |
checkSignature(CharSequence term,
int index)
Deprecated. Checks a signature against a character sequence. |
int |
getNumber(byte[] a,
int off,
int len)
Deprecated. Hashes a term given as a byte-array fragment interpreted in the ISO-8859-1 charset encoding. |
int |
getNumber(CharSequence term)
Deprecated. Hashes a given term. |
int |
getNumber(MutableString term)
Deprecated. Hashes a given term. |
protected abstract void |
initSignatures(Iterable<? extends CharSequence> terms)
Deprecated. Sets up the signature system from a collection. |
static void |
main(String[] arg)
Deprecated. |
Methods inherited from class it.unimi.dsi.mg4j.util.MinimalPerfectHash |
---|
getFromT, getNumber, hash, hasTerms, main, size, weightLength |
Methods inherited from class it.unimi.dsi.mg4j.index.AbstractTermMap |
---|
getIndex, getTerm, getTerm |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final long serialVersionUID
Constructor Detail |
---|
public SignedMinimalPerfectHash(Iterable<? extends CharSequence> terms)
After calling the corresponding constructor of MinimalPerfectHash
, this
constructor will invoke initSignatures(Iterable)
.
terms
- some terms to hash; it is assumed that they do not contain duplicates.MinimalPerfectHash.MinimalPerfectHash(Iterable)
public SignedMinimalPerfectHash(Iterable<? extends CharSequence> terms, int weightLength)
After calling the corresponding constructor of MinimalPerfectHash
, this
constructor will invoke initSignatures(Iterable)
.
terms
- some terms to hash; it is assumed that no terms share a common prefix of
weightLength
characters.weightLength
- the number of weights used generating the intermediate hash functions.MinimalPerfectHash.MinimalPerfectHash(Iterable, int)
public SignedMinimalPerfectHash(String termFile, String encoding, int weightLength, boolean zipped)
After calling the corresponding constructor of MinimalPerfectHash
, this
constructor will invoke initSignatures(Iterable)
.
termFile
- a file containing one term on each line; it is assumed that
it does not contain terms with a common prefix of
weightLength
characters.encoding
- the encoding of termFile
; if null
, it
is assumed to be the platform default encoding.weightLength
- the number of weights used generating the
intermediate hash functions.zipped
- if true, the provided file is zipped and will be opened using a GZIPInputStream
.MinimalPerfectHash.MinimalPerfectHash(String,String,int,boolean)
public SignedMinimalPerfectHash(String termFile, String encoding, boolean zipped)
After calling the corresponding constructor of MinimalPerfectHash
, this
constructor will invoke initSignatures(Iterable)
.
termFile
- a file containing one term on each line; it is assumed that
it does not contain terms with a common prefix of
weightLength
characters.encoding
- the encoding of termFile
; if null
, it
is assumed to be the platform default encoding.zipped
- if true, the provided file is zipped and will be opened using a GZIPInputStream
.MinimalPerfectHash.MinimalPerfectHash(String,String,boolean)
public SignedMinimalPerfectHash(String termFile, String encoding, int weightLength)
After calling the corresponding constructor of MinimalPerfectHash
, this
constructor will invoke initSignatures(Iterable)
.
termFile
- a file containing one term on each line; it is assumed that
it does not contain terms with a common prefix of
weightLength
characters.encoding
- the encoding of termFile
; if null
, it
is assumed to be the platform default encoding.weightLength
- the number of weights used generating the
intermediate hash functions.MinimalPerfectHash.MinimalPerfectHash(String,String,int)
public SignedMinimalPerfectHash(String termFile, String encoding)
After calling the corresponding constructor of MinimalPerfectHash
, this
constructor will invoke initSignatures(Iterable)
.
termFile
- a file containing one term on each line; it is assumed that
it does not contain terms with a common prefix of
weightLength
characters.encoding
- the encoding of termFile
; if null
, it
is assumed to be the platform default encoding.MinimalPerfectHash.MinimalPerfectHash(String,String)
Method Detail |
---|
public int getNumber(CharSequence term)
getNumber
in interface TermMap
getNumber
in class MinimalPerfectHash
term
- a term to hash.
public int getNumber(MutableString term)
getNumber
in class MinimalPerfectHash
term
- a term to hash.
public int getNumber(byte[] a, int off, int len)
getNumber
in class MinimalPerfectHash
a
- a byte array.off
- the first valid byte in a
.len
- the number of bytes composing the term, starting at off
.
len
bytes starting at off
(interpreted
as ISO-8859-1 characters) in the generating collection, starting from 0, if the
term was in the original collection; otherwise, -1.protected abstract void initSignatures(Iterable<? extends CharSequence> terms)
This abstract method must be overriden by implementing subclasses. It must set up all data structures that are necessary to handle signatures; in particular, it will usually compute signatures for all terms in the given collection.
terms
- the collection of terms given to the constructor of this class.HashCodeSignedMinimalPerfectHash.initSignatures(Iterable)
,
LiterallySignedMinimalPerfectHash.initSignatures(Iterable)
protected abstract boolean checkSignature(CharSequence term, int index)
This abstract method must be overriden by implementing subclasses.
It must check whether the signature of the given character sequence matches
the one stored for the index
-th term.
Note that this method and checkSignature(byte[], int, int, int)
must
be coherent.
term
- a character sequence.index
- an integer denoting a term in the indexed collection.
index
-th term.HashCodeSignedMinimalPerfectHash.checkSignature(CharSequence, int)
,
LiterallySignedMinimalPerfectHash.checkSignature(CharSequence,int)
protected abstract boolean checkSignature(byte[] a, int off, int len, int index)
This abstract method must be overriden by implementing subclasses.
It must check whether the signature of the given byte-array fragment
(interpreted as an ISO-8859-1 string) matches
the one stored for the index
-th term.
Note that this method and checkSignature(CharSequence, int)
must
be coherent.
a
- a byte array.off
- the first valid byte in a
.len
- the number of bytes composing the term, starting at off
.
len
bytes starting at off
(interpreted
as ISO-8859-1 characters) matches the one stored for the index
-th term.HashCodeSignedMinimalPerfectHash.checkSignature(CharSequence, int)
,
LiterallySignedMinimalPerfectHash.checkSignature(CharSequence,int)
public MinimalPerfectHash asUnsigned()
public static void main(String[] arg) throws InstantiationException, IllegalAccessException, InvocationTargetException, NoSuchMethodException, IOException, com.martiansoftware.jsap.JSAPException, ClassNotFoundException
InstantiationException
IllegalAccessException
InvocationTargetException
NoSuchMethodException
IOException
com.martiansoftware.jsap.JSAPException
ClassNotFoundException
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |