|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectit.unimi.dsi.fastutil.objects.AbstractObjectCollection
it.unimi.dsi.fastutil.objects.AbstractObjectList
it.unimi.dsi.mg4j.util.FrontCodedStringList
Compact storage of strings using front-coding compression.
This class stores a list of strings using front-coding compression (of course,
the compression will be reasonable only if the list is sorted, but you could
also use instances of this class just as a handy way to manage a large
amount of strings). It implements an immutable ObjectList
that returns the i-th
string (as a MutableString
) when the get(int)
method is
called with argument i. The returned mutable string may be freely
modified.
As a commodity, this class provides a main method that reads from standard input a sequence of newline-separated words, and writes a corresponding serialized front-coded string list.
To store the list of strings, we use either a UTF-8 coded ByteArrayFrontCodedList
, or a CharArrayFrontCodedList
, depending on
the value of the utf8
parameter at creation time. In the first case, if the
strings are ASCII-oriented the resulting array will be much smaller, but
access times will increase manifold, as each string must be UTF-8 encoded
before being returned.
Field Summary | |
protected ByteArrayFrontCodedList |
byteFrontCodedList
The underlying ByteArrayFrontCodedList , or null . |
protected CharArrayFrontCodedList |
charFrontCodedList
The underlying CharArrayFrontCodedList , or null . |
protected ObjectList |
frontCodedList
The underlying front-coded list (either a ByteArrayFrontCodedList ,
or a CharArrayFrontCodedList , depending on the value of utf8 ). |
static long |
serialVersionUID
|
protected boolean |
utf8
Whether this front-coded list is UTF-8 encoded. |
Constructor Summary | |
FrontCodedStringList(Collection c,
int ratio,
boolean utf8)
Creates a new front-coded string list containing the character sequences contained in the given collection. |
|
FrontCodedStringList(Iterator words,
int ratio,
boolean utf8)
Creates a new front-coded string list containing the character sequences returned by the given iterator. |
Method Summary | |
Object |
get(int index)
Returns the element at the specified position in this front-coded as a mutable string. |
void |
get(int index,
MutableString s)
Returns the element at the specified position in this front-coded list by storing it in a mutable string. |
static void |
main(String[] arg)
|
ObjectListIterator |
objectListIterator(int k)
|
int |
ratio()
Returns the ratio of the underlying front-coded list. |
int |
size()
|
boolean |
utf8()
Returns whether this front-coded string list is storing its strings as UTF-8 encoded bytes. |
Methods inherited from class it.unimi.dsi.fastutil.objects.AbstractObjectList |
add, addAll, addAll, addAll, addAll, addAll, addAll, addElements, addElements, compareTo, contains, ensureIndex, ensureRestrictedIndex, equals, getElements, hashCode, indexOf, lastIndexOf, listIterator, listIterator, objectIterator, objectListIterator, objectSubList, peek, pop, push, rem, remove, removeElements, set, size, subList, top, toString |
Methods inherited from class it.unimi.dsi.fastutil.objects.AbstractObjectCollection |
add, clear, containsAll, isEmpty, iterator, remove, removeAll, retainAll, toArray, toArray |
Methods inherited from class java.lang.Object |
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Methods inherited from interface java.util.List |
add, clear, containsAll, isEmpty, iterator, remove, removeAll, retainAll, toArray, toArray |
Methods inherited from interface it.unimi.dsi.fastutil.Stack |
isEmpty |
Field Detail |
public static final long serialVersionUID
protected ObjectList frontCodedList
ByteArrayFrontCodedList
,
or a CharArrayFrontCodedList
, depending on the value of utf8
).
protected transient ByteArrayFrontCodedList byteFrontCodedList
ByteArrayFrontCodedList
, or null
.
protected transient CharArrayFrontCodedList charFrontCodedList
CharArrayFrontCodedList
, or null
.
protected boolean utf8
Constructor Detail |
public FrontCodedStringList(Iterator words, int ratio, boolean utf8)
words
- an iterator returning character sequences.ratio
- the desired ratio.utf8
- if true, the strings will be stored as UTF-8 byte arrays.public FrontCodedStringList(Collection c, int ratio, boolean utf8)
c
- a collection containing character sequences.ratio
- the desired ratio.utf8
- if true, the strings will be stored as UTF-8 byte arrays.Method Detail |
public boolean utf8()
public int ratio()
public Object get(int index)
get
in interface List
index
- an index in the list.
MutableString
that will contain the string at the specified position. The string may be freely modified.public void get(int index, MutableString s)
index
- an index in the list.s
- a mutable string that will contain the string at the specified position.public ObjectListIterator objectListIterator(int k)
objectListIterator
in interface ObjectList
public int size()
size
in interface List
public static void main(String[] arg) throws FileNotFoundException, IOException
FileNotFoundException
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |