Public Member Functions | Static Public Member Functions | Protected Member Functions | Protected Attributes
ibis::bord::column Class Reference

An in-memory version of ibis::column. More...

#include <bord.h>

Inheritance diagram for ibis::bord::column:
ibis::column

List of all members.

Public Member Functions

void addCounts (uint32_t)
 Extend the buffer to have nr elements.
virtual long append (const char *dt, const char *df, const uint32_t nold, const uint32_t nnew, uint32_t nbuf, char *buf)
 Append new data in directory df to the end of existing data in dt.
virtual long append (const void *vals, const ibis::bitvector &msk)
 Append user supplied data to the current column.
virtual long append (const ibis::column &scol, const ibis::bitvector &msk)
 Append selected values from the given column to the current column.
virtual long append (const ibis::column &scol, const ibis::qContinuousRange &cnd)
 Append selected values from the given column to the current column.
void append (const void *, uint32_t)
 Append a value.
void append (const void *, uint32_t, const void *, uint32_t, ibis::selectClause::AGREGADO)
 Append the value genenerated from the the operation on the incoming columns.
 column (const ibis::bord *tbl, ibis::TYPE_T t, const char *name, void *buf=0, const char *desc="", double low=DBL_MAX, double high=-DBL_MAX)
 Constructor.
 column (const ibis::bord *, const ibis::column &, void *buf)
 Constructor.
 column (const column &rhs)
 Copy constructor. Performs a shallow copy of the storage buffer.
virtual void computeMinMax ()
 Compute the actual min/max values.
virtual void computeMinMax (const char *dir)
 Compute the actual min/max values.
virtual void computeMinMax (const char *, double &min, double &max) const
 Compute the actual min/max of the data in directory dir.
int dump (std::ostream &out, uint32_t i) const
bool equal_to (const column &) const
 Does this column have the same values as the other.
bool equal_to (const column &, uint32_t, uint32_t) const
 Does the ith value of this column equal to the jth value of other?
virtual long evaluateRange (const ibis::qContinuousRange &cmp, const ibis::bitvector &mask, ibis::bitvector &res) const
 Compute the exact answer.
virtual long evaluateRange (const ibis::qDiscreteRange &cmp, const ibis::bitvector &mask, ibis::bitvector &res) const
 Compute the exact answer to a discrete range expression.
void *& getArray ()
void * getArray () const
const ibis::dictionarygetDictionary () const
 Return the dictionary associated with the column.
virtual
ibis::fileManager::storage
getRawData () const
 Retrieve the raw data buffer as an ibis::fileManager::storage.
virtual void getString (uint32_t i, std::string &val) const
 Return the string at the ith row.
virtual int getValuesArray (void *vals) const
 Makes a copy of the in-memory data.
virtual long keywordSearch (const char *, ibis::bitvector &) const
 Find the given keyword and return the rows.
virtual long keywordSearch (const char *) const
 Return an upper bound on the number of matches.
bool less_than (const column &, uint32_t, uint32_t) const
 Is the ith value of this column less than the jth value of other?
int limit (uint32_t nr)
virtual long patternSearch (const char *) const
 Compute an estimate of the maximum number of possible matches.
virtual long patternSearch (const char *, ibis::bitvector &) const
int restoreCategoriesAsStrings (const ibis::category &)
 Convert the integer representation back to the string representation.
void reverseRows ()
virtual array_t< signed char > * selectBytes (const ibis::bitvector &) const
 Retrieve selected 1-byte integer values.
virtual array_t< double > * selectDoubles (const ibis::bitvector &) const
 Put the selected values into an array as doubles.
virtual array_t< float > * selectFloats (const ibis::bitvector &) const
 Put selected values of a float column into an array.
virtual array_t< int32_t > * selectInts (const ibis::bitvector &) const
 Return selected rows of the column in an array_t object.
virtual array_t< int64_t > * selectLongs (const ibis::bitvector &) const
 Can be called on all integral types.
virtual array_t< int16_t > * selectShorts (const ibis::bitvector &) const
 Return selected rows of the column in an array_t object.
virtual std::vector
< std::string > * 
selectStrings (const bitvector &mask) const
 Output the selected values as strings.
virtual array_t< unsigned char > * selectUBytes (const ibis::bitvector &) const
 Return selected rows of the column in an array_t object.
virtual array_t< uint32_t > * selectUInts (const ibis::bitvector &) const
 Can be called on columns of unsigned integral types, UINT, CATEGORY, USHORT, and UBYTE.
virtual array_t< uint64_t > * selectULongs (const ibis::bitvector &) const
 Return selected rows of the column in an array_t object.
virtual array_t< uint16_t > * selectUShorts (const ibis::bitvector &) const
 Return selected rows of the column in an array_t object.
void setDictionary (const ibis::dictionary *d)
 Assign the dictionary to use.
virtual long stringSearch (const char *, ibis::bitvector &) const
 Locate the strings that match the given string.
virtual long stringSearch (const std::vector< std::string > &, ibis::bitvector &) const
virtual long stringSearch (const char *) const
 Compute an estimate of the maximum number of possible matches.
virtual long stringSearch (const std::vector< std::string > &) const
 Compute an estimate of the maximum number of possible matches.

Static Public Member Functions

template<typename T >
static int addIncoreData (array_t< T > *&to, uint32_t nold, const array_t< T > &from, const T special)
 Append new data (in from) to a larger array (pointed to by to).
static int addStrings (std::vector< std::string > *&, uint32_t, const std::vector< std::string > &)

Protected Member Functions

columnoperator= (const column &)

Protected Attributes

void * buffer
 The in-memory storage.
const ibis::dictionarydic
 A dictionary.

Detailed Description

An in-memory version of ibis::column.

For integers and floating-point values, the buffer (with type void*) points to an ibis::array_t<T> where the type T is designated by the column type. For a string-valued column, the buffer (with type void*) is std::vector<std::string>*.

Note:
Since the in-memory data tables are typically created at run-time through select operations, the data types associated with a column is only known at run-time. Casting to void* is a ugly option; the developers welcome suggestions for a replacement.

Constructor & Destructor Documentation

ibis::bord::column::column ( const ibis::bord tbl,
const ibis::column old,
void *  st 
)

Constructor.

Note:
Transfer the ownership of st to the new column object.

Member Function Documentation

void ibis::bord::column::addCounts ( uint32_t  nr)

Extend the buffer to have nr elements.

All new elements have the value 1U.

References ibis::part::m_name, ibis::array_t< T >::size(), and ibis::UINT.

Referenced by ibis::bord::append().

template<typename T >
int ibis::bord::column::addIncoreData ( array_t< T > *&  to,
uint32_t  nold,
const array_t< T > &  from,
const T  special 
) [static]

Append new data (in from) to a larger array (pointed to by to).

References ibis::array_t< T >::copy(), ibis::array_t< T >::reserve(), and ibis::array_t< T >::size().

long ibis::bord::column::append ( const char *  dt,
const char *  df,
const uint32_t  nold,
const uint32_t  nnew,
uint32_t  nbuf,
char *  buf 
) [virtual]

Append new data in directory df to the end of existing data in dt.

Append the content of file in df to end of file in dt.

It returns the number of rows appended or a negative number to indicate error.

Note:
The directories dt and df can not be same.
This function does not update the mininimum and the maximum of the column.

Reimplemented from ibis::column.

References ibis::bord::append().

Referenced by ibis::bord::append().

long ibis::bord::column::append ( const void *  vals,
const ibis::bitvector msk 
) [virtual]

Append user supplied data to the current column.

The incoming values is carried by a void* which is cast to the same type as the buffer used by the column. The mask is used to indicate which values in the incoming array are valid.

Reimplemented from ibis::column.

References ibis::BYTE, ibis::CATEGORY, ibis::bitvector::cnt(), ibis::DOUBLE, ibis::FLOAT, ibis::gVerbose, ibis::INT, ibis::LONG, ibis::part::m_name, ibis::SHORT, ibis::bitvector::size(), ibis::TEXT, ibis::TYPESTRING, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

long ibis::bord::column::append ( const ibis::column scol,
const ibis::bitvector msk 
) [virtual]

Append selected values from the given column to the current column.

This function extracts the values using the given mask from scol, and then append the values to the current column. The type of scol must be ligitimately converted to the type of this column. It returns the number of values added to the column on success, or a negative number to indicate errors.

References ibis::BYTE, ibis::CATEGORY, ibis::bitvector::cnt(), dic, ibis::DOUBLE, ibis::FLOAT, ibis::gVerbose, ibis::INT, ibis::LONG, ibis::part::m_name, ibis::column::selectBytes(), ibis::column::selectDoubles(), ibis::column::selectFloats(), ibis::column::selectInts(), ibis::column::selectLongs(), ibis::column::selectShorts(), ibis::column::selectStrings(), ibis::column::selectUBytes(), ibis::column::selectUInts(), ibis::column::selectULongs(), ibis::column::selectUShorts(), ibis::SHORT, ibis::bitvector::size(), ibis::TEXT, ibis::TYPESTRING, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

long ibis::bord::column::append ( const ibis::column scol,
const ibis::qContinuousRange cnd 
) [virtual]

Append selected values from the given column to the current column.

This function extracts the values using the given range condition on scol, and then append the values to the current column. The type of scol must be ligitimately converted to the type of this column. It returns 0 to indicate success, a negative number to indicate error.

References ibis::BYTE, ibis::DOUBLE, ibis::FLOAT, ibis::gVerbose, ibis::INT, ibis::LONG, ibis::part::m_name, ibis::column::selectValues(), ibis::SHORT, ibis::TYPESTRING, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

void ibis::bord::column::append ( const void *  c1,
uint32_t  i1 
) [inline]

Append a value.

Note:
The first argument c1 is expected to be an array_t object with data type same as this column.

References ibis::BYTE, ibis::DOUBLE, ibis::FLOAT, ibis::INT, ibis::LONG, ibis::array_t< T >::push_back(), ibis::SHORT, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

void ibis::bord::column::append ( const void *  c1,
uint32_t  i1,
const void *  c2,
uint32_t  i2,
ibis::selectClause::AGREGADO  agg 
) [inline]

Append the value genenerated from the the operation on the incoming columns.

Note:
Both arguemnt c1 and c2 are expected to array_t objects with the same data type as this column.

References ibis::BYTE, ibis::DOUBLE, ibis::FLOAT, ibis::INT, ibis::LONG, ibis::array_t< T >::push_back(), ibis::SHORT, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

virtual void ibis::bord::column::computeMinMax ( ) [inline, virtual]

Compute the actual min/max values.

It actually goes through all the values. This function reads the data in the active data directory and modifies the member variables to record the actual min/max.

Reimplemented from ibis::column.

References ibis::part::currentDataDir(), ibis::column::lower, ibis::column::thePart, and ibis::column::upper.

Referenced by computeMinMax().

virtual void ibis::bord::column::computeMinMax ( const char *  dir) [inline, virtual]

Compute the actual min/max values.

It actually goes through all the values. This function reads the data in the given directory and modifies the member variables to record the actual min/max.

Reimplemented from ibis::column.

References computeMinMax(), ibis::column::lower, and ibis::column::upper.

void ibis::bord::column::computeMinMax ( const char *  dir,
double &  min,
double &  max 
) const [virtual]

Compute the actual min/max of the data in directory dir.

Report the actual min/max found back through output arguments min and max. This version does not modify the min/max recorded in this column object.

Reimplemented from ibis::column.

References ibis::column::actualMinMax(), ibis::BYTE, ibis::DOUBLE, ibis::FLOAT, ibis::gVerbose, ibis::INT, ibis::LONG, ibis::part::m_name, ibis::SHORT, ibis::TYPESTRING, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

long ibis::bord::column::evaluateRange ( const ibis::qContinuousRange cmp,
const ibis::bitvector mask,
ibis::bitvector low 
) const [virtual]

Compute the exact answer.

Attempts to use the index if one is available, otherwise use the base data.

Return a negative value to indicate error, 0 to indicate no hit, and positive value to indicate there are zero or more hits.

Reimplemented from ibis::column.

References ibis::bitvector::adjustSize(), ibis::BYTE, ibis::part::doScan(), ibis::DOUBLE, ibis::FLOAT, ibis::gVerbose, ibis::INT, ibis::LONG, ibis::part::m_name, ibis::SHORT, ibis::TYPESTRING, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

Return the dictionary associated with the column.

A dictionary is associated with the column originally stored as ibis::category, but has been converted to be an integer column of type ibis::UINT.

References dic.

Referenced by ibis::bord::bord(), ibis::bord::copyColumn(), ibis::bord::cursor::cursor(), ibis::bord::groupbya(), ibis::jNatural::select(), ibis::jRange::select(), and ibis::bord::xgroupby().

Retrieve the raw data buffer as an ibis::fileManager::storage.

Since this function exposes the internal storage representation, it should not be relied upon for general uses. This is mostly a convenience thing for FastBit internal development!

Note:
Only fix-sized columns are stored using ibis::fileManager::storage objects. It will return a nil pointer for string-valued columns.

Reimplemented from ibis::column.

References ibis::BYTE, ibis::DOUBLE, ibis::FLOAT, ibis::INT, ibis::LONG, ibis::OID, ibis::SHORT, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

void ibis::bord::column::getString ( uint32_t  i,
std::string &  val 
) const [virtual]

Return the string at the ith row.

If the raw data is not present, but a dictionary is present, then this function return the string value corresponding to the integer value i. Note that this fall-back option does not conform to the original intention of this function.

Reimplemented from ibis::column.

References ibis::CATEGORY, and ibis::TEXT.

int ibis::bord::column::getValuesArray ( void *  vals) const [virtual]

Makes a copy of the in-memory data.

Uses a shallow copy for ibis::array_t objects, but a deap copy for the string values.

Reimplemented from ibis::column.

References ibis::BYTE, ibis::CATEGORY, ibis::util::copy(), ibis::DOUBLE, ibis::FLOAT, ibis::gVerbose, ibis::INT, ibis::LONG, ibis::part::m_name, ibis::SHORT, ibis::TEXT, ibis::TYPESTRING, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

long ibis::bord::column::patternSearch ( const char *  pat) const [virtual]

Compute an estimate of the maximum number of possible matches.

This is a trivial implementation that does not actually perform any meaningful checks. It simply returns the number of strings in memory as the estimate.

Reimplemented from ibis::column.

References ibis::CATEGORY, ibis::gVerbose, ibis::part::m_name, ibis::array_t< T >::size(), ibis::TEXT, and ibis::TYPESTRING.

Convert the integer representation back to the string representation.

The existing data type must be ibis::UINT and the column with the same in in the given ibis::part prt must be of type ibis::CATEGORY.

References ibis::CATEGORY, ibis::category::getString(), ibis::array_t< T >::size(), and ibis::UINT.

ibis::array_t< signed char > * ibis::bord::column::selectBytes ( const ibis::bitvector mask) const [virtual]

Retrieve selected 1-byte integer values.

Note that unsigned integers are simply treated as signed integers.

Note:
The caller is responsible for freeing the returned array from any of the selectTypes functions.

Reimplemented from ibis::column.

References ibis::BYTE, ibis::bitvector::cnt(), ibis::horometer::CPUTime(), ibis::gVerbose, ibis::bitvector::indexSet::indices(), ibis::bitvector::indexSet::isRange(), ibis::bitvector::indexSet::nIndices(), ibis::horometer::realTime(), ibis::array_t< T >::resize(), ibis::array_t< T >::size(), ibis::bitvector::size(), ibis::horometer::start(), ibis::horometer::stop(), and ibis::array_t< T >::swap().

Referenced by ibis::bord::backup().

ibis::array_t< double > * ibis::bord::column::selectDoubles ( const ibis::bitvector mask) const [virtual]

Put the selected values into an array as doubles.

Note:
Any column type could be selected as doubles. Other selectXXs function only work on the same data type. This is the only function that allows one to convert to a different type. This is mainly to

Reimplemented from ibis::column.

References ibis::BYTE, ibis::CATEGORY, ibis::bitvector::cnt(), ibis::horometer::CPUTime(), ibis::DOUBLE, ibis::FLOAT, ibis::gVerbose, ibis::bitvector::indexSet::indices(), ibis::INT, ibis::bitvector::indexSet::isRange(), ibis::bitvector::indexSet::nIndices(), ibis::horometer::realTime(), ibis::array_t< T >::resize(), ibis::SHORT, ibis::array_t< T >::size(), ibis::bitvector::size(), ibis::horometer::start(), ibis::horometer::stop(), ibis::array_t< T >::swap(), ibis::UBYTE, ibis::UINT, and ibis::USHORT.

Referenced by ibis::bord::backup().

ibis::array_t< int32_t > * ibis::bord::column::selectInts ( const ibis::bitvector mask) const [virtual]
ibis::array_t< int64_t > * ibis::bord::column::selectLongs ( const ibis::bitvector mask) const [virtual]

Can be called on all integral types.

Note that 64-byte unsigned integers are simply treated as signed integer. This may cause the values to be interperted incorrectly. Shorter version of unsigned integers are treated correctly as positive values.

Reimplemented from ibis::column.

References ibis::BYTE, ibis::CATEGORY, ibis::bitvector::cnt(), ibis::horometer::CPUTime(), ibis::gVerbose, ibis::bitvector::indexSet::indices(), ibis::INT, ibis::bitvector::indexSet::isRange(), ibis::LONG, ibis::bitvector::indexSet::nIndices(), ibis::horometer::realTime(), ibis::array_t< T >::resize(), ibis::SHORT, ibis::array_t< T >::size(), ibis::bitvector::size(), ibis::horometer::start(), ibis::horometer::stop(), ibis::array_t< T >::swap(), ibis::TEXT, ibis::UBYTE, ibis::UINT, and ibis::USHORT.

Referenced by ibis::bord::backup().

ibis::array_t< int16_t > * ibis::bord::column::selectShorts ( const ibis::bitvector mask) const [virtual]

Return selected rows of the column in an array_t object.

Can convert all integers 2-byte or less in length. Note that unsigned integers are simply treated as signed integers. Shoter types of signed integers are treated correctly as positive values.

Note:
The caller is responsible for freeing the returned array from any of the selectTypes functions.

Reimplemented from ibis::column.

References ibis::BYTE, ibis::bitvector::cnt(), ibis::horometer::CPUTime(), ibis::gVerbose, ibis::bitvector::indexSet::indices(), ibis::bitvector::indexSet::isRange(), ibis::bitvector::indexSet::nIndices(), ibis::horometer::realTime(), ibis::array_t< T >::resize(), ibis::SHORT, ibis::array_t< T >::size(), ibis::bitvector::size(), ibis::horometer::start(), ibis::horometer::stop(), ibis::array_t< T >::swap(), and ibis::UBYTE.

Referenced by ibis::bord::backup().

std::vector< std::string > * ibis::bord::column::selectStrings ( const bitvector mask) const [virtual]
ibis::array_t< unsigned char > * ibis::bord::column::selectUBytes ( const ibis::bitvector mask) const [virtual]
ibis::array_t< uint32_t > * ibis::bord::column::selectUInts ( const ibis::bitvector mask) const [virtual]
ibis::array_t< uint64_t > * ibis::bord::column::selectULongs ( const ibis::bitvector mask) const [virtual]
ibis::array_t< uint16_t > * ibis::bord::column::selectUShorts ( const ibis::bitvector mask) const [virtual]
long ibis::bord::column::stringSearch ( const char *  str,
ibis::bitvector hits 
) const [virtual]

Locate the strings that match the given string.

The comaprison is case sensitive. If the incoming strign is a nil pointer, it matches nothing.

Reimplemented from ibis::column.

References ibis::bitvector::adjustSize(), ibis::CATEGORY, ibis::bitvector::clear(), ibis::bitvector::cnt(), ibis::gVerbose, ibis::part::m_name, ibis::bitvector::set(), ibis::bitvector::setBit(), ibis::array_t< T >::size(), ibis::TEXT, and ibis::TYPESTRING.

long ibis::bord::column::stringSearch ( const char *  str) const [virtual]

Compute an estimate of the maximum number of possible matches.

This is a trivial implementation that does not actually perform any meaningful checks. It simply returns the number of strings in memory as the estimate.

Reimplemented from ibis::column.

References ibis::CATEGORY, ibis::gVerbose, ibis::part::m_name, ibis::array_t< T >::size(), ibis::TEXT, and ibis::TYPESTRING.

long ibis::bord::column::stringSearch ( const std::vector< std::string > &  str) const [virtual]

Compute an estimate of the maximum number of possible matches.

This is a trivial implementation that does not actually perform any meaningful checks. It simply returns the number of strings in memory as the estimate.

Reimplemented from ibis::column.

References ibis::CATEGORY, ibis::gVerbose, ibis::part::m_name, ibis::array_t< T >::size(), ibis::TEXT, and ibis::TYPESTRING.


Member Data Documentation

void* ibis::bord::column::buffer [protected]

The in-memory storage.

A pointer to an array<T> or std::vector<std::string> depending on data type.

Referenced by column(), equal_to(), and less_than().

A dictionary.

This dictionary was originally associated with an ibis::category, but has been coverted through ibis::bundle as ibis::UINT.

Referenced by append(), getDictionary(), and setDictionary().


The documentation for this class was generated from the following files:

Make It A Bit Faster
Contact us
Disclaimers
FastBit source code
FastBit mailing list archive