Classes | Public Member Functions | Protected Member Functions | Protected Attributes | Friends
ibis::column Class Reference

The class to represent a column of a data partition. More...

#include <column.h>

Inheritance diagram for ibis::column:
ibis::blob ibis::bord::column ibis::text ibis::category

List of all members.

Classes

class  indexLock
 A class for controlling access of the index object of a column. More...
class  info
 Some basic information about a column. More...
class  mutexLock
 Provide a mutual exclusion lock on an ibis::column. More...
class  readLock
 Provide a write lock on a ibis::column object. More...
class  softWriteLock
 Provide a write lock on a ibis::column object. More...
class  writeLock
 Provide a write lock on a ibis::column object. More...

Public Member Functions

virtual long append (const char *dt, const char *df, const uint32_t nold, const uint32_t nnew, uint32_t nbuf, char *buf)
 Append new data in directory df to the end of existing data in dt.
virtual long append (const void *vals, const ibis::bitvector &msk)
 Append the records in vals to the current working dataset.
void binWeights (std::vector< uint32_t > &) const
 Retrive the number of rows in each bin.
template<typename T >
long castAndWrite (const array_t< double > &vals, ibis::bitvector &mask, const T special)
 Cast the incoming array into the specified type T before writing the values to the file for this column.
 column (const part *tbl, FILE *file)
 Reconstitute a column from the content of a file.
 column (const part *tbl, ibis::TYPE_T t, const char *name, const char *desc="", double low=DBL_MAX, double high=-DBL_MAX)
 Construct a new column of specified type.
 column (const column &rhs)
 copy constructor
virtual void computeMinMax ()
 Compute the actual min/max values.
virtual void computeMinMax (const char *dir)
 Compute the actual min/max values.
virtual void computeMinMax (const char *dir, double &min, double &max) const
 Compute the actual min/max of the data in directory dir.
int contractRange (ibis::qContinuousRange &rng) const
 Contract the range expression so that the new range falls exactly on the bin boundaries.
const char * dataFileName (std::string &fname, const char *dir=0) const
 Name of the data file in the given data directory.
const char * description () const
 Description of the column. Can be an arbitrary string.
void description (const char *d)
int elementSize () const
 Size of a data element in bytes.
virtual double estimateCost (const ibis::qContinuousRange &cmp) const
 Estimate the cost of evaluating the query expression.
virtual double estimateCost (const ibis::qDiscreteRange &cmp) const
 Estimate the cost of evaluating a dicreate range expression.
virtual double estimateCost (const ibis::qIntHod &cmp) const
 Estimate the cost of evaluating a dicreate range expression.
virtual double estimateCost (const ibis::qUIntHod &cmp) const
 Estimate the cost of evaluating a dicreate range expression.
virtual double estimateCost (const ibis::qString &) const
 Estimate the cost of evaluating a string lookup.
virtual double estimateCost (const ibis::qMultiString &) const
 Estimate the cost of looking up a group of strings.
virtual long estimateRange (const ibis::qContinuousRange &cmp, ibis::bitvector &low, ibis::bitvector &high) const
 Compute a lower bound and an upper bound on the number of hits using the bitmap index.
virtual long estimateRange (const ibis::qDiscreteRange &cmp, ibis::bitvector &low, ibis::bitvector &high) const
 Compute a lower bound and an upper bound for hits.
virtual long estimateRange (const ibis::qIntHod &cmp, ibis::bitvector &low, ibis::bitvector &high) const
 Compute a lower bound and an upper bound for hits.
virtual long estimateRange (const ibis::qUIntHod &cmp, ibis::bitvector &low, ibis::bitvector &high) const
 Compute a lower bound and an upper bound for hits.
virtual long estimateRange (const ibis::qContinuousRange &cmp) const
 Use the index of the column to compute an upper bound on the number of hits.
virtual long estimateRange (const ibis::qDiscreteRange &cmp) const
virtual long estimateRange (const ibis::qIntHod &cmp) const
 Compute an upper bound on the number of hits.
virtual long estimateRange (const ibis::qUIntHod &cmp) const
 Compute an upper bound on the number of hits.
virtual long evaluateAndSelect (const ibis::qContinuousRange &, const ibis::bitvector &, void *, ibis::bitvector &) const
 Evaluate a range condition and retrieve the selected values.
virtual long evaluateRange (const ibis::qContinuousRange &cmp, const ibis::bitvector &mask, ibis::bitvector &res) const
 Compute the exact answer.
virtual long evaluateRange (const ibis::qDiscreteRange &cmp, const ibis::bitvector &mask, ibis::bitvector &res) const
 Compute the exact answer to a discrete range expression.
virtual long evaluateRange (const ibis::qIntHod &cmp, const ibis::bitvector &mask, ibis::bitvector &res) const
 Compute the exact answer to a discrete range expression.
virtual long evaluateRange (const ibis::qUIntHod &cmp, const ibis::bitvector &mask, ibis::bitvector &res) const
 Compute the exact answer to a discrete range expression.
int expandRange (ibis::qContinuousRange &rng) const
 Expand the range expression so that the new range falls exactly on the bin boundaries.
virtual const char * findString (const char *) const
 Determine if the input string has appeared in this data partition.
array_t< double > * getDoubleArray () const
 Return all rows of the column as an array_t object.
array_t< float > * getFloatArray () const
 Return all rows of the column as an array_t object.
array_t< int32_t > * getIntArray () const
 Return all rows of the column as an array_t object.
void getNullMask (bitvector &mask) const
 If there is a null mask stored already, return a shallow copy of it in mask.
virtual
ibis::fileManager::storage
getRawData () const
 Return the content of base data file as a storage object.
virtual void getString (uint32_t, std::string &) const
 Return the string value for the ith row.
virtual float getUndecidable (const ibis::qContinuousRange &cmp, ibis::bitvector &iffy) const
 Compute the locations of the rows can not be decided by the index.
virtual float getUndecidable (const ibis::qDiscreteRange &cmp, ibis::bitvector &iffy) const
 Find rows that can not be decided with the existing index.
virtual float getUndecidable (const ibis::qIntHod &cmp, ibis::bitvector &iffy) const
 Find rows that can not be decided with the existing index.
virtual float getUndecidable (const ibis::qUIntHod &cmp, ibis::bitvector &iffy) const
 Find rows that can not be decided with the existing index.
virtual int getValuesArray (void *vals) const
 Copy all rows of the column into an array_t object.
uint32_t indexedRows () const
 Compute the number of rows captured by the index of this column.
virtual long indexSize () const
 Compute the index size (in bytes).
const char * indexSpec () const
 Retrieve the index specification.
void indexSpec (const char *spec)
 Set the index specification.
void indexSpeedTest () const
 Perform a set of built-in tests to determine the speed of common operations.
bool isFloat () const
 Are they floating-point values?
bool isInteger () const
 Are they integer values?
bool isNumeric () const
 Are they numberical values?
bool isSignedInteger () const
 Are they signed integer values?
bool isSorted () const
 Are the values sorted?
void isSorted (bool)
 Change the flag m_sorted.
bool isUnsignedInteger () const
 Are they unsigned integer values?
virtual long keywordSearch (const char *, ibis::bitvector &) const
virtual long keywordSearch (const char *) const
virtual void loadIndex (const char *iopt=0, int ropt=0) const throw ()
 Load the index associated with the column.
void logMessage (const char *event, const char *fmt,...) const
 Log messages using printf syntax.
void logWarning (const char *event, const char *fmt,...) const
 Log warming message using printf syntax.
const double & lowerBound () const
 The lower bound of the values.
void lowerBound (double d)
const char * name () const
 Name of the column.
void name (const char *nm)
 Rename the column.
const char * nullMaskName (std::string &fname) const
 Name of the NULL mask file.
uint32_t numBins () const
 Retrieve the number of bins used.
const partpartition () const
virtual long patternSearch (const char *) const
virtual long patternSearch (const char *, ibis::bitvector &) const
void preferredBounds (std::vector< double > &) const
 Retrive the bin boundaries if the index currently in use.
virtual void print (std::ostream &out) const
 Print some basic infomation about this column.
void purgeIndexFile (const char *dir=0) const
 Purge the index files assocated with the current column.
virtual long saveSelected (const ibis::bitvector &sel, const char *dest, char *buf, uint32_t nbuf)
 Write the selected records to the specified directory.
virtual array_t< signed char > * selectBytes (const bitvector &mask) const
 Retrieve selected 1-byte integer values.
virtual array_t< double > * selectDoubles (const bitvector &mask) const
 Put the selected values into an array as doubles.
virtual array_t< float > * selectFloats (const bitvector &mask) const
 Put selected values of a float column into an array.
virtual array_t< int32_t > * selectInts (const bitvector &mask) const
 Return selected rows of the column in an array_t object.
virtual array_t< int64_t > * selectLongs (const bitvector &mask) const
 Return selected rows of the column in an array_t object.
virtual array_t< int16_t > * selectShorts (const bitvector &mask) const
 Return selected rows of the column in an array_t object.
virtual std::vector
< std::string > * 
selectStrings (const bitvector &mask) const
 Return the selected rows as strings.
virtual array_t< unsigned char > * selectUBytes (const bitvector &mask) const
 Return selected rows of the column in an array_t object.
virtual array_t< uint32_t > * selectUInts (const bitvector &mask) const
 Return selected rows of the column in an array_t object.
virtual array_t< uint64_t > * selectULongs (const bitvector &mask) const
 Return selected rows of the column in an array_t object.
virtual array_t< uint16_t > * selectUShorts (const bitvector &mask) const
 Return selected rows of the column in an array_t object.
long selectValues (const bitvector &, void *) const
 Return selected rows of the column in an array_t object.
long selectValues (const bitvector &, void *, array_t< uint32_t > &) const
 Return selected rows of the column in an array_t object along with their positions.
long selectValues (const ibis::qContinuousRange &, void *) const
 Select the values satisfying the specified range condition.
int setNullMask (const bitvector &)
 Change the null mask to the user specified one.
virtual long stringSearch (const char *, ibis::bitvector &) const
virtual long stringSearch (const std::vector< std::string > &, ibis::bitvector &) const
virtual long stringSearch (const char *) const
virtual long stringSearch (const std::vector< std::string > &) const
virtual long truncateData (const char *dir, uint32_t nent, ibis::bitvector &mask) const
 Truncate the number of records in the named dir to nent.
ibis::TYPE_T type () const
 Type of the data.
virtual void unloadIndex () const
 Unload the index associated with the column.
const double & upperBound () const
 The upper bound of the values.
void upperBound (double d)
virtual void write (FILE *file) const
 Write the metadata entry.
virtual long writeData (const char *dir, uint32_t nold, uint32_t nnew, ibis::bitvector &mask, const void *va1, void *va2=0)
 Write the content in array va1 to directory dir.
virtual double getActualMin () const
 A group of functions to compute some basic statistics for the attribute values.
virtual double getActualMax () const
 Compute the actual maximum value by reading the data or examining the index.
virtual double getSum () const
 Compute the sum of all values by reading the data.
long getCumulativeDistribution (std::vector< double > &bounds, std::vector< uint32_t > &counts) const
 Compute the actual data distribution.
long getDistribution (std::vector< double > &bbs, std::vector< uint32_t > &counts) const
 Count the number of records in each bin.

Protected Member Functions

void actualMinMax (const char *fname, const ibis::bitvector &mask, double &min, double &max) const
 Given the name of the data file, compute the actual minimum and the maximum value.
template<typename T >
void actualMinMax (const array_t< T > &vals, const ibis::bitvector &mask, double &min, double &max) const
 Compute the minimum and maximum of the values in the array.
long appendStrings (const std::vector< std::string > &, const ibis::bitvector &)
 Append the strings to the current data.
template<typename T >
long appendValues (const array_t< T > &, const ibis::bitvector &)
 Append the content of incoming array to the current data.
double computeMax () const
 Read the base data to compute the maximum value.
template<typename T >
computeMax (const array_t< T > &vals, const ibis::bitvector &mask) const
 Compute the maximum value in the array.
double computeMin () const
 Read the data values and compute the minimum value.
template<typename T >
computeMin (const array_t< T > &vals, const ibis::bitvector &mask) const
 Compute the minimum value in the array.
double computeSum () const
 Read the base data to compute the total sum.
template<typename T >
double computeSum (const array_t< T > &vals, const ibis::bitvector &mask) const
 Compute the sum of values in the array.
template<typename T >
uint32_t findLower (int fdes, const uint32_t nr, const T tgt) const
 Find the smallest value >= tgt.
template<typename T >
uint32_t findUpper (int fdes, const uint32_t nr, const T tgt) const
 Find the smallest value > tgt.
void logError (const char *event, const char *fmt,...) const
 Print messages started with "Error" and throw a string exception.
virtual int searchSorted (const ibis::qContinuousRange &, ibis::bitvector &) const
 Resolve a continuous range condition on a sorted column.
virtual int searchSorted (const ibis::qDiscreteRange &, ibis::bitvector &) const
 Resolve a discrete range condition on a sorted column.
virtual int searchSorted (const ibis::qIntHod &, ibis::bitvector &) const
 Resolve a discrete range condition on a sorted column.
virtual int searchSorted (const ibis::qUIntHod &, ibis::bitvector &) const
 Resolve a discrete range condition on a sorted column.
template<typename T >
int searchSortedICC (const array_t< T > &vals, const ibis::qContinuousRange &rng, ibis::bitvector &hits) const
 Resolve a continuous range condition on an array of values.
template<typename T >
int searchSortedICD (const array_t< T > &vals, const ibis::qDiscreteRange &rng, ibis::bitvector &hits) const
 Resolve a discrete range condition on an array of values.
template<typename T >
int searchSortedICD (const array_t< T > &vals, const ibis::qIntHod &rng, ibis::bitvector &hits) const
 Resolve a discrete range condition on an array of values.
template<typename T >
int searchSortedICD (const array_t< T > &vals, const ibis::qUIntHod &rng, ibis::bitvector &hits) const
 Resolve a discrete range condition on an array of values.
template<typename T >
int searchSortedOOCC (const char *fname, const ibis::qContinuousRange &rng, ibis::bitvector &hits) const
 Resolve a continuous range condition using file operations.
template<typename T >
int searchSortedOOCD (const char *fname, const ibis::qDiscreteRange &rng, ibis::bitvector &hits) const
 Resolve a discrete range condition using file operations.
template<typename T >
int searchSortedOOCD (const char *fname, const ibis::qIntHod &rng, ibis::bitvector &hits) const
 Resolve a discrete range condition using file operations.
template<typename T >
int searchSortedOOCD (const char *fname, const ibis::qUIntHod &rng, ibis::bitvector &hits) const
 Resolve a discrete range condition using file operations.
template<typename T >
long selectToStrings (const char *, const bitvector &, std::vector< std::string > &) const
 Extract the values masked 1 and convert them to strings.
template<typename T >
long selectValuesT (const char *, const bitvector &, array_t< T > &) const
 Select values marked in the bitvector mask.
template<typename T >
long selectValuesT (const char *, const bitvector &mask, array_t< T > &vals, array_t< uint32_t > &inds) const
 Select the values marked in the bitvector mask.
long string2int (int fptr, dictionary &dic, uint32_t nbuf, char *buf, array_t< uint32_t > &out) const
 Convert strings in the opened file to a list of integers with the aid of a dictionary.

Protected Attributes

ibis::indexidx
 The index for this column. It is not consider as a must-have member.
ibis::util::sharedInt32 idxcnt
 The number of functions using the index.
double lower
 The minimum value.
std::string m_bins
 Index/binning specification.
std::string m_desc
 Free-form description of the column.
std::string m_name
 Name of the column.
bool m_sorted
 Are the column values in ascending order?
ibis::TYPE_T m_type
 The entries marked 1 are valid.
ibis::bitvector mask_
const partthePart
 Data partition containing this column.
double upper
 The maximum value.

Friends

class indexLock
class mutexLock
class readLock
class softWriteLock
class writeLock

Detailed Description

The class to represent a column of a data partition.

FastBit represents user data as tables (each table may be divided into multiple partitions) where each table consists of a number of columns. Internally, the data values for each column is stored separated from others. In relational algebra terms, this is equivalent to projecting out each attribute of a relation separately. It increases the efficiency of searching on relatively small number of attributes compared to the horizontal data organization used in typical relational database systems.


Constructor & Destructor Documentation

ibis::column::column ( const part tbl,
FILE *  file 
)

Reconstitute a column from the content of a file.

Read the basic information about a column from file.

Note:
Assume the calling program has read "Begin Property/Column" already.
A well-formed column must have a valid name, i.e., ! m_name.empty().

References ibis::BLOB, ibis::BYTE, ibis::CATEGORY, ibis::DOUBLE, ibis::FLOAT, getString(), ibis::gVerbose, ibis::INT, ibis::resource::isStringTrue(), logMessage(), ibis::LONG, lower, m_bins, m_desc, m_name, m_sorted, m_type, ibis::part::name(), ibis::SHORT, ibis::TEXT, ibis::TYPESTRING, ibis::UBYTE, ibis::UDT, ibis::UINT, ibis::ULONG, ibis::UNKNOWN_TYPE, upper, and ibis::USHORT.

ibis::column::column ( const part tbl,
ibis::TYPE_T  t,
const char *  name,
const char *  desc = "",
double  low = DBL_MAX,
double  high = -DBL_MAX 
)

Construct a new column of specified type.

Construct a new column object based on type and name.

References ibis::gVerbose, m_desc, m_name, m_type, ibis::part::name(), and ibis::TYPESTRING.

ibis::column::column ( const column rhs)

copy constructor

The copy constructor.

Note:
The rwlock can not be copied.
The index is not copied because of reference counting difficulties.

References ibis::gVerbose, m_name, m_type, ibis::part::name(), thePart, and ibis::TYPESTRING.


Member Function Documentation

void ibis::column::actualMinMax ( const char *  name,
const ibis::bitvector mask,
double &  min,
double &  max 
) const [protected]

Given the name of the data file, compute the actual minimum and the maximum value.

Compute the actual minimum and maximum values.

Given a data file name, read its content to compute the actual minimum and the maximum of the data values. Only deal with four types of values, unsigned int, signed int, float and double.

References ibis::BYTE, ibis::DOUBLE, ibis::FLOAT, ibis::fileManager::getFile(), ibis::gVerbose, ibis::fileManager::instance(), ibis::INT, ibis::LONG, ibis::SHORT, ibis::TYPESTRING, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

Referenced by ibis::bord::column::computeMinMax().

long ibis::column::append ( const char *  dt,
const char *  df,
const uint32_t  nold,
const uint32_t  nnew,
uint32_t  nbuf,
char *  buf 
) [virtual]

Append new data in directory df to the end of existing data in dt.

Append the content of file in df to end of file in dt.

It returns the number of rows appended or a negative number to indicate error.

Note:
The directories dt and df can not be same.
This function does not update the mininimum and the maximum of the column.

Reimplemented in ibis::bord::column, ibis::blob, ibis::category, and ibis::text.

References ibis::bitvector::adjustSize(), ibis::index::append(), ibis::bitvector::cnt(), ibis::util::copy(), ibis::index::create(), FASTBIT_DIRSEP, ibis::fileManager::flushFile(), ibis::util::getFileSize(), ibis::index::getNRows(), ibis::gVerbose, ibis::fileManager::instance(), ibis::OID, ibis::index::print(), ibis::bitvector::read(), ibis::bitvector::size(), UnixOpen, ibis::bitvector::write(), and ibis::index::write().

Referenced by ibis::part::appendToBackup().

long ibis::column::append ( const void *  vals,
const ibis::bitvector msk 
) [virtual]

Append the records in vals to the current working dataset.

The 'void*' in this function follows the convention of the function getValuesArray (not writeData), i.e., for the ten fixed-size elementary data types, it is array_t<type>* and for string-valued columns it is std::vector<std::string>*.

Return the number of entries actually written to disk or a negative number to indicate error conditions.

Reimplemented in ibis::bord::column, ibis::blob, ibis::category, and ibis::text.

References ibis::BYTE, ibis::CATEGORY, ibis::DOUBLE, ibis::FLOAT, ibis::gVerbose, ibis::INT, ibis::LONG, ibis::SHORT, ibis::TEXT, ibis::TYPESTRING, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

long ibis::column::appendStrings ( const std::vector< std::string > &  vals,
const ibis::bitvector msk 
) [protected]

Append the strings to the current data.

This function attempts to fill the existing data file with null values based on the content of the validity mask.

It then write strings in vals and extends the validity mask.

References ibis::bitvector::adjustSize(), FASTBIT_DIRSEP, ibis::gVerbose, and UnixOpen.

template<typename T >
long ibis::column::appendValues ( const array_t< T > &  vals,
const ibis::bitvector msk 
) [protected]

Append the content of incoming array to the current data.

This function attempts to fill the data file with NULL values if the existing data file is shorter than expected.

It writes the data in vals and extends the existing validity mask.

References ibis::bitvector::adjustSize(), FASTBIT_DIRSEP, ibis::gVerbose, ibis::array_t< T >::size(), and UnixOpen.

template<typename T >
template long ibis::column::castAndWrite ( const array_t< double > &  vals,
ibis::bitvector mask,
const T  special 
)

Cast the incoming array into the specified type T before writing the values to the file for this column.

This function uses assignment statements to perform the casting operations. Warning: this function does not check that the cast values are equal to the incoming values!

References ibis::bitvector::indexSet::nIndices(), and ibis::bitvector::size().

Referenced by ibis::part::addColumn().

void ibis::column::computeMinMax ( ) [virtual]

Compute the actual min/max values.

It actually goes through all the values. This function reads the data in the active data directory and modifies the member variables to record the actual min/max.

Reimplemented in ibis::bord::column, and ibis::blob.

Referenced by ibis::index::create(), ibis::column::info::info(), ibis::part::quickTest(), and ibis::part::testRangeOperators().

void ibis::column::computeMinMax ( const char *  dir) [virtual]

Compute the actual min/max values.

It actually goes through all the values. This function reads the data in the given directory and modifies the member variables to record the actual min/max.

Reimplemented in ibis::bord::column, and ibis::blob.

void ibis::column::computeMinMax ( const char *  dir,
double &  min,
double &  max 
) const [virtual]

Compute the actual min/max of the data in directory dir.

Report the actual min/max found back through output arguments min and max. This version does not modify the min/max recorded in this column object.

Reimplemented in ibis::bord::column, and ibis::blob.

Contract the range expression so that the new range falls exactly on the bin boundaries.

Referenced by ibis::query::doContract().

const char * ibis::column::dataFileName ( std::string &  fname,
const char *  dir = 0 
) const

Name of the data file in the given data directory.

If the directory name is not given, the directory is assumed to be the current data directory of the data partition. There is no need for the caller to free the pointer returned by this function. Upon successful completion of this function, it returns fname.c_str(); otherwise, it returns th nil pointer.

References FASTBIT_DIRSEP.

Referenced by ibis::part::doScan(), and ibis::part::negativeScan().

long ibis::column::estimateRange ( const ibis::qContinuousRange cmp,
ibis::bitvector low,
ibis::bitvector high 
) const [virtual]

Compute a lower bound and an upper bound on the number of hits using the bitmap index.

If no index is available a new one will be built. If no index can be built, the lower bound will contain nothing and the the upper bound will contain everything. The two bounds are returned as bitmaps which marked the qualified rows as one, where the lower bound is stored in 'low' and the upper bound is stored in 'high'. If the bitvector 'high' has less bits than 'low', the bitvector 'low' is assumed to have an exact solution. This function always returns zero (0).

References ibis::bitvector::adjustSize(), ibis::bitvector::copy(), ibis::gVerbose, ibis::bitvector::set(), and ibis::bitvector::size().

Referenced by ibis::part::doScan(), ibis::part::estimateRange(), and ibis::part::negativeScan().

long ibis::column::estimateRange ( const ibis::qDiscreteRange cmp,
ibis::bitvector low,
ibis::bitvector high 
) const [virtual]

Compute a lower bound and an upper bound for hits.

Compute an upper bound on the number of hits.

Estimating hits for a discrete range is actually done with evaluateRange.

References ibis::bitvector::clear().

long ibis::column::estimateRange ( const ibis::qIntHod cmp,
ibis::bitvector low,
ibis::bitvector high 
) const [virtual]

Compute a lower bound and an upper bound for hits.

Estimating hits for a discrete range.

Does nothing useful in this implementation.

References ibis::bitvector::copy(), ibis::bitvector::set(), and ibis::bitvector::sloppyCount().

long ibis::column::estimateRange ( const ibis::qUIntHod cmp,
ibis::bitvector low,
ibis::bitvector high 
) const [virtual]

Compute a lower bound and an upper bound for hits.

Estimating hits for a discrete range. Does nothing in this implementation.

References ibis::bitvector::copy(), ibis::bitvector::set(), and ibis::bitvector::sloppyCount().

long ibis::column::estimateRange ( const ibis::qContinuousRange cmp) const [virtual]

Use the index of the column to compute an upper bound on the number of hits.

If no index can be computed, it will return the number of rows as the upper bound.

long ibis::column::estimateRange ( const ibis::qIntHod cmp) const [virtual]

Compute an upper bound on the number of hits.

A dummy function to estimate the number of possible hits.

It always returns the number of rows in the data partition.

long ibis::column::estimateRange ( const ibis::qUIntHod cmp) const [virtual]

Compute an upper bound on the number of hits.

A dummy function to estimate the number of possible hits.

It always returns the number of rows in the data partition.

long ibis::column::evaluateAndSelect ( const ibis::qContinuousRange cmp,
const ibis::bitvector mask,
void *  vals,
ibis::bitvector low 
) const [virtual]

Evaluate a range condition and retrieve the selected values.

This is a combination of evaluateRange and selectTypes. This combination allows some optimizations to reduce the I/O operations.

Note the fourth argument vals must be valid pointer to the correct type. The acceptable types are as follows (same as required by in-memory data partitions):

  • if the column type has a fixed size such as integers and floating-point values, vals must be a pointer to an ibis::array_t with the matching integer type or floating-point type
  • if the column type is one of the string values, such as TEXT or CATEGORY, vals must be a pointer to std::vector<std::string>.

If vals is a nil pointer, this function simply calls evaluateRange.

References ibis::bitvector::clear(), ibis::bitvector::cnt(), ibis::gVerbose, ibis::OID, ibis::bitvector::set(), ibis::bitvector::size(), ibis::TEXT, and ibis::TYPESTRING.

long ibis::column::evaluateRange ( const ibis::qContinuousRange cmp,
const ibis::bitvector mask,
ibis::bitvector low 
) const [virtual]

Compute the exact answer.

Attempts to use the index if one is available, otherwise use the base data.

Return a negative value to indicate error, 0 to indicate no hit, and positive value to indicate there are zero or more hits.

Reimplemented in ibis::bord::column.

References ibis::bitvector::adjustSize(), ibis::bitvector::clear(), ibis::bitvector::cnt(), ibis::bitvector::copy(), ibis::util::envLock, ibis::bitvector::flip(), ibis::gVerbose, ibis::fileManager::iBeat(), ibis::OID, ibis::bitvector::set(), ibis::bitvector::size(), ibis::bitvector::sloppyCount(), ibis::TEXT, and ibis::TYPESTRING.

Referenced by ibis::part::evaluateRange().

Expand the range expression so that the new range falls exactly on the bin boundaries.

Referenced by ibis::query::doExpand().

template<typename T >
uint32_t ibis::column::findLower ( int  fdes,
const uint32_t  nr,
const T  tgt 
) const [protected]

Find the smallest value >= tgt.

An equivalent of array_t<T>::find.

It reads the open file one word at a time and therefore is likely to be very slow.

References ibis::gVerbose, ibis::fileManager::instance(), and ibis::fileManager::recordPages().

virtual const char* ibis::column::findString ( const char *  ) const [inline, virtual]

Determine if the input string has appeared in this data partition.

If yes, return the pointer to the incoming string, otherwise return nil.

Reimplemented in ibis::text.

template<typename T >
uint32_t ibis::column::findUpper ( int  fdes,
const uint32_t  nr,
const T  tgt 
) const [protected]

Find the smallest value > tgt.

An equivalent of array_t<T>::find_upper.

It reads the open file one word at a time and therefore is likely to be very slow.

References ibis::gVerbose, ibis::fileManager::instance(), and ibis::fileManager::recordPages().

double ibis::column::getActualMax ( ) const [virtual]

Compute the actual maximum value by reading the data or examining the index.

It returns -DBL_MAX in case of error.

Reimplemented in ibis::blob.

Referenced by ibis::whereClause::amplify(), ibis::part::coarsenBins(), ibis::part::get1DDistribution(), ibis::part::get2DDistributionI(), ibis::part::get2DDistributionU(), ibis::part::getActualMax(), and ibis::mensa::getColumnMax().

double ibis::column::getActualMin ( ) const [virtual]

A group of functions to compute some basic statistics for the attribute values.

Compute the actual minimum value by reading the data or examining the index. It returns DBL_MAX in case of error.

Reimplemented in ibis::blob.

Referenced by ibis::whereClause::amplify(), ibis::part::get1DDistribution(), ibis::part::get2DDistributionI(), ibis::part::get2DDistributionU(), ibis::part::getActualMin(), and ibis::mensa::getColumnMin().

long ibis::column::getCumulativeDistribution ( std::vector< double > &  bounds,
std::vector< uint32_t > &  counts 
) const

Compute the actual data distribution.

It will generate an index for the column if one is not already available. The value in cts[i] is the number of values less than bds[i]. If there is no NULL values in the column, the array cts will start with 0 and and end the number of rows in the data. The array bds will end with a value that is greater than the actual maximum value.

long ibis::column::getDistribution ( std::vector< double > &  bbs,
std::vector< uint32_t > &  counts 
) const

Count the number of records in each bin.

The array bins contains bin boundaries that defines the following bins:

    (..., bins[0]) [bins[0], bins[1]) ... [bins.back(), ...).

Because of the two open bins at the end, N bin boundaries defines N+1 bins. The array counts has one more element than bins. This function returns the number of bins. If this function was executed successfully, the return value should be the same as the size of array counts, and one larger than the size of array bbs.

Referenced by ibis::part::get1DDistribution().

ibis::array_t< int32_t > * ibis::column::getIntArray ( ) const

Return all rows of the column as an array_t object.

Caller is responsible for deleting the returned object.

References ibis::fileManager::getFile(), ibis::fileManager::instance(), ibis::INT, and ibis::UINT.

virtual void ibis::column::getString ( uint32_t  ,
std::string &   
) const [inline, virtual]

Return the string value for the ith row.

Only implemented for ibis::text and ibis::category.

See also:
ibis::text

Reimplemented in ibis::bord::column, ibis::category, and ibis::text.

Referenced by column().

float ibis::column::getUndecidable ( const ibis::qContinuousRange cmp,
ibis::bitvector iffy 
) const [virtual]

Compute the locations of the rows can not be decided by the index.

Returns the fraction of rows might satisfy the specified range condition. If no index, nothing can be decided.

Referenced by ibis::part::getUndecidable().

float ibis::column::getUndecidable ( const ibis::qIntHod cmp,
ibis::bitvector iffy 
) const [virtual]

Find rows that can not be decided with the existing index.

A dummy implementation.

It always return 1.0 to indicate everything rows is undecidable.

float ibis::column::getUndecidable ( const ibis::qUIntHod cmp,
ibis::bitvector iffy 
) const [virtual]

Find rows that can not be decided with the existing index.

A dummy implementation.

It always return 1.0 to indicate everything rows is undecidable.

int ibis::column::getValuesArray ( void *  vals) const [virtual]
uint32_t ibis::column::indexedRows ( ) const

Compute the number of rows captured by the index of this column.

This function loads the metadata about the index into memory through ibis::column::indexLock.

long ibis::column::indexSize ( ) const [virtual]

Compute the index size (in bytes).

Return a negative value if the index file does not exist.

Reimplemented in ibis::blob.

References ibis::util::getFileSize().

Referenced by ibis::part::get2DDistribution().

Perform a set of built-in tests to determine the speed of common operations.

void ibis::column::isSorted ( bool  iss)

Change the flag m_sorted.

If the flag m_sorted is set to true, the caller should have sorted the data file. Incorrect flag will lead to wrong answers to queries. This operation invokes a write lock on the column object.

void ibis::column::loadIndex ( const char *  iopt = 0,
int  ropt = 0 
) const throw () [virtual]

Load the index associated with the column.

Parameters:
ioptThis option is passed to ibis::index::create to be used if a new index is to be created.
roptThis option is passed to ibis::index::create to control the reading operations for reconstitute the index object from an index file.
Note:
Accesses to this function are serialized through a write lock on the column.

Reimplemented in ibis::blob, ibis::category, and ibis::text.

References ibis::index::create(), ibis::index::getMax(), ibis::index::getMin(), ibis::index::getNRows(), ibis::gParameters(), ibis::gVerbose, name(), ibis::index::name(), and ibis::index::print().

Referenced by ibis::bord::bord(), fastbit_build_index(), ibis::column::indexLock::indexLock(), and ibis::text::loadIndex().

const char * ibis::column::nullMaskName ( std::string &  fname) const

Name of the NULL mask file.

On successful completion of this function, the return value is the result of fname.c_str(); otherwise the return value is a nil pointer to indicate error.

References FASTBIT_DIRSEP.

uint32_t ibis::column::numBins ( ) const

Retrieve the number of bins used.

long ibis::column::saveSelected ( const ibis::bitvector sel,
const char *  dest,
char *  buf,
uint32_t  nbuf 
) [virtual]

Write the selected records to the specified directory.

Save only the rows marked 1. Replace the data file in dest. Return the number of rows written to the new file or a negative number to indicate error.

Reimplemented in ibis::text.

References ibis::fileManager::buffer< T >::address(), FASTBIT_DIRSEP, ibis::fileManager::flushFile(), ibis::gVerbose, ibis::fileManager::instance(), ibis::bitvector::indexSet::nIndices(), ibis::fileManager::buffer< T >::size(), and ibis::bitvector::subset().

Referenced by ibis::part::purgeInactive().

template<typename T >
int ibis::column::searchSortedOOCC ( const char *  fname,
const ibis::qContinuousRange rng,
ibis::bitvector hits 
) const [protected]

Resolve a continuous range condition using file operations.

The backup option for searchSortedIC.

This function opens the named file and reads its content one word at a time, which is likely to be very slow. It does assume the content of the file is sorted in ascending order and perform binary searches.

References ibis::bitvector::adjustSize(), ibis::bitvector::appendFill(), ibis::bitvector::clear(), ibis::gVerbose, ibis::fileManager::instance(), ibis::qContinuousRange::leftBound(), ibis::fileManager::recordPages(), ibis::qContinuousRange::rightBound(), ibis::util::round_up(), ibis::bitvector::set(), ibis::bitvector::sloppyCount(), and UnixOpen.

template<typename T >
int ibis::column::searchSortedOOCD ( const char *  fname,
const ibis::qDiscreteRange rng,
ibis::bitvector hits 
) const [protected]

Resolve a discrete range condition using file operations.

This version of search function reads the content of data file through explicit read operations.

It sequentially reads the content of the data file. Note the content of the data file is assumed to be sorted in ascending order as elementary data type T.

References ibis::fileManager::buffer< T >::address(), ibis::bitvector::adjustSize(), ibis::bitvector::clear(), ibis::qDiscreteRange::colName(), ibis::qDiscreteRange::getValues(), ibis::gVerbose, ibis::fileManager::instance(), ibis::fileManager::recordPages(), ibis::bitvector::reserve(), ibis::bitvector::setBit(), ibis::array_t< T >::size(), ibis::fileManager::buffer< T >::size(), and UnixOpen.

template<typename T >
int ibis::column::searchSortedOOCD ( const char *  fname,
const ibis::qIntHod rng,
ibis::bitvector hits 
) const [protected]

Resolve a discrete range condition using file operations.

This version of search function reads the content of data file through explicit read operations.

It sequentially reads the content of the data file. Note the content of the data file is assumed to be sorted in ascending order as elementary data type T.

References ibis::fileManager::buffer< T >::address(), ibis::bitvector::adjustSize(), ibis::bitvector::clear(), ibis::qIntHod::colName(), ibis::qIntHod::getValues(), ibis::gVerbose, ibis::fileManager::instance(), ibis::fileManager::recordPages(), ibis::bitvector::reserve(), ibis::bitvector::setBit(), ibis::array_t< T >::size(), ibis::fileManager::buffer< T >::size(), and UnixOpen.

template<typename T >
int ibis::column::searchSortedOOCD ( const char *  fname,
const ibis::qUIntHod rng,
ibis::bitvector hits 
) const [protected]

Resolve a discrete range condition using file operations.

This version of search function reads the content of data file through explicit read operations.

It sequentially reads the content of the data file. Note the content of the data file is assumed to be sorted in ascending order as elementary data type T.

References ibis::fileManager::buffer< T >::address(), ibis::bitvector::adjustSize(), ibis::bitvector::clear(), ibis::qUIntHod::colName(), ibis::qUIntHod::getValues(), ibis::gVerbose, ibis::fileManager::instance(), ibis::fileManager::recordPages(), ibis::bitvector::reserve(), ibis::bitvector::setBit(), ibis::array_t< T >::size(), ibis::fileManager::buffer< T >::size(), and UnixOpen.

ibis::array_t< signed char > * ibis::column::selectBytes ( const bitvector mask) const [virtual]
ibis::array_t< double > * ibis::column::selectDoubles ( const bitvector mask) const [virtual]

Put the selected values into an array as doubles.

Note:
Any numerical values can be converted to doubles, however for 64-bit integers this conversion may cause lose of precision.
The caller is responsible for freeing the returned array from any of the selectTypes functions.

Reimplemented in ibis::bord::column, and ibis::blob.

References ibis::BYTE, ibis::CATEGORY, ibis::bitvector::cnt(), ibis::horometer::CPUTime(), ibis::DOUBLE, ibis::FLOAT, ibis::fileManager::getFile(), ibis::gVerbose, ibis::bitvector::indexSet::indices(), ibis::fileManager::instance(), ibis::INT, ibis::bitvector::indexSet::isRange(), ibis::bitvector::indexSet::nIndices(), ibis::horometer::realTime(), ibis::SHORT, ibis::array_t< T >::size(), ibis::bitvector::size(), ibis::horometer::start(), ibis::horometer::stop(), ibis::UBYTE, ibis::UINT, and ibis::USHORT.

Referenced by ibis::bord::column::append(), ibis::part::fill2DBins2(), ibis::part::fill2DBinsWeighted2(), ibis::part::fill3DBins2(), ibis::part::fill3DBins3(), ibis::part::fill3DBinsWeighted2(), ibis::part::fill3DBinsWeighted3(), ibis::part::get1DBins(), ibis::part::get1DBins_(), ibis::part::get1DDistribution(), ibis::part::get2DBins(), ibis::part::get2DDistribution(), ibis::part::get2DDistributionA(), ibis::part::get2DDistributionU(), ibis::part::get3DBins(), ibis::part::get3DDistribution(), ibis::part::get3DDistributionA(), ibis::part::get3DDistributionA1(), ibis::part::get3DDistributionA2(), ibis::part::getCumulativeDistribution(), ibis::part::getDistribution(), ibis::part::getJointDistribution(), ibis::part::old2DDistribution(), and ibis::part::selectDoubles().

ibis::array_t< float > * ibis::column::selectFloats ( const bitvector mask) const [virtual]

Put selected values of a float column into an array.

Note:
Only performs safe conversion. Conversions from 32-bit integers, 64-bit integers and 64-bit floating-point values are not allowed. A nil array will be returned if the current column can not be converted.
The caller is responsible for freeing the returned array from any of the selectTypes functions.

Reimplemented in ibis::bord::column, and ibis::blob.

References ibis::BYTE, ibis::bitvector::cnt(), ibis::horometer::CPUTime(), ibis::FLOAT, ibis::fileManager::getFile(), ibis::gVerbose, ibis::bitvector::indexSet::indices(), ibis::fileManager::instance(), ibis::bitvector::indexSet::isRange(), ibis::bitvector::indexSet::nIndices(), ibis::horometer::realTime(), ibis::SHORT, ibis::array_t< T >::size(), ibis::bitvector::size(), ibis::horometer::start(), ibis::horometer::stop(), ibis::UBYTE, and ibis::USHORT.

Referenced by ibis::bord::column::append(), ibis::part::fill2DBins2(), ibis::part::fill2DBinsWeighted2(), ibis::part::fill3DBins2(), ibis::part::fill3DBins3(), ibis::part::fill3DBinsWeighted2(), ibis::part::fill3DBinsWeighted3(), ibis::part::get1DBins(), ibis::part::get1DBins_(), ibis::part::get1DDistribution(), ibis::part::get2DBins(), ibis::part::get2DDistribution(), ibis::part::get2DDistributionA(), ibis::part::get2DDistributionU(), ibis::part::get3DBins(), ibis::part::get3DDistribution(), ibis::part::get3DDistributionA(), ibis::part::get3DDistributionA1(), ibis::part::get3DDistributionA2(), ibis::part::getCumulativeDistribution(), ibis::part::getDistribution(), ibis::part::getJointDistribution(), ibis::part::old2DDistribution(), and ibis::part::selectFloats().

ibis::array_t< int32_t > * ibis::column::selectInts ( const bitvector mask) const [virtual]
ibis::array_t< int64_t > * ibis::column::selectLongs ( const bitvector mask) const [virtual]

Return selected rows of the column in an array_t object.

Can be called on all integral types. Note that 64-byte unsigned integers are simply treated as signed integers. This may cause the values to be interperted incorrectly. Shorter version of unsigned integers are treated correctly as positive values.

Note:
The caller is responsible for freeing the returned array from any of the selectTypes functions.

Reimplemented in ibis::bord::column, ibis::blob, and ibis::text.

References ibis::BYTE, ibis::CATEGORY, ibis::bitvector::cnt(), ibis::horometer::CPUTime(), ibis::fileManager::getFile(), ibis::gVerbose, ibis::bitvector::indexSet::indices(), ibis::fileManager::instance(), ibis::INT, ibis::bitvector::indexSet::isRange(), ibis::LONG, ibis::bitvector::indexSet::nIndices(), ibis::horometer::realTime(), ibis::SHORT, ibis::array_t< T >::size(), ibis::bitvector::size(), ibis::horometer::start(), ibis::horometer::stop(), ibis::TEXT, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

Referenced by ibis::bord::column::append(), ibis::part::fill2DBins2(), ibis::part::fill2DBinsWeighted2(), ibis::part::fill3DBins2(), ibis::part::fill3DBins3(), ibis::part::fill3DBinsWeighted2(), ibis::part::fill3DBinsWeighted3(), ibis::part::get1DBins(), ibis::part::get1DBins_(), ibis::part::get1DDistribution(), ibis::part::get2DBins(), ibis::part::get2DDistribution(), ibis::part::get2DDistributionA(), ibis::part::get2DDistributionU(), ibis::part::get3DBins(), ibis::part::get3DDistribution(), ibis::part::get3DDistributionA(), ibis::part::get3DDistributionA1(), ibis::part::get3DDistributionA2(), and ibis::part::selectLongs().

ibis::array_t< int16_t > * ibis::column::selectShorts ( const bitvector mask) const [virtual]
std::vector< std::string > * ibis::column::selectStrings ( const bitvector mask) const [virtual]

Return the selected rows as strings.

This version returns a std::vector<std::string>, which provides wholly self-contained string values. It may take more memory than necessary, and the memory usage of std::string is not tracked by FastBit. The advantage is that it should work regardless of the actual data type of the column.

Reimplemented in ibis::bord::column, ibis::blob, ibis::category, and ibis::text.

References ibis::BYTE, ibis::bitvector::cnt(), ibis::DOUBLE, ibis::FLOAT, ibis::gVerbose, ibis::INT, ibis::LONG, ibis::OID, ibis::SHORT, ibis::TYPESTRING, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

Referenced by ibis::bord::column::append(), ibis::colStrings::colStrings(), ibis::bord::evaluateTerms(), and ibis::part::selectStrings().

ibis::array_t< unsigned char > * ibis::column::selectUBytes ( const bitvector mask) const [virtual]
ibis::array_t< uint32_t > * ibis::column::selectUInts ( const bitvector mask) const [virtual]

Return selected rows of the column in an array_t object.

Can be called on columns of unsigned integral types, UINT, CATEGORY, USHORT, and UBYTE.

Note:
The caller is responsible for freeing the returned array from any of the selectTypes functions.

Reimplemented in ibis::bord::column, ibis::blob, ibis::category, and ibis::text.

References ibis::CATEGORY, ibis::bitvector::cnt(), ibis::horometer::CPUTime(), ibis::fileManager::getFile(), ibis::gVerbose, ibis::bitvector::indexSet::indices(), ibis::fileManager::instance(), ibis::bitvector::indexSet::isRange(), ibis::bitvector::indexSet::nIndices(), ibis::horometer::realTime(), ibis::array_t< T >::size(), ibis::bitvector::size(), ibis::horometer::start(), ibis::horometer::stop(), ibis::TEXT, ibis::UBYTE, ibis::UINT, and ibis::USHORT.

Referenced by ibis::bord::column::append(), ibis::part::fill2DBins2(), ibis::part::fill2DBinsWeighted2(), ibis::part::fill3DBins2(), ibis::part::fill3DBins3(), ibis::part::fill3DBinsWeighted2(), ibis::part::fill3DBinsWeighted3(), ibis::part::get1DBins(), ibis::part::get1DBins_(), ibis::part::get1DDistribution(), ibis::part::get2DBins(), ibis::part::get2DDistribution(), ibis::part::get2DDistributionA(), ibis::part::get2DDistributionU(), ibis::part::get3DBins(), ibis::part::get3DDistribution(), ibis::part::get3DDistributionA(), ibis::part::get3DDistributionA1(), ibis::part::get3DDistributionA2(), ibis::part::getCumulativeDistribution(), ibis::part::getDistribution(), ibis::part::getJointDistribution(), ibis::part::old2DDistribution(), and ibis::part::selectUInts().

ibis::array_t< uint64_t > * ibis::column::selectULongs ( const bitvector mask) const [virtual]
ibis::array_t< uint16_t > * ibis::column::selectUShorts ( const bitvector mask) const [virtual]
long ibis::column::selectValues ( const bitvector mask,
void *  vals 
) const

Return selected rows of the column in an array_t object.

The caller must provide the correct array_t<type>* for vals! No type casting is performed in this function. Only elementary numerical types are supported.

References ibis::BYTE, ibis::CATEGORY, ibis::DOUBLE, ibis::FLOAT, ibis::gVerbose, ibis::INT, ibis::LONG, ibis::OID, ibis::SHORT, ibis::TYPESTRING, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

Referenced by ibis::bord::column::append(), ibis::bord::evaluateTerms(), ibis::mensa::cursor::fillBuffer(), ibis::part::selectValues(), ibis::query::sortEquiJoin(), and ibis::query::sortRangeJoin().

long ibis::column::selectValues ( const bitvector mask,
void *  vals,
ibis::array_t< uint32_t > &  inds 
) const

Return selected rows of the column in an array_t object along with their positions.

The caller must provide the correct array_t<type>* for vals! No type casting is performed in this function. Only elementary numerical types are supported.

References ibis::BYTE, ibis::CATEGORY, ibis::DOUBLE, ibis::FLOAT, ibis::gVerbose, ibis::INT, ibis::LONG, ibis::OID, ibis::SHORT, ibis::TYPESTRING, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

template<typename T >
long ibis::column::selectValuesT ( const char *  dfn,
const bitvector mask,
ibis::array_t< T > &  vals 
) const [protected]

Select values marked in the bitvector mask.

Pack them into the output array vals.

Upon a successful executation, it returns the number of values selected. If it returns zero (0), the contents of vals is not modified. If it returns a negative number, the contents of arrays vals is not guaranteed to be in any particular state.

References ibis::array_t< T >::clear(), ibis::bitvector::cnt(), ibis::fileManager::getFile(), ibis::util::getFileSize(), ibis::gVerbose, ibis::fileManager::instance(), ibis::bitvector::indexSet::nIndices(), ibis::array_t< T >::push_back(), ibis::fileManager::recordPages(), ibis::array_t< T >::reserve(), ibis::array_t< T >::resize(), ibis::array_t< T >::size(), ibis::bitvector::size(), ibis::fileManager::tryGetFile(), and UnixOpen.

template<typename T >
long ibis::column::selectValuesT ( const char *  dfn,
const bitvector mask,
ibis::array_t< T > &  vals,
ibis::array_t< uint32_t > &  inds 
) const [protected]

Select the values marked in the bitvector mask.

Pack them into the output array vals and fill the array inds with the positions of the values selected.

Upon a successful executation, it returns the number of values selected. If it returns zero (0), the contents of vals and inds are not modified. If it returns a negative number, the contents of arrays vals and inds are not guaranteed to be in particular state.

References ibis::array_t< T >::clear(), ibis::bitvector::cnt(), ibis::util::getFileSize(), ibis::gVerbose, ibis::fileManager::instance(), ibis::bitvector::indexSet::nIndices(), ibis::array_t< T >::push_back(), ibis::fileManager::recordPages(), ibis::array_t< T >::reserve(), ibis::array_t< T >::resize(), ibis::array_t< T >::size(), ibis::bitvector::size(), ibis::fileManager::tryGetFile(), and UnixOpen.

int ibis::column::setNullMask ( const bitvector msk)

Change the null mask to the user specified one.

The incoming mask should have as many bits as the number of rows in the data partition. Upon a successful completion of this function, the return value is >= 0, otherwise it is less than 0.

References ibis::gVerbose, and ibis::bitvector::size().

Referenced by ibis::tafel::toTable().

long ibis::column::string2int ( int  fptr,
dictionary dic,
uint32_t  nbuf,
char *  buf,
array_t< uint32_t > &  out 
) const [protected]

Convert strings in the opened file to a list of integers with the aid of a dictionary.

Convert string values in the opened file to a list of integers with the aid of a dictionary.

  • return 0 if there is no more elements in file.
  • return a positive value if more bytes remain in the file.
  • return a negative value if an error is encountered during the read operation.

References ibis::array_t< T >::clear(), ibis::gVerbose, ibis::dictionary::insert(), ibis::fileManager::instance(), ibis::array_t< T >::push_back(), ibis::fileManager::recordPages(), and ibis::array_t< T >::size().

long ibis::column::truncateData ( const char *  dir,
uint32_t  nent,
ibis::bitvector mask 
) const [virtual]

Truncate the number of records in the named dir to nent.

It truncates file if more entries are in the current file, and it adds more NULL values if the current file is shorter. The null mask is adjusted accordingly.

References ibis::bitvector::adjustSize(), ibis::CATEGORY, ibis::bitvector::cnt(), FASTBIT_DIRSEP, ibis::fileManager::flushFile(), ibis::fileManager::getFile(), ibis::util::getFileSize(), ibis::gVerbose, ibis::fileManager::instance(), ibis::bitvector::size(), ibis::TEXT, and ibis::bitvector::write().

ibis::TYPE_T ibis::column::type ( ) const [inline]

Type of the data.

Note:
The type shall not be changed.

References m_type.

Referenced by ibis::bord::append(), ibis::bord::backup(), ibis::bord::bord(), ibis::colStrings::colStrings(), ibis::bord::column::column(), ibis::part::combineNames(), ibis::bord::copyColumn(), ibis::part::countHits(), ibis::colValues::create(), ibis::index::create(), ibis::mensa::cursor::cursor(), ibis::bord::cursor::cursor(), ibis::direkte::direkte(), ibis::part::doScan(), ibis::bord::evaluateTerms(), ibis::part::fill2DBins2(), ibis::part::fill2DBinsWeighted2(), ibis::part::fill3DBins2(), ibis::part::fill3DBins3(), ibis::part::fill3DBinsWeighted2(), ibis::part::fill3DBinsWeighted3(), ibis::part::get1DBins(), ibis::part::get1DBins_(), ibis::part::get1DDistribution(), ibis::part::get2DBins(), ibis::part::get2DDistribution(), ibis::part::get2DDistributionA(), ibis::part::get2DDistributionU(), ibis::part::get3DBins(), ibis::part::get3DDistribution(), ibis::part::get3DDistributionA(), ibis::part::get3DDistributionA1(), ibis::part::get3DDistributionA2(), ibis::bord::getColumnAsBytes(), ibis::bord::getColumnAsDoubles(), ibis::bord::getColumnAsFloats(), ibis::bord::getColumnAsInts(), ibis::bord::getColumnAsLongs(), ibis::bord::getColumnAsShorts(), ibis::bord::getColumnAsStrings(), ibis::bord::getColumnAsUBytes(), ibis::bord::getColumnAsUInts(), ibis::bord::getColumnAsULongs(), ibis::bord::getColumnAsUShorts(), ibis::part::getJointDistribution(), ibis::bord::groupbya(), ibis::jNatural::jNatural(), ibis::keywords::keywords(), ibis::part::lookforString(), ibis::bord::merge(), ibis::bord::merge10(), ibis::bord::merge11(), ibis::bord::merge12(), ibis::bord::merge20(), ibis::bord::merge20T1(), ibis::bord::merge21(), ibis::bord::merge21T1(), ibis::bord::merge21T2(), ibis::part::negativeScan(), ibis::part::old2DDistribution(), ibis::part::quickTest(), ibis::part::readMetaData(), ibis::part::recursiveQuery(), ibis::relic::relic(), ibis::jNatural::select(), ibis::jRange::select(), ibis::query::sortEquiJoin(), ibis::query::sortRangeJoin(), ibis::whereClause::verifyExpr(), and ibis::tafel::write().

void ibis::column::write ( FILE *  file) const [virtual]

Write the metadata entry.

Write the current content to the metadata file -part.txt of the data partition.

Reimplemented in ibis::blob, ibis::category, and ibis::text.

References ibis::BYTE, ibis::DOUBLE, ibis::FLOAT, ibis::INT, ibis::LONG, ibis::SHORT, ibis::TYPESTRING, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

long ibis::column::writeData ( const char *  dir,
uint32_t  nold,
uint32_t  nnew,
ibis::bitvector mask,
const void *  va1,
void *  va2 = 0 
) [virtual]

Write the content in array va1 to directory dir.

Extend the mask. The void* is internally cast into a pointer to the fixed-size elementary data types according to the type of column. Therefore, there is no way this function can handle string values.

  • Normally: record the content in array va1 to the directory dir.
  • Special case 1: the OID column writes the second array va2 only.
  • Special case 2: for string values, va2 is recasted to be the number of bytes in va1.

Return the number of entries actually written to file. If writing was completely successful, the return value should match nnew. It also extends the mask. Write out the mask if not all the bits are set.

Reimplemented in ibis::blob.

References ibis::bitvector::adjustSize(), ibis::BYTE, ibis::CATEGORY, ibis::bitvector::cnt(), ibis::DOUBLE, FASTBIT_DIRSEP, ibis::FLOAT, ibis::fileManager::flushFile(), ibis::util::getFileSize(), ibis::gVerbose, ibis::fileManager::instance(), ibis::INT, ibis::OID, ibis::SHORT, ibis::bitvector::size(), ibis::TEXT, ibis::TYPESTRING, ibis::UBYTE, ibis::UINT, ibis::USHORT, and ibis::bitvector::write().

Referenced by ibis::part::addColumn().


Member Data Documentation

double ibis::column::upper [protected]

The documentation for this class was generated from the following files:

Make It A Bit Faster
Contact us
Disclaimers
FastBit source code
FastBit mailing list archive