Public Member Functions | Static Public Member Functions
ibis::tablex Class Reference

The class for expandable tables. More...

#include <table.h>

Inheritance diagram for ibis::tablex:
ibis::tafel

List of all members.

Public Member Functions

virtual int addColumn (const char *cname, ibis::TYPE_T ctype, const char *cdesc=0, const char *idx=0)=0
 Add a column.
virtual int append (const char *cname, uint64_t begin, uint64_t end, void *values)=0
 Add values to the named column.
virtual int appendRow (const ibis::table::row &)=0
 Add one row.
virtual int appendRow (const char *line, const char *delimiters=0)=0
 Append a row stored in ASCII form.
virtual int appendRows (const std::vector< ibis::table::row > &)=0
 Add multiple rows.
virtual uint32_t capacity () const
 Capacity of the memory cache.
virtual void clearData ()=0
 Remove all data recorded.
virtual void describe (std::ostream &) const =0
 Print a description of the table to the specified output stream.
virtual uint32_t mColumns () const =0
 The number of columns in this table.
virtual uint32_t mRows () const =0
 The maximum number of rows in any column.
virtual int parseNamesAndTypes (const char *txt)
 Parse names and data types in string form.
virtual int readCSV (const char *inputfile, int maxrows=0, const char *outputdir=0, const char *delimiters=0)=0
 Read the content of the named file as comma-separated values.
virtual int readNamesAndTypes (const char *filename)
 Read a file containing the names and types of columns.
virtual int readSQLDump (const char *inputfile, std::string &tname, int maxrows=0, const char *outputdir=0)=0
 Read a SQL dump from database systems such as MySQL.
virtual int32_t reserveSpace (uint32_t)
 Reserve enough space for the specified number of rows.
virtual tabletoTable (const char *nm=0, const char *de=0)=0
 Stop expanding the current set of data records.
virtual int write (const char *dir, const char *tname=0, const char *tdesc=0, const char *idx=0, const char *nvpairs=0) const =0
 Write the in-memory data records to the specified directory and update the metadata on disk.
virtual int writeMetaData (const char *dir, const char *tname=0, const char *tdesc=0, const char *idx=0, const char *nvpairs=0) const =0
 Write out the information about the columns.

Static Public Member Functions

static ibis::tablexcreate ()
 Create a minimalistic table exclusively for entering new records.

Detailed Description

The class for expandable tables.

It is designed to temporarily store data in memory and then write the records out through the function write. After creating a object of this type, the user must first add columns by calling addColumn. New data records may be added one column at a time or one row at a time. An example of using this class is in examples/ardea.cpp.

Note:
Most functions that return an integer return 0 in case of success, a negative value in case error and a positive number as advisory information.

Member Function Documentation

virtual int ibis::tablex::append ( const char *  cname,
uint64_t  begin,
uint64_t  end,
void *  values 
) [pure virtual]

Add values to the named column.

The column name must be in the table already. The first value is to be placed at row begin (the row numbers start with 0) and the last value before row end. The array values must contain (end - begin) values of the type specified through addColumn.

The expected types of values are "const std::vector<std::string>*" for string valued columns, and "const T*" for a fix-sized column of type T. For example, if the column type is float, the type of values is "const float*"; if the column type is category, the type of values is "const std::vector<std::string>*".

Note:
Since each column may have different number of rows filled, the number of rows in the table is considered to be the maximum number of rows filled of all columns.
This function can not be used to introduce new columns in a table. A new column must be added with addColumn.
See also:
appendRow

Implemented in ibis::tafel.

Referenced by fastbit_add_values().

virtual int ibis::tablex::appendRow ( const ibis::table::row ) [pure virtual]

Add one row.

If an array of names has the same number of elements as the array of values, the names are used as column names. If the names are not specified explicitly, the values are assigned to the columns of the same data type in the order as they are specified through addColumn or if the same order as they are recreated from an existing dataset (which is typically alphabetical).

Return the number of values added to the new row.

Note:
The column names are not case-sensitive.
Like append, this function can not be used to introduce new columns in a table. A new column must be added with addColumn.
Since the various columns may have different numbers of rows filled, the number of rows in the table is assumed to the largest number of rows filled so far. The new row appended here increases the number of rows in the table by 1. The unfilled rows are assumed to be null.
A null value is internally denoted with a mask separated from the data values. However, since the rows corresponding to the null values must be filled with some value in this implementation, the following is how their values are filled. A null value of an integer column is filled as the maximum possible of the type of integer. A null value of a floating-point valued column is filled as a quiet NaN (Not-a-Number). A null value of a string-valued column is filled with an empty string.

Implemented in ibis::tafel.

virtual int ibis::tablex::appendRow ( const char *  line,
const char *  delimiters = 0 
) [pure virtual]

Append a row stored in ASCII form.

The ASCII form of the values are assumed to be separated by comma (,) or space, but additional delimiters may be added through the second argument.

Return the number of values added to the new row.

Implemented in ibis::tafel.

virtual int ibis::tablex::appendRows ( const std::vector< ibis::table::row > &  ) [pure virtual]

Add multiple rows.

Rows in the incoming vector are processed on after another. The ordering of the values in earlier rows are automatically carried over to the later rows until another set of names is specified.

Return the number of new rows added.

See also:
appendRow

Implemented in ibis::tafel.

virtual uint32_t ibis::tablex::capacity ( ) const [inline, virtual]

Capacity of the memory cache.

Report the maximum number of rows can be stored with this object before more memory will be allocated. A return value of zero (0) may also indicate that it does not know about its capacity.

Note:
For string valued columns, the resvation is not necessarily allocating space required for the actual string values. Thus it is possible to run out of memory before the number of rows reported by mRows reaches the value returned by this function.

Reimplemented in ibis::tafel.

virtual void ibis::tablex::clearData ( ) [pure virtual]

Remove all data recorded.

Keeps the information about columns. It is intended to prepare for new rows after invoking the function write.

Implemented in ibis::tafel.

Create a minimalistic table exclusively for entering new records.

Create a tablex for entering new data.

int ibis::tablex::parseNamesAndTypes ( const char *  txt) [virtual]

Parse names and data types in string form.

A column name must start with an alphabet or a underscore (_); it can be followed by any number of alphanumeric characters (including underscores).

For each built-in data types, the type names recognized are as follows:

If it can not find a type, but a valid name is found, then the type is assumed to be int.

Note:
Column names are not case-sensitive and all types should be specified in lower case letters.

Characters following '#' or '--' on a line will be treated as comments and discarded.

References ibis::tafel::addColumn(), ibis::BLOB, ibis::BYTE, ibis::CATEGORY, ibis::DOUBLE, ibis::FLOAT, ibis::gVerbose, ibis::INT, ibis::LONG, ibis::SHORT, ibis::TEXT, ibis::UBYTE, ibis::UINT, ibis::ULONG, and ibis::USHORT.

Referenced by readNamesAndTypes().

virtual int ibis::tablex::readCSV ( const char *  inputfile,
int  maxrows = 0,
const char *  outputdir = 0,
const char *  delimiters = 0 
) [pure virtual]

Read the content of the named file as comma-separated values.

Append the records to this table. If the argument maxrows is greater than 0, this function will reserve space to read this many records. If the total number of records is more than maxrows and the output directory name is specified, then the records will be written the outputdir and the memory is made available for later records. If outputdir is not specified, this function attempts to expand the memory allocated, which may run out of memory. Furthermore, repeated allocations can be time-consuming.

By default the records are delimited by comma (,) and blank space. One may specify alternative delimiters using the last argument.

Upon successful completion of this funciton, the return value is the number of rows processed. However, not all of them may remain in memory because ealier rows may have been written to disk.

Note:
Information about column names and types must be provided before calling this function.
The return value is intentionally left as 32-bit integer, which limits the maximum number of rows can be correctly handled.

Implemented in ibis::tafel.

virtual int ibis::tablex::readSQLDump ( const char *  inputfile,
std::string &  tname,
int  maxrows = 0,
const char *  outputdir = 0 
) [pure virtual]

Read a SQL dump from database systems such as MySQL.

The entire file will be read into memory in one shot unless both maxrows and outputdir are specified. In cases where both maxrows and outputdir are specified, this function reads a maximum of maxrows before write the data to outputdir under the name tname, which leaves no more than maxrows number of rows in memory. The value returned from this function is the number of rows processed including those written to disk. Use function mRows to determine how many are still in memory.

If the SQL dump file contains statement to create table, then the existing metadata is overwritten. Otherwise, it reads insert statements and convert the ASCII data into binary format in memory.

Implemented in ibis::tafel.

virtual int32_t ibis::tablex::reserveSpace ( uint32_t  ) [inline, virtual]

Reserve enough space for the specified number of rows.

Return the number of rows that can be stored or a negative number to indicate error. Since the return value is a 32-bit signed integer, it is not possible to represent number greater or equal to 2^31 (~2 billion), the caller shall not attempt to reserve space for 2^31 rows (or more).

The intention is to mimize the number of dynamic memory allocations needed expand memory used to hold the data. The implementation of this function is not required, and the user is not required to call this function.

Reimplemented in ibis::tafel.

virtual table* ibis::tablex::toTable ( const char *  nm = 0,
const char *  de = 0 
) [pure virtual]

Stop expanding the current set of data records.

Convert a tablex object into a table object, so that they can participate in queries. The data records held by the tablex object is transfered to the table object, however, the metadata remains with this object.

Implemented in ibis::tafel.

virtual int ibis::tablex::write ( const char *  dir,
const char *  tname = 0,
const char *  tdesc = 0,
const char *  idx = 0,
const char *  nvpairs = 0 
) const [pure virtual]

Write the in-memory data records to the specified directory and update the metadata on disk.

If the table name (tname) is a null string or an empty string, the last component of the directory name is used. If the description (tdesc) is a null string or an empty string, a time stamp will be printed in its place. If the specified directory already contains data, the new records will be appended to the existing data. In this case, the table name specified here will overwrite the existing name, but the existing name and description will be retained if the current arguments are null strings or empty strings. The data type associated with this table will overwrite the existing data type information. If the index specification is not null, the existing index specification will be overwritten.

  • dir The output directory name. Must be a valid directory name. The named directory will be created if it does not already exist.
  • tname Table name. Should be a valid string, otherwise, a random name is generated as FastBit requires a name for each table.
  • tdesc Table description. An optional description of the table. It can be an arbitrary string.
  • idx Indexing option for all columns of the table without its own indexing option. More information about indexing options is available elsewhere.
  • nvpairs An arbitrary list of name-value pairs to be associated with the data table. An arbitrary number of name-value pairs may be given here, however, FastBit may not be able to do much about them. One useful of the form "columnShape=(nd1, ..., ndk)" can be used to tell FastBit that the table table is defined on a simple regular k-dimensional mesh of size nd1 x ... x ndk. Internally, these name-value pairs associated with a data table is known as meta tags or simply tags.

Implemented in ibis::tafel.

Referenced by fastbit_flush_buffer().

virtual int ibis::tablex::writeMetaData ( const char *  dir,
const char *  tname = 0,
const char *  tdesc = 0,
const char *  idx = 0,
const char *  nvpairs = 0 
) const [pure virtual]

Write out the information about the columns.

It will write the metadata file containing the column information and index specifications if no metadata file already exists. It returns the number of columns written to the metadata file upon successful completion, returns 0 if a metadata file already exists, and returns a negative number to indicate errors. If there is no column in memory, nothing is written to the output directory.

Note:
The formal arguments of this function are exactly same as those of ibis::tablex::write.
Warning:
This function does not preserve the existing metadata! Use with care.

Implemented in ibis::tafel.


The documentation for this class was generated from the following files:

Make It A Bit Faster
Contact us
Disclaimers
FastBit source code
FastBit mailing list archive