Berkeley DB Reference Guide:
Access Methods

PrevRefNext

Duplicate data items

The B+tree and Hash access methods support the creation of multiple data items for a single key item. By default, multiple data items are not permitted, and each database store operation will overwrite any previous data item for that key. To configure Berkeley DB for duplicate data items, call the DB->set_flags function with the DB_DUP flag.

By default, Berkeley DB stores duplicates in the order in which they were added. This default behavior can be overridden by using the DBcursor->c_put function and one of the DB_AFTER, DB_BEFORE DB_KEYFIRST or DB_KEYLAST flags. Alternatively, Berkeley DB may be configured to sort duplicate data items.

When stepping through the database sequentially, duplicate data items will be returned individually, as a key/data pair, where the key item only changes after the last duplicate data item has been returned. For this reason, duplicate data items cannot be accessed using the DB->get function, as it always returns the first of the duplicate data items. Duplicate data items should be retrieved using the Berkeley DB cursor interface, DBcursor->c_get.

It is also possible to maintain duplicate records in sorted order. This minimizes the effort needed to search them and to perform logical joins on them. To configure Berkeley DB to sort duplicate data items, the application must call the DB->set_flags function with the DB_DUPSORT flag (in addition to the DB_DUP flag). Additionally, a custom sort may be specified using the DB->set_dup_compare function. If the DB_DUPSORT flag is given, but no comparison routine is specified, then Berkeley DB defaults to the same lexicographical sorting used for B+tree keys, with shorter items collating before longer items.

For information on how searching and insertion behaves in the presence of duplicates (sorted or not), see the DB->get, DB->put, DBcursor->c_get and DBcursor->c_put documentation.

PrevRefNext

Copyright Sleepycat Software