2.4 Release Overview

The major focus areas for release 2.4 include:

Berkeley DB XML 2.4.11 Change Log

Upgrade Requirements

Upgrade is only required for containers created using release 2.2.13 or earlier. Containers creating using 2.3.X do not require upgrade. Most queries will benefit from reindexing 2.3.X-based containers to add new structural statistics information used in query cost analysis. Reindexing is required in order to enable the substring index to be used on 1- and 2-character strings (a new feature in 2.4).

If an upgrade is performed (e.g. from 2.2.13) it is recommended that the resulting container be run through dbxml_dump/dbxml_load to reduce its file size.

New Features:

  1. Conformance to Last-Call Working Draft of XQuery Update 1.0
  2. Added the ability to use "document projection" when querying whole-document containers. This performance and memory optimization results in only materializing that portion of the document relevant to the query.
  3. Partial document modifications will now only reindex those portions of the document(s) affected by the modification itself. This is a significant performance enhancement for partial update of large documents.

API Changes:

Unless otherwise noted, the API additions apply to all language bindings, and all bindings use the same method name.
  1. Added the DBXML_DOCUMENT_PROJECTION flag to the various query interfaces to enable use of this feature. In Java, this behavior is controlled by XmlDocumentConfig.setDocumentProjection()
  2. Added a new XQuery extension function, dbxml:contains(), that allows case- and diacritic-insensitive string searches and can be optimized by a substring index
  3. Removed all C++ interfaces that used Xerces-C DOM, including:
  4. XmlModify is now a deprecated (but still-supported) class. XQuery Update should be used instead. One method has been removed as it is no longer supported by the internal infrastructure: XmlModify::setNewEncoding()
  5. Added a new constructor to XmlEventReaderToWriter that allows an XmlEventWriter instance to be multiple-use. This makes it possible to write to it from multiple reader sources to concatenate content, for example.
  6. Removed XmlUpdateContext::{get,set}ApplyChangesToContainers() methods. This behavior is no longer controllable. Changes to documents that are in containers will always be written. If transient changes are required, content must be copied (XQuery Update has syntax to do this directly)
  7. Removed unused variant of XmlValue::asString(std::string &encoding). This variant never actually changed the encoding [#15822]
  8. Added DBXML_STATISTICS and DBXML_NO_STATISTICS flags to enable/disable creation of an additional statistics database that is used for query optimization. The default is to create the database. Upgraded containers will NOT have a statistics database added unless they are explicitly reindexed. The cost of this optimization is a bit of extra work during document insertion. In Java this behavior is controlled by XmlContainerConfig.setStatisticsEnabled()
  9. Added the DBXML_NO_AUTO_COMMIT flag, which can be specified to the XmlQueryExpression::execute() methods to turn off auto-commit of update queries when it is not appropriate.
  10. Some XmlException error codes have changed - DOM_PARSER_ERROR and NO_VARIABLE_BINDING have been removed, XPATH_PARSER_ERROR is now called QUERY_PARSER_ERROR, and XPATH_EVALUATION_ERROR is now called QUERY_EVALUATION_ERROR. [#15792]
  11. The enumeration, XmlQueryContext::DeadValues, has been removed. The related method, XmlQueryContext::setReturnType() remains but is a no-op. All results are LiveValues. This will not affect the vast majority of applications.
  12. Added XmlQueryExpression::isUpdateExpression() to allow users to know whether an expression is updating or not

Changes That May Require Application Modification:

  1. Some of the API changes above may result in the need to make minor code changes
  2. Not all modification patterns that use XmlModify will continue to work given the new infrastructure. Specifically, operations that would both copy and delete the same content (emulating a "move" operation) may not work. In all cases, such code can be rewritten to use XQuery Update directly, resulting in simpler code. Special attention should be paid to multi-step operations that include such side effects.
  3. Changed default indexing type on containers to be node indexes for node storage containers and document indexes for document storage containers [#15863].
  4. Java only -- the function XmlContainer.getNode() function has changed its signature and will require code change if used. See details below under "Java-specific Functionality Changes."

General Functionality Changes:

  1. Partial document modification will result in only reindexing those portions of document(s) affected by the modification
  2. The system now keeps better cost information and statistics and the query optimizer uses this information to perform more effective cost-based optimization
  3. The content processing internals have been reworked to make heavy use of iterators and temporary Berkeley DB databases to significantly reduce the memory footprint of query handling as well as reduce the number of objects created and destroyed by a query. This leads to a more scalable, high-performance system
  4. Substring indexes will now work on any length search string (e.g. 2-char) rather than be restricted to a 3-character minimum. Reindexing the container is required to get this functionality.
  5. Various fixes and memory leak elimination in XmlEventWriter[#15405]
  6. Fixed a problem where removing a default index could remove index entries for an overlapping non-default index[#15412]
  7. Changed semantics of XmlQueryContext::setNamespace() to treat an empty namespace prefix as the default element namespace[#15630]
  8. Fixed problem in XmlModify that could result in malformed XML if a prefixed element name were used without a mapping for that prefix[#15586]
  9. Fixed URI resolution code to not add the base URI when the URI being resolved is absolute[#15583]
  10. Fixed code to force explicit transactions (vs auto-commit) when using XmlContainer::putDocumentAsEventWriter. This is necessary because of the 2-part nature of this interface [#15578]
  11. Fixed a crash that could occur if XmlResults.next() were called at the end of a result set [#15621]
  12. Fixed a bug where XQuery expressions involving unused global variables were not being optimized correctly [#15661]
  13. Fixed problem in XmlEventReader::nextTag() where it would mistakenly throw an exception on character data. Also changed semantics of XmlEventReader to always return start and end document events so that callers can know when content starts and ends [#15686]
  14. Fixed case where the '>' character was not being escaped properly (according to the XML specificiation). This case is when it occurs in the sequence, "]]>" [#15739]
  15. Fixed a problem in XmlModify where removing a node that was the last child and had leading text could cause a SEGV [#15615]
  16. Enhanced XmlEventReaderToWriter API to not unconditionally close the XmlEventWriter object, allowing a single XmlEventWriter to be used more than once via that API. This allows XmlEventReaderToWriter to be used for example, to coalesce a number of results into one document [#15446]
  17. Fixed a problem with XmlIndexLookup where a GT lookup that happens to start with the last entry in the index might return results when it should return none[#15408]
  18. Fixed a bug which incorrectly reported an error for fractional seconds when the seconds filed was "59" [#15389]
  19. Fixed a problem in statistics calculation for substring indexes that could cause a crash in fn:contains() [#15823]
  20. Fixed a bad exception that might be thrown when inserting a schema-invalid document, due to the length of the error message [#15824]
  21. Fixed a problem where a query that uses an XmlDocument that has just been "put" into a continer as context for the query might hang if done in the same transaction as the putDocument call[#15905]
  22. Fixed a problem where querying empty CDATA sections could cause an assertion failure or bad memory refernce[#15906]
  23. Fixed a latent bug that could result in missing index entries after updateDocument or modifyDocument call. This is very obscure and has never been seen by a user. It requires an odd combination of indexes and updates[#15943]
  24. Fix open/close race condition on XmlContainer. An application that concurrently opened/closed XmlContainer objects (not recommended...) could possibly reference bad memory [#15890]
  25. Fixed several memory leaks that could occur if deadlock exceptions are thrown during document processing (most likely put and delete)
  26. All update operations now work inside internal child transactions to ensure that they are properly aborted if necessary. This is not user visible
  27. Internal buffer size for DB get operations on nodes is tuned to the calling operation (bulk vs single get)[#15607]
  28. Fixed a bug in XmlEventWriter where the behavior was dependent on an uninitialized variable [#15968]
  29. Changed dbxml_load_container to take a '-e' flag that causes the program to stop document loading in the event of a parse error. The default is to continue with the next document [#15777].
  30. Fixed problem in XmlModify where newly-inserted element content could cause a bad memory reference and/or crash while calculating a new node id [#15974].

Utility Changes:

  1. The dbxml shell added commands:
  2. The dbxml shell can be invoked using the #! syntax in a *nix shell command, e.g. with the first ilne:
    #!/dbxml -s
  3. Handling of '#' comment lines in the dbxml shell has been improved so that they can occur anywhere in a line [#15689]

Java-specific Functionality Changes:

  1. Fixed a problem where committed or aborting a transaction that was already committed or aborted could crash, especially after a failed XmlManager.openContainer() call [#15729]
  2. Is it no longer be necessary to explicitly delete objects of type XmlValue, XmlDocument, XmlQueryContext, XmlMetaData, XmlMetaDataIterator and XmlUpdateContext. They are implemented entirely as pure Java objects with no native memory to release. It will still be necessary to explicitly delete other Java objects to release native memory. In general the validity of XmlValue and XmlDocument objects returned via XmlResults (queries, index lookups, etc) is under control of the XmlResults object. When the XmlResults object is deleted node values that may have been associated with that object may no longer be accessible and an exception will be thrown if accessed [#15194]
  3. Added -source 1.5 -target 1.5 to Java builds to be explicit, especially for Windows binary build. The current code *will* work with 1.4 or 1.6 if the arguments are changed (manually) [#15986]
  4. The function XmlContainer.getNode() function has changed its signature. Instead of XmlValue it now returns XmlResults. The XmlValue that was previously returned can be retrieved using XmlResults.next(). There will never be more than one value in this result. It is necessary to explicitly delete the returned XmlResults object (XmlResults.delete()) when the application no longer needs access to the returned value. Once deleted the information in the XmlValue may no longer be accessible.

Python-specific Functionality Changes:

  1. Fixed XmlEventReader in Python so that methods returning unsigned char * would be mapped properly into Python strings [#15608]
  2. Changed implementation of XmlException and related classes to make them part of the dbxml (vs _dbxml) module [#15617]
  3. Changed names of XmlException attributes to start with lower-case letters. See src/python/README.exceptions.
  4. Moved examples to dbxml/examples/python directory and added some additional basic examples

PHP-specific Functionality Changes:

  1. Fixed code that resulted in build and runtime errors on 64-bit platforms. One symptom was "std::bad_alloc" exceptions. The issue was a mix of 64- and 32-bit types resulting in attempts to allocate huge amounts of memory [#15587]
  2. Fixed compilation problems in a threaded (ZTS) environment related to the use of incorrect macros in a few places [#15746]
  3. Moved examples to dbxml/examples/php
  4. Fixed XmlValue constructor to accept explicitly typed strings [#15996]

Perl-specific Functionality Changes:

  1. Moved examples to dbxml/examples/perl

Example Code Changes

  1. Added examples/cxx/xerces directory with example code that provides the same functionality that the Xerces-C DOM interfaces previously provided. They are written as example code to illustrate an integration with Xerces-C DOM and to also illustrate use of the XmlEvent* classes for such an adapter
  2. Moved examples for all languages to dbxml/examples/* to consolidate them and make packaging simpler

Configuration, Documentation, Portability and Build Changes:

  1. XQilla 2.0 is bundled. XQilla 2.0 is released under a permissive (Apache) license
  2. Windows static build projects are included
  3. Project and solution files for Visual Studio version 8.00 have been added for use by Visual Studio 2005 and later releases. The new solution file is BDBXML_all_vs8.sln.
  4. Added Berkeley DB project files to the BDB XML build_windows directory for Visual Studio 7.1 and 8 builds. This means that the included DB projects will be built directly in the BDB XML tree and not in the Berkeley DB tree. This does not apply to the VC6 projects and workspace and does not affect where the default build installs executables and libraries. VS7.1 project files for Berkeley DB examples are no longer included.