It is possible to link the DAKOTA toolkit into another application for use as an algorithm library. This section describes facilities which permit this type of integration.
As part of the normal DAKOTA build process, where Dakota/configure
--prefix=PREFIX
has been run prior to make
and make
install
, a libdakota.a
is created and a copy of it is placed in PREFIX/lib
(PREFIX defaults to /usr/local/Dakota). This library contains all source files from Dakota/src
excepting the main.C, restart_util.C, and library_mode.C main programs. This library may be linked with another application through inclusion of -ldakota
on the link line. Library and header paths may also be specified using the -L
and -I
compiler options (using PREFIX/lib
and PREFIX/include
, respectively). Depending on the configuration used when building this library, other libraries for the vendor optimizers and vendor packages will also be needed to resolve DAKOTA symbols for DOT, NPSOL, OPT++, NCSUOpt, LHS, Teuchos, etc. Copies of these libraries are also placed in Dakota/lib
. Refer to Linking against the DAKOTA library for additional information.
To learn by example, refer to the files PluginSerialDirectApplicInterface.[CH] and PluginParallelDirectApplicInterface.[CH] in Dakota/src for simple examples of serial and parallel plug-in interfaces. The file library_mode.C in Dakota/src provides example usage of these plug-ins within a mock simulator program that demonstrates the required object instantiation syntax in combination with the three problem database population approaches (input file parsing, data node insertion, and mixed mode). All of this code may be compiled and tested by configuring DAKOTA using the --with-plugin
option.
The procedure for utilizing DAKOTA as a library within another application involves a number of steps that are similar to those used in the stand-alone DAKOTA application. The stand-alone procedure can be viewed in the file main.C, and the differences for the library approach are most easily explained with reference to that file. The basic steps of executing DAKOTA include instantiating the ParallelLibrary, CommandLineHandler, and ProblemDescDB objects; managing the DAKOTA input file (ProblemDescDB::manage_inputs()); specifying restart files and output streams (ParallelLibrary::specify_outputs_restart()); and instantiating the Strategy and running it (Strategy::run_strategy()). When using DAKOTA as an algorithm library, the operations are quite similar, although command line information (argc, argv, and therefore CommandLineHandler) will not in general be accessible. In particular, main.C can pass argc and argv into the ParallelLibrary and CommandLineHandler constructors and then pass the CommandLineHandler object into ProblemDescDB::manage_inputs() and ParallelLibrary::specify_outputs_restart(). In an algorithm library approach, a CommandLineHandler object is not instantiated and overloaded forms of the ParallelLibrary constructor, ProblemDescDB::manage_inputs(), and ParallelLibrary::specify_outputs_restart() are used.
The overloaded forms of these functions are as follows. For instantiation of the ParallelLibrary object, the default constructor may be used. This constructor assumes that MPI is administered by the parent application such that the MPI configuration will be detected rather than explicitly created (i.e., DAKOTA will not call MPI_Init or MPI_Finalize). In code, the instantiation
ParallelLibrary parallel_lib(argc, argv);
is replaced with
ParallelLibrary parallel_lib;
In the case of specifying restart files and output streams, the call to
parallel_lib.specify_outputs_restart(cmd_line_handler);
should be replaced with its overloaded form in order to pass the required information through the parameter list
parallel_lib.specify_outputs_restart(std_output_filename, std_error_filename, read_restart_filename, write_restart_filename, stop_restart_evals);
where file names for standard output and error and restart read and write as well as the integer number of restart evaluations are passed through the parameter list rather than read from the command line of the main DAKOTA program. The definition of these attributes is performed elsewhere in the parent application (e.g., specified in the parent application input file or GUI). In this function call, specify NULL
for any files not in use, which will elicit the desired subset of the following defaults: standard output and standard error are directed to the terminal, no restart input, and restart output to file dakota.rst
. The stop_restart_evals
specification is an optional parameter with a default of 0, which indicates that restart processing should process all records. If no overrides of these defaults are intended, the call to specify_outputs_restart()
may be omitted entirely.
With respect to alternate forms of ProblemDescDB::manage_inputs(), the following section describes different approaches to populating data within DAKOTA's problem description database. It is this database from which all DAKOTA objects draw data upon instantiation.
Now that the ProblemDescDB object has been instantiated, we must populate it with data, either via parsing an input file, direct data insertion, or a mixed approach, as described in the following sections.
The simplest approach to linking an application with the DAKOTA library is to rely on DAKOTA's normal parsing system to populate DAKOTA's problem database (ProblemDescDB) through the reading of an input file. The disadvantage to this approach is the requirement for an additional input file beyond those already required by the parent application.
In this approach, the main.C call to
problem_db.manage_inputs(cmd_line_handler);
would be replaced with its overloaded form
problem_db.manage_inputs(dakota_input_file);
where the file name for the DAKOTA input is passed through the parameter list rather than read from the command line of the main DAKOTA program. Again, the definition of the DAKOTA input file name is performed elsewhere in the parent application (e.g., specified in the parent application input file or GUI). Refer to run_dakota_parse() in library_mode.C for a complete example listing.
ProblemDescDB::manage_inputs() invokes ProblemDescDB::parse_inputs() (which in turn invokes ProblemDescDB::check_input()), ProblemDescDB::broadcast(), and ProblemDescDB::post_process(), which are lower level functions that will be important in the following two sections. Thus, the input file parsing approach may employ a single coarse grain function to coordinate all aspects of problem database population, whereas the two approaches to follow will use lower level functions to accomplish a finer grain of control.
This approach is more involved than the previous approach, but it allows the application to publish all needed data to DAKOTA's database directly, thereby eliminating the need for the parsing of a separate DAKOTA input file. In this case, ProblemDescDB::manage_inputs() is not called. Rather, DataStrategy, DataMethod, DataModel, DataVariables, DataInterface, and DataResponses objects are instantiated and populated with the desired problem data. These objects are then published to the problem database using ProblemDescDB::insert_node(), e.g.:
// instantiate the data object DataMethod data_method; // set the attributes within the data object data_method.methodName = "nond_sampling"; ... // publish the data object to the ProblemDescDB problem_db.insert_node(data_method);
The data objects are populated with their default values upon instantiation, so only the non-default values need to be specified. Refer to the DataStrategy, DataMethod, DataModel, DataVariables, DataInterface, and DataResponses class documentation and source code for lists of attributes and their defaults.
The default strategy is single_method
, which runs a single iterator on a single model, and the default model is single
, so it is not necessary to instantiate and publish a DataStrategy or DataModel object if advanced multi-component capabilities are not required. Rather, instantiation and insertion of a single DataMethod, DataVariables, DataInterface, and DataResponses object is sufficient for basic DAKOTA capabilities.
Once the data objects have been published to the ProblemDescDB object, calls to
problem_db.check_input(); problem_db.broadcast(); problem_db.post_process();
will perform basic database error checking, broadcast a packed MPI buffer of the specification data to other processors, and post-process specification data to fill in vector defaults (scalar defaults are handled in the Data class constructors), respectively. For parallel applications, processor rank 0 should be responsible for Data node population and insertion and the call to ProblemDescDB::check_input(), and all processors should participate in ProblemDescDB::broadcast() and ProblemDescDB::post_process(). Moreover, preserving the order shown assures that large default vectors are not transmitted by MPI. Refer to run_dakota_data() in library_mode.C for a complete example listing.
In this case, we will combine the parsing of a DAKOTA input file with some direct database updates. The motivation for this approach arises in large-scale applications where large vectors can be awkward to specify in a DAKOTA input file. The first step is to parse the input file, but rather than using
problem_db.manage_inputs(dakota_input_file);
as described in Input file parsing, we will use the lower level function
problem_db.parse_inputs(dakota_input_file);
to provide a finer grain of control. The passed input file dakota_input_file
must contain all required inputs. Since vector data like variable values/bounds/tags, linear/nonlinear constraint coefficients/bounds, etc. are optional, these potentially large vector specifications can be omitted from the input file. Only the variable/response counts, e.g.:
method linear_inequality_constraints = 500 variables continuous_design = 1000 responses objective_functions = 1 nonlinear_inequality_constraints = 100000
are required in this case. To update the data omissions from their defaults, one uses the ProblemDescDB::set() family of overloaded functions, e.g.
Dakota::RealVector drv(1000, 1.); // vector of length 1000, values initialized to 1. problem_db.set("variables.continuous_design.initial_point", drv);
where the string identifiers are the same identifiers used when pulling information from the database using one of the get_<datatype>() functions (refer to the source code of ProblemDescDB.C for a full list). However, the supported ProblemDescDB::set() options are a restricted subset of the database attributes, focused on vector inputs that can be large scale.
If performing these updates within the constructor of a DirectApplicInterface extension/derivation (see Defining the direct application interface), then this code is sufficient since the database is unlocked, the active list nodes of the ProblemDescDB have been set for you, and the correct strategy/method/model/variables/interface/responses specification instance will get updated. The difficulty in this case stems from the order of instantiation. Since the Variables and Response instances are constructed in the base Model class, prior to construction of Interface instances in derived Model classes, database information related to Variables and Response objects will have already been extracted by the time the Interface constructor is invoked and the database update will not propagate.
Therefore, it is preferred to perform these operations at a higher level (e.g., within your main program), prior to Strategy instantiation and execution, such that instantiation order is not an issue. However, in this case, it is necessary to explicitly manage the list nodes of the ProblemDescDB using a specification instance identifier that corresponds to an identifier from the input file, e.g.:
problem_db.set_db_variables_node("MY_VARIABLES_ID"); Dakota::RealVector drv(1000, 1.); // vector of length 1000, values initialized to 1. problem_db.set("variables.continuous_design.initial_point", drv);
Alternatively, rather than setting just a single data node, all data nodes may be set using a method specification identifier:
problem_db.set_db_list_nodes("MY_METHOD_ID");
since the method specification is responsible for identifying a model specification, which in turn identifies variables, interface, and responses specifications. If hardwiring specification identifiers is undesirable, then
problem_db.resolve_top_method();
can also be used to deduce the active method specification and set all list nodes based on it. This is most appropriate in the case where only single specifications exist for method/model/variables/interface/responses. In each of these cases, setting list nodes unlocks the corresponding portions of the database, allowing set/get operations.
Once all direct database updates have been performed in this manner, calls to ProblemDescDB::broadcast() and ProblemDescDB::post_process() should be used on all processors. The former will broadcast a packed MPI buffer with the aggregated set of specification data from rank 0 to other processors, and the latter will post-process specification data to fill in any vector defaults that have not yet been provided through either file parsing or direct updates (Note: scalar defaults are handled in the Data class constructors). Refer to run_dakota_mixed() in library_mode.C for a complete example listing.
With the ProblemDescDB object populated with problem data, we may now instantiate the strategy.
// instantiate the strategy
Strategy selected_strategy(problem_db);
Following strategy construction, all MPI communicator partitioning has been performed and the ParallelLibrary instance may be interrogated for parallel configuration data. For example, the lowest level communicators in DAKOTA's multilevel parallel partitioning are the analysis communicators, which can be retrieved using:
// retrieve the set of analysis communicators for simulation initialization: // one analysis comm per ParallelConfiguration (PC), one PC per Model. Array<MPI_Comm> analysis_comms = parallel_lib.analysis_intra_communicators();
These communicators can then be used for initializing parallel simulation instances, where the number of MPI communicators in the array corresponds to one communicator per ParallelConfiguration instance.
When employing a library interface to DAKOTA, it is frequently desirable to also use a direct interface between DAKOTA and the simulation. There are two approaches to defining this direct interface.
The first approach involves extending the existing DirectApplicInterface class to support additional direct simulation interfaces. In this case, a new simulation interface function can be added to Dakota/src/DirectApplicInterface.[CH] for the simulation of interest. If the new function will not be a member function, then the following prototype should be used in order to pass the required data:
int sim(const Dakota::Variables& vars, const Dakota::ActiveSet& set, Dakota::Response& response);
If the new function will be a member function, then this can be simplified to
int sim();
since the data access can be performed through the DirectApplicInterface class attributes.
This simulation can then be added to the logic blocks in DirectApplicInterface::derived_map_ac(). In addition, DirectApplicInterface::derived_map_if() and DirectApplicInterface::derived_map_of() can be extended to perform pre- and post-processing tasks if desired, but this is not required.
While this approach is the simplest, it has the disadvantage that the DAKOTA library may need to be recompiled when the simulation or its direct interface is modified. If it is desirable to maintain the independence of the DAKOTA library from the host application, then the following derivation approach should be employed.
The second approach is to derive a new interface from DirectApplicInterface in order to redefine several virtual functions. A typical derived class declaration might be
namespace SIM { class SerialDirectApplicInterface: public Dakota::DirectApplicInterface { public: // Constructor and destructor SerialDirectApplicInterface(const Dakota::ProblemDescDB& problem_db); ~SerialDirectApplicInterface(); protected: // Virtual function redefinitions int derived_map_if(const Dakota::String& if_name); int derived_map_ac(const Dakota::String& ac_name); int derived_map_of(const Dakota::String& of_name); private: // Data } } // namespace SIM
where the new derived class resides in the simulation's namespace. Similar to the case of Extension, the DirectApplicInterface::derived_map_ac() function is the required redefinition, and DirectApplicInterface::derived_map_if() and DirectApplicInterface::derived_map_of() are optional.
The new derived interface object (from namespace SIM) must now be plugged into the strategy. In the simplest case of a single model and interface, one could use
// retrieve the interface of interest ModelList& all_models = problem_db.model_list(); Model& first_model = *all_models.begin(); Interface& interface = first_model.interface(); // plug in the new direct interface instance (DB does not need to be set) interface.assign_rep(new SIM::SerialDirectApplicInterface(problem_db), false);
from within the Dakota namespace. In a more advanced case of multiple models and multiple interface plug-ins, one might use
// retrieve the list of Models from the Strategy ModelList& models = problem_db.model_list(); // iterate over the Model list for (ModelLIter ml_iter = models.begin(); ml_iter != models.end(); ml_iter++) { Interface& interface = ml_iter->interface(); if (interface.interface_type() == "direct" && interface.analysis_drivers().contains("SIM") ) { // set the correct list nodes within the DB prior to new instantiations problem_db.set_db_model_nodes(ml_iter->model_id()); // plug in the new direct interface instance interface.assign_rep(new SIM::SerialDirectApplicInterface(problem_db), false); } }
In the case where the simulation interface instance should manage parallel simulations within the context of an MPI communicator, one should pass in the relevant analysis communicator(s) to the derived constructor. For the latter case of looping over a set of models, the simplest approach of passing a single analysis communicator would use code similar to
const ParallelLevel& ea_level = ml_iter->parallel_configuration_iterator()->ea_parallel_level(); const MPI_Comm& analysis_comm = ea_level.server_intra_communicator(); interface.assign_rep(new SIM::ParallelDirectApplicInterface(problem_db, analysis_comm), false);
Since Models may be used in multiple parallel contexts and may therefore have a set of parallel configurations, a more general approach would extract and pass an array of analysis communicators to allow initialization for each of the parallel configurations.
New derived direct interface instances inherit various attributes of use in configuring the simulation. In particular, the ApplicationInterface::parallelLib reference provides access to MPI communicator data (e.g., the analysis communicators discussed in Instantiating the strategy), DirectApplicInterface::analysisDrivers provides the analysis driver names specified by the user in the input file, and DirectApplicInterface::analysisComponents provides additional analysis component identifiers (such as mesh file names) provided by the user which can be used to distinguish different instances of the same simulation interface. It is worth noting that inherited attributes that are set as part of the parallel configuration (instead of being extracted from the ProblemDescDB) will be set to their defaults following construction of the base class instance for the derived class plug-in. It is not until run-time (i.e., within derived_map_if/derived_map_ac/derived_map_of) that the parallel configuration settings are re-propagated to the plug-in instance. This is the reason that the analysis communicator should be passed in to the constructor of a parallel plug-in, if the constructor will be responsible for parallel application initialization.
As part of strategy instantiation, all problem specification data is extracted from ProblemDescDB as various objects are constructed. Therefore, any updates that need to be performed following strategy instantiation must be performed through direct set operations on the constructed objects. In the previous section, the process for updating the Interface object used within a Model was shown. To update other data such as variable values/bounds/tags or response bounds/targets/tags, refer to the set functions documented in Iterator and Model. As an example, the following code updates the active continuous variable values, which will be employed as the initial guess for certain classes of Iterators:
ModelList& all_models = problem_db.model_list();
Model& first_model = *all_models.begin();
Dakota::RealVector drv(1000, 1.); // vector of length 1000, values initialized to 1.
first_model.continuous_variables(drv);
Finally, with simulation configuration and plug-ins completed, we execute the strategy:
// run the strategy
selected_strategy.run_strategy();
After executing the strategy, final results can be obtained through the use of Strategy::variables_results() and Strategy::response_results(), e.g.:
// retrieve the final parameter values const Variables& vars = selected_strategy.variables_results(); // retrieve the final response values const Response& resp = selected_strategy.response_results();
In the case of optimization, the final design is returned, and in the case of uncertainty quantification, the final statistics are returned.
This section presumes DAKOTA has been compiled with configure/make and installed to PREFIX
using 'make install'. While the DAKOTA build system offers the most up-to-date guidance for what libraries are needed to link against a particular version of DAKOTA, a typical case is presented here. Note that depending on how you configured DAKOTA, some of the following libraries may not be available (for example NPSOL, DOT, NLPQL) -- check which appear in $PREFIX/lib
.
As of DAKOTA 5.0, -levidence
is no longer required and -lgsl
is optional (discouraged due to GPL), depending on how DAKOTA was configured.
Post DAKOTA 5.0, -lquadrature
has been renamed to -lsparsegrid
. Also the DFFTPACK library should be integrated into libpecos, so -ldfftpack
should not be needed and NKM should be integrated into libsurfpack, so -lnkm
should not be needed.
Note that as of DAKOTA 5.2, -lnewmat
is no longer required but additional Boost libraries are needed (-lboost_regex
-lboost_filesystem
-lboost_system
) as a result of migration from legacy DAKOTA utilities to more modern Boost components. It should also be noted that DAKOTA relies on VERSION 2 of Boost.Filesystem which is provided in the source distribution under packages/boost (Boost.Filesystem VERSION 3 is NOT supported at this time).
DAKOTA_LIBS = -L${PREFIX}/lib -ldakota -lteuchos -lpecos -llhs \ -lsparsegrid -lsurfpack -lconmin -lddace -ldot -lfsudace \ -ljega -lcport -lnlpql -lnpsol -loptpp -lpsuade \ -lncsuopt -lcolin -linterfaces -lmomh -lscolib -lpebbl \ -ltinyxml -lutilib -l3po -lhopspack -lnidr -lamplsolver \ -lboost_signals -lboost_regex -lboost_filesystem \ -lboost_system -llapack -lblas
You may also need funcadd0.o
, -lfl
, -lexpat
, and, if linking with system-provided GSL, -lgslcblas
. The AMPL solver library may require -ldl
. System compiler and math libraries may also need to be included. If configuring with graphics, you will need to add -lDGraphics
and system X libraries (partial list here):
-lXpm -lXm -lXt -lXmu -lXp -lXext -lX11 -lSM -lICE
CMake support for library users is experimental. At present library names and inclusion requirements differ slightly when using CMake, e.g, libpecos_src and libdakota_src instead of libpecos and libdakota, respectively. A representative (non-authoritative) set of necessary link libraries is shown here for illustration purposes only:
-ldakota_src -lteuchos -lnidr -lpecos -lpecos_src -llhs -lmods -lmod -ldfftpack -lsparsegrid -lsurfpack -lsurfpack -lutilib -lcolin -linterfaces -lscolib -l3po -lpebbl -ltinyxml -lconmin -ldace -lanalyzer -lrandom -lsampling -lbose -ldot -lfsudace -lhopspack -ljega -ljega_fe -lmoga -lsoga -leutils -lutilities -lncsuopt -lnlpql -lcport -lnpsol -loptpp -lpsuade -lDGraphics -lamplsolver -lboost_regex -lboost_signals -lboost_filesystem -lboost_system -lSM -lICE -lX11 -lXext -lXm -lXpm -lXmu -lpthread -llapack -lcurl -ldl -lutilib -ltinyxml -lm -lboost_regex -lboost_filesystem -lboost_system -lexpat -lboost_signals -ljega -lteuchos -lblas -llapack
We have experienced problems with the creation of libamplsolver.a
on some platforms. Please use the DAKOTA mailing lists for help with any problems.
Finally, it is important to use the same C++ compiler (possibly an MPI wrapper) for compiling DAKOTA and your application and declare the compiler define -DHAVE_CONFIG_H
when including header files from DAKOTA. This ensures that the platform configuration settings are properly propagated.
To utilize the DAKOTA library within a parent software application, the basic steps of main.C and the order of invocation of these steps should be mimicked from within the parent application. Of these steps, ParallelLibrary instantiation, ProblemDescDB::manage_inputs() and ParallelLibrary::specify_outputs_restart() require the use of overloaded forms in order to function in an environment without direct command line access and, potentially, without file parsing. Additional optional steps not performed in main.C include the extension/derivation of the direct interface and the retrieval of strategy results after a run.
DAKOTA's library mode is now in production use within several Sandia and external simulation codes/frameworks.