The optimization package provides algorithms to optimize (i.e. either minimize or maximize) some objective or cost function. The package is split in several sub-packages dedicated to different kind of functions or algorithms.
The top level optimization package provides common interfaces for the optimization
algorithms provided in sub-packages. The main interfaces defines defines optimizers
and convergence checkers. The functions that are optimized by the algorithms provided
by this package and its sub-packages are a subset of the one defined in the
analysis
package, namely the real and vector valued functions. These
functions are called objective function here. When the goal is to minimize, the
functions are often called cost function, this name is not used in this package.
The type of goal, i.e. minimization or maximization, is defined by the enumerated
GoalType which has only two values: MAXIMIZE
and MINIMIZE
.
Optimizers are the algorithms that will either minimize or maximize, the objective function by changing its input variables set until an optimal set is found. There are only four interfaces defining the common behavior of optimizers, one for each supported type of objective function:
Despite there are only four types of supported optimizers, it is possible to optimize a transform a non-differentiable multivariate vectorial function by converting it to a non-differentiable multivariate real function thanks to the LeastSquaresConverter helper class. The transformed function can be optimized using any implementation of the MultivariateRealOptimizer interface.
For each of the four types of supported optimizers, there is a special implementation which wraps a classical optimizer in order to add it a multi-start feature. This feature call the underlying optimizer several times in sequence with different starting points and returns the best optimum found or all optima if desired. This is a classical way to prevent being trapped into a local extremum when looking for a global one.
A
UnivariateRealOptimizer is used to find the minimal values of a univariate real-valued
function f
.
These algorithms usage is very similar to root-finding algorithms usage explained
in the analysis package. The main difference is that the solve
methods in root
finding algorithms is replaced by optimize
methods.
This package provides an implementation of George Dantzig's simplex algorithm for solving linear optimization problems with linear equality and inequality constraints.
Direct search methods only use cost function values, they don't need derivatives and don't either try to compute approximation of the derivatives. According to a 1996 paper by Margaret H. Wright (Direct Search Methods: Once Scorned, Now Respectable), they are used when either the computation of the derivative is impossible (noisy functions, unpredictable discontinuities) or difficult (complexity, computation cost). In the first cases, rather than an optimum, a not too bad point is desired. In the latter cases, an optimum is desired but cannot be reasonably found. In all cases direct search methods can be useful.
Simplex-based direct search methods are based on comparison of the cost function values at the vertices of a simplex (which is a set of n+1 points in dimension n) that is updated by the algorithms steps.
The instances can be built either in single-start or in
multi-start mode. Multi-start is a traditional way to try to avoid
being trapped in a local minimum and miss the global minimum of a
function. It can also be used to verify the convergence of an
algorithm. In multi-start mode, the minimizes
method
returns the best minimum found after all starts, and the etMinima
method can be used to retrieve all minima from all starts (including the one
already provided by the minimizes
method).
The direct
package provides two solvers. The first one is the classical
Nelder-Mead method. The second one is Virginia Torczon's
multi-directional method.
The general package deals with non-linear vectorial optimization problems when the partial derivatives of the objective function are available.
One important class of estimation problems is weighted least squares problems. They basically consist in finding the values for some parameters pk such that a cost function J = sum(wi(mesi - modi)2) is minimized. The various (targeti - modeli(pk)) terms are called residuals. They represent the deviation between a set of target values targeti and theoretical values computed from models modeli depending on free parameters pk. The wi factors are weights. One classical use case is when the target values are experimental observations or measurements.
Solving a least-squares problem is finding the free parameters pk of the theoretical models such that they are close to the target values, i.e. when the residual are small.
Two optimizers are available in the general package, both devoted to least-squares problems. The first one is based on the Gauss-Newton method. The second one is the Levenberg-Marquardt method.
In order to solve a vectorial optimization problem, the user must provide it as
an object implementing the
DifferentiableMultivariateVectorialFunction interface. The object will be provided to
the estimate
method of the optimizer, along with the target and weight arrays,
thus allowing the optimizer to compute the residuals at will. The last parameter to the
estimate
method is the point from which the optimizer will start its
search for the optimal point.
In addition to least squares solving, the NonLinearConjugateGradientOptimizer class provides a non-linear conjugate gradient algorithm to optimize DifferentiableMultivariateRealFunction. Both the Fletcher-Reeves and the Polak-Ribière search direction update methods are supported. It is also possible to set up a preconditioner or to change the line-search algorithm of the inner loop if desired (the default one is a Brent solver).
The fitting package deals with curve fitting for univariate real functions. When a univariate real function y = f(x) does depend on some unknown parameters p0, p1 ... pn-1, curve fitting can be used to find these parameters. It does this by fitting the curve so it remains very close to a set of observed points (x0, y0), (x1, y1) ... (xk-1, yk-1). This fitting is done by finding the parameters values that minimizes the objective function sum(yi-f(xi))2. This is really a least squares problem.
For all provided curve fitters, the operating principle is the same. Users must first
create an instance of the fitter, then add the observed points and once the complete
sample of observed points has been added they must call the fit
method
which will compute the parameters that best fit the sample. A weight is associated
with each observed point, this allows to take into account uncertainty on some points
when they come from loosy measurements for example. If no such information exist and
all points should be treated the same, it is safe to put 1.0 as the weight for all points.
The CurveFitter class provides curve fitting for general curves. Users must provide their own implementation of the curve template as a class implementing the ParametricRealFunction interface and they must provide the initial guess of the parameters. The more specialized PolynomialFitter and HarmonicFitter classes require neither an implementation of the parametric real function not an initial guess as they are able to compute them by themselves.
An example of fitting a polynomial is given here:
PolynomialFitter fitter = new PolynomialFitter(degree, new LevenbergMarquardtOptimizer()); fitter.addObservedPoint(-1.00, 2.021170021833143); fitter.addObservedPoint(-0.99 2.221135431136975); fitter.addObservedPoint(-0.98 2.09985277659314); fitter.addObservedPoint(-0.97 2.0211192647627025); // lots of lines ommitted fitter.addObservedPoint( 0.99, -2.4345814727089854); PolynomialFunction fitted = fitter.fit();