Package mdp
The Modular toolkit for Data Processing (MDP) is a library of widely
used data processing algorithms that can be combined according to a
pipeline analogy to build more complex data processing software.
From the user's perspective, MDP consists of a collection of
supervised and unsupervised learning algorithms, and other data
processing units (nodes) that can be combined into data processing
sequences (flows) and more complex feed-forward network
architectures. Given a set of input data, MDP takes care of
successively training or executing all nodes in the network. This
allows the user to specify complex algorithms as a series of simpler
data processing steps in a natural way.
The base of available algorithms is steadily increasing and includes,
to name but the most common, Principal Component Analysis (PCA and
NIPALS), several Independent Component Analysis algorithms (CuBICA,
FastICA, TDSEP, JADE, and XSFA), Slow Feature Analysis, Gaussian
Classifiers, Restricted Boltzmann Machine, and Locally Linear
Embedding.
Particular care has been taken to make computations efficient in terms
of speed and memory. To reduce memory requirements, it is possible to
perform learning using batches of data, and to define the internal
parameters of the nodes to be single precision, which makes the usage
of very large data sets possible. Moreover, the 'parallel' subpackage
offers a parallel implementation of the basic nodes and flows.
From the developer's perspective, MDP is a framework that makes the
implementation of new supervised and unsupervised learning algorithms
easy and straightforward. The basic class, 'Node', takes care of
tedious tasks like numerical type and dimensionality checking, leaving
the developer free to concentrate on the implementation of the
learning and execution phases. Because of the common interface, the
node then automatically integrates with the rest of the library and
can be used in a network together with other nodes. A node can have
multiple training phases and even an undetermined number of phases.
This allows the implementation of algorithms that need to collect some
statistics on the whole input before proceeding with the actual
training, and others that need to iterate over a training phase until
a convergence criterion is satisfied. The ability to train each phase
using chunks of input data is maintained if the chunks are generated
with iterators. Moreover, crash recovery is optionally available: in
case of failure, the current state of the flow is saved for later
inspection.
MDP has been written in the context of theoretical research in
neuroscience, but it has been designed to be helpful in any context
where trainable data processing algorithms are used. Its simplicity on
the user side together with the reusability of the implemented nodes
make it also a valid educational tool.
http://mdp-toolkit.sourceforge.net
Version:
2.6
Author:
Pietro Berkes, Henning Sprekeler, Niko Wilbert, and Tiziano Zito
Contact:
mdp-toolkit-users AT lists.sourceforge.net
Copyright:
(c) 2003-2009 Pietro Berkes, Henning Sprekeler, Niko Wilbert,
Tiziano Zito
License:
LGPL v3, http://www.gnu.org/licenses/lgpl.html
|
CheckpointFlow
Subclass of Flow class that allows user-supplied checkpoint functions
to be executed at the end of each phase, for example to
save the internal structures of a node for later analysis.
|
|
CheckpointFunction
Base class for checkpoint functions.
|
|
CheckpointSaveFunction
This checkpoint function saves the node in pickle format.
|
|
CrashRecoveryException
Class to handle crash recovery
|
|
Cumulator
A Cumulator is a Node whose training phase simply collects
all input data.
|
|
ExtensionNode
Base class for extensions nodes.
|
|
ExtensionNodeMetaclass
This is the metaclass for node extension superclasses.
|
|
Flow
A 'Flow' is a sequence of nodes that are trained and executed
together to form a more complex algorithm.
|
|
FlowException
Base class for exceptions in Flow subclasses.
|
|
FlowExceptionCR
Class to handle flow-crash recovery
|
|
IsNotInvertibleException
Raised when the 'inverse' method is called although the
node is not invertible.
|
|
IsNotTrainableException
Raised when the 'train' method is called although the
node is not trainable.
|
|
MDPException
Base class for exceptions in MDP.
|
|
MDPWarning
Base class for warnings in MDP.
|
|
Node
A 'Node' is the basic building block of an MDP application.
|
|
NodeException
Base class for exceptions in Node subclasses.
|
|
TrainingException
Base class for exceptions in the training phase.
|
|
TrainingFinishedException
Raised when the 'train' method is called although the
training phase is closed.
|
|
activate_extension(extension_name)
Activate the extension by injecting the extension methods. |
|
|
|
|
|
|
|
|
|
extension_method(ext_name,
node_cls,
method_name=None)
Returns a function to register a function as extension method. |
|
|
|
get_eta(x,
**kwargs)
Compute eta values (a slowness measure) of the input data. |
|
|
|
|
|
pca(x,
**kwargs)
Filters multidimensioanl input data through its principal components. |
|
|
|
sfa(x,
**kwargs)
Perform Slow Feature Analysis on input data using the SFA
algorithm by Laurenz Wiskott. |
|
|
|
test(suitename=' all ' ,
verbosity=2,
seed=None,
testname=None) |
|
|
|
whitening(x,
**kwargs)
Filters multidimensional input data through its principal components,
rescaling the output signals such that they have unit variance. |
|
|
|
with_extension(extension_name)
Return a wrapper function to activate and deactivate the extension. |
|
|
activate_extension(extension_name)
|
|
Activate the extension by injecting the extension methods.
|
activate_extensions(extension_names)
|
|
Activate all the extensions for the given list of names.
|
deactivate_extension(extension_name)
|
|
Deactivate the extension by removing the injected methods.
|
deactivate_extensions(extension_names)
|
|
Deactivate all the extensions for the given list of names.
extension_names -- Sequence of extension names.
|
extension_method(ext_name,
node_cls,
method_name=None)
|
|
Returns a function to register a function as extension method.
This function is intended to be used with the decorator syntax.
ext_name -- String with the name of the extension.
node_cls -- Node class for which the method should be registered.
method_name -- Name of the extension method (default value is None).
If no value is provided then the name of the function is used.
Note that it is possible to directly call other extension functions, call
extension methods in other node classes or to use super in the normal way
(the function will be called as a method of the node class).
|
Compute eta values (a slowness measure) of the input data.
The delta value of a signal is a measure of its temporal
variation, and is defined as the mean of the derivative squared,
i.e. delta(x) = mean(dx/dt(t)^2). delta(x) is zero if
x is a constant signal, and increases if the temporal variation
of the signal is larger.
The eta value is a more intuitive measure of temporal variation,
defined as
eta(x) = T/(2*pi) * sqrt(delta(x))
If x is a signal of length T which consists of a sine function
that accomplishes exactly N oscillations, then eta(x)=N.
Input data are normalized to have unit variance, such that it is
possible to compare the temporal variation of two signals
independently from their scaling.
Observations of the same variable are stored on rows, different variables
are stored on columns.
This is a shortcut function for the corresponding node EtaComputerNode.
If any keyword arguments are specified, they are passed to its constructor.
|
Return a dictionary currently registered extensions.
Be careful that this is not a copy, so if you change anything in this dict
then the whole extension mechanism will be affected. If you just want the
names of the available extensions use get_extensions().keys().
|
Filters multidimensioanl input data through its principal components.
Observations of the same variable are stored on rows, different variables
are stored on columns.
This is a shortcut function for the corresponding node PCANode. If any
keyword arguments are specified, they are passed to its constructor.
This is equivalent to mdp.nodes.PCANode(**kwargs)(x)
|
Perform Slow Feature Analysis on input data using the SFA
algorithm by Laurenz Wiskott.
Observations of the same variable are stored on rows, different variables
are stored on columns.
This is a shortcut function for the corresponding node SFANode.
If any keyword arguments are specified, they are passed to its constructor.
This is equivalent to mdp.nodes.SFANode(**kwargs)(x)
|
test(suitename=' all ' ,
verbosity=2,
seed=None,
testname=None)
|
|
|
Filters multidimensional input data through its principal components,
rescaling the output signals such that they have unit variance.
Observations of the same variable are stored on rows, different variables
are stored on columns.
This is a shortcut function for the corresponding node WhiteningNode.
If any keyword arguments are specified, they are passed to its constructor.
This is equivalent to mdp.nodes.WhiteningNode(**kwargs)(x)
|
with_extension(extension_name)
|
|
Return a wrapper function to activate and deactivate the extension.
This function is intended to be used with the decorator syntax.
|