MLPACK  1.0.4
Public Member Functions | Private Member Functions | Private Attributes
mlpack::hmm::HMM< Distribution > Class Template Reference

A class that represents a Hidden Markov Model with an arbitrary type of emission distribution. More...

List of all members.

Public Member Functions

 HMM (const size_t states, const Distribution emissions)
 Create the Hidden Markov Model with the given number of hidden states and the given default distribution for emissions.
 HMM (const arma::mat &transition, const std::vector< Distribution > &emission)
 Create the Hidden Markov Model with the given transition matrix and the given emission distributions.
size_t Dimensionality () const
 Get the dimensionality of observations.
size_t & Dimensionality ()
 Set the dimensionality of observations.
const std::vector< Distribution > & Emission () const
 Return the emission distributions.
std::vector< Distribution > & Emission ()
 Return a modifiable emission probability matrix reference.
double Estimate (const arma::mat &dataSeq, arma::mat &stateProb, arma::mat &forwardProb, arma::mat &backwardProb, arma::vec &scales) const
 Estimate the probabilities of each hidden state at each time step for each given data observation, using the Forward-Backward algorithm.
double Estimate (const arma::mat &dataSeq, arma::mat &stateProb) const
 Estimate the probabilities of each hidden state at each time step of each given data observation, using the Forward-Backward algorithm.
void Generate (const size_t length, arma::mat &dataSequence, arma::Col< size_t > &stateSequence, const size_t startState=0) const
 Generate a random data sequence of the given length.
double LogLikelihood (const arma::mat &dataSeq) const
 Compute the log-likelihood of the given data sequence.
double Predict (const arma::mat &dataSeq, arma::Col< size_t > &stateSeq) const
 Compute the most probable hidden state sequence for the given data sequence, using the Viterbi algorithm, returning the log-likelihood of the most likely state sequence.
void Train (const std::vector< arma::mat > &dataSeq)
 Train the model using the Baum-Welch algorithm, with only the given unlabeled observations.
void Train (const std::vector< arma::mat > &dataSeq, const std::vector< arma::Col< size_t > > &stateSeq)
 Train the model using the given labeled observations; the transition and emission matrices are directly estimated.
const arma::mat & Transition () const
 Return the transition matrix.
arma::mat & Transition ()
 Return a modifiable transition matrix reference.

Private Member Functions

void Backward (const arma::mat &dataSeq, const arma::vec &scales, arma::mat &backwardProb) const
 The Backward algorithm (part of the Forward-Backward algorithm).
void Forward (const arma::mat &dataSeq, arma::vec &scales, arma::mat &forwardProb) const
 The Forward algorithm (part of the Forward-Backward algorithm).

Private Attributes

size_t dimensionality
 Dimensionality of observations.
std::vector< Distribution > emission
 Set of emission probability distributions; one for each state.
arma::mat transition
 Transition probability matrix.

Detailed Description

template<typename Distribution = distribution::DiscreteDistribution>
class mlpack::hmm::HMM< Distribution >

A class that represents a Hidden Markov Model with an arbitrary type of emission distribution.

This HMM class supports training (supervised and unsupervised), prediction of state sequences via the Viterbi algorithm, estimation of state probabilities, generation of random sequences, and calculation of the log-likelihood of a given sequence.

The template parameter, Distribution, specifies the distribution which the emissions follow. The class should implement the following functions:

 class Distribution
 {
  public:
   // The type of observation used by this distribution.
   typedef something DataType;

   // Return the probability of the given observation.
   double Probability(const DataType& observation) const;

   // Estimate the distribution based on the given observations.
   void Estimate(const std::vector<DataType>& observations);

   // Estimate the distribution based on the given observations, given also
   // the probability of each observation coming from this distribution.
   void Estimate(const std::vector<DataType>& observations,
                 const std::vector<double>& probabilities);
 };

See the mlpack::distribution::DiscreteDistribution class for an example. One would use the DiscreteDistribution class when the observations are non-negative integers. Other distributions could be Gaussians, a mixture of Gaussians (GMM), or any other probability distribution implementing the four Distribution functions.

Usage of the HMM class generally involves either training an HMM or loading an already-known HMM and taking probability measurements of sequences. Example code for supervised training of a Gaussian HMM (that is, where the emission output distribution is a single Gaussian for each hidden state) is given below.

 extern arma::mat observations; // Each column is an observation.
 extern arma::Col<size_t> states; // Hidden states for each observation.
 // Create an untrained HMM with 5 hidden states and default (N(0, 1))
 // Gaussian distributions with the dimensionality of the dataset.
 HMM<GaussianDistribution> hmm(5, GaussianDistribution(observations.n_rows));

 // Train the HMM (the labels could be omitted to perform unsupervised
 // training).
 hmm.Train(observations, states);

Once initialized, the HMM can evaluate the probability of a certain sequence (with LogLikelihood()), predict the most likely sequence of hidden states (with Predict()), generate a sequence (with Generate()), or estimate the probabilities of each state for a sequence of observations (with Estimate()).

Template Parameters:
DistributionType of emission distribution for this HMM.

Definition at line 93 of file hmm.hpp.


Constructor & Destructor Documentation

template<typename Distribution = distribution::DiscreteDistribution>
mlpack::hmm::HMM< Distribution >::HMM ( const size_t  states,
const Distribution  emissions 
)

Create the Hidden Markov Model with the given number of hidden states and the given default distribution for emissions.

Parameters:
statesNumber of states.
emissionsDefault distribution for emissions.
template<typename Distribution = distribution::DiscreteDistribution>
mlpack::hmm::HMM< Distribution >::HMM ( const arma::mat &  transition,
const std::vector< Distribution > &  emission 
)

Create the Hidden Markov Model with the given transition matrix and the given emission distributions.

The transition matrix should be such that T(i, j) is the probability of transition to state i from state j. The columns of the matrix should sum to 1.

The emission matrix should be such that E(i, j) is the probability of emission i while in state j. The columns of the matrix should sum to 1.

Parameters:
transitionTransition matrix.
emissionEmission distributions.

Member Function Documentation

template<typename Distribution = distribution::DiscreteDistribution>
void mlpack::hmm::HMM< Distribution >::Backward ( const arma::mat &  dataSeq,
const arma::vec &  scales,
arma::mat &  backwardProb 
) const [private]

The Backward algorithm (part of the Forward-Backward algorithm).

Computes backward probabilities for each state for each observation in the given data sequence, using the scaling factors found (presumably) by Forward(). The returned matrix has rows equal to the number of hidden states and columns equal to the number of observations.

Parameters:
dataSeqData sequence to compute probabilities for.
scalesVector of scaling factors.
backwardProbMatrix in which backward probabilities will be saved.
template<typename Distribution = distribution::DiscreteDistribution>
size_t mlpack::hmm::HMM< Distribution >::Dimensionality ( ) const [inline]

Get the dimensionality of observations.

Definition at line 255 of file hmm.hpp.

References mlpack::hmm::HMM< Distribution >::dimensionality.

template<typename Distribution = distribution::DiscreteDistribution>
size_t& mlpack::hmm::HMM< Distribution >::Dimensionality ( ) [inline]

Set the dimensionality of observations.

Definition at line 257 of file hmm.hpp.

References mlpack::hmm::HMM< Distribution >::dimensionality.

template<typename Distribution = distribution::DiscreteDistribution>
const std::vector<Distribution>& mlpack::hmm::HMM< Distribution >::Emission ( ) const [inline]

Return the emission distributions.

Definition at line 247 of file hmm.hpp.

References mlpack::hmm::HMM< Distribution >::emission.

template<typename Distribution = distribution::DiscreteDistribution>
std::vector<Distribution>& mlpack::hmm::HMM< Distribution >::Emission ( ) [inline]

Return a modifiable emission probability matrix reference.

Definition at line 252 of file hmm.hpp.

References mlpack::hmm::HMM< Distribution >::emission.

template<typename Distribution = distribution::DiscreteDistribution>
double mlpack::hmm::HMM< Distribution >::Estimate ( const arma::mat &  dataSeq,
arma::mat &  stateProb,
arma::mat &  forwardProb,
arma::mat &  backwardProb,
arma::vec &  scales 
) const

Estimate the probabilities of each hidden state at each time step for each given data observation, using the Forward-Backward algorithm.

Each matrix which is returned has columns equal to the number of data observations, and rows equal to the number of hidden states in the model. The log-likelihood of the most probable sequence is returned.

Parameters:
dataSeqSequence of observations.
stateProbMatrix in which the probabilities of each state at each time interval will be stored.
forwardProbMatrix in which the forward probabilities of each state at each time interval will be stored.
backwardProbMatrix in which the backward probabilities of each state at each time interval will be stored.
scalesVector in which the scaling factors at each time interval will be stored.
Returns:
Log-likelihood of most likely state sequence.
template<typename Distribution = distribution::DiscreteDistribution>
double mlpack::hmm::HMM< Distribution >::Estimate ( const arma::mat &  dataSeq,
arma::mat &  stateProb 
) const

Estimate the probabilities of each hidden state at each time step of each given data observation, using the Forward-Backward algorithm.

The returned matrix of state probabilities has columns equal to the number of data observations, and rows equal to the number of hidden states in the model. The log-likelihood of the most probable sequence is returned.

Parameters:
dataSeqSequence of observations.
stateProbProbabilities of each state at each time interval.
Returns:
Log-likelihood of most likely state sequence.
template<typename Distribution = distribution::DiscreteDistribution>
void mlpack::hmm::HMM< Distribution >::Forward ( const arma::mat &  dataSeq,
arma::vec &  scales,
arma::mat &  forwardProb 
) const [private]

The Forward algorithm (part of the Forward-Backward algorithm).

Computes forward probabilities for each state for each observation in the given data sequence. The returned matrix has rows equal to the number of hidden states and columns equal to the number of observations.

Parameters:
dataSeqData sequence to compute probabilities for.
scalesVector in which scaling factors will be saved.
forwardProbMatrix in which forward probabilities will be saved.
template<typename Distribution = distribution::DiscreteDistribution>
void mlpack::hmm::HMM< Distribution >::Generate ( const size_t  length,
arma::mat &  dataSequence,
arma::Col< size_t > &  stateSequence,
const size_t  startState = 0 
) const

Generate a random data sequence of the given length.

The data sequence is stored in the data_sequence parameter, and the state sequence is stored in the state_sequence parameter.

Parameters:
lengthLength of random sequence to generate.
dataSequenceVector to store data in.
stateSequenceVector to store states in.
startStateHidden state to start sequence in (default 0).
template<typename Distribution = distribution::DiscreteDistribution>
double mlpack::hmm::HMM< Distribution >::LogLikelihood ( const arma::mat &  dataSeq) const

Compute the log-likelihood of the given data sequence.

Parameters:
dataSeqData sequence to evaluate the likelihood of.
Returns:
Log-likelihood of the given sequence.
template<typename Distribution = distribution::DiscreteDistribution>
double mlpack::hmm::HMM< Distribution >::Predict ( const arma::mat &  dataSeq,
arma::Col< size_t > &  stateSeq 
) const

Compute the most probable hidden state sequence for the given data sequence, using the Viterbi algorithm, returning the log-likelihood of the most likely state sequence.

Parameters:
dataSeqSequence of observations.
stateSeqVector in which the most probable state sequence will be stored.
Returns:
Log-likelihood of most probable state sequence.
template<typename Distribution = distribution::DiscreteDistribution>
void mlpack::hmm::HMM< Distribution >::Train ( const std::vector< arma::mat > &  dataSeq)

Train the model using the Baum-Welch algorithm, with only the given unlabeled observations.

Instead of giving a guess transition and emission matrix here, do that in the constructor.

Note:
Train() can be called multiple times with different sequences; each time it is called, it uses the current parameters of the HMM as a starting point for training.
Parameters:
dataSeqVector of observation sequences.
template<typename Distribution = distribution::DiscreteDistribution>
void mlpack::hmm::HMM< Distribution >::Train ( const std::vector< arma::mat > &  dataSeq,
const std::vector< arma::Col< size_t > > &  stateSeq 
)

Train the model using the given labeled observations; the transition and emission matrices are directly estimated.

Note:
Train() can be called multiple times with different sequences; each time it is called, it uses the current parameters of the HMM as a starting point for training.
Parameters:
dataSeqVector of observation sequences.
stateSeqVector of state sequences, corresponding to each observation.
template<typename Distribution = distribution::DiscreteDistribution>
const arma::mat& mlpack::hmm::HMM< Distribution >::Transition ( ) const [inline]

Return the transition matrix.

Definition at line 237 of file hmm.hpp.

References mlpack::hmm::HMM< Distribution >::transition.

template<typename Distribution = distribution::DiscreteDistribution>
arma::mat& mlpack::hmm::HMM< Distribution >::Transition ( ) [inline]

Return a modifiable transition matrix reference.

Definition at line 242 of file hmm.hpp.

References mlpack::hmm::HMM< Distribution >::transition.


Member Data Documentation

template<typename Distribution = distribution::DiscreteDistribution>
size_t mlpack::hmm::HMM< Distribution >::dimensionality [private]

Dimensionality of observations.

Definition at line 292 of file hmm.hpp.

Referenced by mlpack::hmm::HMM< Distribution >::Dimensionality().

template<typename Distribution = distribution::DiscreteDistribution>
std::vector<Distribution> mlpack::hmm::HMM< Distribution >::emission [private]

Set of emission probability distributions; one for each state.

Definition at line 100 of file hmm.hpp.

Referenced by mlpack::hmm::HMM< Distribution >::Emission().

template<typename Distribution = distribution::DiscreteDistribution>
arma::mat mlpack::hmm::HMM< Distribution >::transition [private]

Transition probability matrix.

Definition at line 97 of file hmm.hpp.

Referenced by mlpack::hmm::HMM< Distribution >::Transition().


The documentation for this class was generated from the following file: