Class NaiveBayes
source code
ClassifyI --+
|
AbstractClassify --+
|
NaiveBayes
The Naive Bayes Classifier is a supervised classifier.
It needs to be trained with representative examples of
each class. From these examples the classifier
calculates the most probable classification of the sample.
P(class) * P(features|class)
P(class|features) = -------------------------
P(features)
Internal data structures:
_feature_dectector:
holds a feature detector function
_classes:
holds a list of classes supplied during training
_cls_prob_dist:
hols a Probability Distribution, namely GoodTuringProbDist
this structure is defined in probabilty.py in nltk_lite
this structure is indexed by classnames
_feat_prob_dist:
holds Conditional Probability Distribution, conditions are
class name, and feature type name
these probability distributions are indexed by feature values
this structure is defined in probabilty.py in nltk_lite
__init__(self,
feature_detector)
(Constructor)
| source code
|
- Parameters:
feature_detector - feature detector produced function, which takes a sample of
object to be classified (eg: string or list of words) and returns
a list of tuples (feature_type_name, list of values of this
feature type)
|
- Parameters:
gold - dictionary of class names to representative examples function
takes representative examples of classes then creates frequency
distributions of these classes these freqdists are used to create
probability distributions
- Overrides:
ClassifyI.train
|
- Parameters:
sample - sample to be classified
|
@param sample: sample to be tested
@ret: Dictionary (class to probability)
naivebayes classifier:
creates a probability distribution based on sample string
sums the log probabilities of each feature value
for each class and feature type
and with the probability of the resepective class
|