Package nltk_lite :: Package tag :: Module brill :: Class BrillTrainer
[hide private]
[frames] | no frames]

Class BrillTrainer

source code

object --+
         |
        BrillTrainer

A trainer for brill taggers.

Instance Methods [hide private]
 
__init__(self, initial_tagger, templates, trace=0)
x.__init__(...) initializes x; see x.__class__.__doc__ for signature
source code
 
train(self, train_tokens, max_rules=200, min_score=2)
Trains the Brill tagger on the corpus train_token, producing at most max_rules transformations, each of which reduces the net number of errors in the corpus by at least min_score.
source code
 
_best_rule(self, test_tokens, train_tokens) source code
 
_find_rules(self, test_tokens, train_tokens)
Find all rules that correct at least one token's tag in test_tokens.
source code
Set
_find_rules_at(self, test_tokens, train_tokens, i)
Returns: the set of all rules (based on the templates) that correct token i's tag in test_tokens.
source code
 
_trace_header(self) source code
 
_trace_rule(self, rule, score, fixscore, numchanges) source code

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __repr__, __setattr__, __str__

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, initial_tagger, templates, trace=0)
(Constructor)

source code 

x.__init__(...) initializes x; see x.__class__.__doc__ for signature

Overrides: object.__init__
(inherited documentation)

train(self, train_tokens, max_rules=200, min_score=2)

source code 

Trains the Brill tagger on the corpus train_token, producing at most max_rules transformations, each of which reduces the net number of errors in the corpus by at least min_score.

Parameters:
  • train_tokens (list of tuple) - The corpus of tagged tokens
  • max_rules (int) - The maximum number of transformations to be created
  • min_score (int) - The minimum acceptable net error reduction that each transformation must produce in the corpus.

_find_rules(self, test_tokens, train_tokens)

source code 

Find all rules that correct at least one token's tag in test_tokens.

Returns:
A list of tuples (rule, fixscore), where rule is a brill rule and fixscore is the number of tokens whose tag the rule corrects. Note that fixscore does not include the number of tokens whose tags are changed to incorrect values.

_find_rules_at(self, test_tokens, train_tokens, i)

source code 
Returns: Set
the set of all rules (based on the templates) that correct token i's tag in test_tokens.