Package nltk_lite :: Package corpora :: Module brown
[hide private]
[frames] | no frames]

Module brown

source code

Read tokens from the Brown Corpus.

Brown Corpus: A Standard Corpus of Present-Day Edited American English, for use with Digital Computers, by W. N. Francis and H. Kucera (1964), Department of Linguistics, Brown University, Providence, Rhode Island, USA. Revised 1971, Revised and Amplified 1979. Distributed with NLTK with the permission of the copyright holder. Source: http://www.hit.uib.no/icame/brown/bcm.html

The Brown Corpus is divided into the following files:

a. press: reportage b. press: editorial c. press: reviews d. religion e. skill and hobbies f. popular lore g. belles-lettres h. miscellaneous: government & house organs j. learned k: fiction: general l: fiction: mystery m: fiction: science n: fiction: adventure p. fiction: romance r. humor

Functions [hide private]
 
_read(files, conversion_function) source code
 
raw(files=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'j', 'k', 'l', 'm', '...) source code
 
tagged(files=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'j', 'k', 'l', 'm', '...) source code
 
demo() source code
Variables [hide private]
  items = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'j', 'k', 'l'...
  item_name = {'a': 'press: reportage', 'b': 'press: editorial',...
Variables Details [hide private]

items

Value:
['a',
 'b',
 'c',
 'd',
 'e',
 'f',
 'g',
 'h',
...

item_name

Value:
{'a': 'press: reportage',
 'b': 'press: editorial',
 'c': 'press: reviews',
 'd': 'religion',
 'e': 'skill and hobbies',
 'f': 'popular lore',
 'g': 'belles-lettres',
 'h': 'miscellaneous: government & house organs',
...