Home | Trees | Indices | Help |
|
---|
|
This module provides tools for parsing and manipulating the contents of a Shoebox text without reference to its metadata.
|
|||
Word This class defines a word object, which consists of fixed number of attributes: a wordform, a gloss, a part of speech, and a list of morphemes. |
|||
Morpheme This class defines a morpheme object, which consists of fixed number of attributes: a surface form, an underlying form, a gloss, and a part of speech. |
|||
Line This class defines a line of interlinear glossing, such as: |
|||
Paragraph This class defines a unit of analysis above the line and below the text. |
|||
Text This class defines an interlinearized text, which consists of a collection of Paragraph objects. |
|
|||
|
|||
|
|
This method finds the indices for the leftmost boundaries of the units in a line of aligned text. Given the field \um, this function will find the indices identifing leftmost word boundaries, as follows: 0 5 8 12 <- indices | | | | ||||||||||||||||||||||||||| \sf dit is een goede <- surface form \um dit is een goed -e <- underlying morphemes \mg this is a good -ADJ <- morpheme gloss \gc DEM V ART ADJECTIVE -SUFF <- grammatical categories t This is a good explanation. <- free translation The function walks through the line char by char: c flag.before flag.after index? -- ----------- ---------- ------ 0 1 0 yes 1 0 1 no 2 1 0 no 3 0 1 no 4 1 0 no 5 1 0 yes
|
Given a string and a list of indices, this function returns a list of the substrings defined by those indices. For example, given the arguments: str='antidisestablishmentarianism', indices=[4, 7, 16, 20, 25] this function returns the list: ['anti', 'dis', 'establish', 'ment', arian', 'ism']
|
Home | Trees | Indices | Help |
|
---|
Generated by Epydoc 3.0beta1 on Wed May 16 22:47:18 2007 | http://epydoc.sourceforge.net |