Module Parser
source code
Implement Martel parsers.
The classes in this module are used by other Martel modules and not
typically by external users.
There are two major parsers, 'Parser' and 'RecordParser.' The first
is the standard one, which parses the file as one string in memory then
generates the SAX events. The other reads a record at a time using a
RecordReader and generates events after each read. The generated event
callbacks are identical.
At some level, both parsers use "_do_callback" to convert
mxTextTools tags into SAX events.
XXX finish this documentation
XXX need a better way to get closer to the likely error position when
parsing.
XXX need to implement Locator
|
_do_callback(s,
begin,
end,
taglist,
cont_handler,
attrlookup)
internal function to convert the tagtable into ContentHandler events |
source code
|
|
|
_do_dispatch_callback(s,
begin,
end,
taglist,
start_table_get,
cont_handler,
save_stack,
end_table_get,
attrlookup)
internal function to convert the tagtable into ContentHandler events |
source code
|
|
|
_parse_elements(s,
tagtable,
cont_handler,
debug_level,
attrlookup)
parse the string with the tagtable and send the ContentHandler events |
source code
|
|
|
_match_group = {}
|
|
_attribute_list = MartelAttributeList([])
|
_do_callback(s,
begin,
end,
taglist,
cont_handler,
attrlookup)
| source code
|
internal function to convert the tagtable into ContentHandler
events
's' is the input text 'begin' is the current position in the text
'end' is 1 past the last position of the text allowed to be parsed
'taglist' is the tag list from mxTextTools.parse 'cont_handler' is the
SAX ContentHandler 'attrlookup' is a dict mapping the encoded tag name to
the element info
|
_do_dispatch_callback(s,
begin,
end,
taglist,
start_table_get,
cont_handler,
save_stack,
end_table_get,
attrlookup)
| source code
|
internal function to convert the tagtable into ContentHandler
events
THIS IS A SPECIAL CASE FOR Dispatch.Dispatcher objects
's' is the input text 'begin' is the current position in the text
'end' is 1 past the last position of the text allowed to be parsed
'taglist' is the tag list from mxTextTools.parse 'start_table_get' is the
Dispatcher._start_table 'cont_handler' is the Dispatcher 'end_table_get'
is the Dispatcher._end_table 'cont_handler' is the SAX ContentHandler
'attrlookup' is a dict mapping the encoded tag name to the element
info
|
_parse_elements(s,
tagtable,
cont_handler,
debug_level,
attrlookup)
| source code
|
parse the string with the tagtable and send the ContentHandler
events
Specifically, it sends the startElement, endElement and characters
events but not startDocument and endDocument.
|