Small, Fast S-Expression Library
Data Fields
pcont Struct Reference

#include <sexp.h>

Data Fields

faststack_tstack
sexp_tlast_sexp
char * val
size_t val_allocated
size_t val_used
char * vcur
char * lastPos
char * sbuffer
unsigned int depth
unsigned int qdepth
unsigned int state
unsigned int esc
unsigned int squoted
sexp_errcode_t error
parsermode_t mode
size_t binexpected
size_t binread
char * bindata
parser_event_handlers_tevent_handlers

Detailed Description

A continuation is used by the parser to save and restore state between invocations to support partial parsing of strings. For example, if we pass the string "(foo bar)(goo car)" to the parser, we want to be able to retrieve each s-expression one at a time - it would be difficult to return all s-expressions at once without knowing how many there are in advance (this would require more memory management than we want...). So, by using a continuation-based parser, we can call it with this string and have it return a continuation when it has parsed the first s-expression. Once we have processed the s-expression (accessible through the last_sexpr field of the continuation), we can call the parser again with the same string and continuation, and it will be able to pick up where it left off.

We use continuations instead of a state-ful parser to allow multiple concurrent strings to be parsed by simply maintaining a set of continuations. Manipulating continuations by hand is required if the continuation-based parser is called directly. This is not recommended unless you are willing to deal with potential errors and are willing to learn exactly how the continuation relates to the internals of the parser. A simpler approach is to use either the parse_sexp function that simply returns an s-expression without exposing the continuations, or the iparse_sexp function that allows iteratively popping one s-expression at a time from a string containing one or more s-expressions. Refer to the documentation for each parsing function for further details on behavior and usage.


Field Documentation

char* bindata

Pointer to the memory containing the binary data being read in.

size_t binexpected

Length to expect of the current binary data being read in. this also corresponds to the size of the memory allocated for reading this binary data into.

size_t binread

Number of bytes of the binary blob that have already been read in.

unsigned int depth

This is the depth of parenthesis (the number of left parens encountered) that the parser is currently working with.

Error code. Used to indicate that the continuation being returned does not represent a successful parsing and thus the contents aren't of much value.

unsigned int esc

This is a flag indicating whether the next character to be processed should be assumed to have been prefaced with a '\' character to escape it.

Pointer to a structure holding handlers for sexpr events. NULL for normal parser operation. This field is NOT freed by destroy_continuation and must be free'd by the user. This is because these are malloc'd outside the library ALWAYS, so they are the user's responsibility.

The last full s-expression encountered by the parser. If this is NULL, the parser has not encountered a full s-expression and more data is required for the current s-expression being parsed. If this is non-NULL, then the parser has encountered one s-expression and may be partially through parsing the next s-expression.

char* lastPos

Pointer to the last character to examine in the string being parsed. When the parser is called with the continuation, this is the first character that will be processed. If this is NULL, the parser will start parsing at the beginning of the string passed into the parser.

Mode. The parsers' specialized behaviours can be activated by tweaking the mode setting. There are currently two available: normal and inline_binary. Inline_binary treats atoms that start with #b# specially, assuming that they have the structure:

#b#s#data

Where s is a positive (greater than 0) integer representing the length of the data, and data is s bytes of binary data following the # sign. After the s bytes, it is assumed normal s-expression data continues.

unsigned int qdepth

This is the depth of parenthesis encountered after a single quote (tick) if the character immediately following the tick was a left paren.

char* sbuffer

This is a pointer to the beginning of the current string being processed. lastPos is a pointer to some value inside the string that this points to.

unsigned int squoted

Flag whether or not we are processing an atom that was preceeded by a single quote.

The parser stack used for iterative parsing.

unsigned int state

This is the state ID of the current state of the parser in the DFA representing the parser. The current parser is a DFA based parser to simplify restoring the proper state from a continuation.

char* val

Pointer to a temporary buffer used to store atom values during parsing.

size_t val_allocated

Current number of bytes allocated for val.

size_t val_used

Current number of used bytes in val.

char* vcur

Pointer to the character following the last character in the current atom value being parsed.