The class which parses text zones in a mac MS Works document v4. More...
#include <MSK4Text.hxx>
Classes | |
struct | DataFOD |
structure which retrieves data information which correspond to a text position More... | |
Public Member Functions | |
MSK4Text (MSK4Zone &parser) | |
constructor | |
~MSK4Text () | |
destructor | |
void | setDefault (MWAWFont &font) |
sets the default font | |
int | numPages () const |
returns the number of pages | |
void | flushExtra (MWAWInputStreamPtr) |
sends the data which have not been sent: actually do nothing | |
Protected Types | |
typedef bool(MSK4Text::* | FDPParser )(MWAWInputStreamPtr &input, long endPos, int &id, std::string &mess) |
callback when a new attribute is found in an FDPP/FDPC entry | |
typedef bool(MSK4Text::* | DataParser )(MWAWInputStreamPtr input, long endPos, long bot, long eot, int id, std::string &mess) |
definition of the plc data parser (low level) | |
Protected Member Functions | |
bool | readStructures (MWAWInputStreamPtr input, bool mainOle) |
finds and parses all structures which correspond to the text | |
bool | readText (MWAWInputStreamPtr input, MWAWEntry const &entry, bool mainOle) |
reads a text section and send it to the listener | |
bool | readFootNote (MWAWInputStreamPtr input, int id) |
sends the text which corresponds to footnote id to the listner | |
bool | readPLC (MWAWInputStreamPtr input, MWAWEntry const &entry, std::vector< long > &textPtrs, std::vector< long > &listValues, DataParser parser=&MSK4Text::defDataParser) |
reads a PLC (Pointer List Composant ?) in zone entry | |
bool | readSimplePLC (MWAWInputStreamPtr &input, MWAWEntry const &entry, std::vector< long > &textPtrs, std::vector< long > &listValues) |
reads a PLC (Pointer List Composant ?) in zone entry | |
bool | defDataParser (MWAWInputStreamPtr input, long endPos, long bot, long eot, int id, std::string &mess) |
the default parser (does nothing) | |
bool | readFontNames (MWAWInputStreamPtr input, MWAWEntry const &entry) |
reads the font names entry : FONT | |
bool | readFont (MWAWInputStreamPtr &input, long endPos, int &id, std::string &mess) |
reads a font properties | |
void | setProperty (MSK4TextInternal::Paragraph const &tabs) |
sends a paragraph properties to the listener | |
bool | readParagraph (MWAWInputStreamPtr &input, long endPos, int &id, std::string &mess) |
reads a paragraph properties | |
bool | ftntDataParser (MWAWInputStreamPtr input, long endPos, long bot, long eot, int id, std::string &mess) |
parses the footnote position : FTNT | |
bool | eobjDataParser (MWAWInputStreamPtr input, long endPos, long bot, long eot, int id, std::string &mess) |
parses the object position : EOBJ | |
bool | toknDataParser (MWAWInputStreamPtr input, long endPos, long bot, long eot, int id, std::string &mess) |
parses the field properties entries : TOKN. | |
bool | pgdDataParser (MWAWInputStreamPtr input, long endPos, long, long, int id, std::string &mess) |
parses the pagebreak positin entries : PGD | |
void | flushNote (int noteId) |
sends to the listener the text which corresponds to noteId | |
MSK4Zone const * | mainParser () const |
returns the main parser | |
MSK4Zone * | mainParser () |
returns the main parser | |
std::vector< DataFOD > | mergeSortedLists (std::vector< DataFOD > const &lst1, std::vector< DataFOD > const &lst2) const |
function which takes two sorted list of attribute (by text position). | |
bool | readFDP (MWAWInputStreamPtr &input, MWAWEntry const &entry, std::vector< DataFOD > &fods, FDPParser parser) |
parses a FDPP or a FDPC entry (which contains a list of ATTR_TEXT/ATTR_PARAG with their definition ) and adds found data in listFODs | |
bool | findFDPStructures (MWAWInputStreamPtr &input, int which) |
Fills the vector of (FDPCs/FDPPs) paragraph/characters strutures. | |
bool | findFDPStructuresByHand (MWAWInputStreamPtr &input, int which) |
Fills the vector of (FDPCs/FDPPs) paragraph/characters strutures, a function to call when the normal ways fails. | |
Protected Attributes | |
MWAWParserStatePtr | m_parserState |
the parser state | |
MSK4Zone * | m_mainParser |
the main parser | |
MWAWEntry | m_textPositions |
an entry which corresponds to the complete text zone | |
shared_ptr < MSK4TextInternal::State > | m_state |
the internal state | |
std::vector< DataFOD > | m_FODsList |
the list of a FOD | |
std::vector< MWAWEntry const * > | m_FDPCs |
the list of FDPC entries | |
std::vector< MWAWEntry const * > | m_FDPPs |
the list of FDPP entries | |
Private Member Functions | |
MSK4Text (MSK4Text const &orig) | |
MSK4Text & | operator= (MSK4Text const &orig) |
Friends | |
class | MSK4Zone |
The class which parses text zones in a mac MS Works document v4.
This class must be associated with a MSK4Zone. It reads the entries:
typedef bool(MSK4Text::* MSK4Text::DataParser)(MWAWInputStreamPtr input, long endPos, long bot, long eot, int id, std::string &mess) [protected] |
definition of the plc data parser (low level)
endPos | the end of the properties' definition, |
bot,eot | defined the text zone corresponding to these properties |
id | the number of this properties |
mess | a string which can be filled to indicate unparsed data |
typedef bool(MSK4Text::* MSK4Text::FDPParser)(MWAWInputStreamPtr &input, long endPos, int &id, std::string &mess) [protected] |
callback when a new attribute is found in an FDPP/FDPC entry
input,endPos,: | defined the zone in the file |
MSK4Text::MSK4Text | ( | MSK4Zone & | parser | ) |
constructor
destructor
MSK4Text::MSK4Text | ( | MSK4Text const & | orig | ) | [private] |
bool MSK4Text::defDataParser | ( | MWAWInputStreamPtr | input, |
long | endPos, | ||
long | bot, | ||
long | eot, | ||
int | id, | ||
std::string & | mess | ||
) | [protected] |
the default parser (does nothing)
bool MSK4Text::eobjDataParser | ( | MWAWInputStreamPtr | input, |
long | endPos, | ||
long | bot, | ||
long | eot, | ||
int | id, | ||
std::string & | mess | ||
) | [protected] |
parses the object position : EOBJ
Referenced by readStructures().
bool MSK4Text::findFDPStructures | ( | MWAWInputStreamPtr & | input, |
int | which | ||
) | [protected] |
Fills the vector of (FDPCs/FDPPs) paragraph/characters strutures.
Uses the entry BTEC/BTEP : the normal ways, and calls readSimplePLC on each entry to check that the parsing is correct
input | the file input |
which | = 0 : paragraphs structures |
which | = 1 : characters structures |
Referenced by readStructures().
bool MSK4Text::findFDPStructuresByHand | ( | MWAWInputStreamPtr & | input, |
int | which | ||
) | [protected] |
Fills the vector of (FDPCs/FDPPs) paragraph/characters strutures, a function to call when the normal ways fails.
Uses all entries FDPCs/FDPPs and calls readSimplePLC on each entry to check that the parsing is correct.
input | the file input |
which | = 0 : paragraphs structures |
which | = 1 : characters structures |
Referenced by readStructures().
void MSK4Text::flushExtra | ( | MWAWInputStreamPtr | ) | [inline] |
sends the data which have not been sent: actually do nothing
void MSK4Text::flushNote | ( | int | noteId | ) | [protected] |
sends to the listener the text which corresponds to noteId
bool MSK4Text::ftntDataParser | ( | MWAWInputStreamPtr | input, |
long | endPos, | ||
long | bot, | ||
long | eot, | ||
int | id, | ||
std::string & | mess | ||
) | [protected] |
parses the footnote position : FTNT
Referenced by readStructures().
MSK4Zone const* MSK4Text::mainParser | ( | ) | const [inline, protected] |
returns the main parser
Referenced by findFDPStructures(), findFDPStructuresByHand(), readStructures(), and readText().
MSK4Zone* MSK4Text::mainParser | ( | ) | [inline, protected] |
returns the main parser
std::vector< MSK4Text::DataFOD > MSK4Text::mergeSortedLists | ( | std::vector< DataFOD > const & | lst1, |
std::vector< DataFOD > const & | lst2 | ||
) | const [protected] |
function which takes two sorted list of attribute (by text position).
Referenced by readPLC(), and readStructures().
int MSK4Text::numPages | ( | ) | const |
returns the number of pages
bool MSK4Text::pgdDataParser | ( | MWAWInputStreamPtr | input, |
long | endPos, | ||
long | , | ||
long | , | ||
int | id, | ||
std::string & | mess | ||
) | [protected] |
parses the pagebreak positin entries : PGD
Referenced by readStructures().
bool MSK4Text::readFDP | ( | MWAWInputStreamPtr & | input, |
MWAWEntry const & | entry, | ||
std::vector< DataFOD > & | fods, | ||
MSK4Text::FDPParser | parser | ||
) | [protected] |
parses a FDPP or a FDPC entry (which contains a list of ATTR_TEXT/ATTR_PARAG with their definition ) and adds found data in listFODs
Referenced by readStructures().
bool MSK4Text::readFont | ( | MWAWInputStreamPtr & | input, |
long | endPos, | ||
int & | id, | ||
std::string & | mess | ||
) | [protected] |
reads a font properties
Referenced by readStructures().
bool MSK4Text::readFontNames | ( | MWAWInputStreamPtr | input, |
MWAWEntry const & | entry | ||
) | [protected] |
reads the font names entry : FONT
Referenced by readStructures().
bool MSK4Text::readFootNote | ( | MWAWInputStreamPtr | input, |
int | id | ||
) | [protected] |
sends the text which corresponds to footnote id to the listner
bool MSK4Text::readParagraph | ( | MWAWInputStreamPtr & | input, |
long | endPos, | ||
int & | id, | ||
std::string & | mess | ||
) | [protected] |
reads a paragraph properties
Referenced by readStructures().
bool MSK4Text::readPLC | ( | MWAWInputStreamPtr | input, |
MWAWEntry const & | entry, | ||
std::vector< long > & | textPtrs, | ||
std::vector< long > & | listValues, | ||
MSK4Text::DataParser | parser = &MSK4Text::defDataParser |
||
) | [protected] |
reads a PLC (Pointer List Composant ?) in zone entry
input | the file's input |
entry | the zone which contains the plc |
textPtrs | lists of offset in text zones where properties changes |
listValues | lists of properties values (filled only if values are simple types: int, ..) |
parser | the parser to use to read the values |
Referenced by readSimplePLC(), and readStructures().
bool MSK4Text::readSimplePLC | ( | MWAWInputStreamPtr & | input, |
MWAWEntry const & | entry, | ||
std::vector< long > & | textPtrs, | ||
std::vector< long > & | listValues | ||
) | [inline, protected] |
reads a PLC (Pointer List Composant ?) in zone entry
input | the file's input |
entry | the zone which contains the plc |
textPtrs | lists of offset in text zones where properties changes |
listValues | lists of properties values (filled only if values are simple types: int, ..) |
Referenced by findFDPStructures().
bool MSK4Text::readStructures | ( | MWAWInputStreamPtr | input, |
bool | mainOle | ||
) | [protected] |
finds and parses all structures which correspond to the text
More precisely the TEXT, FONT, FDPC/FDPP, BTEC/BTEP, FTNT, PGD, TOKN entries
eobj and RBIL seems linked ( and associate with a 0xc6 symbol in file) RBIL: can store a chart, a calendar, ...
bool MSK4Text::readText | ( | MWAWInputStreamPtr | input, |
MWAWEntry const & | entry, | ||
bool | mainOle | ||
) | [protected] |
reads a text section and send it to the listener
Referenced by readFootNote().
void MSK4Text::setDefault | ( | MWAWFont & | font | ) |
sets the default font
void MSK4Text::setProperty | ( | MSK4TextInternal::Paragraph const & | tabs | ) | [protected] |
sends a paragraph properties to the listener
Referenced by readText().
bool MSK4Text::toknDataParser | ( | MWAWInputStreamPtr | input, |
long | endPos, | ||
long | bot, | ||
long | eot, | ||
int | id, | ||
std::string & | mess | ||
) | [protected] |
parses the field properties entries : TOKN.
Referenced by readStructures().
friend class MSK4Zone [friend] |
std::vector<MWAWEntry const *> MSK4Text::m_FDPCs [protected] |
the list of FDPC entries
Referenced by findFDPStructures(), findFDPStructuresByHand(), and readStructures().
std::vector<MWAWEntry const *> MSK4Text::m_FDPPs [protected] |
the list of FDPP entries
Referenced by findFDPStructures(), findFDPStructuresByHand(), and readStructures().
std::vector<DataFOD> MSK4Text::m_FODsList [protected] |
the list of a FOD
Referenced by numPages(), readPLC(), readStructures(), and readText().
MSK4Zone* MSK4Text::m_mainParser [protected] |
the main parser
Referenced by mainParser(), readFDP(), readFontNames(), readParagraph(), readPLC(), readText(), and toknDataParser().
MWAWParserStatePtr MSK4Text::m_parserState [protected] |
the parser state
Referenced by readFont(), readFontNames(), readFootNote(), readParagraph(), readText(), and setProperty().
shared_ptr<MSK4TextInternal::State> MSK4Text::m_state [mutable, protected] |
the internal state
Referenced by eobjDataParser(), ftntDataParser(), MSK4Text(), numPages(), readFont(), readFontNames(), readFootNote(), readParagraph(), readPLC(), readStructures(), readText(), setDefault(), and setProperty().
MWAWEntry MSK4Text::m_textPositions [protected] |
an entry which corresponds to the complete text zone
Referenced by readFDP(), readFootNote(), readPLC(), readStructures(), readText(), and toknDataParser().