http://xml.apache.org/http://www.apache.org/http://www.w3.org/

Home

Readme
Download
Installation
Build

API Docs
Samples
Schema

FAQs
Programming
Migration

Releases
Bug-Reporting
Feedback

Y2K Compliance
PDF Document

CVS Repository
Mail Archive

API Docs for SAX and DOM
 

Main Page   Class Hierarchy   Alphabetical List   Compound List   File List   Compound Members   File Members  

SAXParser.hpp

Go to the documentation of this file.
00001 /*
00002  * The Apache Software License, Version 1.1
00003  *
00004  * Copyright (c) 1999-2001 The Apache Software Foundation.  All rights
00005  * reserved.
00006  *
00007  * Redistribution and use in source and binary forms, with or without
00008  * modification, are permitted provided that the following conditions
00009  * are met:
00010  *
00011  * 1. Redistributions of source code must retain the above copyright
00012  *    notice, this list of conditions and the following disclaimer.
00013  *
00014  * 2. Redistributions in binary form must reproduce the above copyright
00015  *    notice, this list of conditions and the following disclaimer in
00016  *    the documentation and/or other materials provided with the
00017  *    distribution.
00018  *
00019  * 3. The end-user documentation included with the redistribution,
00020  *    if any, must include the following acknowledgment:
00021  *       "This product includes software developed by the
00022  *        Apache Software Foundation (http://www.apache.org/)."
00023  *    Alternately, this acknowledgment may appear in the software itself,
00024  *    if and wherever such third-party acknowledgments normally appear.
00025  *
00026  * 4. The names "Xerces" and "Apache Software Foundation" must
00027  *    not be used to endorse or promote products derived from this
00028  *    software without prior written permission. For written
00029  *    permission, please contact apache\@apache.org.
00030  *
00031  * 5. Products derived from this software may not be called "Apache",
00032  *    nor may "Apache" appear in their name, without prior written
00033  *    permission of the Apache Software Foundation.
00034  *
00035  * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
00036  * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
00037  * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
00038  * DISCLAIMED.  IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR
00039  * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
00040  * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
00041  * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
00042  * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
00043  * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
00044  * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
00045  * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
00046  * SUCH DAMAGE.
00047  * ====================================================================
00048  *
00049  * This software consists of voluntary contributions made by many
00050  * individuals on behalf of the Apache Software Foundation, and was
00051  * originally based on software copyright (c) 1999, International
00052  * Business Machines, Inc., http://www.ibm.com .  For more information
00053  * on the Apache Software Foundation, please see
00054  * <http://www.apache.org/>.
00055  */
00056 
00057 /*
00058  * $Log: SAXParser.hpp,v $
00059  * Revision 1.18  2001/07/16 12:52:09  tng
00060  * APIDocs fix: default for schema processing in DOMParser, IDOMParser, and SAXParser should be false.
00061  *
00062  * Revision 1.17  2001/06/23 14:13:16  tng
00063  * Remove getScanner from the Parser headers as this is not needed and Scanner is not internal class.
00064  *
00065  * Revision 1.16  2001/06/03 19:26:20  jberry
00066  * Add support for querying error count following parse; enables simple parse without requiring error handler.
00067  *
00068  * Revision 1.15  2001/05/11 13:26:22  tng
00069  * Copyright update.
00070  *
00071  * Revision 1.14  2001/05/03 19:09:25  knoaman
00072  * Support Warning/Error/FatalError messaging.
00073  * Validity constraints errors are treated as errors, with the ability by user to set
00074  * validity constraints as fatal errors.
00075  *
00076  * Revision 1.13  2001/03/30 16:46:57  tng
00077  * Schema: Use setDoSchema instead of setSchemaValidation which makes more sense.
00078  *
00079  * Revision 1.12  2001/03/21 21:56:09  tng
00080  * Schema: Add Schema Grammar, Schema Validator, and split the DTDValidator into DTDValidator, DTDScanner, and DTDGrammar.
00081  *
00082  * Revision 1.11  2001/02/15 15:56:29  tng
00083  * Schema: Add setSchemaValidation and getSchemaValidation for DOMParser and SAXParser.
00084  * Add feature "http://apache.org/xml/features/validation/schema" for SAX2XMLReader.
00085  * New data field  fSchemaValidation in XMLScanner as the flag.
00086  *
00087  * Revision 1.10  2001/01/12 21:23:41  tng
00088  * Documentation Enhancement: explain values of Val_Scheme
00089  *
00090  * Revision 1.9  2000/08/02 18:05:15  jpolast
00091  * changes required for sax2
00092  * (changed private members to protected)
00093  *
00094  * Revision 1.8  2000/04/12 22:58:30  roddey
00095  * Added support for 'auto validate' mode.
00096  *
00097  * Revision 1.7  2000/03/03 01:29:34  roddey
00098  * Added a scanReset()/parseReset() method to the scanner and
00099  * parsers, to allow for reset after early exit from a progressive parse.
00100  * Added calls to new Terminate() call to all of the samples. Improved
00101  * documentation in SAX and DOM parsers.
00102  *
00103  * Revision 1.6  2000/02/17 03:54:27  rahulj
00104  * Added some new getters to query the parser state and
00105  * clarified the documentation.
00106  *
00107  * Revision 1.5  2000/02/16 03:42:58  rahulj
00108  * Finished documenting the SAX Driver implementation.
00109  *
00110  * Revision 1.4  2000/02/15 04:47:37  rahulj
00111  * Documenting the SAXParser framework. Not done yet.
00112  *
00113  * Revision 1.3  2000/02/06 07:47:56  rahulj
00114  * Year 2K copyright swat.
00115  *
00116  * Revision 1.2  1999/12/15 19:57:48  roddey
00117  * Got rid of redundant 'const' on boolean return value. Some compilers choke
00118  * on this and its useless.
00119  *
00120  * Revision 1.1.1.1  1999/11/09 01:07:51  twl
00121  * Initial checkin
00122  *
00123  * Revision 1.6  1999/11/08 20:44:54  rahul
00124  * Swat for adding in Product name and CVS comment log variable.
00125  *
00126  */
00127 
00128 #if !defined(SAXPARSER_HPP)
00129 #define SAXPARSER_HPP
00130 
00131 #include <sax/Parser.hpp>
00132 #include <internal/VecAttrListImpl.hpp>
00133 #include <framework/XMLDocumentHandler.hpp>
00134 #include <framework/XMLElementDecl.hpp>
00135 #include <framework/XMLEntityHandler.hpp>
00136 #include <framework/XMLErrorReporter.hpp>
00137 #include <validators/DTD/DocTypeHandler.hpp>
00138 
00139 class DocumentHandler;
00140 class EntityResolver;
00141 class XMLPScanToken;
00142 class XMLScanner;
00143 class XMLValidator;
00144 
00145 
00155 
00156 class  SAXParser :
00157 
00158     public Parser
00159     , public XMLDocumentHandler
00160     , public XMLErrorReporter
00161     , public XMLEntityHandler
00162     , public DocTypeHandler
00163 {
00164 public :
00165     // -----------------------------------------------------------------------
00166     //  Class types
00167     // -----------------------------------------------------------------------
00168     enum ValSchemes
00169     {
00170         Val_Never
00171         , Val_Always
00172         , Val_Auto
00173     };
00174 
00175 
00176     // -----------------------------------------------------------------------
00177     //  Constructors and Destructor
00178     // -----------------------------------------------------------------------
00179 
00182 
00187     SAXParser(XMLValidator* const valToAdopt = 0);
00188 
00192     ~SAXParser();
00194 
00195 
00198 
00204     DocumentHandler* getDocumentHandler();
00205 
00212     const DocumentHandler* getDocumentHandler() const;
00213 
00220     EntityResolver* getEntityResolver();
00221 
00228     const EntityResolver* getEntityResolver() const;
00229 
00236     ErrorHandler* getErrorHandler();
00237 
00244     const ErrorHandler* getErrorHandler() const;
00245 
00252     const XMLValidator& getValidator() const;
00253 
00261     ValSchemes getValidationScheme() const;
00262 
00273     bool getDoSchema() const;
00274 
00285     int getErrorCount() const;
00286 
00296     bool getDoNamespaces() const;
00297 
00307     bool getExitOnFirstFatalError() const;
00308 
00319     bool getValidationConstraintFatal() const;
00321 
00322 
00323     // -----------------------------------------------------------------------
00324     //  Setter methods
00325     // -----------------------------------------------------------------------
00326 
00329 
00346     void setDoNamespaces(const bool newState);
00347 
00364     void setValidationScheme(const ValSchemes newScheme);
00365 
00379     void setDoSchema(const bool newState);
00380 
00381 
00397     void setExitOnFirstFatalError(const bool newState);
00398 
00414     void setValidationConstraintFatal(const bool newState);
00416 
00417 
00418     // -----------------------------------------------------------------------
00419     //  Advanced document handler list maintenance methods
00420     // -----------------------------------------------------------------------
00421 
00424 
00437     void installAdvDocHandler(XMLDocumentHandler* const toInstall);
00438 
00448     bool removeAdvDocHandler(XMLDocumentHandler* const toRemove);
00450 
00451 
00452     // -----------------------------------------------------------------------
00453     //  Implementation of the SAXParser interface
00454     // -----------------------------------------------------------------------
00455 
00458 
00470     virtual void parse(const InputSource& source, const bool reuseGrammar = false);
00471 
00484     virtual void parse(const XMLCh* const systemId, const bool reuseGrammar = false);
00485 
00496     virtual void parse(const char* const systemId, const bool reuseGrammar = false);
00497 
00508     virtual void setDocumentHandler(DocumentHandler* const handler);
00509 
00519     virtual void setDTDHandler(DTDHandler* const handler);
00520 
00531     virtual void setErrorHandler(ErrorHandler* const handler);
00532 
00544     virtual void setEntityResolver(EntityResolver* const resolver);
00546 
00547 
00548     // -----------------------------------------------------------------------
00549     //  Progressive scan methods
00550     // -----------------------------------------------------------------------
00551 
00554 
00585     bool parseFirst
00586     (
00587         const   XMLCh* const    systemId
00588         ,       XMLPScanToken&  toFill
00589         , const bool            reuseGrammar = false
00590     );
00591 
00622     bool parseFirst
00623     (
00624         const   char* const     systemId
00625         ,       XMLPScanToken&  toFill
00626         , const bool            reuseGrammar = false
00627     );
00628 
00659     bool parseFirst
00660     (
00661         const   InputSource&    source
00662         ,       XMLPScanToken&  toFill
00663         , const bool            reuseGrammar = false
00664     );
00665 
00690     bool parseNext(XMLPScanToken& token);
00691 
00713     void parseReset(XMLPScanToken& token);
00714 
00716 
00717 
00718 
00719     // -----------------------------------------------------------------------
00720     //  Implementation of the DocTypeHandler Interface
00721     // -----------------------------------------------------------------------
00722 
00725 
00739     virtual void attDef
00740     (
00741         const   DTDElementDecl& elemDecl
00742         , const DTDAttDef&      attDef
00743         , const bool            ignoring
00744     );
00745 
00755     virtual void doctypeComment
00756     (
00757         const   XMLCh* const    comment
00758     );
00759 
00776     virtual void doctypeDecl
00777     (
00778         const   DTDElementDecl& elemDecl
00779         , const XMLCh* const    publicId
00780         , const XMLCh* const    systemId
00781         , const bool            hasIntSubset
00782     );
00783 
00797     virtual void doctypePI
00798     (
00799         const   XMLCh* const    target
00800         , const XMLCh* const    data
00801     );
00802 
00814     virtual void doctypeWhitespace
00815     (
00816         const   XMLCh* const    chars
00817         , const unsigned int    length
00818     );
00819 
00832     virtual void elementDecl
00833     (
00834         const   DTDElementDecl& decl
00835         , const bool            isIgnored
00836     );
00837 
00848     virtual void endAttList
00849     (
00850         const   DTDElementDecl& elemDecl
00851     );
00852 
00859     virtual void endIntSubset();
00860 
00867     virtual void endExtSubset();
00868 
00883     virtual void entityDecl
00884     (
00885         const   DTDEntityDecl&  entityDecl
00886         , const bool            isPEDecl
00887         , const bool            isIgnored
00888     );
00889 
00894     virtual void resetDocType();
00895 
00908     virtual void notationDecl
00909     (
00910         const   XMLNotationDecl&    notDecl
00911         , const bool                isIgnored
00912     );
00913 
00924     virtual void startAttList
00925     (
00926         const   DTDElementDecl& elemDecl
00927     );
00928 
00935     virtual void startIntSubset();
00936 
00943     virtual void startExtSubset();
00944 
00957     virtual void TextDecl
00958     (
00959         const   XMLCh* const    versionStr
00960         , const XMLCh* const    encodingStr
00961     );
00963 
00964 
00965     // -----------------------------------------------------------------------
00966     //  Implementation of the XMLDocumentHandler interface
00967     // -----------------------------------------------------------------------
00968 
00971 
00986     virtual void docCharacters
00987     (
00988         const   XMLCh* const    chars
00989         , const unsigned int    length
00990         , const bool            cdataSection
00991     );
00992 
01002     virtual void docComment
01003     (
01004         const   XMLCh* const    comment
01005     );
01006 
01026     virtual void docPI
01027     (
01028         const   XMLCh* const    target
01029         , const XMLCh* const    data
01030     );
01031 
01043     virtual void endDocument();
01044 
01061     virtual void endElement
01062     (
01063         const   XMLElementDecl& elemDecl
01064         , const unsigned int    urlId
01065         , const bool            isRoot
01066     );
01067 
01078     virtual void endEntityReference
01079     (
01080         const   XMLEntityDecl&  entDecl
01081     );
01082 
01102     virtual void ignorableWhitespace
01103     (
01104         const   XMLCh* const    chars
01105         , const unsigned int    length
01106         , const bool            cdataSection
01107     );
01108 
01113     virtual void resetDocument();
01114 
01125     virtual void startDocument();
01126 
01153     virtual void startElement
01154     (
01155         const   XMLElementDecl&         elemDecl
01156         , const unsigned int            urlId
01157         , const XMLCh* const            elemPrefix
01158         , const RefVectorOf<XMLAttr>&   attrList
01159         , const unsigned int            attrCount
01160         , const bool                    isEmpty
01161         , const bool                    isRoot
01162     );
01163 
01173     virtual void startEntityReference
01174     (
01175         const   XMLEntityDecl&  entDecl
01176     );
01177 
01195     virtual void XMLDecl
01196     (
01197         const   XMLCh* const    versionStr
01198         , const XMLCh* const    encodingStr
01199         , const XMLCh* const    standaloneStr
01200         , const XMLCh* const    actualEncodingStr
01201     );
01203 
01204 
01205     // -----------------------------------------------------------------------
01206     //  Implementation of the XMLErrorReporter interface
01207     // -----------------------------------------------------------------------
01208 
01211 
01234     virtual void error
01235     (
01236         const   unsigned int                errCode
01237         , const XMLCh* const                msgDomain
01238         , const XMLErrorReporter::ErrTypes  errType
01239         , const XMLCh* const                errorText
01240         , const XMLCh* const                systemId
01241         , const XMLCh* const                publicId
01242         , const unsigned int                lineNum
01243         , const unsigned int                colNum
01244     );
01245 
01254     virtual void resetErrors();
01256 
01257 
01258     // -----------------------------------------------------------------------
01259     //  Implementation of the XMLEntityHandler interface
01260     // -----------------------------------------------------------------------
01261 
01264 
01275     virtual void endInputSource(const InputSource& inputSource);
01276 
01291     virtual bool expandSystemId
01292     (
01293         const   XMLCh* const    systemId
01294         ,       XMLBuffer&      toFill
01295     );
01296 
01304     virtual void resetEntities();
01305 
01320     virtual InputSource* resolveEntity
01321     (
01322         const   XMLCh* const    publicId
01323         , const XMLCh* const    systemId
01324     );
01325 
01337     virtual void startInputSource(const InputSource& inputSource);
01339 
01340 
01343 
01353     bool getDoValidation() const;
01354 
01368     void setDoValidation(const bool newState);
01370 
01371 
01372 protected :
01373     // -----------------------------------------------------------------------
01374     //  Unimplemented constructors and operators
01375     // -----------------------------------------------------------------------
01376     SAXParser(const SAXParser&);
01377     void operator=(const SAXParser&);
01378 
01379 
01380     // -----------------------------------------------------------------------
01381     //  Private data members
01382     //
01383     //  fAttrList
01384     //      A temporary implementation of the basic SAX attribute list
01385     //      interface. We use this one over and over on each startElement
01386     //      event to allow SAX-like access to the element attributes.
01387     //
01388     //  fDocHandler
01389     //      The installed SAX doc handler, if any. Null if none.
01390     //
01391     //  fDTDHandler
01392     //      The installed SAX DTD handler, if any. Null if none.
01393     //
01394     //  fElemDepth
01395     //      This is used to track the element nesting depth, so that we can
01396     //      know when we are inside content. This is so we can ignore char
01397     //      data outside of content.
01398     //
01399     //  fEntityResolver
01400     //      The installed SAX entity handler, if any. Null if none.
01401     //
01402     //  fErrorHandler
01403     //      The installed SAX error handler, if any. Null if none.
01404     //
01405     //  fAdvDHCount
01406     //  fAdvDHList
01407     //  fAdvDHListSize
01408     //      This is an array of pointers to XMLDocumentHandlers, which is
01409     //      how we see installed advanced document handlers. There will
01410     //      usually not be very many at all, so a simple array is used
01411     //      instead of a collection, for performance. It will grow if needed,
01412     //      but that is unlikely.
01413     //
01414     //      The count is how many handlers are currently installed. The size
01415     //      is how big the array itself is (for expansion purposes.) When
01416     //      count == size, is time to expand.
01417     //
01418     //  fParseInProgress
01419     //      This flag is set once a parse starts. It is used to prevent
01420     //      multiple entrance or reentrance of the parser.
01421     //
01422     //  fScanner
01423     //      The scanner being used by this parser. It is created internally
01424     //      during construction.
01425     //
01426     // -----------------------------------------------------------------------
01427     VecAttrListImpl         fAttrList;
01428     DocumentHandler*        fDocHandler;
01429     DTDHandler*             fDTDHandler;
01430     unsigned int            fElemDepth;
01431     EntityResolver*         fEntityResolver;
01432     ErrorHandler*           fErrorHandler;
01433     unsigned int            fAdvDHCount;
01434     XMLDocumentHandler**    fAdvDHList;
01435     unsigned int            fAdvDHListSize;
01436     bool                    fParseInProgress;
01437     XMLScanner*             fScanner;
01438 };
01439 
01440 
01441 // ---------------------------------------------------------------------------
01442 //  SAXParser: Getter methods
01443 // ---------------------------------------------------------------------------
01444 inline DocumentHandler* SAXParser::getDocumentHandler()
01445 {
01446     return fDocHandler;
01447 }
01448 
01449 inline const DocumentHandler* SAXParser::getDocumentHandler() const
01450 {
01451     return fDocHandler;
01452 }
01453 
01454 inline EntityResolver* SAXParser::getEntityResolver()
01455 {
01456     return fEntityResolver;
01457 }
01458 
01459 inline const EntityResolver* SAXParser::getEntityResolver() const
01460 {
01461     return fEntityResolver;
01462 }
01463 
01464 inline ErrorHandler* SAXParser::getErrorHandler()
01465 {
01466     return fErrorHandler;
01467 }
01468 
01469 inline const ErrorHandler* SAXParser::getErrorHandler() const
01470 {
01471     return fErrorHandler;
01472 }
01473 
01474 #endif


Copyright © 2000 The Apache Software Foundation. All Rights Reserved.