public final class PdfReader
extends java.lang.Object
PdfManager
. This class is synchronized.Modifier and Type | Class and Description |
---|---|
protected class |
PdfReader.ArrayEnd
A placeholder used by the PDF parser to mark the end of an
array.
|
protected class |
PdfReader.DictionaryEnd
A placeholder used by the PDF parser to mark the end of a
dictionary.
|
protected class |
PdfReader.DictionaryEndStream
A placeholder used by the PDF parser to mark the end of a
dictionary that is also followed by a stream.
|
protected class |
PdfReader.ParserObject
The superclass of inner classes used by this
PdfReader to mark positions while parsing PDF
objects. |
Modifier and Type | Field and Description |
---|---|
protected static java.util.regex.Pattern |
_patHeader
The regular expression that matches a PDF header.
|
protected static java.util.regex.Pattern |
_patObjIntro
The regular expression that matches the begining of an
indirect object (specifically, the object number and
generation number followed by "obj").
|
protected static java.util.regex.Pattern |
_patPdfObject
The regular expression that matches a PDF (direct) object.
|
protected static java.util.regex.Pattern |
_patStartxref
The regular expression that matches a startxref section.
|
protected static java.util.regex.Pattern |
_patXref
The regular expression that matches the beginning of an
xref section (specifically, the "xref" key word).
|
protected static java.util.regex.Pattern |
_patXrefEof
The regular expression that matches an entire xref table
section, including the "trailer" key word.
|
protected static java.util.regex.Pattern |
_patXrefSub
The regular expression that matches the introduction to a
subsection of an xref section (specifically, an integer
pair) or the "trailer" key word.
|
protected static java.util.regex.Pattern |
_patXrefTable
The regular expression that matches an entire xref table
section, including the "trailer" key word.
|
protected PdfInput |
_pdfInput |
protected static PdfName |
PDFNAME_LENGTH
A
PdfName object representing the name
Length . |
protected static PdfName |
PDFNAME_PREV
A
PdfName object representing the name
Prev . |
protected static PdfName |
PDFNAME_SIZE
A
PdfName object representing the name
Size . |
protected static java.lang.String |
REGEX_ANY_CHAR
The regular expression that matches literally any character.
|
protected static java.lang.String |
REGEX_COMMENT
The regular expression that matches a comment in PDF.
|
protected static java.lang.String |
REGEX_DELIMITER
The regular expression that matches a delimiter in PDF.
|
protected static java.lang.String |
REGEX_EOL
The regular expression that matches an end-of-line (EOL)
marker in PDF.
|
protected static java.lang.String |
REGEX_REGULAR
The regular expression that matches a regular character in PDF.
|
protected static java.lang.String |
REGEX_STOP
The regular expression that matches a white-space or
delimiter (stopping syntactic entities) in PDF.
|
protected static java.lang.String |
REGEX_WHITESPACE
The regular expression that matches general white-space in PDF.
|
protected static int |
STARTXREF_RETRY_COUNT
Number of times to try scanning for startxref.
|
protected static int |
STARTXREF_RETRY_SCAN
The number of bytes from the end of a PDF document at which to
start scanning for startxref.
|
Constructor and Description |
---|
PdfReader(PdfInput pdfInput)
Creates a reader for a PDF document to be read from a
PdfInput source. |
Modifier and Type | Method and Description |
---|---|
void |
close()
Closes the PDF document and releases any system resources
associated with it.
|
PdfInput |
getInput()
Returns the
PdfInput instance associated with
this document. |
protected PdfInput |
getPdfInput() |
protected PdfObject |
parseObject(long start,
long end,
java.nio.CharBuffer cbuf,
XrefTable xt)
Parses and returns a PDF object from the input source.
|
java.lang.String |
readHeader()
Reads the header of the PDF document.
|
PdfObject |
readObject(long start,
long end,
boolean indirect,
XrefTable xt)
Reads a PDF object from the document.
|
protected XrefTable |
readPartialXrefTable(XrefTable xt,
long startxref,
long[] prev)
Reads an individual (partial) cross-reference table and
trailer dictionary from the PDF document.
|
long |
readStartxref()
Reads the startxref value from the PDF document.
|
XrefTable |
readXrefTable(long startxref)
Reads and compiles all cross-reference tables and trailer
dictionaries from the PDF document beginning at a specified
position.
|
protected PdfInput _pdfInput
protected static java.util.regex.Pattern _patHeader
protected static final java.util.regex.Pattern _patObjIntro
protected static final java.util.regex.Pattern _patPdfObject
protected static final java.util.regex.Pattern _patStartxref
protected static final java.util.regex.Pattern _patXref
protected static final java.util.regex.Pattern _patXrefSub
protected static final java.util.regex.Pattern _patXrefTable
protected static final java.util.regex.Pattern _patXrefEof
protected static final PdfName PDFNAME_LENGTH
PdfName
object representing the name
Length
.protected static final PdfName PDFNAME_PREV
PdfName
object representing the name
Prev
.protected static final PdfName PDFNAME_SIZE
PdfName
object representing the name
Size
.protected static final java.lang.String REGEX_ANY_CHAR
protected static final java.lang.String REGEX_COMMENT
protected static final java.lang.String REGEX_DELIMITER
protected static final java.lang.String REGEX_EOL
protected static final java.lang.String REGEX_REGULAR
protected static final java.lang.String REGEX_STOP
protected static final java.lang.String REGEX_WHITESPACE
protected static final int STARTXREF_RETRY_COUNT
protected static final int STARTXREF_RETRY_SCAN
public PdfReader(PdfInput pdfInput)
PdfInput
source.pdfInput
- the source to read the PDF document from.public PdfInput getInput()
PdfInput
instance associated with
this document.protected PdfInput getPdfInput()
public void close() throws java.io.IOException
java.io.IOException
protected PdfObject parseObject(long start, long end, java.nio.CharBuffer cbuf, XrefTable xt) throws java.io.IOException, PdfFormatException
PdfReaderFilter
.
It is possible for this method to return null
if the filtering method discards all objects. This method
is intended to be called from readObject()
which advanced the buffer position past introduction if the
object is indirect.start
- the offset where the object starts.end
- the offset where the object ends.cbuf
- the character buffer cached from
readObject()
.xt
- the cross-reference table; used for resolving
indirect references.PdfFormatException
java.io.IOException
protected XrefTable readPartialXrefTable(XrefTable xt, long startxref, long[] prev) throws java.io.IOException, PdfFormatException
PdfReaderFilter
. This method should be
made public.xrefTrailer
- an existing xrefTrailer object to add
data to; assumed to be the "subsequent" to the new
XrefTrailer that is to be read. Only non-existing entries
are modified. The trailer is not modified.startxref
- the xref start position.filter
- the filter.prev
- the current Prev offset.java.io.IOException
PdfFormatException
public java.lang.String readHeader() throws java.io.IOException, PdfException
java.io.IOException
PdfException
public PdfObject readObject(long start, long end, boolean indirect, XrefTable xt) throws java.io.IOException, PdfFormatException
PdfReaderFilter
. It is
possible for this method to return null
if the
filtering method discards all objects.start
- the offset where the object starts.end
- the offset where the object ends.indirect
- true if the object is preceded by the object
number, generation, and "obj".xt
- the PDF document's cross-reference table.filter
- the object filter.java.io.IOException
PdfFormatException
public long readStartxref() throws java.io.IOException, PdfFormatException
java.io.IOException
PdfFormatException
public XrefTable readXrefTable(long startxref) throws java.io.IOException, PdfFormatException
PdfReaderFilter
.startxref
- the xref start position.filter
- the filter.java.io.IOException
PdfFormatException