org.htmlparser.lexer

Class Source

public abstract class Source extends Reader implements Serializable

A buffered source of characters. A Source is very similar to a Reader, like:
 new InputStreamReader (connection.getInputStream (), charset)
 
It differs from the above, in three ways:
Field Summary
static intEOF
Return value when the source is exhausted.
Method Summary
abstract intavailable()
Get the number of available characters.
abstract voidclose()
Does nothing.
abstract voiddestroy()
Close the source.
abstract chargetCharacter(int offset)
Retrieve a character again.
abstract voidgetCharacters(char[] array, int offset, int start, int end)
Retrieve characters again.
abstract voidgetCharacters(StringBuffer buffer, int offset, int length)
Append characters already read into a StringBuffer.
abstract StringgetEncoding()
Get the encoding being used to convert characters.
abstract StringgetString(int offset, int length)
Retrieve a string comprised of characters already read.
abstract voidmark(int readAheadLimit)
Mark the present position.
abstract booleanmarkSupported()
Tell whether this source supports the mark() operation.
abstract intoffset()
Get the position (in characters).
abstract intread()
Read a single character.
abstract intread(char[] cbuf, int off, int len)
Read characters into a portion of an array.
abstract intread(char[] cbuf)
Read characters into an array.
abstract booleanready()
Tell whether this source is ready to be read.
abstract voidreset()
Reset the source.
abstract voidsetEncoding(String character_set)
Set the encoding to the given character set.
abstract longskip(long n)
Skip characters.
abstract voidunread()
Undo the read of a single character.

Field Detail

EOF

public static final int EOF
Return value when the source is exhausted. Has a value of {@value }.

Method Detail

available

public abstract int available()
Get the number of available characters.

Returns: The number of characters that can be read without blocking.

close

public abstract void close()
Does nothing. It's supposed to close the source, but use Source instead.

Throws: IOException not used

See Also: Source

destroy

public abstract void destroy()
Close the source. Once a source has been closed, further read, ready, mark, reset, skip, unread, getCharacter or getString invocations will throw an IOException. Closing a previously-closed source, however, has no effect.

Throws: IOException If an I/O error occurs.

getCharacter

public abstract char getCharacter(int offset)
Retrieve a character again.

Parameters: offset The offset of the character.

Returns: The character at offset.

Throws: IOException If the source is closed or the offset is beyond offset.

getCharacters

public abstract void getCharacters(char[] array, int offset, int start, int end)
Retrieve characters again.

Parameters: array The array of characters. offset The starting position in the array where characters are to be placed. start The starting position, zero based. end The ending position (exclusive, i.e. the character at the ending position is not included), zero based.

Throws: IOException If the source is closed or the start or end is beyond offset.

getCharacters

public abstract void getCharacters(StringBuffer buffer, int offset, int length)
Append characters already read into a StringBuffer.

Parameters: buffer The buffer to append to. offset The offset of the first character. length The number of characters to retrieve.

Throws: IOException If the source is closed or the offset or (offset + length) is beyond offset.

getEncoding

public abstract String getEncoding()
Get the encoding being used to convert characters.

Returns: The current encoding.

getString

public abstract String getString(int offset, int length)
Retrieve a string comprised of characters already read.

Parameters: offset The offset of the first character. length The number of characters to retrieve.

Returns: A string containing the length characters at offset.

Throws: IOException If the source is closed.

mark

public abstract void mark(int readAheadLimit)
Mark the present position. Subsequent calls to Source will attempt to reposition the source to this point. Not all sources support the mark() operation.

Parameters: readAheadLimit The minimum number of characters that can be read before this mark becomes invalid.

Throws: IOException If an I/O error occurs.

markSupported

public abstract boolean markSupported()
Tell whether this source supports the mark() operation.

Returns: true if and only if this source supports the mark operation.

offset

public abstract int offset()
Get the position (in characters).

Returns: The number of characters that have already been read, or EOF if the source is closed.

read

public abstract int read()
Read a single character. This method will block until a character is available, an I/O error occurs, or the source is exhausted.

Returns: The character read, as an integer in the range 0 to 65535 (0x00-0xffff), or EOF if the source is exhausted.

Throws: IOException If an I/O error occurs.

read

public abstract int read(char[] cbuf, int off, int len)
Read characters into a portion of an array. This method will block until some input is available, an I/O error occurs, or the source is exhausted.

Parameters: cbuf Destination buffer off Offset at which to start storing characters len Maximum number of characters to read

Returns: The number of characters read, or EOF if the source is exhausted.

Throws: IOException If an I/O error occurs.

read

public abstract int read(char[] cbuf)
Read characters into an array. This method will block until some input is available, an I/O error occurs, or the source is exhausted.

Parameters: cbuf Destination buffer.

Returns: The number of characters read, or EOF if the source is exhausted.

Throws: IOException If an I/O error occurs.

ready

public abstract boolean ready()
Tell whether this source is ready to be read.

Returns: true if the next read() is guaranteed not to block for input, false otherwise. Note that returning false does not guarantee that the next read will block.

Throws: IOException If an I/O error occurs.

reset

public abstract void reset()
Reset the source. Repositions the read point to begin at zero.

setEncoding

public abstract void setEncoding(String character_set)
Set the encoding to the given character set. If the current encoding is the same as the requested encoding, this method is a no-op. Otherwise any subsequent characters read from this source will have been decoded using the given character set.

If characters have already been consumed from this source, it is expected that an exception will be thrown if the characters read so far would be different if the encoding being set was used from the start.

Parameters: character_set The character set to use to convert characters.

Throws: ParserException If a character mismatch occurs between characters already provided and those that would have been returned had the new character set been in effect from the beginning. An exception is also thrown if the character set is not recognized.

skip

public abstract long skip(long n)
Skip characters. This method will block until some characters are available, an I/O error occurs, or the source is exhausted. Note: n is treated as an int

Parameters: n The number of characters to skip.

Returns: The number of characters actually skipped

Throws: IOException If an I/O error occurs.

unread

public abstract void unread()
Undo the read of a single character.

Throws: IOException If the source is closed or no characters have been read.

HTML Parser is an open source library released under LGPL. SourceForge.net