org.w3c.tidy
Class StreamInImpl

java.lang.Object
  extended byorg.w3c.tidy.StreamInImpl
All Implemented Interfaces:
StreamIn

public class StreamInImpl
extends java.lang.Object
implements StreamIn

Input Stream Implementation. This implementation is from the c version of tidy and it doesn't take advantage of java readers.

Version:
$Revision: 1.28 $ ($Author: fgiust $)
Author:
Dave Raggett dsr@w3.org , Andy Quick ac.quick@sympatico.ca (translation to Java), Fabrizio Giustina

Field Summary
private  int bufpos
          actual position in buffer.
private  int[] charbuf
          character buffer.
private static int CHARBUF_SIZE
          number of characters kept in buffer.
private  int curcol
          current column number.
private  int curline
          current line number.
private  int encoding
          Encoding.
private  boolean endOfStream
          has end of stream been reached?
private  EncodingUtils.GetBytes getBytes
          Getter.
private  int lastcol
          last column.
private  Lexer lexer
          needed for error reporting.
private  boolean lookingForBOM
          looking for an UTF BOM?
private  boolean pushed
           
private  int rawBufpos
          actual position in rawBytebuf.
private  char[] rawBytebuf
          Private unget buffer for the raw bytes read from the input stream.
private  boolean rawOut
          Avoid mapping values > 127 to entities.
private  boolean rawPushed
          has a raw byte been pushed into stack?
private  int state
          FSM for ISO2022.
private  java.io.InputStream stream
          input stream.
private  int tabs
           
private  int tabsize
          tab size in chars.
 
Fields inherited from interface org.w3c.tidy.StreamIn
END_OF_STREAM
 
Constructor Summary
StreamInImpl(java.io.InputStream stream, Configuration configuration)
          Instatiates a new StreamInImpl.
 
Method Summary
 int getCurcol()
          Getter for curcol.
 int getCurline()
          Getter for curline.
 boolean isEndOfStream()
          Has end of stream been reached?
 int readChar()
          Read a char.
 int readCharFromStream()
          reads a char from the stream.
protected  void readRawBytesFromStream(int[] buf, int[] count, boolean unget)
          Read raw bytes from stream, return <= 0 if EOF; or if "unget" is true, Unget the bytes to re-synchronize the input stream Normally UTF-8 successor bytes are read using this routine.
 void setLexer(Lexer lexer)
          Setter for lexer.
 void ungetChar(int c)
          Unget a char.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CHARBUF_SIZE

private static final int CHARBUF_SIZE
number of characters kept in buffer.

See Also:
Constant Field Values

lexer

private Lexer lexer
needed for error reporting.


charbuf

private int[] charbuf
character buffer.


bufpos

private int bufpos
actual position in buffer.


rawBytebuf

private char[] rawBytebuf
Private unget buffer for the raw bytes read from the input stream. Normally this will only be used by the UTF-8 decoder to resynchronize the input stream after finding an illegal UTF-8 sequences. But it can be used for other purposes when reading bytes in ReadCharFromStream.


rawBufpos

private int rawBufpos
actual position in rawBytebuf.


rawPushed

private boolean rawPushed
has a raw byte been pushed into stack?


lookingForBOM

private boolean lookingForBOM
looking for an UTF BOM?


endOfStream

private boolean endOfStream
has end of stream been reached?


pushed

private boolean pushed

tabs

private int tabs

tabsize

private int tabsize
tab size in chars.


state

private int state
FSM for ISO2022.


encoding

private int encoding
Encoding.


curcol

private int curcol
current column number.


lastcol

private int lastcol
last column.


curline

private int curline
current line number.


stream

private java.io.InputStream stream
input stream.


getBytes

private EncodingUtils.GetBytes getBytes
Getter.


rawOut

private boolean rawOut
Avoid mapping values > 127 to entities.

Constructor Detail

StreamInImpl

public StreamInImpl(java.io.InputStream stream,
                    Configuration configuration)
Instatiates a new StreamInImpl.

Parameters:
stream - input stream
configuration - Configuration
Method Detail

getCurcol

public int getCurcol()
Description copied from interface: StreamIn
Getter for curcol.

Specified by:
getCurcol in interface StreamIn
Returns:
Returns the curcol.
See Also:
StreamIn.getCurcol()

getCurline

public int getCurline()
Description copied from interface: StreamIn
Getter for curline.

Specified by:
getCurline in interface StreamIn
Returns:
Returns the curline.
See Also:
StreamIn.getCurline()

setLexer

public void setLexer(Lexer lexer)
Setter for lexer.

Specified by:
setLexer in interface StreamIn
Parameters:
lexer - The lexer to set.

readChar

public int readChar()
Description copied from interface: StreamIn
Read a char.

Specified by:
readChar in interface StreamIn
Returns:
char
See Also:
StreamIn.readChar()

ungetChar

public void ungetChar(int c)
Description copied from interface: StreamIn
Unget a char.

Specified by:
ungetChar in interface StreamIn
Parameters:
c - char
See Also:
StreamIn.ungetChar(int)

isEndOfStream

public boolean isEndOfStream()
Description copied from interface: StreamIn
Has end of stream been reached?

Specified by:
isEndOfStream in interface StreamIn
Returns:
true if end of stream has been reached
See Also:
StreamIn.isEndOfStream()

readCharFromStream

public int readCharFromStream()
Description copied from interface: StreamIn
reads a char from the stream.

Specified by:
readCharFromStream in interface StreamIn
Returns:
char
See Also:
StreamIn.readCharFromStream()

readRawBytesFromStream

protected void readRawBytesFromStream(int[] buf,
                                      int[] count,
                                      boolean unget)
Read raw bytes from stream, return <= 0 if EOF; or if "unget" is true, Unget the bytes to re-synchronize the input stream Normally UTF-8 successor bytes are read using this routine.

Parameters:
buf - character buffer
count - number of bytes to read
unget - unget bytes