Class BasicTokenIterator

  • All Implemented Interfaces:
    java.util.Iterator<java.lang.Object>, TokenIterator

    public class BasicTokenIterator
    extends java.lang.Object
    implements TokenIterator
    Basic implementation of a TokenIterator. This implementation parses #token sequences as defined by RFC 2616, section 2. It extends that definition somewhat beyond US-ASCII.
    Since:
    4.0
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected java.lang.String createToken​(java.lang.String value, int start, int end)
      Creates a new token to be returned.
      protected int findNext​(int pos)
      Determines the next token.
      protected int findTokenEnd​(int from)
      Determines the ending position of the current token.
      protected int findTokenSeparator​(int pos)
      Determines the position of the next token separator.
      protected int findTokenStart​(int pos)
      Determines the starting position of the next token.
      boolean hasNext()
      Indicates whether there is another token in this iteration.
      protected boolean isHttpSeparator​(char ch)
      Checks whether a character is an HTTP separator.
      protected boolean isTokenChar​(char ch)
      Checks whether a character is a valid token character.
      protected boolean isTokenSeparator​(char ch)
      Checks whether a character is a token separator.
      protected boolean isWhitespace​(char ch)
      Checks whether a character is a whitespace character.
      java.lang.Object next()
      Returns the next token.
      java.lang.String nextToken()
      Obtains the next token from this iteration.
      void remove()
      Removing tokens is not supported.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
      • Methods inherited from interface java.util.Iterator

        forEachRemaining
    • Field Detail

      • HTTP_SEPARATORS

        public static final java.lang.String HTTP_SEPARATORS
        The HTTP separator characters. Defined in RFC 2616, section 2.2.
        See Also:
        Constant Field Values
      • headerIt

        protected final HeaderIterator headerIt
        The iterator from which to obtain the next header.
      • currentHeader

        protected java.lang.String currentHeader
        The value of the current header. This is the header value that includes currentToken. Undefined if the iteration is over.
      • currentToken

        protected java.lang.String currentToken
        The token to be returned by the next call to nextToken(). null if the iteration is over.
    • Constructor Detail

      • BasicTokenIterator

        public BasicTokenIterator​(HeaderIterator headerIterator)
        Creates a new instance of BasicTokenIterator.
        Parameters:
        headerIterator - the iterator for the headers to tokenize
    • Method Detail

      • hasNext

        public boolean hasNext()
        Description copied from interface: TokenIterator
        Indicates whether there is another token in this iteration.
        Specified by:
        hasNext in interface java.util.Iterator<java.lang.Object>
        Specified by:
        hasNext in interface TokenIterator
        Returns:
        true if there is another token, false otherwise
      • nextToken

        public java.lang.String nextToken()
                                   throws java.util.NoSuchElementException,
                                          ParseException
        Obtains the next token from this iteration.
        Specified by:
        nextToken in interface TokenIterator
        Returns:
        the next token in this iteration
        Throws:
        java.util.NoSuchElementException - if the iteration is already over
        ParseException - if an invalid header value is encountered
      • next

        public final java.lang.Object next()
                                    throws java.util.NoSuchElementException,
                                           ParseException
        Returns the next token. Same as nextToken(), but with generic return type.
        Specified by:
        next in interface java.util.Iterator<java.lang.Object>
        Returns:
        the next token in this iteration
        Throws:
        java.util.NoSuchElementException - if there are no more tokens
        ParseException - if an invalid header value is encountered
      • remove

        public final void remove()
                          throws java.lang.UnsupportedOperationException
        Removing tokens is not supported.
        Specified by:
        remove in interface java.util.Iterator<java.lang.Object>
        Throws:
        java.lang.UnsupportedOperationException - always
      • findNext

        protected int findNext​(int pos)
                        throws ParseException
        Determines the next token. If found, the token is stored in currentToken. The return value indicates the position after the token in currentHeader. If necessary, the next header will be obtained from headerIt. If not found, currentToken is set to null.
        Parameters:
        pos - the position in the current header at which to start the search, -1 to search in the first header
        Returns:
        the position after the found token in the current header, or negative if there was no next token
        Throws:
        ParseException - if an invalid header value is encountered
      • createToken

        protected java.lang.String createToken​(java.lang.String value,
                                               int start,
                                               int end)
        Creates a new token to be returned. Called from findNext after the token is identified. The default implementation simply calls String.substring.

        If header values are significantly longer than tokens, and some tokens are permanently referenced by the application, there can be problems with garbage collection. A substring will hold a reference to the full characters of the original string and therefore occupies more memory than might be expected. To avoid this, override this method and create a new string instead of a substring.

        Parameters:
        value - the full header value from which to create a token
        start - the index of the first token character
        end - the index after the last token character
        Returns:
        a string representing the token identified by the arguments
      • findTokenStart

        protected int findTokenStart​(int pos)
        Determines the starting position of the next token. This method will iterate over headers if necessary.
        Parameters:
        pos - the position in the current header at which to start the search
        Returns:
        the position of the token start in the current header, negative if no token start could be found
      • findTokenSeparator

        protected int findTokenSeparator​(int pos)
        Determines the position of the next token separator. Because of multi-header joining rules, the end of a header value is a token separator. This method does therefore not need to iterate over headers.
        Parameters:
        pos - the position in the current header at which to start the search
        Returns:
        the position of a token separator in the current header, or at the end
        Throws:
        ParseException - if a new token is found before a token separator. RFC 2616, section 2.1 explicitly requires a comma between tokens for #.
      • findTokenEnd

        protected int findTokenEnd​(int from)
        Determines the ending position of the current token. This method will not leave the current header value, since the end of the header value is a token boundary.
        Parameters:
        from - the position of the first character of the token
        Returns:
        the position after the last character of the token. The behavior is undefined if from does not point to a token character in the current header value.
      • isTokenSeparator

        protected boolean isTokenSeparator​(char ch)
        Checks whether a character is a token separator. RFC 2616, section 2.1 defines comma as the separator for #token sequences. The end of a header value will also separate tokens, but that is not a character check.
        Parameters:
        ch - the character to check
        Returns:
        true if the character is a token separator, false otherwise
      • isWhitespace

        protected boolean isWhitespace​(char ch)
        Checks whether a character is a whitespace character. RFC 2616, section 2.2 defines space and horizontal tab as whitespace. The optional preceeding line break is irrelevant, since header continuation is handled transparently when parsing messages.
        Parameters:
        ch - the character to check
        Returns:
        true if the character is whitespace, false otherwise
      • isTokenChar

        protected boolean isTokenChar​(char ch)
        Checks whether a character is a valid token character. Whitespace, control characters, and HTTP separators are not valid token characters. The HTTP specification (RFC 2616, section 2.2) defines tokens only for the US-ASCII character set, this method extends the definition to other character sets.
        Parameters:
        ch - the character to check
        Returns:
        true if the character is a valid token start, false otherwise
      • isHttpSeparator

        protected boolean isHttpSeparator​(char ch)
        Checks whether a character is an HTTP separator. The implementation in this class checks only for the HTTP separators defined in RFC 2616, section 2.2. If you need to detect other separators beyond the US-ASCII character set, override this method.
        Parameters:
        ch - the character to check
        Returns:
        true if the character is an HTTP separator