Class UniversalDetector


  • public class UniversalDetector
    extends java.lang.Object
    • Field Detail

      • done

        private boolean done
      • start

        private boolean start
      • gotData

        private boolean gotData
      • onlyPrintableASCII

        private boolean onlyPrintableASCII
      • lastChar

        private byte lastChar
      • detectedCharset

        private java.lang.String detectedCharset
    • Constructor Detail

      • UniversalDetector

        public UniversalDetector()
      • UniversalDetector

        public UniversalDetector​(CharsetListener listener)
        Parameters:
        listener - a listener object that is notified of the detected encocoding. Can be null.
    • Method Detail

      • isDone

        public boolean isDone()
      • getDetectedCharset

        public java.lang.String getDetectedCharset()
        Returns:
        The detected encoding is returned. If the detector couldn't determine what encoding was used, null is returned.
      • handleData

        public void handleData​(byte[] buf)
        Feed the detector with more data
        Parameters:
        buf - The buffer containing the data
      • handleData

        public void handleData​(byte[] buf,
                               int offset,
                               int length)
        Feed the detector with more data
        Parameters:
        buf - Buffer with the data
        offset - initial position of data in buf
        length - length of data
      • detectCharsetFromBOM

        public static java.lang.String detectCharsetFromBOM​(byte[] buf)
      • detectCharsetFromBOM

        private static java.lang.String detectCharsetFromBOM​(byte[] buf,
                                                             int offset)
      • dataEnd

        public void dataEnd()
        Marks end of data reading. Finish calculations.
      • reset

        public final void reset()
        Resets detector to be used again.
      • detectCharset

        public static java.lang.String detectCharset​(java.io.File file)
                                              throws java.io.IOException
        Gets the charset of a File.
        Parameters:
        file - The file to check charset for
        Returns:
        The charset of the file, null if cannot be determined
        Throws:
        java.io.IOException - if some IO error occurs
      • detectCharset

        public static java.lang.String detectCharset​(java.nio.file.Path path)
                                              throws java.io.IOException
        Gets the charset of a Path.
        Parameters:
        path - The path to file to check charset for
        Returns:
        The charset of the file, null if cannot be determined
        Throws:
        java.io.IOException - if some IO error occurs
      • detectCharset

        public static java.lang.String detectCharset​(java.io.InputStream inputStream)
                                              throws java.io.IOException
        Gets the charset of content from InputStream.
        Parameters:
        inputStream - InputStream containing text file
        Returns:
        The charset of the file, null if cannot be determined
        Throws:
        java.io.IOException - if some IO error occurs