de.l3s.boilerpipe.document
Class TextDocument

java.lang.Object
  extended by de.l3s.boilerpipe.document.TextDocument

public class TextDocument
extends java.lang.Object

A text document, consisting of one or more TextBlocks.

Author:
Christian Kohlsch??tter

Constructor Summary
TextDocument(java.util.List<TextBlock> textBlocks)
          Creates a new TextDocument with given TextBlocks, and no title.
TextDocument(java.lang.String title, java.util.List<TextBlock> textBlocks)
          Creates a new TextDocument with given TextBlocks and given title.
 
Method Summary
 java.lang.String debugString()
          Returns detailed debugging information about the contained TextBlocks.
 java.lang.String getContent()
          Returns the TextDocument's content.
 java.lang.String getText(boolean includeContent, boolean includeNonContent)
          Returns the TextDocument's content, non-content or both
 java.util.List<TextBlock> getTextBlocks()
          Returns the TextBlocks of this document.
 java.lang.String getTitle()
          Returns the "main" title for this document, or null if no such title has ben set.
 void setTitle(java.lang.String title)
          Updates the "main" title for this document.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TextDocument

public TextDocument(java.util.List<TextBlock> textBlocks)
Creates a new TextDocument with given TextBlocks, and no title.

Parameters:
textBlocks - The text blocks of this document.

TextDocument

public TextDocument(java.lang.String title,
                    java.util.List<TextBlock> textBlocks)
Creates a new TextDocument with given TextBlocks and given title.

Parameters:
title - The "main" title for this text document.
textBlocks - The text blocks of this document.
Method Detail

getTextBlocks

public java.util.List<TextBlock> getTextBlocks()
Returns the TextBlocks of this document.

Returns:
A list of TextBlocks, in sequential order of appearance.

getTitle

public java.lang.String getTitle()
Returns the "main" title for this document, or null if no such title has ben set.

Returns:
The "main" title.

setTitle

public void setTitle(java.lang.String title)
Updates the "main" title for this document.

Parameters:
title -

getContent

public java.lang.String getContent()
Returns the TextDocument's content.

Returns:
The content text.

getText

public java.lang.String getText(boolean includeContent,
                                boolean includeNonContent)
Returns the TextDocument's content, non-content or both

Parameters:
includeContent - Whether to include TextBlocks marked as "content".
includeNonContent - Whether to include TextBlocks marked as "non-content".
Returns:
The text.

debugString

public java.lang.String debugString()
Returns detailed debugging information about the contained TextBlocks.

Returns:
Debug information.