|
||||||||||
PREV NEXT | FRAMES NO FRAMES |
Packages that use TextDocument | |
---|---|
de.l3s.boilerpipe | The Boilerpipe top-level package. |
de.l3s.boilerpipe.document | The classes in this package represent the simple Boilerpipe document model. |
de.l3s.boilerpipe.extractors | This package contains some standard extractors (i.e., completely piped BoilerpipeFilters) |
de.l3s.boilerpipe.filters.english | The BoilerpipeFilters in this package have only been tested on English text. |
de.l3s.boilerpipe.filters.heuristics | The BoilerpipeFilters in this package are pure heuristics. |
de.l3s.boilerpipe.filters.simple | The BoilerpipeFilters in this package are straight-forward and probably not really specific to English. |
de.l3s.boilerpipe.sax | Classes related to parsing and producing HTML from/to Boilerpipe TextDocuments. |
Uses of TextDocument in de.l3s.boilerpipe |
---|
Methods in de.l3s.boilerpipe that return TextDocument | |
---|---|
TextDocument |
BoilerpipeInput.getTextDocument()
Returns (somehow) a TextDocument . |
TextDocument |
BoilerpipeDocumentSource.toTextDocument()
|
Methods in de.l3s.boilerpipe with parameters of type TextDocument | |
---|---|
java.lang.String |
BoilerpipeExtractor.getText(TextDocument doc)
Extracts text from the given TextDocument object. |
boolean |
BoilerpipeFilter.process(TextDocument doc)
Processes the given document doc . |
Uses of TextDocument in de.l3s.boilerpipe.document |
---|
Constructors in de.l3s.boilerpipe.document with parameters of type TextDocument | |
---|---|
TextDocumentStatistics(TextDocument doc,
boolean contentOnly)
Computes statistics on a given TextDocument . |
Uses of TextDocument in de.l3s.boilerpipe.extractors |
---|
Methods in de.l3s.boilerpipe.extractors with parameters of type TextDocument | |
---|---|
java.lang.String |
ExtractorBase.getText(TextDocument doc)
Extracts text from the given TextDocument object. |
boolean |
ArticleSentencesExtractor.process(TextDocument doc)
|
boolean |
CanolaExtractor.process(TextDocument doc)
|
boolean |
KeepEverythingExtractor.process(TextDocument doc)
|
boolean |
ArticleExtractor.process(TextDocument doc)
|
boolean |
KeepEverythingWithMinKWordsExtractor.process(TextDocument doc)
|
boolean |
DefaultExtractor.process(TextDocument doc)
|
boolean |
LargestContentExtractor.process(TextDocument doc)
|
boolean |
NumWordsRulesExtractor.process(TextDocument doc)
|
Uses of TextDocument in de.l3s.boilerpipe.filters.english |
---|
Methods in de.l3s.boilerpipe.filters.english with parameters of type TextDocument | |
---|---|
boolean |
IgnoreBlocksAfterContentFromEndFilter.process(TextDocument doc)
|
boolean |
KeepLargestFulltextBlockFilter.process(TextDocument doc)
|
boolean |
TerminatingBlocksFinder.process(TextDocument doc)
|
boolean |
IgnoreBlocksAfterContentFilter.process(TextDocument doc)
|
boolean |
DensityRulesClassifier.process(TextDocument doc)
|
boolean |
NumWordsRulesClassifier.process(TextDocument doc)
|
boolean |
MinFulltextWordsFilter.process(TextDocument doc)
|
Uses of TextDocument in de.l3s.boilerpipe.filters.heuristics |
---|
Methods in de.l3s.boilerpipe.filters.heuristics with parameters of type TextDocument | |
---|---|
boolean |
ArticleMetadataFilter.process(TextDocument doc)
|
boolean |
DocumentTitleMatchClassifier.process(TextDocument doc)
|
boolean |
ContentFusion.process(TextDocument doc)
|
boolean |
ExpandTitleToContentFilter.process(TextDocument doc)
|
boolean |
KeepLargestBlockFilter.process(TextDocument doc)
|
boolean |
BlockProximityFusion.process(TextDocument doc)
|
boolean |
LabelFusion.process(TextDocument doc)
|
boolean |
AddPrecedingLabelsFilter.process(TextDocument doc)
|
boolean |
SimpleBlockFusionProcessor.process(TextDocument doc)
|
Uses of TextDocument in de.l3s.boilerpipe.filters.simple |
---|
Methods in de.l3s.boilerpipe.filters.simple with parameters of type TextDocument | |
---|---|
boolean |
MinClauseWordsFilter.process(TextDocument doc)
|
boolean |
MarkEverythingContentFilter.process(TextDocument doc)
|
boolean |
BoilerplateBlockFilter.process(TextDocument doc)
|
boolean |
InvertedFilter.process(TextDocument doc)
|
boolean |
SplitParagraphBlocksFilter.process(TextDocument doc)
|
boolean |
MinWordsFilter.process(TextDocument doc)
|
boolean |
LabelToContentFilter.process(TextDocument doc)
|
boolean |
SurroundingToContentFilter.process(TextDocument doc)
|
boolean |
LabelToBoilerplateFilter.process(TextDocument doc)
|
Uses of TextDocument in de.l3s.boilerpipe.sax |
---|
Methods in de.l3s.boilerpipe.sax that return TextDocument | |
---|---|
TextDocument |
BoilerpipeSAXInput.getTextDocument()
Retrieves the TextDocument using a default HTML parser. |
TextDocument |
BoilerpipeSAXInput.getTextDocument(BoilerpipeHTMLParser parser)
Retrieves the TextDocument using the given HTML parser. |
TextDocument |
BoilerpipeHTMLContentHandler.toTextDocument()
Returns a TextDocument containing the extracted TextBlock
s. |
TextDocument |
BoilerpipeHTMLParser.toTextDocument()
Returns a TextDocument containing the extracted TextBlock
s. |
Methods in de.l3s.boilerpipe.sax with parameters of type TextDocument | |
---|---|
java.lang.String |
HTMLHighlighter.process(TextDocument doc,
org.xml.sax.InputSource is)
Processes the given TextDocument and the original HTML text (as
an InputSource ). |
java.lang.String |
HTMLHighlighter.process(TextDocument doc,
java.lang.String origHTML)
Processes the given TextDocument and the original HTML text (as a
String). |
|
||||||||||
PREV NEXT | FRAMES NO FRAMES |