|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |
See:
Description
Interface Summary | |
---|---|
CloseableVectorStore | Some vector stores (e.g., those that read from the filesystem) claim resources that aren't automatically garbage collected or released. |
VectorStore | Classes implementing this interface are used to represent a collection of object vectors, including i. |
Class Summary | |
---|---|
BuildBilingualIndex | Command line utility for creating bilingual semantic vector indexes. |
BuildIndex | Command line utility for creating semantic vector indexes. |
BuildPositionalIndex | Command line utility for creating semantic vector indexes using the sliding context window approach (see work on HAL, and by Shutze). |
ClusterResults | |
ClusterVectorStore | This class is used for performing kMeans clustering on an entire vector store. |
CompareTerms | Command line term vector comparison utility. |
CompareTermsBatch | Command line term vector comparison utility designed to be run in batch mode. |
CompoundVectorBuilder | This class contains methods for manipulating queries, e.g., taking a list of queryterms and producing a (possibly weighted) aggregate query vector. |
DocVectors | Implementation of vector store that collects doc vectors by iterating through all the terms in a term vector store and incrementing document vectors for each of the documents containing that term. |
Flags | Class for representing and parsing global command line flags. |
IncrementalDocVectors | generates document vectors incrementally requires a |
LuceneUtils | Class to support reading extra information from Lucene indexes, including term frequency, doc frequency. |
ObjectVector | This class provides a basic object (e.g., term or document id) and corresponding vector. |
Search | Command line term vector search utility. |
SearchResult | Class to represent search results. |
TermTermVectorsFromLucene | Implementation of vector store that creates term by term cooccurence vectors by iterating through all the documents in a Lucene index. |
TermVectorsFromLucene | Implementation of vector store that creates term vectors by iterating through all the terms in a Lucene index. |
VectorSearcher | Class for searching vector stores using different scoring functions. |
VectorSearcher.BalancedVectorSearcherPerm | Class for searching a permuted vector store using cosine similarity. |
VectorSearcher.VectorSearcherConvolutionSim | Class for searching a vector store using convolution similarity. |
VectorSearcher.VectorSearcherCosine | Class for searching a vector store using cosine similarity. |
VectorSearcher.VectorSearcherCosineSparse | Class for searching a vector store using sparse cosine similarity. |
VectorSearcher.VectorSearcherMaxSim | Class for searching a vector store using minimum distance similarity. |
VectorSearcher.VectorSearcherPerm | Class for searching a permuted vector store using cosine similarity. |
VectorSearcher.VectorSearcherSubspaceSim | Class for searching a vector store using quantum disjunction similarity. |
VectorSearcher.VectorSearcherTensorSim | Class for searching a vector store using tensor product similarity. |
VectorStoreRAM | This class provides methods for reading a VectorStore into memory as an optimization if batching many searches. |
VectorStoreReader | Wrapper class used to get access to underlying VectorStore implementations. |
VectorStoreReaderLucene | This class provides methods for reading a VectorStore from disk. |
VectorStoreReaderText | This class provides methods for reading a VectorStore from a textfile. |
VectorStoreSparseRAM | This class provides methods for reading a VectorStore into memory as an optimization if batching many searches. |
VectorStoreTranslater | Class providing command-line interface for transforming vector store between the optimized Lucene format and plain text. |
VectorStoreWriter | This class provides methods for serializing a VectorStore to disk. |
VectorUtils | This class provides standard vector methods, e.g., cosine measure, normalization, tensor utils. |
Exception Summary | |
---|---|
ZeroVectorException |
Semantic Vector indexes, created by applying a Random Projection algorithm to term-document matrices created using Apache Lucene.
This Semantic Vecotors package implements a Random Projection algorithm, a form of automatic semantic analysis, similar to Latent Semantic Analysis (LSA) and its variants like Probabilistic Latent Semantic Analysis (PLSA). However, unlike these methods, Random Projection does not rely on the use of computationally intensive matrix decomposition algorithms like Singular Value Decomposition (SVD). This makes Random Projection a much more scalable technique in practice.
Our application of Random Projection for Natural Language Processing (NLP) is descended from Pentti Kanerva's work on Sparse Distributed Memory, which in semantic analysis and text mining, this method has also been called Random Indexing. A growing number of researchers have applied Random Projection to NLP tasks, demonstrating:
The current package was created as part of a project by the University of Pittsburgh Office of Technology Management, to explore the potential for automatically matching related concepts in the technology management domain, e.g., mapping new technologies to potentatially interested licensors. This project can be found at http://real.hsls.pitt.edu.
The package requires Apache Ant and Apache Lucene to have been installed, and the Lucene classes must be available in your CLASSPATH.
Further documentation and links to articles on Random Projection and related techniques can be found at the package download site, http://code.google.com/p/semanticvectors.
|
||||||||||
PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES |