pitt.search.semanticvectors
Class CompoundVectorBuilder

java.lang.Object
  extended by pitt.search.semanticvectors.CompoundVectorBuilder

public class CompoundVectorBuilder
extends java.lang.Object

This class contains methods for manipulating queries, e.g., taking a list of queryterms and producing a (possibly weighted) aggregate query vector. In the fullness of time this will hopefully include parsing and building queries that include basic (quantum) logical operations. So far these basic operations include negation of one or more terms.


Constructor Summary
CompoundVectorBuilder(VectorStore vecReader)
          Constructor that defaults LuceneUtils to null.
CompoundVectorBuilder(VectorStore vecReader, LuceneUtils lUtils)
           
 
Method Summary
protected  float[] getAdditiveQueryVector(java.lang.String[] queryTerms)
          Returns a (possibly weighted) normalized query vector created by adding together vectors retrieved from vector store.
protected  float[] getAdditiveQueryVectorRegex(java.lang.String[] queryTerms)
          Returns a (possibly weighted) normalized query vector created by adding together all vectors retrieved from vector store whose objects match a particular regular expression.
protected  float[] getNegatedQueryVector(java.lang.String[] queryTerms, int split)
          Creates a vector including orthogonalizing negated terms.
static float[] getPermutedQueryVector(VectorStore vecReader, LuceneUtils lUtils, java.lang.String[] queryTerms)
          Returns a vector representation containing both content and positional information
static float[] getQueryVector(VectorStore vecReader, LuceneUtils lUtils, java.lang.String[] queryTerms)
          Method gets a query vector from an array of query terms.
static float[] getQueryVectorFromString(VectorStore vecReader, LuceneUtils lUtils, java.lang.String queryString)
          Method gets a query vector from a query string, i.e., a space-separated list of queryterms.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CompoundVectorBuilder

public CompoundVectorBuilder(VectorStore vecReader,
                             LuceneUtils lUtils)

CompoundVectorBuilder

public CompoundVectorBuilder(VectorStore vecReader)
Constructor that defaults LuceneUtils to null.

Method Detail

getPermutedQueryVector

public static float[] getPermutedQueryVector(VectorStore vecReader,
                                             LuceneUtils lUtils,
                                             java.lang.String[] queryTerms)
                                      throws java.lang.IllegalArgumentException
Returns a vector representation containing both content and positional information

Parameters:
queryTerms - String array of query terms to look up. Expects a single "?" entry, which denotes the query term position. E.g., "martin ? king" might pick out "luther".
Throws:
java.lang.IllegalArgumentException

getQueryVectorFromString

public static float[] getQueryVectorFromString(VectorStore vecReader,
                                               LuceneUtils lUtils,
                                               java.lang.String queryString)
Method gets a query vector from a query string, i.e., a space-separated list of queryterms.


getQueryVector

public static float[] getQueryVector(VectorStore vecReader,
                                     LuceneUtils lUtils,
                                     java.lang.String[] queryTerms)
Method gets a query vector from an array of query terms. The method is static and creates its own CompoundVectorBuilder. This enables client code just to call "getQueryVector" without creating an object first, though this may be slightly less efficient for multiple calls.

Parameters:
vecReader - The vector store reader to use.
lUtils - Lucene utilities for getting term weights.
queryTerms - Query expression, e.g., from command line. If the term NOT appears in queryTerms, terms after that will be negated.
Returns:
queryVector, an array of floats representing the user's query.

getAdditiveQueryVector

protected float[] getAdditiveQueryVector(java.lang.String[] queryTerms)
Returns a (possibly weighted) normalized query vector created by adding together vectors retrieved from vector store.

Parameters:
queryTerms - String array of query terms to look up.

getAdditiveQueryVectorRegex

protected float[] getAdditiveQueryVectorRegex(java.lang.String[] queryTerms)
Returns a (possibly weighted) normalized query vector created by adding together all vectors retrieved from vector store whose objects match a particular regular expression.

Parameters:
queryTerms - String array of query terms to look up.

getNegatedQueryVector

protected float[] getNegatedQueryVector(java.lang.String[] queryTerms,
                                        int split)
Creates a vector including orthogonalizing negated terms.

Parameters:
queryTerms - List of positive and negative terms.
split - Position in this list of the NOT mark: terms before this are positive, those after this are negative.
Returns:
Single query vector, the sum of the positive terms, projected to be orthogonal to all negative terms.
See Also:
VectorUtils.orthogonalizeVectors(java.util.ArrayList)