Base class for “facets”, aspects that can be sorted/faceted.
Returns a Categorizer corresponding to this facet.
Base class for categorizer objects which compute a key value for a document based on certain criteria, for use in sorting/faceting.
Categorizers are created by FacetType objects through the FacetType.categorizer() method. The whoosh.searching.Searcher object passed to the categorizer method may be a composite searcher (that is, wrapping a multi-reader), but categorizers are always run per-segment, with segment-relative document numbers.
The collector will call a categorizer’s set_searcher method as it searches each segment to let the cateogorizer set up whatever segment- specific data it needs.
Collector.allow_overlap should be True if the caller should use the keys_for_id method instead of key_for_id to group documents into potentially overlapping groups.
Returns a key for the given segment-relative document number.
Returns a key for the given matcher. The default implementation simply gets the matcher’s current document ID and calls key_for_id, but a subclass can override this if it needs information from the matcher to compute the key.
Returns a representation of the key to be used as a dictionary key in faceting. For example, the sorting key for date fields is a large integer; this method translates it into a datetime object to make the groupings clearer.
Yields a series of keys for the given segment-relative document number. This method will be called instead of key_for_id if Categorizer.allow_overlap==True.
Called by the collector when the collector moves to a new segment. The searcher will be atomic. The docoffset is the offset of the segment’s document numbers relative to the entire index. You can use the offset to get absolute index docnums by adding the offset to segment-relative docnums.
Sorts/facest by the contents of a field.
For example, to sort by the contents of the “path” field in reverse order, and facet by the contents of the “tag” field:
paths = FieldFacet("path", reverse=True)
tags = FieldFacet("tag")
results = searcher.search(myquery, sortedby=paths, groupedby=tags)
This facet returns different categorizers based on the field type.
Parameters: |
|
---|
Sorts/facets based on the results of a series of queries.
Parameters: |
|
---|
Sorts/facets based on numeric ranges. For textual ranges, use QueryFacet.
For example, to facet the “price” field into $100 buckets, up to $1000:
prices = RangeFacet("price", 0, 1000, 100)
results = searcher.search(myquery, groupedby=prices)
The ranges/buckets are always inclusive at the start and exclusive at the end.
Parameters: |
|
---|
Sorts/facets based on date ranges.
For example, to facet the “birthday” field into year-sized buckets:
startdate = datetime(1920, 0, 0)
enddate = datetime.now()
gap = timedelta(days=365)
bdays = RangeFacet("birthday", startdate, enddate, gap)
results = searcher.search(myquery, groupedby=bdays)
The ranges/buckets are always inclusive at the start and exclusive at the end.
Parameters: |
|
---|
Uses a document’s score as a sorting criterion.
For example, to sort by the tag field, and then within that by relative score:
tag_score = MultiFacet(["tag", ScoreFacet()])
results = searcher.search(myquery, sortedby=tag_score)
Lets you pass an arbitrary function that will compute the key. This may be easier than subclassing FacetType and Categorizer to set up the desired behavior.
The function is called with the arguments (searcher, docid), where the searcher may be a composite searcher, and the docid is an absolute index document number (not segment-relative).
For example, to use the number of words in the document’s “content” field as the sorting/faceting key:
fn = lambda s, docid: s.doc_field_length(docid, "content")
lengths = FunctionFacet(fn)
Sorts/facets by the combination of multiple “sub-facets”.
For example, to sort by the value of the “tag” field, and then (for documents where the tag is the same) by the value of the “path” field:
facet = MultiFacet(FieldFacet("tag"), FieldFacet("path")
results = searcher.search(myquery, sortedby=facet)
As a shortcut, you can use strings to refer to field names, and they will be assumed to be field names and turned into FieldFacet objects:
facet = MultiFacet("tag", "path")
You can also use the add_* methods to add criteria to the multifacet:
facet = MultiFacet()
facet.add_field("tag")
facet.add_field("path", reverse=True)
facet.add_query({"a-m": TermRange("name", "a", "m"), "n-z": TermRange("name", "n", "z")})
Maps facet names to FacetType objects, for creating multiple groupings of documents.
For example, to group by tag, and also group by price range:
facets = Facets()
facets.add_field("tag")
facets.add_facet("price", RangeFacet("price", 0, 1000, 100))
results = searcher.search(myquery, groupedby=facets)
tag_groups = results.groups("tag")
price_groups = results.groups("price")
(To group by the combination of multiple facets, use MultiFacet.)
Adds the contents of the given Facets or dict object to this object.
Adds a FieldFacet for the given field name (the field name is automatically used as the facet name).
Adds a QueryFacet under the given name.
Parameters: |
|
---|
Returns a list of (facetname, facetobject) tuples for the facets in this object.