support.dawg module
This module contains classes and functions for working with Directed Acyclic
Word Graphs (DAWGs). This structure is used to efficiently store a list of
words.
This code should be considered an implementation detail and may change in
future releases.
TODO: try to find a way to traverse the term index efficiently to do within()
instead of storing a DAWG separately.
Graph nodes
-
class whoosh.support.dawg.BaseNode
This is the base class for objects representing nodes in a directed
acyclic word graph (DAWG).
- final is a property which is True if this node represents the end of
a word.
- __contains__(label) returns True if the node has an edge with the
given label.
- __iter__() returns an iterator of the labels for the node’s outgoing
edges. keys() is available as a convenient shortcut to get a list.
- __len__() returns the number of outgoing edges.
- edge(label) returns the Node connected to the edge with the given
label.
- all_edges() returns a dictionary of the node’s outgoing edges, where
the keys are the edge labels and the values are the connected nodes.
-
all_edges()
Returns a dictionary mapping outgoing edge labels to nodes.
-
edge(key, expand=True)
Returns the node connected to the outgoing edge with the given
label.
-
edge_count()
Returns the recursive count of edges in this node and the tree under
it.
-
keys()
Returns a list of the outgoing edge labels.
-
class whoosh.support.dawg.UnionNode(a, b)
Makes two graphs appear to be the union of the two graphs.
-
class whoosh.support.dawg.IntersectionNode(a, b)
Makes two graphs appear to be the intersection of the two graphs.
Utility functions
-
whoosh.support.dawg.edge_count(node)
-
whoosh.support.dawg.flatten(node, sofar='')
-
whoosh.support.dawg.dump_dawg(node, tab=0)
-
whoosh.support.dawg.within(node, text, k=1, prefix=0, seen=None)