Class | Riddle::Client |
In: |
lib/riddle/0.9.9/client.rb
lib/riddle/1.10/client.rb lib/riddle/client/response.rb lib/riddle/client/filter.rb lib/riddle/client/message.rb lib/riddle/client.rb |
Parent: | Object |
This class was heavily based on the existing Client API by Dmytro Shteflyuk and Alexy Kovyrin. Their code worked fine, I just wanted something a bit more Ruby-ish (ie. lowercase and underscored method names). I also have used a few helper classes, just to neaten things up.
Feel free to use it wherever. Send bug reports, patches, comments and suggestions to pat at freelancing-gods dot com.
Most properties of the client are accessible through attribute accessors, and where relevant use symboles instead of the long constants common in other clients. Some examples:
client.sort_mode = :extended client.sort_by = "birthday DESC" client.match_mode = :extended
To add a filter, you will need to create a Filter object:
client.filters << Riddle::Client::Filter.new("birthday", Time.at(1975, 1, 1).to_i..Time.at(1985, 1, 1).to_i, false)
Commands | = | { :search => 0, # SEARCHD_COMMAND_SEARCH :excerpt => 1, # SEARCHD_COMMAND_EXCERPT :update => 2, # SEARCHD_COMMAND_UPDATE :keywords => 3, # SEARCHD_COMMAND_KEYWORDS :persist => 4, # SEARCHD_COMMAND_PERSIST :status => 5, # SEARCHD_COMMAND_STATUS :query => 6, # SEARCHD_COMMAND_QUERY :flushattrs => 7 |
Versions | = | { :search => 0x113, # VER_COMMAND_SEARCH :excerpt => 0x100, # VER_COMMAND_EXCERPT :update => 0x101, # VER_COMMAND_UPDATE :keywords => 0x100, # VER_COMMAND_KEYWORDS :status => 0x100, # VER_COMMAND_STATUS :query => 0x100, # VER_COMMAND_QUERY :flushattrs => 0x100 |
Statuses | = | { :ok => 0, # SEARCHD_OK :error => 1, # SEARCHD_ERROR :retry => 2, # SEARCHD_RETRY :warning => 3 |
MatchModes | = | { :all => 0, # SPH_MATCH_ALL :any => 1, # SPH_MATCH_ANY :phrase => 2, # SPH_MATCH_PHRASE :boolean => 3, # SPH_MATCH_BOOLEAN :extended => 4, # SPH_MATCH_EXTENDED :fullscan => 5, # SPH_MATCH_FULLSCAN :extended2 => 6 |
RankModes | = | { :proximity_bm25 => 0, # SPH_RANK_PROXIMITY_BM25 :bm25 => 1, # SPH_RANK_BM25 :none => 2, # SPH_RANK_NONE :wordcount => 3, # SPH_RANK_WORDCOUNT :proximity => 4, # SPH_RANK_PROXIMITY :match_any => 5, # SPH_RANK_MATCHANY :fieldmask => 6, # SPH_RANK_FIELDMASK :sph04 => 7, # SPH_RANK_SPH04 :total => 8 |
SortModes | = | { :relevance => 0, # SPH_SORT_RELEVANCE :attr_desc => 1, # SPH_SORT_ATTR_DESC :attr_asc => 2, # SPH_SORT_ATTR_ASC :time_segments => 3, # SPH_SORT_TIME_SEGMENTS :extended => 4, # SPH_SORT_EXTENDED :expr => 5 |
AttributeTypes | = | { :integer => 1, # SPH_ATTR_INTEGER :timestamp => 2, # SPH_ATTR_TIMESTAMP :ordinal => 3, # SPH_ATTR_ORDINAL :bool => 4, # SPH_ATTR_BOOL :float => 5, # SPH_ATTR_FLOAT :bigint => 6, # SPH_ATTR_BIGINT :string => 7, # SPH_ATTR_STRING :multi => 0x40000000 |
GroupFunctions | = | { :day => 0, # SPH_GROUPBY_DAY :week => 1, # SPH_GROUPBY_WEEK :month => 2, # SPH_GROUPBY_MONTH :year => 3, # SPH_GROUPBY_YEAR :attr => 4, # SPH_GROUPBY_ATTR :attrpair => 5 |
FilterTypes | = | { :values => 0, # SPH_FILTER_VALUES :range => 1, # SPH_FILTER_RANGE :float_range => 2 |
anchor | [RW] | |
connection | [RW] | |
cut_off | [RW] | |
field_weights | [RW] | |
filters | [RW] | |
group_by | [RW] | |
group_clause | [RW] | |
group_distinct | [RW] | |
group_function | [RW] | |
id_range | [RW] | |
index_weights | [RW] | |
limit | [RW] | |
match_mode | [RW] | |
max_matches | [RW] | |
max_query_time | [RW] | |
offset | [RW] | |
overrides | [RW] | |
port | [RW] | |
queue | [R] | |
rank_mode | [RW] | |
retry_count | [RW] | |
retry_delay | [RW] | |
select | [RW] | |
servers | [RW] | |
sort_by | [RW] | |
sort_mode | [RW] | |
timeout | [RW] | |
weights | [RW] |
Can instantiate with a specific server and port - otherwise it assumes defaults of localhost and 3312 respectively. All other settings can be accessed and changed via the attribute accessors.
Build excerpts from search terms (the words) and the text of documents. Excerpts are bodies of text that have the words highlighted. They may also be abbreviated to fit within a word limit.
As part of the options hash, you will need to define:
Optional settings include:
The defaults differ from the official PHP client, as I‘ve opted for semantic HTML markup.
Example:
client.excerpts(:docs => ["Pat Allan, Pat Cash"], :words => 'Pat', :index => 'pats') #=> ["<span class=\"match\">Pat</span> Allan, <span class=\"match\">Pat</span> Cash"] lorem_lipsum = "Lorem ipsum dolor..." client.excerpts(:docs => ["Pat Allan, #{lorem_lipsum} Pat Cash"], :words => 'Pat', :index => 'pats') #=> ["<span class=\"match\">Pat</span> Allan, Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua … . Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. <span class=\"match\">Pat</span> Cash"]
Workflow:
Excerpt creation is completely isolated from searching the index. The nominated index is only used to discover encoding and charset information.
Therefore, the workflow goes:
Generates a keyword list for a given query. Each keyword is represented by a hash, with keys :tokenised and :normalised. If return_hits is set to true it will also report on the number of hits and documents for each keyword (see :hits and :docs keys respectively).
Query the Sphinx daemon - defaulting to all indexes, but you can specify a specific one if you wish. The search parameter should be a string following Sphinx‘s expectations.
The object returned from this method is a hash with the following keys:
The key :matches returns an array of hashes - the actual search results. Each hash has the document id (:doc), the result weighting (:weight), and a hash of the attributes for the document (:attributes).
The :fields and :attribute_names keys return list of fields and attributes for the documents. The key :attributes will return a hash of attribute name and type pairs, and :words returns a hash of hashes representing the words from the search, with the number of documents and hits for each, along the lines of:
results[:words]["Pat"] #=> {:docs => 12, :hits => 15}
:total, :total_found and :time return the number of matches available, the total number of matches (which may be greater than the maximum available, depending on the number of matches and your sphinx configuration), and the time in milliseconds that the query took to run.
:status is the error code for the query - and if there was a related warning, it will be under the :warning key. Fatal errors will be described under :error.
Set the geo-anchor point - with the names of the attributes that contain the latitude and longitude (in radians), and the reference position. Note that for geocoding to work properly, you must also set match_mode to :extended. To sort results by distance, you will need to set sort_by to ’@geodist asc’, and sort_mode to extended (as an example). Sphinx expects latitude and longitude to be returned from you SQL source in radians.
Example:
client.set_anchor('lat', -0.6591741, 'long', 2.530770)
Update attributes - first parameter is the relevant index, second is an array of attributes to be updated, and the third is a hash, where the keys are the document ids, and the values are arrays with the attribute values - in the same order as the second parameter.
Example:
client.update('people', ['birthday'], {1 => [Time.at(1982, 20, 8).to_i]})