SphinxQL now lets you do everything querying-related that SphinxAPI did, in a simpler, faster, and more convenient way. For most of the features, the mapping of API calls to SphinxQL syntax is straightforwards (either via SQL syntax or using our OPTION clause). However, a few things, namely time segments, geosearches, overrides, index and weight fields, etc might now be less obvious. So let’s discuss them.
Time segments sorting mode can now be easily repeated using INTERVAL and NOW functions. Moreover, it’s so much more flexible in SphinxQL. In custom SphinxAPI mode, turn-points were hardcoded, and limited to sorting. With SphinxQL, you now can control those turn-points, and also use the result for something else, not just sorting.
Specifically, legacy SphinxAPI mode is completely equivalent to computing this INTERVAL expression and sorting by it:
SELECT *, INTERVAL(time_stamp_attr, NOW()-90*86400, NOW()-30*86400, NOW()-7*86400, NOW()-86400, NOW()-3600) AS time_seg ... ORDER BY time_seg DESC, @weight DESC
86400 is the number of seconds per 24 hours so the turn-points are at 90 days, 30 days, 7 days, 1 day, and 1 hour respectively. That’s exactly how they are hard-coded in SPH_SORT_TIME_SEGMENTS mode.
Note, by the way, that you need not worry about NOW() function used too much or constant multiplication. All that stuff gets optimized by our expression engine and this INTERVAL() expression is simplified down to an efficient search of attribute values in a list of constants.
Geodistance searches are supported through GEODIST(lat1,lon1,lat2,lon2) function. That is similar to SetGeoAnchor() API call but more flexible again. API required that lat1/lon1 are attribute names, and lat2/lon2 are constant floats. SphinxQL does not have that restriction any more, all latitudes and longitudes can be arbitrary expressions!
At the same time we are not sacrificing any speed by making GEODIST() that generic, because we check whether any (or both) of the points are constant expressions, and optimize accordingly. For instance, GEODIST() between two constant points will be computed only once. And GEODIST() with a pair of attributes and a constant anchor will be computed exactly as before, without any extra computational overheads.
Attribute value overrides, depending on how did you use them, can now be implemented using either IF() or IN() functions. For instance, to boost matching posts written by user’s friends, you could use something like:
SELECT *, @weight + IN(author_id,2,3,5,8,13,21)*1000 ... ORDER BY @weight DESC
And, say, giving a 10% discount on a given category of goods to a current user (because he punched in a valid promo code) can be accomplished like:
SELECT *, IF(category_id=123,0.9,1)*price AS adjusted_price ... ORDER BY adjusted_price ASC
In this particular case just cutting 10% off could be expressed without an IF(), for example as price*(1-0.1*(category_id=123)), exploiting the fact that boolean expressions in scalar context evaluate to 0 or 1. However IF() might be more efficient when more complex expressions are involved, because it short-circuits. That is, either 2nd or 3rd argument gets evaluated, but never the both of them.
Index and field weights were, frankly speaking, overlooked a while ago and stayed unimplemented for some time. But as of bleeding edge trunk, they are available, through an OPTION clause:
SELECT ... OPTION field_weights=(title=5,abstract=2,content=1)
To summarize, all the searching things you were able to do with the API are now available from SphinxQL as well, and sometimes in even more flexible and optimal way than before. So be sure to give it a try.
|« May 23, 2010. C++ compiler shootout||July 16, 2010. Sphinx at OSCON Portland July 19-23 »|