Sphinx lets you do facets easily via the multi-query interface, but sometimes you would only need to display a tab or a link, and just ignore the per-facet search results count. Guess what? When you don’t need the counts, such queries can be made even faster using a cutoff feature.
So here’s a query that one would normally use to check whether there are both Public and Private items to display. (This one is an actual query from production.)
SELECT private, COUNT(*) AS cnt
FROM items_category_24 WHERE location_id=641780 AND sort_time BETWEEN 0 AND 1343892300
GROUP BY private ORDER BY private ASC
OPTION max_matches=3;
This query scans and honestly counts all the matches that meet the WHERE condition. However we don’t really need the count here. It simply never gets displayed. Also, we know upfront that “private” is a boolean flag, so there can be two groups, tops. So what we need to know here is just whether we have both groups in the result set, or less than that. So maybe we could stop the query evaluation early when we get those? Yes, we can. Enters cutoff:
SELECT private
FROM items_category_24 WHERE location_id=641780 AND sort_time BETWEEN 0 AND 1343892300 GROUP BY private ORDER BY private ASC
OPTION max_matches=3, cutoff=2;
What cutoff does by design is that it stops the query evaluation once a given number of results was matched. And with GROUP BY our results are groups, not matches themselves. And remember that “private” is boolean, so when we’ve seen 2 groups, we know for a fact that we’ve seen them all. So this query will either terminate early the very moment both groups are found, or work exactly as the previous one in case one of the groups is missing (to ensure that it is, actually, missing). No harm there.
A quick log analysis and a test that we’ve done with a client revealed that 85% of such queries with a “GROUP BY private” clause are not only subject to this optimization, but that the processing time on those drops from ~56 msec/query on average down to ~0-1 msec/query. That’s a nice saving, especially as auxiliary queries go.
| « August 1, 2012. Sphinx 2.0.5-release, Unleashed! | August 14, 2012. Indexing tips & tricks » |

