News
Downloads
Services
Community
Partners
About

SELECT
    select_expr [, select_expr ...]
    FROM index [, index2 ...]
    [WHERE where_condition]
    [GROUP BY {col_name | expr_alias}]
    [ORDER BY {col_name | expr_alias} {ASC | DESC} [, ...]]
    [WITHIN GROUP ORDER BY {col_name | expr_alias} {ASC | DESC}]
    [LIMIT offset, row_count]
    [OPTION opt_name = opt_value [, ...]]

SELECT statement was introduced in version 0.9.9-rc2. It's syntax is based upon regular SQL but adds several Sphinx-specific extensions and has a few omissions (such as (currently) missing support for JOINs). Specifically,

  • Column list clause. Column names, arbitrary expressions, and star ('*') are all allowed (ie. SELECT @id, group_id*123+456 AS expr1 FROM test1 will work). Unlike in regular SQL, all computed expressions must be aliased with a valid identifier. Starting with version 2.0.1-beta, AS is optional. Special names such as @id and @weight should currently be used with leading at-sign. This at-sign requirement will be lifted in the future.

  • FROM clause. FROM clause should contain the list of indexes to search through. Unlike in regular SQL, comma means enumeration of full-text indexes as in Query() API call rather than JOIN.

  • WHERE clause. This clause will map both to fulltext query and filters. Comparison operators (=, !=, <, >, <=, >=), IN, AND, NOT, and BETWEEN are all supported and map directly to filters. OR is not supported yet but will be in the future. MATCH('query') is supported and maps to fulltext query. Query will be interpreted according to full-text query language rules. There must be at most one MATCH() in the clause. Starting with version 2.0.1-beta, {col_name | expr_alias} [NOT] IN @uservar condition syntax is supported. (Refer to Section 7.7, “SET syntax” for a discussion of global user variables.)

  • GROUP BY clause. Currently only supports grouping by a single column. The column however can be a computed expression:

    SELECT *, group_id*1000+article_type AS gkey FROM example GROUP BY gkey
    

    Aggregate functions (AVG(), MIN(), MAX(), SUM()) in column list clause are supported. Arguments to aggregate functions can be either plain attributes or arbitrary expressions. COUNT(*) is implicitly supported as using GROUP BY will add @count column to result set. Explicit support might be added in the future. COUNT(DISTINCT attr) is supported. Currently there can be at most one COUNT(DISTINCT) per query and an argument needs to be an attribute. Both current restrictions on COUNT(DISTINCT) might be lifted in the future.

    SELECT *, AVG(price) AS avgprice, COUNT(DISTINCT storeid)
    FROM products
    WHERE MATCH('ipod')
    GROUP BY vendorid
    

    Starting with 2.0.1-beta, GROUP BY on a string attribute is supported, with respect for current collation (see Section 5.12, “Collations”).

  • WITHIN GROUP ORDER BY clause. This is a Sphinx specific extension that lets you control how the best row within a group will to be selected. The syntax matches that of regular ORDER BY clause:

    SELECT *, INTERVAL(posted,NOW()-7*86400,NOW()-86400) AS timeseg
    FROM example WHERE MATCH('my search query')
    GROUP BY siteid
    WITHIN GROUP ORDER BY @weight DESC
    ORDER BY timeseg DESC, @weight DESC
    

    Starting with 2.0.1-beta, WITHIN GROUP ORDER BY on a string attribute is supported, with respect for current collation (see Section 5.12, “Collations”).

  • ORDER BY clause. Unlike in regular SQL, only column names (not expressions) are allowed and explicit ASC and DESC are required. The columns however can be computed expressions:

    SELECT *, @weight*10+docboost AS skey FROM example ORDER BY skey
    

    Starting with 2.0.1-beta, ORDER BY on a string attribute is supported, with respect for current collation (see Section 5.12, “Collations”).

    Starting with 2.0.2-beta, ORDER BY RAND() syntax is supported. Note that this syntax is actually going to randomize the weight values and then order matches by those randomized weights.

  • LIMIT clause. Both LIMIT N and LIMIT M,N forms are supported. Unlike in regular SQL (but like in Sphinx API), an implicit LIMIT 0,20 is present by default.

  • OPTION clause. This is a Sphinx specific extension that lets you control a number of per-query options. The syntax is:

    OPTION <optionname>=<value> [ , ... ]
    

    Supported options and respectively allowed values are:

    • 'ranker' - any of 'proximity_bm25', 'bm25', 'none', 'wordcount', 'proximity', 'matchany', or 'fieldmask'

    • 'max_matches' - integer (per-query max matches value)

    • 'cutoff' - integer (max found matches threshold)

    • 'max_query_time' - integer (max search time threshold, msec)

    • 'retry_count' - integer (distributed retries count)

    • 'retry_delay' - integer (distributed retry delay, msec)

    • 'field_weights' - a named integer list (per-field user weights for ranking)

    • 'index_weights' - a named integer list (per-index user weights for ranking)

    • 'reverse_scan' - 0 or 1, lets you control the order in which full-scan query processes the rows

    • 'comment' - string, user comment that gets copied to a query log file

    Example:

    SELECT * FROM test WHERE MATCH('@title hello @body world')
    OPTION ranker=bm25, max_matches=3000,
        field_weights=(title=10, body=3)