Oct 15, 2013. Sphinx 2.1.2 is now available

We are happy to announce the release of the stable version of the Sphinx 2.1 series. The 2.1.2  release is not only a maintenance release, but also adds several features. In this post, we’re going to walk you through the highlights. 

New features

FLUSH RAMCHUNK command

This command forces the RT memory chunk to be flushed to disk, even if the rt_mem_limit has not been reached. This is useful for debugging and testing. Note that abuse of this command can lead to fragmentation of the index.

mysql> FLUSH RAMCHUNK rt;
Query OK, 0 rows affected (0.05 sec)

SHOW PLAN command

This command displays the execution plan for fulltext matching. It is useful for debugging fulltext matching. To use this command you need to enable profiling first.

mysql> SET profiling=1 \G
Query OK, 0 rows affected (0.00 sec)
 
mysql> SELECT id FROM myindex WHERE MATCH('the i') LIMIT 1 \G
*************************** 1. row ***************************
id: 39815
1 row in set (1.53 sec)
 
mysql> SHOW PLAN \G
*************************** 1. row ***************************
Variable: transformed_tree
   Value: AND(
  AND(KEYWORD(the, querypos=1)),
  AND(KEYWORD(i, querypos=2)))
1 row in set (0.00 sec)

Multiple grouping

It is now possible to perform grouping on multiple columns or computed expressions:

mysql> SELECT * FROM myindex GROUP BY attr1,attr2;
mysql> SELECT *,attr3+10*attr4 as myexpr FROM myindex GROUP BY attr1,attr2,myexpr;

indextool new options

–fold INDEXNAME OPTFILE – similar to CALL KEYWORDS. You can specify a file at input (or standard input will be considered) and it will output the tokenized words.

-q (–quiet) – output is more concise, output banner is removed, this option should be useful to people who parse the output of indextool in their scripts

rt_attr_bool

RT indexes were missing the boolean attribute type. Now, they aren’t. Usage is the same as sql_attr_bool.

LENGTH() for MVA

Returns the number of elements in a MVA set.

mysql> select *,length(mva) from rt_test22;
+------+--------+-------------+
| id   | mva    | length(mva) |
+------+--------+-------------+
|    1 | 34,121 |           2 |
|    4 | 10     |           1 |
|    5 |        |           0 |
+------+--------+-------------+
3 rows in set (0.01 sec)

BM25F in SELECT

The BM25F can be used in SELECT statements if expression ranker is used. It shares the same parameters as the bm25f ranker function.

mysql> select id,bm25f(2.0,0.75,{title=200,body=10}),weight() from blog where match('sql query')  OPTION ranker=expr('sum(lcs*user_weight)');
+-------+-------------------------------------+----------+
| id    | bm25f(2.0,0.75,{title=200,body=10}) | weight() |
+-------+-------------------------------------+----------+
| 20042 |                            0.645800 |        2 |
| 20033 |                            0.671464 |        1 |
| 20037 |                            0.651077 |        1 |
| 20045 |                            0.643463 |        1 |
| 20055 |                            0.640395 |        1 |
+-------+-------------------------------------+----------+
5 rows in set (0.00 sec)

Optimizations

  • SELECT and UPDATE commands are now up to 3.5x faster on indexes with many attributes ( 250+)
  • SELECT commands on JSON attributes are 5-20% faster
  • xmlpipe2 indexing is faster up to 9x on some schemas

Deprecations

str2ordinal is deprecated as it has known issues with sorting and the proper way is to use string attributes.

str2wordcount is deprecated as index_field_lengths creates attributes with the counts of words for a field.

We discourage using these attributes as they will be removed in future versions.

Fixed bugs

Over 40 bugs have been fixed, ranging from lemmatizer issues to agent balancing problems, DSN line for MSSQL sources to crashes and memory leaks. Here are several of the outstanding bug fixes:

  • #1628 – GROUP_CONCAT() and GROUPBY() now works for distributed indexes.
  • #1460 – aggregate functions works over JSON elements with type conversion
  • #1384 – mssql sources can now use odbc_dsn parameter as well
  • #1399 – incorrect error message when filtering on string attributes
  • #1485 – index_exact_words was not automaticaly for RT indexes with infixes and morphology
  • #1508 – distributed index query taking too long

We recommend anyone using 2.1.1-beta, or development versions of the 2.1 series, to upgrade to the 2.1.2 stable release.

Thank you for reading! As usual, if you have any questions, please feel free to ask in the comments below.

Happy Searching!


« »

3 Responses to “Sphinx 2.1.2 is now available”

  1. And when will the next beta be out ?
    With this change :
    http://sphinxsearch.com/bugs/view.php?id=1413

  2. - says:

    I would also like to know when the following will be released please:
    http://sphinxsearch.com/bugs/view.php?id=1413

  3. Laxman says:

    Thank you very much for the stable version of 2.1.1-beta

Leave a Reply