Archive for October, 2011

Sphinx performance – know your queries time

Thursday, October 27th, 2011

As you might know Sphinx team are focused not only on full-text search improvements like blended characters support we introduced in 2.0.1-beta, we also cares about performance improvements. And one of the main questions on performance – how do you measure single query speed especially in scalable, distributed environment.
(more…)

dist_threads, the New Right Way to use many cores

Wednesday, October 19th, 2011

One of the application of distributed indexes in Sphinx is parallelizing queries across many CPU cores even when running on a single server. There’s a well known trick to have an agent line (or three) pointing to the very same master searchd instance. Only problem with that approach is, every query entails a bunch of one-off TCP connections, extra forks, and other redundant internal work. Which is okay when you’re serving a few heavy queries but might spin over 50% of your CPU in system time doing those works when you’re doing many quick ones.

Now that’s a problem, but starting with 1.10-beta, there is a solution, called dist_threads directive. So if you’re still doing that agent=localhost trick, and suffering from TCP stack pressure and/or seeing way too much system time in top(1) or vmstat(8), do read on, you are eligible. (As a collateral, if you’re still on anything pre-2.0.1, you should seriously consider upgrading, too.)
(more…)