Forums Register Login Forgot your login/password? Search
RT Performance Degradation Cutoff
Common forum | 1 | 2 | 3 | 4 | 5 | ... | 453 | 454 | 455 | 456 | next »» | Create new thread
|
cbeans
Name: Chris |
2011-06-24 20:37:20
| reply! I've recently been running some performance comparisons between using Sphinx and Lucene indexing collections of email inboxes. In the smaller tests, Sphinx was winning across the board, and was much faster at processing the input and indexing it, but I did notice that its generated indexes were about three times larger than those of Lucene. Granted, Lucene has a lot more support for taming and cropping and sorting indexes to be just the information that you want, but when these indexes began to exceed my 40gb of RAM, I started to (perhaps predictably) notice some serious performance degradation. I've seen analysis elsewhere showing cutoffs for RT performance against, say, the plain indexes, and I was curious if it was primarily the rate of growth of the indexes that limits their performance. Has anyone else experienced something similar, or have any advice for how to trim down the indexes to squeeze out better performance? |
|
Tomat
Name: Stas Klinov |
to: cbeans, 2011-06-26 17:42:52
| reply! > I've recently been running some performance comparisons between using Sphinx and Lucene > indexing collections of email inboxes. In the smaller tests, Sphinx was ... the rate of growth of the indexes that limits their performance. Has anyone else experienced something similar, or have any advice for how to trim down the indexes to squeeze out better performance? Its hard to say something without actual numbers, config etc as you could not set rt_mem_limit and it by default is 32 mb so pushing a lot of data to such index gives you many disk chunks and you performance could degrade greatly. |
|
cbeans
Name: Chris |
to: Tomat, 2011-06-27 18:41:13
| reply! > Its hard to say something without actual numbers, config etc as you could not set > rt_mem_limit and it by default is 32 mb so pushing a lot of data to such index gives you > many disk chunks and you performance could degrade greatly. > As it stands, I've got the indexes broken into 64 partitions, corresponding to the partitions of the Mailboxes as they are broken up into separate MySQL databases. rt_mem_limit is set to 2gb, so we shouldn't have all too many disk chunks. Additionally, how does the size of the indexes when dumped out compare against their size when the daemon is using them? Will this 70gb get reduced at all? |
|
Tomat
Name: Stas Klinov |
to: cbeans, 2011-06-28 07:05:34
| reply! > As it stands, I've got the indexes broken into 64 partitions, corresponding to the > partitions of the Mailboxes as they are broken up into separate MySQL databases. > rt_mem_limit is set to 2gb, so we shouldn't have all too many disk chunks. Additionally, > how does the size of the indexes when dumped out compare against their size when the > daemon is using them? Will this 70gb get reduced at all? Could you provide disk chunks count for your indexes? |
|
cbeans
Name: Chris |
to: Tomat, 2011-06-28 18:52:22
| reply! > Could you provide disk chunks count for your indexes? > I can indeed, but first I want to make sure that I'm accurately relaying the information. Is it the case that each index gets an associated .ram file to use as the dump whenever searchd halts, and that only once this .ram file is going to exceed the chunk size are further chunks allocated with a set of files with a trailing .i.sp* for increasing i's? If I understand things correctly, then we have the 64 ram files after the dump, 12 sets of .0.sp* files, 4 sets of .1.sp* files, and 4 sets of .2.sp* files. If I understand things correctly, that means we've got 84 chunks. |
|
Tomat
Name: Stas Klinov |
to: cbeans, 2011-06-28 20:57:14
| reply! > dump whenever searchd halts, and that only once this .ram file is going to exceed the > chunk size are further chunks allocated with a set of files with a trailing .i.sp* for > increasing i's? If I understand things correctly, then we have the 64 ram files after the > dump, 12 sets of .0.sp* files, 4 sets of .1.sp* files, and 4 sets of .2.sp* files. If I > understand things correctly, that means we've got 84 chunks. Yes, as explained here http://sphinxsearch.com/docs/current.html#rt-internals RT dumps plain index as disk chunk on ram chunk overgrow rt_mem_limit So you have 4 RT index with 4 to 12 disk chunks per index. And daemon performs search over disk chunks then ram chunk per index(s). That is why it could be slow in case you issue query to all indexes in that case daemon should perform search over 64 (ram) * ( 4 to 12 )( plain ) indexes. |
|
Tomat
Name: Stas Klinov |
to: Tomat, 2011-06-28 20:58:30
| reply! > RT dumps plain index as disk chunk on ram chunk overgrow rt_mem_limit > So you have 4 RT index with 4 to 12 disk chunks per index. > I've misspelled - Not 4 Rt indexes but 64 RT indexes |
|
Tomat
Name: Stas Klinov |
to: cbeans, 2011-06-28 21:01:26
| reply! > increasing i's? If I understand things correctly, then we have the 64 ram files after the > dump, 12 sets of .0.sp* files, 4 sets of .1.sp* files, and 4 sets of .2.sp* It could be better to measure RT chunks not counting .0.*, .1.*, but counting rt_index1.ram rt_index1.0.* rt_index1.1.* rt_index1.2.* rt_index1.3.* So you have RT index with 4 disk chunks ( plain indexes ) |
Common forum | 1 | 2 | 3 | 4 | 5 | ... | 453 | 454 | 455 | 456 | next »» | Create new thread