Forums Register Login Forgot your login/password? Search
How to index dash char in sphinx?
Common forum | 1 | 2 | 3 | 4 | 5 | ... | 449 | 450 | 451 | 452 | next »» | Create new thread
|
oscardb
Name: Oscar Del Ben |
2009-01-20 11:23:45
| reply! Hello, I read that sphinx threat the dash char ('-') as a word separator. If this is true, how can I overwrite this behavior? Thanks Oscar Del Ben |
|
Arantor
Name: Pete Spicer |
to: oscardb, 2009-01-20 12:06:37
| reply! > Hello, I read that sphinx threat the dash char ('-') as a word separator. If this is > true, how can I overwrite this behavior? Thanks There are three things you can do: 1. You can ignore the character (so hyphenated words get de-hyphenated, e.g. blu-ray becomes bluray) - just add it to the ignore_chars directive. This will then drop, and not treat as word-breaking, any standard hyphen (not special em-dashes or en-dashes) 2. You can make it part of the list of characters that are words - add it to charset_table - however this will treat even a single - as a word. 3. Specific words can be overridden by defining each word as an exception. (See the exceptions directive for more) Note that in all cases, you'll need to reindex your data and additionally if you're not using 0.9.9, you'll need to restart searchd too. Also, options 1 and 2 will disable the negation syntax (e.g. word -word2 where it would find documents that contain word but not word2) but you can substitute ! instead (i.e. word !word2) |
|
oscardb
Name: Oscar Del Ben |
to: Arantor, 2009-01-20 12:50:46
| reply! Thank you, very exhaustive and helpful. > There are three things you can do: > > 1. You can ignore the character (so hyphenated words get de-hyphenated, e.g. blu-ray > becomes bluray) - just add it to the ignore_chars directive. This will then drop, and not > treat as word-breaking, any standard hyphen (not special em-dashes or en-dashes) > > 2. You can make it part of the list of characters that are words - add it to > charset_table - however this will treat even a single - as a word. > > 3. Specific words can be overridden by defining each word as an exception. (See the > exceptions directive for more) > > Note that in all cases, you'll need to reindex your data and additionally if you're not > using 0.9.9, you'll need to restart searchd too. > > Also, options 1 and 2 will disable the negation syntax (e.g. word -word2 where it would > find documents that contain word but not word2) but you can substitute ! instead (i.e. > word !word2) |
|
rmarscher
Name: Rob Marscher |
to: oscardb, 2009-03-18 20:49:00
| reply! > Thank you, very exhaustive and helpful. +1 I needed to figure this out too. Thanks! |
Common forum | 1 | 2 | 3 | 4 | 5 | ... | 449 | 450 | 451 | 452 | next »» | Create new thread