Character set encoding type. Optional, default is 'sbcs'. Known values are 'sbcs' and 'utf-8'.
Different encodings have different methods for mapping their internal characters codes into specific byte sequences. Two most common methods in use today are single-byte encoding and UTF-8. Their corresponding charset_type values are 'sbcs' (stands for Single Byte Character Set) and 'utf-8'. The selected encoding type will be used everywhere where the index is used: when indexing the data, when parsing the query against this index, when generating snippets, etc.
Note that while 'utf-8' implies that the decoded values must be treated as Unicode codepoint numbers, there's a family of 'sbcs' encodings that may in turn treat different byte values differently, and that should be properly reflected in your charset_table settings. For example, the same byte value of 224 (0xE0 hex) maps to different Russian letters depending on whether koi-8r or windows-1251 encoding is used.
charset_type = utf-8