Compound Word Tokenfilter¶
Token filters that allow to decompose compound words. There are two types available: dictionary_decompounder and hyphenation_decompounder.
The following are settings that can be set for a compound word token filter type:
Setting | Description |
---|---|
word_list | A list of words to use. |
word_list_path | A path (either relative to config location, or absolute) to a list of words. |
Here is an example:
index :
analysis :
analyzer :·
myAnalyzer2 :
type : custom
tokenizer : standard
filter : [myTokenFilter1, myTokenFilter2]
filter :
myTokenFilter1 :
type : dictionary_decompounder
word_list: [one, two, three]
myTokenFilter2 :
type : hyphenation_decompounder
word_list_path: path/to/words.txt