Elasticsearch Analysis Synonym


Elasticsearch Analysis Synonym Plugin provides NGramSynonymTokenizer. For more details, see LUCENE-5252.


Versions in Maven Repository


Please file an issue. (Japanese forum is here.)


For 5.x

$ $ES_HOME/bin/elasticsearch-plugin install org.codelibs:elasticsearch-analysis-synonym:5.3.0

For 2.x

$ $ES_HOME/bin/plugin install org.codelibs/elasticsearch-analysis-synonym/2.4.0

Getting Started

Create synonym.txt File

First of all, you need to create a synonym dictionary file, synonym.txt in $ES_CONF(ex. /etc/elasticsearch). (The following content is just a sample...)

$ cat /etc/elasticsearch/synonym.txt

Create Index

NGramSynonymTokenizer is defined as "ngram_synonym" type. Creating an index with "ngram_synonym" is below:

$ curl -XPUT localhost:9200/sample?pretty -d '

and then insert data:

$ curl -XPOST localhost:9200/sample/item/1 -d '

Check Search Results

Try searching...

$ curl -XPOST "http://localhost:9200/sample/_search" -d '
   "query": {
      "match_phrase": {
         "msg": "あ"

$ curl -XPOST "http://localhost:9200/sample/_search" -d '
   "query": {
      "match_phrase": {
         "msg": "あい"

$ curl -XPOST "http://localhost:9200/sample/_search" -d '
   "query": {
      "match_phrase": {
         "msg": "かき"

$ curl -XPOST "http://localhost:9200/sample/_search" -d '
   "query": {
      "match_phrase": {
         "msg": "かきい"

Reload synonyms_path File Dynamically

To add "dynamic_reload" property as true, NGramSynonymTokenizer reloads synonyms_path file on the fly(actually, it's reload on reset() method call). If you want to change an interval time to check a file timestamp, add "reload_interval".

$ curl -XPUT localhost:9200/sample?pretty -d '