Kind of Chinese Analysis for Elasticsearch Build Status

Requirements

- Java 7 update 55 or later

Structure of es-ik

How to use es-ik

Actually, ik-analysis-es-plugin expose a interface DictionaryConfiguration a kind of SPI. es-ik-sqlite3 implement it so that ik-analysis-es-plugin can get dictionary's content from Sqlite. In other words, you can get your implementation like persisting dictionary's content into Redis.

SPI is just a kind of concept. In java, I use ServiceLoader to implement that. As soon as your implementation conforms with ServiceLoader's usage, don't need to change ik-analysis-es-plugin module, you'll get a new ik-analysis-es-plugin's plugin. :P

How to use es-ik-sqlite3(currently version 1.0.1)

  1. create songs index

    curl -X PUT -H "Cache-Control: no-cache" -d '{
        "settings":{
            "index":{
                "number_of_shards":1,
                "number_of_replicas": 1
            }
        }
    }' 'http://localhost:9200/songs/'
  2. create map for songs/song

    curl -X PUT -H "Cache-Control: no-cache" -d '{
            "song": {
                "_source": {"enabled": true},
                "_all": {
                    "indexAnalyzer": "ik_analysis",
                    "searchAnalyzer": "ik_analysis",
                    "term_vector": "no",
                    "store": "true"
                },
                "properties":{
                    "title":{
                        "type": "string",
                        "store": "yes",
                        "indexAnalyzer": "ik_analysis",
                        "searchAnalyzer": "ik_analysis",
                        "include_in_all": "true"
                    }
                }
    
            }
    }
        ' 'http://localhost:9200/songs/_mapping/song'
  3. test it

    curl -X POST  -d '林夕为我们作词' 'http://localhost:9200/songs/_analyze?analyzer=ik_analysis'
    
    response:
    {"tokens":[{"token":"林夕","start_offset":0,"end_offset":2,"type":"CN_WORD","position":1},{"token":"作词","start_offset":5,"end_offset":7,"type":"CN_WORD","position":2}]}

Create a empty sqlite3 db for es-ik-sqlite3

  1. create database

    sqlite3 dictionary.db
  2. create tables

    CREATE TABLE main_dictionary(term TEXT NOT NULL,unique(term));
    CREATE TABLE quantifier_dictionary(term TEXT NOT NULL,unique(term));
    CREATE TABLE stopword_dictionary(term TEXT NOT NULL,unique(term));

617052 records ~= 30MB db file