配置elasticsearch6.5.4-ik分詞插件安裝,測試,擴展字典

elasticsearch基本配置上篇已經簡單介紹過,本文講述配置ik分詞器插件的安裝,測試,自定義擴展字典,簡單使用。但願能幫助後來者少走點彎路。git

注意:ik分詞器必須保證和elasticsearch版本一致,配置完成以後能夠設置默認的分詞工具,也能夠在建立索引文件時使用ik分詞工具

1. elasticsearch-ik分詞環境必須跟elasticsearch一致

個人elasticsearch版本是elasticsearch-v6.5.4,因此須要下載的ik分詞器版本是elasticsearch-ik-v6.5.4github

下載文件:json

wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-6.5.4.tar.gz 
wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.5.4/elasticsearch-analysis-ik-6.5.4.zip 
複製代碼

進入elasticsearch的安裝目錄,解壓ik分詞器內文件到plugin目錄:bash

root @ localhost in /data/elasticsearch-6.5.4 [17:18:23]
$ l
總用量 4.8M
drwxrwxr-x   9 euser euser  198 12月 11 11:26 .
drwxr-xr-x.  7 root  root    90 1月  16 16:35 ..
drwxrwxr-x   3 euser euser 4.0K 12月 11 11:13 bin
drwxrwxr-x   2 euser euser  178 12月 11 11:32 config
drwxrwxr-x   3 euser euser   19 12月 11 11:25 data
-rwxrwxr-x   1 euser euser 4.3M 12月  6 22:30 elasticsearch-analysis-ik-6.5.4.zip
drwxrwxr-x   3 euser euser 4.0K 11月 30 08:02 lib
-rwxrwxr-x   1 euser euser  14K 11月 30 07:55 LICENSE.txt
drwxrwxrwx   2 euser euser 8.0K 2月  11 01:30 logs
drwxrwxr-x  28 euser euser 4.0K 11月 30 08:02 modules
-rwxrwxr-x   1 euser euser 395K 11月 30 08:01 NOTICE.txt
drwxrwxr-x   3 euser euser   25 12月 11 11:29 plugins
-rwxrwxr-x   1 euser euser 8.4K 11月 30 07:55 README.textile
複製代碼

進入到plugin目錄,建立文件夾app

mkdir analysis-ik/ 
複製代碼

解壓ik分詞器中的文件到analysis-ik目錄:curl

# root @ iZ2zedtbewsc8oa9i1cb4tZ in /data/elasticsearch-6.5.4 [18:01:37] 
$ cd plugins/

# root @ iZ2zedtbewsc8oa9i1cb4tZ in /data/elasticsearch-6.5.4 [18:01:37] 
$ mkdir analysis-ik

# root @ iZ2zedtbewsc8oa9i1cb4tZ in /data/elasticsearch-6.5.4 [18:01:37] 
$ mv ../../../analysis-ik analysis-ik

# root @ iZ2zedtbewsc8oa9i1cb4tZ in /data/elasticsearch-6.5.4/plugins [18:04:29] 
$ ls
analysis-ik

# root @ iZ2zedtbewsc8oa9i1cb4tZ in /data/elasticsearch-6.5.4/plugins [18:04:34] 
$ ls -l ./analysis-ik/
total 1432
-rw-r--r-- 1 root root 263965 May  6  2018 commons-codec-1.9.jar
-rw-r--r-- 1 root root  61829 May  6  2018 commons-logging-1.2.jar
drwxr-xr-x 2 root root   4096 Aug 26 17:52 config
-rw-r--r-- 1 root root  54693 Dec 23 11:26 elasticsearch-analysis-ik-6.5.4.jar
-rw-r--r-- 1 root root 736658 May  6  2018 httpclient-4.5.2.jar
-rw-r--r-- 1 root root 326724 May  6  2018 httpcore-4.4.4.jar
-rw-r--r-- 1 root root   1805 Dec 23 11:26 plugin-descriptor.properties
-rw-r--r-- 1 root root    125 Dec 23 11:26 plugin-security.policy
複製代碼

配置默認分詞工具爲ik分詞,在ElasticSearch的配置文件config/elasticsearch.yml中的最後一行添加參數: index.analysis.analyzer.default.type:ik (則設置全部索引的默認分詞器爲ik分詞,也能夠不這麼作,經過設置mapping來使用ik分詞)elasticsearch

# root @ iZ2zedtbewsc8oa9i1cb4tZ in /data/elasticsearch-6.5.4 [18:33:16] 
$ cd config/

# root @ iZ2zedtbewsc8oa9i1cb4tZ in /data/elasticsearch-6.5.4/config [18:33:21] 
$ echo "index.analysis.analyzer.default.type:ik" >> elasticsearch.yml
複製代碼

2. 啓動eleasticsearch,並測試ik分詞

方便測試,以普通模式啓動:工具

./bin/elasticsearch
複製代碼

建立索引文件:post

curl -XPUT http://localhost:9200/class
複製代碼

使用ik分詞,查看效果:測試

curl -XGET -H "Content-Type: application/json" 'http://localhost:9200/class/_analyze?pretty' -d ' { "analyzer": "ik_max_word", "text": "我是中國人,我愛個人祖國和人民" }'
{
  "tokens" : [
    {
      "token" : "我",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "CN_CHAR",
      "position" : 0
    },
    {
      "token" : "是",
      "start_offset" : 1,
      "end_offset" : 2,
      "type" : "CN_CHAR",
      "position" : 1
    },
    {
      "token" : "中國人",
      "start_offset" : 2,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "中國",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
      "token" : "國人",
      "start_offset" : 3,
      "end_offset" : 5,
      "type" : "CN_WORD",
      "position" : 4
    },
    {
      "token" : "我",
      "start_offset" : 6,
      "end_offset" : 7,
      "type" : "CN_CHAR",
      "position" : 5
    },
    {
      "token" : "愛我",
      "start_offset" : 7,
      "end_offset" : 9,
      "type" : "CN_WORD",
      "position" : 6
    },
    {
      "token" : "的",
      "start_offset" : 9,
      "end_offset" : 10,
      "type" : "CN_CHAR",
      "position" : 7
    },
    {
      "token" : "祖國",
      "start_offset" : 10,
      "end_offset" : 12,
      "type" : "CN_WORD",
      "position" : 8
    },
    {
      "token" : "和",
      "start_offset" : 12,
      "end_offset" : 13,
      "type" : "CN_CHAR",
      "position" : 9
    },
    {
      "token" : "人民",
      "start_offset" : 13,
      "end_offset" : 15,
      "type" : "CN_WORD",
      "position" : 10
    }
  ]
}
複製代碼

3. 測試完成以後以守護進程啓動

./data/elasticsearch/bin/elasticsearch  -d
複製代碼

待續 ......

相關文章
相關標籤/搜索