Docker容器中的Elasticsearch中離線安裝IK分詞器

時間 2020-02-23

標籤 docker 容器 elasticsearch 離線安裝分詞器欄目 Docker 简体版

原文原文鏈接

Elasticsearch自帶的分詞器對中文分詞不是很友好，因此咱們下載開源的IK分詞器來解決這個問題。首先進入到plugins目錄中下載分詞器，下載完成後而後解壓，再重啓es便可。具體步驟以下: 注意：elasticsearch的版本和ik分詞器的版本須要保持一致，否則在重啓的時候會失敗。能夠在這查看全部版本，選擇合適本身版本的右鍵複製連接地址便可。在該連接中找到符合本身版本的：https://github.com/medcl/elasticsearch-analysis-ik/releasesgit

docker exec -it elasticsearch /bin/bash
cd /usr/share/elasticsearch/plugins/ 
elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.5.1/elasticsearch-analysis-ik-7.5.1.zip
exit 
docker restart elasticsearch

因爲經過上述方式安裝因爲網絡問題可能實現不了，因此能夠經過離線安裝github

經過https://github.com/medcl/elasticsearch-analysis-ik/releases下載對應版本安裝包
在es的plugins文件下(/usr/share/elasticsearch/plugins/)建立ik文件夾
cd /usr/share/elasticsearch/plugins/
mkdir ik
將下載好的安裝包拷貝在這個文件夾下，同時減壓便可

注意：安裝es的ik分詞器須要安裝jdkdocker

測試：bash

POST http://localhost:9200/_analyze?pretty=true
{
  "analyzer": "ik_max_word",
  "text": "中國人民的兒子"
}

結果：網絡

{
  "tokens" : [
    {
      "token" : "中國人民",
      "start_offset" : 0,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 0
    },
    {
      "token" : "中國人",
      "start_offset" : 0,
      "end_offset" : 3,
      "type" : "CN_WORD",
      "position" : 1
    },
    {
      "token" : "中國",
      "start_offset" : 0,
      "end_offset" : 2,
      "type" : "CN_WORD",
      "position" : 2
    },
    {
      "token" : "國人",
      "start_offset" : 1,
      "end_offset" : 3,
      "type" : "CN_WORD",
      "position" : 3
    },
    {
      "token" : "人民",
      "start_offset" : 2,
      "end_offset" : 4,
      "type" : "CN_WORD",
      "position" : 4
    },
    {
      "token" : "的",
      "start_offset" : 4,
      "end_offset" : 5,
      "type" : "CN_CHAR",
      "position" : 5
    },
    {
      "token" : "兒子",
      "start_offset" : 5,
      "end_offset" : 7,
      "type" : "CN_WORD",
      "position" : 6
    }
  ]
}