Elasticsearch自帶的分詞器對中文分詞不是很友好,因此咱們下載開源的IK分詞器來解決這個問題。首先進入到plugins目錄中下載分詞器,下載完成後而後解壓,再重啓es便可。具體步驟以下: 注意:elasticsearch的版本和ik分詞器的版本須要保持一致,否則在重啓的時候會失敗。能夠在這查看全部版本,選擇合適本身版本的右鍵複製連接地址便可。在該連接中找到符合本身版本的:https://github.com/medcl/elasticsearch-analysis-ik/releasesgit
docker exec -it elasticsearch /bin/bash cd /usr/share/elasticsearch/plugins/ elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.5.1/elasticsearch-analysis-ik-7.5.1.zip exit docker restart elasticsearch
因爲經過上述方式安裝因爲網絡問題可能實現不了,因此能夠經過離線安裝github
經過https://github.com/medcl/elasticsearch-analysis-ik/releases下載對應版本安裝包 在es的plugins文件下(/usr/share/elasticsearch/plugins/)建立ik文件夾 cd /usr/share/elasticsearch/plugins/ mkdir ik 將下載好的安裝包拷貝在這個文件夾下,同時減壓便可
注意:安裝es的ik分詞器須要安裝jdkdocker
測試:bash
POST http://localhost:9200/_analyze?pretty=true { "analyzer": "ik_max_word", "text": "中國人民的兒子" }
結果:網絡
{ "tokens" : [ { "token" : "中國人民", "start_offset" : 0, "end_offset" : 4, "type" : "CN_WORD", "position" : 0 }, { "token" : "中國人", "start_offset" : 0, "end_offset" : 3, "type" : "CN_WORD", "position" : 1 }, { "token" : "中國", "start_offset" : 0, "end_offset" : 2, "type" : "CN_WORD", "position" : 2 }, { "token" : "國人", "start_offset" : 1, "end_offset" : 3, "type" : "CN_WORD", "position" : 3 }, { "token" : "人民", "start_offset" : 2, "end_offset" : 4, "type" : "CN_WORD", "position" : 4 }, { "token" : "的", "start_offset" : 4, "end_offset" : 5, "type" : "CN_CHAR", "position" : 5 }, { "token" : "兒子", "start_offset" : 5, "end_offset" : 7, "type" : "CN_WORD", "position" : 6 } ] }