1.找到合適的版本git
IK version | ES version |
---|---|
6.1.1 | 6.1.1 |
5.6.4 | 5.6.4 |
5.5.3 | 5.5.3 |
5.4.3 | 5.4.3 |
5.3.3 | 5.3.3 |
5.2.2 | 5.2.2 |
5.1.2 | 5.1.2 |
1.10.6 | 2.4.6 |
1.9.5 | 2.3.5 |
1.8.1 | 2.2.1 |
1.7.0 | 2.1.1 |
1.5.0 | 2.0.0 |
1.2.6 | 1.0.0 |
1.2.5 | 0.90.x |
1.1.3 | 0.20.x |
1.0.0 | 0.16.2 -> 0.19.0 |
2.下載對應的zip包github
對應版本的zip包spring
在本地elasticsearch根目錄下的plugins下新建ik文件夾,將zip包中內容解壓到該文件夾下。瀏覽器
如今重啓es就能夠了springboot
3. 安裝kibana進場測試app
建立index,typeelasticsearch
PUT /yf-springboot-es-ik1 POST /yf-springboot-es-ik1/springboot-test/_mapping { "springboot-test": { "properties": { "keyword": { "type": "text", "analyzer": "ik_max_word", "search_analyzer": "ik_max_word" } } } }
使用瀏覽器進行測試:post
http://localhost:9200/yf-springboot-es-ik1/springboot-test/14/_termvectors?fields=keyword測試
ip + 端口 + index + type + id + _termvectors + ?fields = 搜索字段spa
POST /yf-springboot-es-ik1/springboot-test/14 {"keyword":"中國駐洛杉磯領事館遭亞裔男子槍擊 嫌犯已自首"}
{ "_index": "yf-springboot-es-ik1", "_type": "springboot-test", "_id": "14", "_version": 4, "found": true, "took": 0, "term_vectors": { "keyword": { "field_statistics": { "sum_doc_freq": 14, "doc_count": 1, "sum_ttf": 14 }, "terms": { "中國": { "term_freq": 1, "tokens": [ { "position": 0, "start_offset": 0, "end_offset": 2 } ] }, "亞裔": { "term_freq": 1, "tokens": [ { "position": 7, "start_offset": 10, "end_offset": 12 } ] }, "嫌犯": { "term_freq": 1, "tokens": [ { "position": 11, "start_offset": 17, "end_offset": 19 } ] }, "子槍": { "term_freq": 1, "tokens": [ { "position": 9, "start_offset": 13, "end_offset": 15 } ] }, "已": { "term_freq": 1, "tokens": [ { "position": 12, "start_offset": 19, "end_offset": 20 } ] }, "槍擊": { "term_freq": 1, "tokens": [ { "position": 10, "start_offset": 14, "end_offset": 16 } ] }, "洛杉磯": { "term_freq": 1, "tokens": [ { "position": 2, "start_offset": 3, "end_offset": 6 } ] }, "男子": { "term_freq": 1, "tokens": [ { "position": 8, "start_offset": 12, "end_offset": 14 } ] }, "自首": { "term_freq": 1, "tokens": [ { "position": 13, "start_offset": 20, "end_offset": 22 } ] }, "遭": { "term_freq": 1, "tokens": [ { "position": 6, "start_offset": 9, "end_offset": 10 } ] }, "領事": { "term_freq": 1, "tokens": [ { "position": 4, "start_offset": 6, "end_offset": 8 } ] }, "領事館": { "term_freq": 1, "tokens": [ { "position": 3, "start_offset": 6, "end_offset": 9 } ] }, "館": { "term_freq": 1, "tokens": [ { "position": 5, "start_offset": 8, "end_offset": 9 } ] }, "駐": { "term_freq": 1, "tokens": [ { "position": 1, "start_offset": 2, "end_offset": 3 } ] } } } } }
使用kibana添加數據而且查詢
POST /yf-springboot-es-ik1/springboot-test/11 {"keyword":"美國留給伊拉克的是個爛攤子嗎"} POST /yf-springboot-es-ik1/springboot-test/12 {"keyword":"公安部:各地校車將享最高路權"} POST /yf-springboot-es-ik1/springboot-test/13 {"keyword":"中韓漁警衝突調查:韓警平均天天扣1艘中國漁船"} POST /yf-springboot-es-ik1/springboot-test/14 {"keyword":"中國駐洛杉磯領事館遭亞裔男子槍擊 嫌犯已自首"} POST /yf-springboot-es-ik1/springboot-test/_search { "query" : { "match" : { "keyword" : "中國" }}, "highlight" : { "pre_tags" : ["<tag1>", "<tag2>"], "post_tags" : ["</tag1>", "</tag2>"], "fields" : { "keyword" : {} } } }
查詢結果:
{ "took" : 0, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 0.5480699, "hits" : [ { "_index" : "yf-springboot-es-ik1", "_type" : "springboot-test", "_id" : "13", "_score" : 0.5480699, "_source" : { "keyword" : "中韓漁警衝突調查:韓警平均天天扣1艘中國漁船" }, "highlight" : { "keyword" : [ "中韓漁警衝突調查:韓警平均天天扣1艘<tag1>中國</tag1>漁船" ] } }, { "_index" : "yf-springboot-es-ik1", "_type" : "springboot-test", "_id" : "14", "_score" : 0.2876821, "_source" : { "keyword" : "中國駐洛杉磯領事館遭亞裔男子槍擊 嫌犯已自首" }, "highlight" : { "keyword" : [ "<tag1>中國</tag1>駐洛杉磯領事館遭亞裔男子槍擊 嫌犯已自首" ] } } ] } }
4.ik_max_word 和 ik_smart 什麼區別?
ik_max_word: 會將文本作最細粒度的拆分,好比會將「中華人民共和國國歌」拆分爲「中華人民共和國,中華人民,中華,華人,人民共和國,人民,人,民,共和國,共和,和,國國,國歌」,會窮盡各類可能的組合;
ik_smart: 會作最粗粒度的拆分,好比會將「中華人民共和國國歌」拆分爲「中華人民共和國,國歌」