docker for Windows : 18.03.1-ce-win65 (17513)
springBoot : 2.2.2.RELEASE
springDataElasticSearch : 3.2.3
elasticSearch Image : 6.8.5
elasticSearch-analysis-ik : 6.8.5
mySql : 5.6.40-log
JDK : 1.8
gradle : 6.0.1mysql
爲何要學習elasticSearch?由於快,由於能提供良好的中文分詞,由於分佈式,由於springBoot已經集成了。其實由於最近項目中咱們對接了京東大約百萬條商品數據,致使之前的一些查詢出現十幾秒加載的狀況,讓我從新進行了sql的優化(拆分join,設置聯合索引,異步請求)使得我對索引進行了複習,而且想去了解搜索引擎與mysql全文索引的具體區別。這裏我是用了docker + elasticSearch + springBoot來初步瞭解elasticsearch。git
由於在dockers pull elasticsearch 的時候提示沒有latest版本因此從docker hub上找到6.8.5來測試,這個版本比較穩定也比較新。github
HTTP/1.1 200 OK content-type: application/json; charset=UTF-8 content-length: 578 { "tokens" : [ { "token" : "我", "start_offset" : 0, "end_offset" : 1, "type" : "<IDEOGRAPHIC>", "position" : 0 }, { "token" : "愛", "start_offset" : 1, "end_offset" : 2, "type" : "<IDEOGRAPHIC>", "position" : 1 }, { "token" : "中", "start_offset" : 2, "end_offset" : 3, "type" : "<IDEOGRAPHIC>", "position" : 2 }, { "token" : "國", "start_offset" : 3, "end_offset" : 4, "type" : "<IDEOGRAPHIC>", "position" : 3 } ] }
分詞效果很差,和老外同樣。spring
進入container安裝IK分詞器:sql
HTTP/1.1 200 OK content-type: application/json; charset=UTF-8 content-length: 424 { "tokens" : [ { "token" : "我", "start_offset" : 0, "end_offset" : 1, "type" : "CN_CHAR", "position" : 0 }, { "token" : "愛", "start_offset" : 1, "end_offset" : 2, "type" : "CN_CHAR", "position" : 1 }, { "token" : "中國", "start_offset" : 2, "end_offset" : 4, "type" : "CN_WORD", "position" : 2 } ] }
具體接入網上不少,只提一點,要使用IK分詞器不能使用@Field這些註解,只能本身寫JSON文件進行mapping:docker
@Getter @Mapping(mappingPath = "es_article_mapping.json") @Document(indexName = "article",type = "article") public class ArticleEsEntity { @Id private String id; private String title; private String content; private long createTime; public ArticleEsEntity(String title, String content) { this.id = System.nanoTime() + ""; this.title = title; this.content = content; this.createTime = System.currentTimeMillis(); } }
{ "article":{ "properties":{ "id":{ "type":"text" }, "create\_time":{ "type":"long" }, "content":{ "type":"text", "analyzer":"ik\_smart", "search\_analyzer":"ik\_smart", "fields":{ "keyword":{ "type":"keyword", "ignore\_above":10000 } } }, "title":{ "type":"text", "analyzer":"ik\_smart", "search\_analyzer":"ik\_smart", "fields":{ "keyword":{ "type":"keyword", "ignore\_above":256 } } } } } }
總共12w+的記錄,mysql與elasticsearch都是。json
另外:mysql的fullIndex很差分詞哦~~~瀏覽器