咱們的應用常常須要使用檢索功能,開源的 Elasticsearch 是目前全文搜索引擎的首選。它能夠快速的存儲、搜索和分析海量數據。SpringBoot 經過整合 SpringData Elasticsearch 爲咱們提供了很是便捷的檢索功能支持。html
Elasticsearch 是一個分佈式搜索服務,提供 Restful API,底層基於 Lucene,採用多 shard(分片)的方式保證數據安全,而且提供自動 resharding 的功能,github 等大型的站點也是採用了 Elasticsearch 做爲其搜索服務。java
Elasticsearch 官網 | Elasticsearch 權威指南 | Elasticsearch 權威指南離線版(提取碼:v7th)node
參考【Docker 安裝 Elasticsearch】。git
下面引用權威指南中的一段話:github
應用中的對象不多隻是簡單的鍵值列表,更多時候它擁有複雜的數據結構,好比包含日期、地理位置、另外一個對象或者數組。spring
總有一天你會想到把這些對象存儲到數據庫中。將這些數據保存到由行和列組成的關係數據庫中,就好像是把一個豐富,信息表現力強的對象拆散了放入一個很是大的表格中:你不得不拆散對象以適應表模式(一般一列表示一個字段),而後又不得不在查詢的時候重建它們。數據庫
Elasticsearch 是面向文檔 (document oriented) 的,這意味着它能夠存儲整個對象或文檔 (document) 。然而它不只僅是存儲,還會索引 (index) 每一個文檔的內容使之能夠被搜索。在 Elasticsearch 中,你能夠對文檔(而非成行成列的數據)進行索引、搜索、排序、過濾。這種理解數據的方式與以往徹底不一樣,這也是 Elasticsearch 可以執行復雜的全文搜索的緣由之一。json
涉及以下幾個概念:數組
咱們只須要 http 請求的方式來操做 Elasticserach 服務。安全
以索引一個員工對象(文檔)操做爲例,只須要對 Elasticsearch 發送一個以下 restful 風格的 put 請求:
PUT /megacorp/employee/1 { "first_name" : "John", "last_name" : "Smith", "age" : 25, "about" : "I love to go rock climbing", "interests": [ "sports", "music" ] }
{ "_index": "megacorp", "_type": "employee", "_id": "1", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "created": true }
名字 | 說明 |
---|---|
megacorp | 索引名 |
employee | 類型名 |
1 | 員工Id |
咱們能夠接着保存 Id 爲 二、3 的員工:
PUT /megacorp/employee/2 { "first_name" : "Jane", "last_name" : "Smith", "age" : 32, "about" : "I like to collect rock albums", "interests": [ "music" ] } PUT /megacorp/employee/3 { "first_name" : "Douglas", "last_name" : "Fir", "age" : 35, "about": "I like to build cabinets", "interests": [ "forestry" ] }
要更新一個已有的文檔,一樣能夠以該方式。
以查詢 megacorp 索引的 employee 類型下 id 爲 1 的員工爲例,咱們只須要發送一個以下 restful 風格的 get 請求:
GET /megacorp/employee/1
{ "_index": "megacorp", "_type": "employee", "_id": "1", "_version": 1, "found": true, "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] } }
以檢查 id 爲 1 的員工是否存在爲例,咱們只須要發送一個 restful 風格的 head 請求:
HEAD /megacorp/employee/1
該請求沒有響應體,而是以響應狀態碼爲標識。若是存在這個員工,響應狀態碼爲 200,不然爲 404。
以刪除 id 爲 1 的員工爲例,咱們只須要發送一個 restful 風格的 delete 請求:
DELETE /megacorp/employee/1
{ "found": true, "_index": "megacorp", "_type": "employee", "_id": "1", "_version": 2, "result": "deleted", "_shards": { "total": 2, "successful": 1, "failed": 0 } }
上面咱們已經知道了如何獲取一個指定 id 的文檔,還能夠經過以下方式搜索指定索引的類型下全部文檔:
GET /megacorp/employee/_search
{ "took": 2, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 3, "max_score": 1, "hits": [ { "_index": "megacorp", "_type": "employee", "_id": "2", "_score": 1, "_source": { "first_name": "Jane", "last_name": "Smith", "age": 32, "about": "I like to collect rock albums", "interests": [ "music" ] } }, { "_index": "megacorp", "_type": "employee", "_id": "1", "_score": 1, "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] } }, { "_index": "megacorp", "_type": "employee", "_id": "3", "_score": 1, "_source": { "first_name": "Douglas", "last_name": "Fir", "age": 35, "about": "I like to build cabinets", "interests": [ "forestry" ] } } ] } }
經過 url 參數根據指定字段值搜索文檔,以搜索姓氏中包含 "Smith" 的員工爲例:
GET /megacorp/employee/_search?q=last_name:Smith
{ "took": 18, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 0.2876821, "hits": [ { "_index": "megacorp", "_type": "employee", "_id": "2", "_score": 0.2876821, "_source": { "first_name": "Jane", "last_name": "Smith", "age": 32, "about": "I like to collect rock albums", "interests": [ "music" ] } }, { "_index": "megacorp", "_type": "employee", "_id": "1", "_score": 0.2876821, "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] } } ] } }
Elasticsearch 提供了豐富且靈活的查詢語言叫作 DSL ( Domain Specific Language:特定領域語言 ) 查詢,它可以構建更復雜、強大的查詢。DSL 以 Json 請求體的形式出現。
咱們能夠這樣表示以前關於「Smith」的查詢:
GET /megacorp/employee/_search { "query" : { "match" : { "last_name" : "Smith" } } }
讓搜索稍微再變的複雜一些。咱們依舊想要找到姓氏爲「Smith」的員工,可是咱們只想獲得年齡大於 30 歲的員工。咱們的 語句將添加過濾器 filter,它使得咱們高效率的執行一個結構化搜索:
{ "query": { "bool": { "filter": { "range": { "age": { "gt": 30 } } }, "must": { "match": { "last_name": "smith" } } } } }
{ "took": 42, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 0.2876821, "hits": [ { "_index": "megacorp", "_type": "employee", "_id": "2", "_score": 0.2876821, "_source": { "first_name": "Jane", "last_name": "Smith", "age": 32, "about": "I like to collect rock albums", "interests": [ "music" ] } } ] } }
咱們嘗試一種更高級的搜索,全文搜索——一種傳統數據庫很難實現的功能。 咱們將會搜索全部喜歡 「rock climbing」 的員工:
GET /megacorp/employee/_search
{ "query": { "match": { "about": "rock climbing" } } }
{ "took": 12, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 2, "max_score": 0.53484553, "hits": [ { "_index": "megacorp", "_type": "employee", "_id": "1", "_score": 0.53484553, "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] } }, { "_index": "megacorp", "_type": "employee", "_id": "2", "_score": 0.26742277, "_source": { "first_name": "Jane", "last_name": "Smith", "age": 32, "about": "I like to collect rock albums", "interests": [ "music" ] } } ] } }
默認狀況下,Elasticsearch 根據結果相關性評分來對結果集進行排序,所謂的「結果相關性評分」就是文檔與查詢條件的匹 配程度。很顯然,排名第一的 John Smith 的 about 字段明確的寫到「rock climbing」。
可是爲何 Jane Smith 也會出如今結果裏呢?緣由是「rock」在她的 about 字段中被說起了。由於只有「rock」被說起 而「climbing」沒有,因此她的 _score 要低於 John。
這個例子很好的解釋了 Elasticsearch 如何在各類文本字段中進行全文搜索,而且返回相關性最大的結果集。相關性 (relevance)的概念在 Elasticsearch 中很是重要,而這個概念在傳統關係型數據庫中是不可想象的,由於傳統數據庫對記錄的查詢只有匹配或者不匹配。
上面全文檢索方式是通過分詞後的搜索,若是咱們想要不分詞查詢 about 字段包含 "rock climbing" 的員工記錄,只須要將 "match" 查詢變動爲 "match_phrase" 便可:
GET /megacorp/employee/_search { "query": { "match_phrase": { "about": "rock climbing" } } }
{ "took": 24, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 0.53484553, "hits": [ { "_index": "megacorp", "_type": "employee", "_id": "1", "_score": 0.53484553, "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] } } ] } }
從每一個搜索結果中高亮 (highlight) 匹配到的關鍵字,這樣用戶能夠知道爲何這些文檔和查詢相匹配。在 Elasticsearch 中高亮片斷是很是容易的。 讓咱們在以前的語句上增長 highlight 參數:
GET /megacorp/employee/_search
{ "query": { "match_phrase": { "about": "rock climbing" } }, "highlight": { "fields": { "about": {} } } }
{ "took": 1305, "timed_out": false, "_shards": { "total": 5, "successful": 5, "skipped": 0, "failed": 0 }, "hits": { "total": 1, "max_score": 0.53484553, "hits": [ { "_index": "megacorp", "_type": "employee", "_id": "1", "_score": 0.53484553, "_source": { "first_name": "John", "last_name": "Smith", "age": 25, "about": "I love to go rock climbing", "interests": [ "sports", "music" ] }, "highlight": { "about": [ "I love to go <em>rock</em> <em>climbing</em>" ] } } ] } }
當咱們運行這個語句時,會命中與以前相同的結果,可是在返回結果中會有一個新的部分叫作 highlight ,這裏包含了來 自 about 字段中的文本,而且用 <em></em> 來標識匹配到的單詞。
SpringBoot 默認支持兩種如下兩種方式操做 Elasticsearch。
新建測試 bean:
package zze.springboot.elasticsearch.bean; import io.searchbox.annotations.JestId; public class Product { private Integer id; private String name; private String remark; private Double price; public Integer getId() { return id; } public void setId(Integer id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public String getRemark() { return remark; } public void setRemark(String remark) { this.remark = remark; } public Double getPrice() { return price; } public void setPrice(Double price) { this.price = price; } @Override public String toString() { return "Product{" + "id=" + id + ", name='" + name + '\'' + '}'; } }
一、使用 maven 新建 SpringBoot 項目,引入 Web 場景啓動器,導入 Jest 的依賴:
<dependency> <groupId>io.searchbox</groupId> <artifactId>jest</artifactId> <version>5.3.4</version> </dependency>
二、配置 Elasticsearch 服務主機地址,使用 9200 端口:
spring.elasticsearch.jest.uris=http://192.168.202.136:9200
三、修改文檔 bean,使用註解標識主鍵:
package zze.springboot.elasticsearch.bean; import io.searchbox.annotations.JestId; public class Product { @JestId private Integer id; private String name; private String remark; private Double price; public Integer getId() { return id; } public void setId(Integer id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public String getRemark() { return remark; } public void setRemark(String remark) { this.remark = remark; } public Double getPrice() { return price; } public void setPrice(Double price) { this.price = price; } @Override public String toString() { return "Product{" + "id=" + id + ", name='" + name + '\'' + '}'; } }
四、測試:
package zze.springboot.elasticsearch; import io.searchbox.client.JestClient; import io.searchbox.core.Index; import io.searchbox.core.Search; import io.searchbox.core.SearchResult; import org.junit.Test; import org.junit.runner.RunWith; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.test.context.SpringBootTest; import org.springframework.test.context.junit4.SpringRunner; import zze.springboot.elasticsearch.bean.Product; import java.io.IOException; import java.util.List; @RunWith(SpringRunner.class) @SpringBootTest public class JestTests { @Autowired private JestClient jestClient; // 索引文檔 @Test public void testIndex() { // 建立一個產品做爲文檔 Product product = new Product(); product.setId(1); product.setName("iphone 8 plus"); product.setPrice(5300D); product.setRemark("刺激戰場首選"); // 構建一個 product 索引,索引文檔到該索引下 phone 類型 Index index = new Index.Builder(product).index("product").type("phone").build(); try { jestClient.execute(index); } catch (IOException e) { e.printStackTrace(); }
} // 搜索 @Test public void testSearch() { // 查詢表達式 String json = "{\n" + " \"query\": {\n" + " \"match\": {\n" + " \"name\": \"iphone 8 plus\"\n" + " }\n" + " }\n" + "}"; // 空字符串爲查詢全部 // 構建搜索對象,指定在 product 索引的 phone 類型下經過 json 變量指定的查詢表達式搜索 Search search = new Search.Builder(json).addIndex("product").addType("phone").build(); try { SearchResult searchResult = jestClient.execute(search); List<SearchResult.Hit<Product, Void>> hits = searchResult.getHits(Product.class); for (SearchResult.Hit<Product, Void> hit : hits) { System.out.println(hit.source); } } catch (IOException e) { e.printStackTrace(); } /* Product{id=1, name='iphone 8 plus'} */ } }
SpringBoot 默認使用 SpringData 來操做 Elasticsearch。
Spring Data Elasticsearch 官方文檔 | Spring Data Elasticsearch GitHub
一、使用 maven 新建 SpringBoot 項目,引入 Web、Elasticsearch 場景啓動器。
二、配置 Elasticsearch 服務主機地址,使用 9300 端口:
spring.data.elasticsearch.cluster-name=elasticsearch
spring.data.elasticsearch.cluster-nodes=192.168.202.136:9300
三、修改文檔 bean,使用註解指定文檔存放的索引及類型:
package zze.springboot.elasticsearch.bean; import org.springframework.data.elasticsearch.annotations.Document; @Document(indexName = "product",type = "phone") // 指定該類型實例是一個文檔對象,存放在 product 索引下 phone 類型中 public class Product { private Integer id; private String name; private String remark; private Double price; public Integer getId() { return id; } public void setId(Integer id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public String getRemark() { return remark; } public void setRemark(String remark) { this.remark = remark; } public Double getPrice() { return price; } public void setPrice(Double price) { this.price = price; } @Override public String toString() { return "Product{" + "id=" + id + ", name='" + name + '\'' + '}'; } }
一、新建 Repository 接口:
package zze.springboot.elasticsearch.repository; import org.springframework.data.elasticsearch.repository.ElasticsearchRepository; import zze.springboot.elasticsearch.bean.Product; import java.util.List; public interface ProductRepository extends ElasticsearchRepository<Product, Integer> { // 擴展 ElasticsearchRepository 自定義方法,使用可參考官方文檔及 GitHub 文檔 public List<Product> findProductByNameLike(String name); }
二、測試:
package zze.springboot.elasticsearch; import io.searchbox.client.JestClient; import io.searchbox.core.Index; import io.searchbox.core.Search; import io.searchbox.core.SearchResult; import org.junit.Test; import org.junit.runner.RunWith; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.test.context.SpringBootTest; import org.springframework.test.context.junit4.SpringRunner; import zze.springboot.elasticsearch.bean.Product; import zze.springboot.elasticsearch.repository.ProductRepository; import java.io.IOException; import java.util.List; @RunWith(SpringRunner.class) @SpringBootTest public class ElasticsearchRepositoryTests { @Autowired private ProductRepository productRepository; // 索引文檔 @Test public void testIndex() { // 建立一個產品做爲文檔 Product product = new Product(); product.setId(1); product.setName("iphone 8 plus"); product.setPrice(5300D); product.setRemark("刺激戰場首選"); productRepository.index(product); } // 搜索 @Test public void testSearch() { Iterable<Product> products = productRepository.findAll(); products.forEach(p-> System.out.println(p)); /* Product{id=1, name='iphone 8 plus'} */ } // 根據名稱查詢 @Test public void testFindByName(){ // like 模糊查詢時值不能直接使用空格,須要使用 \b 轉義 List<Product> products = productRepository.findProductByNameLike("iphone\b8"); products.fotestrEach(p-> System.out.println(p)); /* Product{id=1, name='iphone 8 plus'} */ } }
關於在接口中擴展查詢方法可參考以下範例:
關鍵字 | 例子 | 對應查詢表達式 |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@Query("{\"bool\" : {\"must\" : {\"field\" : {\"name\" : \" ? 0\"}}}}") Page<Product> findByName(String name, Pageable pageable);
更多使用細節參考官方文檔 2.2 節。
測試:
package zze.springboot.elasticsearch; import org.elasticsearch.action.search.SearchType; import org.elasticsearch.index.query.BoolQueryBuilder; import org.elasticsearch.index.query.QueryBuilders; import org.junit.Test; import org.junit.runner.RunWith; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.boot.test.context.SpringBootTest; import org.springframework.data.elasticsearch.core.ElasticsearchTemplate; import org.springframework.data.elasticsearch.core.query.IndexQuery; import org.springframework.data.elasticsearch.core.query.IndexQueryBuilder; import org.springframework.data.elasticsearch.core.query.NativeSearchQueryBuilder; import org.springframework.data.elasticsearch.core.query.SearchQuery; import org.springframework.test.context.junit4.SpringRunner; import zze.springboot.elasticsearch.bean.Product; import zze.springboot.elasticsearch.repository.ProductRepository; import java.util.List; @RunWith(SpringRunner.class) @SpringBootTest public class ElasticsearchTemplateTests { @Autowired private ElasticsearchTemplate elasticsearchTemplate; // 索引文檔 @Test public void testIndex() { // 建立一個產品做爲文檔 Product product = new Product(); product.setId(3); product.setName("iphone 9 plus"); product.setPrice(5300D); product.setRemark("刺激戰場首選"); IndexQuery indexQuery = new IndexQueryBuilder().withIndexName("product") .withType("phone").withId(product.getId().toString()).withObject(product).build(); elasticsearchTemplate.index(indexQuery); } // 搜索 @Test public void testSearch() { // 構建查詢構建器 BoolQueryBuilder bqb = QueryBuilders.boolQuery(); bqb.must(QueryBuilders.boolQuery() .should(QueryBuilders.matchQuery("id","3"))); // 構建一個搜索查詢 SearchQuery searchQuery = new NativeSearchQueryBuilder().withQuery(bqb).withIndices("product").withTypes("phone") .withSearchType(SearchType.DEFAULT) .build(); List<Product> products = elasticsearchTemplate.queryForList(searchQuery, Product.class); for (Product product : products) { System.out.println(product); } /* Product{id=3, name='iphone 9 plus'} */ } }
注意:SpringData 依賴的 Elasticsearch 依賴版本須要與 Elasticsearch 服務器版本匹配,在 GitHub 中有說明規則:
spring data elasticsearch | elasticsearch |
---|---|
3.2.x | 6.5.0 |
3.1.x | 6.2.2 |
3.0.x | 5.5.0 |
2.1.x | 2.4.0 |
2.0.x | 2.2.0 |
1.3.x | 1.5.2 |
org.elasticsearch.transport.ConnectTransportException: [][192.168.202.136:9300] connect_timeout[30s] ... Caused by: java.net.ConnectException: Connection refused: no further information: /192.168.202.136:9300 ...