ElasticSearch入門-搜索如此簡單

時間 2019-11-22

標籤 elasticsearch 入門搜索如此簡單欄目日誌分析简体版

原文原文鏈接

搜索引擎我也不是很熟悉，可是數據庫仍是比較瞭解。能夠把搜索理解爲數據庫的like功能的替代品。由於like有如下幾點不足：數據庫

第1、like的效率不行，在使用like時，通常都用不到索引，除非使用前綴匹配，才能用得上索引。但普通的需求並不是前綴匹配。json

第2、like的不能作到徹底的模糊匹配。好比like '%化痰沖劑%'就不能把」化痰止咳沖劑「搜索出來。可是普通的用戶，需求就是這樣elasticsearch

第3、like沒法根據匹配度進行排序。數據庫匹配某個關鍵字的記錄可能有好幾千，可是用戶只能看100條，數據庫每每返回用戶一些不關心的記錄。ui

種種緣由致使搜索引擎的橫空出世。搜索引擎

爲了說明ES的搜索AIP及搜索功能，咱們須要先造點數據。spa

import org.elasticsearch.action.bulk.BulkRequestBuilder;orm

import org.elasticsearch.action.bulk.BulkResponse;blog

import org.elasticsearch.action.index.IndexRequestBuilder;排序

import org.elasticsearch.client.Client;索引

import com.donlianli.es.ESUtils;

import com.donlianli.es.model.LogModel;

public class BulkIndexTest {

public static void main(String[] args) {

String[] desc = new String[]{

"玉屏風口服液",

"清咽丸",

"四消丸",

"感冒清膠囊",

"人蔘歸脾丸",

"人蔘健脾丸",

"明目地黃丸",

"小兒咳喘靈顆粒",

"小兒化痰止咳沖劑",

"雙黃連",

"六味地黃丸"

};

Client client = ESUtils.getClient();

int j= 0;

BulkRequestBuilder bulkRequest = client.prepareBulk();

for(int i=1000;i<1010;i++){

LogModel l = new LogModel();

l.setDesc(desc[j]);

j++;

String json = ESUtils.toJson(l);

IndexRequestBuilder indexRequest = client.prepareIndex("twitter", "tweet")

//指定不重複的ID

.setSource(json).setId(String.valueOf(i));

//添加到builder中

bulkRequest.add(indexRequest);

}

BulkResponse bulkResponse = bulkRequest.execute().actionGet();

if (bulkResponse.hasFailures()) {

// process failures by iterating through each bulk response item

System.out.println(bulkResponse.buildFailureMessage());

}

LogModel的定義見ElasticSearch入門-增刪改查(CRUD)

咱們插入了10條記錄到ES，別管ID是多少，只要不重就行。

下面，咱們須要對LogModel的desc字段進行搜索。咱們搜索一個最簡單的」丸「字，咱們但願將全部帶丸字的記錄都篩選出來。

import org.elasticsearch.action.search.SearchResponse;

import org.elasticsearch.client.Client;

import org.elasticsearch.index.query.QueryBuilder;

import org.elasticsearch.index.query.QueryBuilders;

import org.elasticsearch.search.SearchHit;

import org.elasticsearch.search.SearchHits;

import com.donlianli.es.ESUtils;

public class QuerySearchTest {

public static void main(String[] args) {

Client client = ESUtils.getClient();

QueryBuilder query = QueryBuilders.fieldQuery("desc", "丸");

SearchResponse response = client.prepareSearch("twitter")

.setTypes("tweet")

//設置查詢條件,

.setQuery(query)

.setFrom(0).setSize(60)

.execute()

.actionGet();

SearchHits shs = response.getHits();

for(SearchHit hit : shs){

System.out.println("分數(score):"+hit.getScore()+", 業務描述(desc):"+

hit.getSource().get("desc"));

}

client.close();

}

運行結果：

分數(score):2.97438, 業務描述(desc):四消丸

分數(score):2.7716475, 業務描述(desc):清咽丸

分數(score):2.6025825, 業務描述(desc):人蔘歸脾丸

分數(score):2.6025825, 業務描述(desc):人蔘健脾丸

分數(score):2.4251914, 業務描述(desc):明目地黃丸

能夠看到，搜索引擎已經將咱們全部帶丸的記錄都篩選出來了。而且，字數最少的自動排在了最前面。是否是很智能。在徹底沒有配置ES任何東西以前，就能使用搜索功能了。

下面，咱們再來試試搜索」小兒顆粒「，你猜會不會搜到記錄呢？運行結果：

分數(score):4.46157, 業務描述(desc):小兒咳喘靈顆粒

分數(score):0.87699485, 業務描述(desc):小兒化痰止咳沖劑

嗯，不錯，雖然沒有徹底匹配的，但相關記錄都已經出來了。

至此，使用ES替代數據庫的LIKE功能，基本上已經完成了。搜索的更多功能，探索ing。。。。

PS: ESUtils.getClient();就是一個靜態方法，建立了一個ES的客戶端。

public static Client getClient(){

Settings settings = ImmutableSettings.settingsBuilder()

//指定集羣名稱

.put("cluster.name", "elasticsearch")

//探測集羣中機器狀態

.put("client.transport.sniff", true).build();

Client client = new TransportClient(settings)

.addTransportAddress(new InetSocketTransportAddress("192.168.1.106", 9300));

return client;

}

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。