全文檢索技術ElasticSearch

第1章  全文檢索技術

1.1概述

如今企業開發中,更經常使用是的 solr 搜索服務器和 ElasticSearch 搜索服務器java

是經過Resyful風格API簡化客戶端 對lunence的使用node

1.1.1和數據庫進行對比分析

索引:至關於DB 存儲數據的邏輯位置git

類型:至關於表 ·github

文檔:至關於行數據,經過詞條搜索出來數據spring

文檔的屬性field:至關於表中的列數據庫

 

 

服務的默認端口是: 9300服務器

控制頁面的端口號是: 9200app

1.2ElasticSearch 搜索服務器安裝

官網: https://www.elastic.co/products/elasticsearchdom

包結構:elasticsearch

bin 存放 elasticSearch 運行命令

config 存放配置文件

lib 存放 elasticSearch 運行依賴 jar

modules 存放 elasticSearch 模塊

plugins 存放插件

環境配置:

運行 elasticSearch/bin/elasticsearch.bat 文件

配置 JAVA_HOME 環境變量

訪問 http://127.0.0.1:9200

1.3 ElasticSearch  插件安裝 es head圖形化界面

1.3.1在線下載安裝

1.進入到bin目錄

       2.執行plugin.bat install mobz/elasticsearch-head

       3.head插件安裝到plugins目錄下

      

1.3.2      若是在線安裝失敗

       1.head下載到本地

       2.head插件放置到plugins中便可

       訪問:http://localhost:9200/_plugin/head/

看到下圖表示安裝成功:

1.4集成IK分詞器

1.4.1下載開源包

  https://github.com/medcl/elasticsearch-analysis-ik/tree/2.x

1.4.2打包 ik 分詞器

輸入命令:mvn clean package

1.4.3進入 ik包中的target/release 目錄

將下列文件,複製到EslasticSerarch文件夾的plugins/analysis-ik

1.4.4進入 IK目錄下target/release/config 目錄

將全部配置文件,複製到EslasticSerarch文件夾的config

1.4.5在配置 ES/config/elasticsearch.yml

在其最底下加入:

index.analysis.analyzer.ik.type:    "ik"

重啓ES發現 ik 分詞器被加載

1.4.6訪問:

http://localhost:9200/_analyze?analyzer=ik&pretty=true&text=我是中國人

集成成功

1.5Spring data ElasticSearch的使用

1.5.1導入Maven依賴包:

<!-- elasticsearch  -->

<dependency>

    <groupId>org.elasticsearch</groupId>

    <artifactId>elasticsearch</artifactId>

    <version>2.4.0</version>

</dependency>

<dependency>

    <groupId>org.springframework.data</groupId>

    <artifactId>spring-data-elasticsearch</artifactId>

    <version>2.0.4.RELEASE</version>

</dependency>

 

1.5.2配置文件

applicationContext-elasticsearch.xml:

<?xml version="1.0" encoding="UTF-8"?>

<beans xmlns="http://www.springframework.org/schema/beans"

    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

    xmlns:elasticsearch="http://www.springframework.org/schema/data/elasticsearch"

    xsi:schemaLocation="

       http://www.springframework.org/schema/beans

       http://www.springframework.org/schema/beans/spring-beans.xsd

       http://www.springframework.org/schema/data/elasticsearch

       http://www.springframework.org/schema/data/elasticsearch/spring-elasticsearch-1.0.xsd">

   

    <!-- 搜索DAO 掃描 -->

    <elasticsearch:repositories base-package="cn.peihua.bos.index" />

   

    <!-- 配置Client -->

    <elasticsearch:transport-client id="client" cluster-nodes="127.0.0.1:9300"/>

   

    <!-- 配置搜索模板-->

    <bean id="elasticsearchTemplate"

       class="org.springframework.data.elasticsearch.core.ElasticsearchTemplate">

       <constructor-arg name="client" ref="client"/>

    </bean>

</beans>

applicationContext.xml中引入:

<!-- 引入elasticSearch -->

<import resource="applicationContext-elasticsearch.xml"/>

 

1.5.3編寫domain

說明:本次案例數據訪問技術選用的是Hibernate註解方式
 

實體映射關係:

@Document 文檔對象 (索引名、文檔類型 )

@Id 文檔主鍵 惟一標識

注意:

       實體類中不光須要導入jpa@id還須要@org.springframework.data.annotation.Id

id註解

@Field 每一個文檔的字段配置(類型、是否分詞、是否存儲、分詞器

Field屬性:說明

index = FieldIndex.not_analyzed 表示不分詞

index = FieldIndex.analyzed   表示今次那個分詞

store = true 表示進行ES存儲

analyzer = "ik" 表示存儲的分詞器爲ik

searchAnalyzer = "ik" 表示查詢的分詞器爲ik

注意:

       數字類型(Integer/Double)的字段不能進行分詞

 

package cn. peihua.bos.domain.take_delivery;

 

import java.io.Serializable;

 

import javax.persistence.Column;

import javax.persistence.Entity;

import javax.persistence.GeneratedValue;

import javax.persistence.Id;

import javax.persistence.JoinColumn;

import javax.persistence.OneToOne;

import javax.persistence.Table;

 

import org.springframework.data.elasticsearch.annotations.Document;

import org.springframework.data.elasticsearch.annotations.Field;

import org.springframework.data.elasticsearch.annotations.FieldIndex;

import org.springframework.data.elasticsearch.annotations.FieldType;

 

import cn.peihua.bos.domain.base.Area;

 

/**

 * @description:運單實體類

 */

@Entity

@Table(name = "T_WAY_BILL")

@Document(indexName = "bos", type = "waybill")

publicclass WayBillimplements Serializable {

 

    @Id

    @GeneratedValue

    @Column(name = "C_ID")

    @org.springframework.data.annotation.Id

    @Field(index = FieldIndex.not_analyzed, store = true, type = FieldType.Integer)

    private Integer id;

    @Column(name = "C_WAY_BILL_NUM", unique = true)

    @Field(index = FieldIndex.not_analyzed, store = true, type = FieldType.String)

    private String wayBillNum; // 運單編號

    @OneToOne

    @JoinColumn(name = "C_ORDER_ID")

    private Order order; // 訂單信息

 

    @Column(name = "C_SEND_NAME")

    @Field(index = FieldIndex.analyzed, analyzer = "ik", searchAnalyzer = "ik", store = true, type = FieldType.String)

    private String sendName; // 寄件人姓名

    @Column(name = "C_SEND_MOBILE")

    @Field(index = FieldIndex.analyzed, analyzer = "ik", searchAnalyzer = "ik", store = true, type = FieldType.String)

    private String sendMobile;// 寄件人電話

    @Column(name = "C_SEND_COMPANY")

    @Field(index = FieldIndex.analyzed, analyzer = "ik", searchAnalyzer = "ik", store = true, type = FieldType.String)

    private String sendCompany;// 寄件人公司

 

  ……………………

 

 

1.5.4編寫索引庫dao

注意:這個索引的dao不要和原先的dao放在一個包下

 

package cn.peihua.bos.index;

 

import org.springframework.data.elasticsearch.repository.ElasticsearchRepository;

 

import cn.peihua.bos.domain.take_delivery.WayBill;

 

publicinterface WayBillIndexRepository extends

       ElasticsearchRepository<WayBill, Integer> {

 

}

 

 

1.5.5編寫service

service中注入WayBillIndexRepository

在保存進數據庫的同時保存索引庫

//注入索引dao

    @Autowired

    private WayBillIndexRepository wayBillIndexRepository;

在方法中直接進行調用:

wayBillRepository.save(wayBill);

 

1.6進行ElasticSearch查詢

1.6.1查詢方式

建議:

查詢能夠結合分詞結果查看,有助於查詢結果判斷

http://localhost:9200/_analyze?analyzer=ik&pretty=true&text=我是中國人

TermQuery 不能帶條件的詞條等值查詢

WildcardQuery 模糊查詢

BooleanQuery 布爾查詢:能夠用來組合多個查詢條件

       //must 條件必須成立 至關於 and

       //must not 條件必須不成立 至關於 not

       //should 條件能夠成立 至關於or

QueryBuileders.queryStringQuery(內容) 分詞查詢方法 默認範圍是全字段,也能夠指定字段

       若是是默認分詞器:就按單個字查詢

       若是是ik分詞器:就是 能夠帶條件 的分詞等值查詢

 

 

//有條件分頁查詢運單

    public Page<WayBill> findPageData(WayBill wayBill, Pageable pageable) {

       //判斷是不是條件查詢,即判斷wayBill中,條件是否存在  

       if (StringUtils.isBlank(wayBill.getWayBillNum())

              && StringUtils.isBlank(wayBill.getSendAddress())

              && StringUtils.isBlank(wayBill.getRecAddress())

              && StringUtils.isBlank(wayBill.getSendProNum())

              && (wayBill.getSignStatus() == null || wayBill.getSignStatus() == 0)) {

           //無條件,直接分頁查詢數據庫

           returnwayBillRepository.findAll(pageable);

       }else{

           //查詢條件

           //must 條件必須成立至關於 and

           //must not 條件必須不成立至關於 not

           //should 條件能夠成立至關於or

          

           //建立條件組合查詢對象boolQuery

           BoolQueryBuilder query = new BoolQueryBuilder();

          

           //構建組合條件,並將其加入到組合對象中:

          

           //1.等值查詢運單號

           if(StringUtils.isNoneBlank(wayBill.getWayBillNum())){

              //若是訂單號不爲空,就進行等值查詢

             

              //建立無條件等值查詢對象

              QueryBuilder termQuery = new TermQueryBuilder("wayBillNum",

                     wayBill.getWayBillNum());

              query.must(termQuery);

             

           }

           //2.查詢發貨地

           if(StringUtils.isNoneBlank(wayBill.getSendAddress())){

              //若是發貨地址不爲空,進行查詢

             

              //狀況一:用戶輸入的僅是次詞條的一部分,使用模糊查詢

             

              //建立模糊查詢對象

              QueryBuilder wildcardQueryBuilder = new WildcardQueryBuilder(

                     "sendAddress", "*"+wayBill.getSendAddress()+"*");

             

              //狀況二:用戶輸入北京市海淀區,是多個詞條的組合,使用帶條件的等值查詢

              //.feild()指定查詢字段

              QueryBuilder queryStringQueryBuilder = new

                     QueryStringQueryBuilder(wayBill.getSendAddress())

                     .field("sendAddress")

                     .defaultOperator(Operator.AND);

              //對兩種狀況取(should)or

              BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();

              boolQueryBuilder.should(wildcardQueryBuilder);

              boolQueryBuilder.should(queryStringQueryBuilder);

             

              //加入總條件查詢對象中

              query.must(boolQueryBuilder);

             

           }

           //3.查詢收貨地

           if(StringUtils.isNoneBlank(wayBill.getRecAddress())){

              //若是發貨地址不爲空,進行查詢

             

              //狀況一:用戶輸入的僅是次詞條的一部分,使用模糊查詢

             

              //建立模糊查詢對象

              QueryBuilder wildcardQueryBuilder = new WildcardQueryBuilder(

                     "recAddress", "*"+wayBill.getRecAddress()+"*");

             

              //狀況二:用戶輸入北京市海淀區,是多個詞條的組合,使用帶條件的等值查詢

              //.feild()指定查詢字段

              QueryBuilder queryStringQueryBuilder = new

                     QueryStringQueryBuilder(wayBill.getRecAddress())

                     .field("recAddress")

                     .defaultOperator(Operator.AND);

              //對兩種狀況取(should)or

              BoolQueryBuilder boolQueryBuilder = new BoolQueryBuilder();

              boolQueryBuilder.should(wildcardQueryBuilder);

              boolQueryBuilder.should(queryStringQueryBuilder);

             

              //加入總條件查詢對象中

              query.must(boolQueryBuilder);

             

           }

           //4.等值查詢產品類型編號

           if (StringUtils.isNoneBlank(wayBill.getSendProNum())) {

              // 速運類型等值查詢

              QueryBuilder termQuery = new TermQueryBuilder("sendProNum",

                     wayBill.getSendProNum());

              query.must(termQuery);

           }

           //等值查詢運單簽收狀態

           if(wayBill.getSignStatus() != null && wayBill.getSignStatus() != 0){

               //若是運單簽收狀態不爲空,進行無條件的等值查詢

              QueryBuilder TermQuery = new

                     TermQueryBuilder("signStatus", wayBill.getSignStatus());

             

              query.must(TermQuery);

           }

          

 

 

 

           SearchQuery searchQuery = new NativeSearchQuery(query);

           searchQuery.setPageable(pageable); // 分頁效果

           // 有條件查詢、查詢索引庫

           returnwayBillIndexRepository.search(searchQuery);

          

       }

}

相關文章
相關標籤/搜索