在文章開始以前咱們先來介紹一下elasticsearch 是一個分佈式的 RESTful 風格的搜索和數據分析引擎。javascript
一.安裝elasticsearchhtml
這裏咱們將window 的安裝,其實linux 應該都差很少,首先在官網下載安裝包,解壓好我下的是6.5版本的https://www.elastic.co/cn/downloads/past-releases#elasticsearch你們能夠在這裏選擇版本進行下載修改config下的elasticsearch.yml前端
# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
# Before you set out to tweak and tune the configuration, make sure you
# understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#配置集羣名字
cluster.name: my-cluster
#
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#節點名
node.name: node-1
#
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
#path.data: /path/to/data
#
# Path to log files:
#
#path.logs: /path/to/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#這兩個配置是爲了讓header能夠鏈接elasticsearch
http.cors.enabled: true
http.cors.allow-origin: "*"
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 0.0.0.0
#
# Set a custom port for HTTP:
#端口號
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#
#discovery.zen.ping.unicast.hosts: ["host1", "host2"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1):
#
#discovery.zen.minimum_master_nodes:
#
# For more information, consult the zen discovery module documentation.
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various -----------------------------------
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
解壓完後進入bin目錄運行elasticsearch.bat文件即可開啓服務器了,訪問http://localhost:9200/出現以下界面就說明你的elasticsearch安裝好了java
固然咱們若是僅僅使用elasticsearch的話確定是不太方便的,官方給你們提供了一個可視化界面工具elasticsearch-head-master,下面介紹該插件的安裝方法node
二.elasticsearch-head-master安裝mysql
head插件是基於node.js的因此咱們要先下載node.js 才能夠安裝head插件,你們能夠在下面網址下載該插件https://github.com/mobz/elasticsearch-head.git這裏得注意的是head的版本得跟elasticsearch的版本一致,不然可能會報錯。jquery
解壓後修改Gruntfile.js以下:linux
connect: {
server: {
options: {
port: 9100,
hostname: '*',
base: '.',
keepalive: true
}
}
}
而後修改_site中的app.js文件以下:git
init: function(parent) {
this._super();
this.prefs = services.Preferences.instance();
this.base_uri = this.config.base_uri || this.prefs.get("app-base_uri") || "http://localhost:9200";
if( this.base_uri.charAt( this.base_uri.length - 1 ) !== "/" ) {
// XHR request fails if the URL is not ending with a "/"
this.base_uri += "/";
}
if( this.config.auth_user ) {
var credentials = window.btoa( this.config.auth_user + ":" + this.config.auth_password );
$.ajaxSetup({
headers: {
"Authorization": "Basic " + credentials
}
});
}
而後在elastic head 目錄下運行以下命令:第一次運行的時候須要安裝npm 命令 輸入 npm intall 命令 安裝github
而後啓動 npm run start 啓動 服務即可以直接訪問localhost:9200 即可以看到以下界面:
這裏我就不具體說該界面的具體用法,你們本身去探索,elasticsearch的查詢速度很快,特別對於大量數據更能看出其優點,可是對於大量數據咱們又怎麼進行插入呢。下面介紹一個官方推薦的數據庫同步工具 logstash
三.logstash 同步mysql 數據庫
首先在官網上下載logstash 下面給出地址https://www.elastic.co/guide/en/logstash/7.3/logstash-7-3-1.html,值得注意的是這裏須要跟你的elasticsearch 的版本一致,不然會報錯
下載完後進入bin目錄由於該插件須要將mysql 的數據輸入,而後輸出到elasticsearch中因此須要下載logstash-input-jdbc、logstash-output-elasticsearch插件。執行以下命令logstash-plugin install logstash-input-jdbc ,logstash-plugin install logstash-output-elasticsearch
在安裝的過程當中可能會以下錯誤:
安裝好了後再logstash 的config目錄下面建立jdbc.conf文件,這個文件命名是我本身命名的,具體配置以下:
input { jdbc { # mysql相關jdbc配置 jdbc_connection_string => "jdbc:mysql://localhost:3306/cinema?useUnicode=true&characterEncoding=utf-8&useSSL=false" jdbc_user => "root" jdbc_password => "123456" # jdbc鏈接mysql驅動的文件目錄,可去官網下載:https://dev.mysql.com/downloads/connector/j/ jdbc_driver_library => "E:/mavenware/mysql/mysql-connector-java/5.1.39/mysql-connector-java-5.1.39.jar" # the name of the driver class for mysql jdbc_driver_class => "com.mysql.jdbc.Driver" jdbc_paging_enabled => true jdbc_page_size => "50000" jdbc_default_timezone =>"Asia/Shanghai" # mysql文件, 也能夠直接寫SQL語句在此處,以下: #where update_time >= :sql_last_value statement => "select * from movies" #statement_filepath => "./config/jdbc.sql" # 這裏相似crontab,能夠定製定時操做,好比每分鐘執行一次同步(分 時 天 月 年) schedule => "* * * * *" #type => "jdbc" # 是否記錄上次執行結果, 若是爲真,將會把上次執行到的 tracking_column 字段的值記錄下來,保存到 last_run_metadata_path 指定的文件中 #record_last_run => true # 是否須要記錄某個column 的值,若是record_last_run爲真,能夠自定義咱們須要 track 的 column 名稱,此時該參數就要爲 true. 不然默認 track 的是 timestamp 的值. use_column_value => true # 若是 use_column_value 爲真,需配置此參數. track 的數據庫 column 名,該 column 必須是遞增的. 通常是mysql主鍵 tracking_column => "update_time" #這裏要指定類型不能可能會報錯 tracking_column_type => "timestamp" #這裏是存儲最後一次運行的時間 last_run_metadata_path => "./logstash_capital_bill_last_id.txt" # 是否清除 last_run_metadata_path 的記錄,若是爲真那麼每次都至關於從頭開始查詢全部的數據庫記錄 clean_run => false #是否將 字段(column) 名稱轉小寫 lowercase_column_names => false } } output { elasticsearch { hosts => "localhost:9200" index => "movies"
#這裏的字段是索引惟一標識的主鍵,字段名根據本身的表進行修改,不能相同
document_id => "%{movie_id}" template_overwrite => true } # 這裏輸出調試,正式運行時能夠註釋掉 stdout { codec => json_lines } }
在bin 目錄下運行以下命令logstash -f ../config/jdbc.conf ,這裏我實在window命令下運行的 若是在linux 中./logstash -f ../config/jdbc.conf 便可,這裏須要找到jdbc.conf文件,可使用絕對路徑也能夠跟我同樣使用相對路勁。運行以後即可以看到的elastsicsearch中多了很對數據了以下所示:
值得注意的是,只要你一直開着logstash 那麼這個工具就會根據你給定的條件按期的更新索引,因此咱們在表中通常都是加一個update_time字段這樣就能夠進行實時更新了
居然咱們索引已經建立好了那麼咱們就但願能用在項目中了
四.spring boot 中使用elasticsearch
我使用的是maven版本控制工具,因此首先得添加以下依賴:
<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-elasticsearch</artifactId> <version>2.0.2.RELEASE</version> </dependency>
下面是個人yml的配置文件,以下:
spring:
data:
elasticsearch:
cluster-name: my-cluster
#節點的地址 注意api模式下端口號是9300,千萬不要寫成9200
cluster-nodes: localhost:9300
repositories:
enabled: true
mvc:
view:
prefix: /WEB-INF/jsp
suffix: .jsp
server:
port: 8088
這裏注意的是端口號不要寫錯了不是9100,而是9300,這裏由於我要使用jsp因此我配置了jsp的視圖解析器若是你用的是themleaf模板就徹底不須要了
下面是個人實體類:
package com.lwc.pojo; import org.springframework.data.annotation.Id; import org.springframework.data.elasticsearch.annotations.Document; import org.springframework.data.elasticsearch.annotations.Field; import org.springframework.data.elasticsearch.annotations.FieldType; @Document(indexName = "movies",type = "doc",shards = 1,replicas = 0, refreshInterval = "-1") public class Movies { /** * index:是否設置分詞 * analyzer:存儲時使用的分詞器 * searchAnalyze:搜索時使用的分詞器 * store:是否存儲 * type: 數據類型 */ @Id private Integer movie_id; @Field(store = true,index = true,type = FieldType.Keyword,searchAnalyzer = "ik_smart") private String movie_name; private String movie_time; private String movie_date; @Field(store = true,index = true,type = FieldType.Keyword,searchAnalyzer = "ik_smart") private String movie_area; private String movie_lang; @Field(store = true,index = true,type = FieldType.Keyword,searchAnalyzer = "ik_smart") private String movie_director; @Field(store = true,index = true,type = FieldType.Keyword,searchAnalyzer = "ik_smart") private String movie_writer; @Field(store = true,index = true,type = FieldType.Keyword,searchAnalyzer = "ik_smart") private String movie_actor; @Field(store = true,index = true,type = FieldType.Keyword,searchAnalyzer = "ik_smart") private String movie_type; @Field(store = true,index = true,type = FieldType.Float,searchAnalyzer = "ik_smart") private Float movie_mark; private String key_word; private Integer movie_size; private Integer movie_classify;
get 和set 方法我就不貼出來了,節省空間,這裏名字必定要跟索引的名字要對齊
下面試dao層的代碼:
package com.lwc.dao; import com.lwc.pojo.Movies; import org.springframework.data.elasticsearch.repository.ElasticsearchRepository; import org.springframework.stereotype.Repository; @Repository public interface MovieDao extends ElasticsearchRepository<Movies,Long> { }
由於spring boot 的jpa 已經幫咱們實現了elasticsearch 的基本方法,就像hibernate 的jpa 同樣咱們只須要定義一個接口而後繼承ElasticsearchRepository<Movies,Long>就能夠了,這裏接口中有兩個泛型,一個是定義實體類的類型,另一個則是定義索引主鍵的類型
這裏個人主鍵名字叫 movie_id 因此我定義的主鍵名字是long 類型
下面是個人服務層:
package com.lwc.service; import com.lwc.dao.MovieDao; import com.lwc.pojo.Movies; import org.elasticsearch.index.query.QueryStringQueryBuilder; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Service; import java.util.ArrayList; import java.util.Iterator; import java.util.List; @Service public class MovieService { @Autowired private MovieDao movieDao; public List<Movies> findMovies(String keyWord){ QueryStringQueryBuilder qsq=new QueryStringQueryBuilder(keyWord); //這裏是添加查詢域,就是輸入關鍵字後的搜索字段 qsq.field("movie_area").field("movie_director").field("movie_writer").field("movie_actor").field("movie_type").field("movie_name"); Iterator<Movies> iterator=movieDao.search(qsq).iterator(); List<Movies> list=new ArrayList<Movies>(); while(iterator.hasNext()) list.add(iterator.next()); return list; } }
下面是控制層代碼:
package com.lwc.controller; import com.lwc.pojo.Movies; import com.lwc.service.MovieService; import com.lwc.vo.SearchVo; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Controller; import org.springframework.web.bind.annotation.RequestBody; import org.springframework.web.bind.annotation.RequestMapping; import org.springframework.web.bind.annotation.ResponseBody; import org.springframework.web.bind.annotation.RestController; import util.ChineseUtil; import java.util.List; @RestController @RequestMapping("/movies") public class MoviesController { @Autowired private MovieService movieService; @ResponseBody @RequestMapping("/getMovies")
//這裏的searchVo 是一個映射類主要是爲了接收前端傳過來的信息 public List<Movies> getMovies(@RequestBody SearchVo searchVo){ String keyWord=searchVo.getKeyWord(); List<Movies> list=null;
//這裏主要是爲了判斷是否包含中文,由於我使用的是keyup事件當輸入拼音的時候也會輸入拼音是的字符,因此添加了這個檢查,對純英文好像不太友好 if(ChineseUtil.hasChineseByReg(keyWord)) list= movieService.findMovies(keyWord); if(list !=null) for(Movies movies:list) System.out.println(movies.getMovie_name()); return list; } }
下面是界面代碼:
<%-- Created by IntelliJ IDEA. User: Administrator Date: 2019/9/6 Time: 16:39 To change this template use File | Settings | File Templates. --%> <%@ page contentType="text/html;charset=UTF-8" language="java" %> <html> <head> <title>Title</title> <script type="text/javascript" src="/jquery.js"></script> </head> <body> <script type="text/javascript"> $(function(){ $("#name").keyup(function(){ var name={keyWord:$("#name").val()}; var json =JSON.stringify(name); $.ajax({ url:"/movies/getMovies", type:"POST", dataType:"json", data:json, contentType:"application/json;charset=UTF-8", success:function(data){ alert(data); $.each(function(i,n){ }) } }) }) }) </script> 輸入電影名:<input type="text" id="name"> <div class="movies"> <ul class="movieList"> </ul> </div> </body> </html>
ul div 是爲了放置查詢出來的電影,可是博主很懶,等過段時間在來補,其實elasticsearch還有一個主要功能就是智能分詞
五.智能分詞器配置
首先在https://github.com/medcl/elasticsearch-analysis-ik 下載好這個插件而後將這個插件放置到以下目錄中:
那個analysis-ik是我本身建立的目錄,放置後從起elasticsearch
看到了圖中的就說明你ik 分詞器已經配置好了,接下來即是測試分詞器了,在測試以前咱們須要瞭解一下兩個配置的意思
IK分詞器有兩種分詞模式:ik_max_word和ik_smart模式。
一、ik_max_word
會將文本作最細粒度的拆分,好比會將「中華人民共和國人民大會堂」拆分爲「中華人民共和國、中華人民、中華、華人、人民共和國、人民、共和國、大會堂、大會、會堂等詞語。
二、ik_smart
會作最粗粒度的拆分,好比會將「中華人民共和國人民大會堂」拆分爲中華人民共和國、人民大會堂。
下面是對ik分詞器進行測試,咱們能夠直接用head 進行查詢:
咱們能夠看到分詞器能夠對中文進行智能分詞,而後對分完的詞進行搜索,返回相應的結果集