初識ElasticSearch

時間 2019-12-06

標籤 elasticsearch 欄目日誌分析简体版

原文原文鏈接

概述

Elasticsearch是一個基於Apache Lucene(TM)的開源搜索引擎。不管在開源仍是專有領域，Lucene能夠被認爲是迄今爲止最早進、性能最好的、功能最全的搜索引擎庫。linux

分佈式的實時文件存儲，每一個字段都被索引並可被搜索
分佈式的實時分析搜索引擎
能夠擴展到上百臺服務器，處理PB級結構化或非結構化數據

下面展現了在關係型數據庫中和ElasticSearch中對應的存儲字段：git

Relational DB -> Databases -> Tables -> Rows -> Columns
Elasticsearch -> Indices -> Types -> Documents -> Fieldsgithub

這裏的索引和傳統的關係型數據庫中索引有些不一樣，在ElasticSearch中，一個索引就像是傳統關係型數據庫中的數據庫。web

在ElasticSearch和Lucene中，使用的是一種叫倒排索引的數據結構加速檢索。chrome

安裝使用

因爲機器有限，因此我採用的是使用虛擬機安裝ElasticSearch。數據庫

實驗環境以下：npm

VMware虛擬機
centos6.5系統
JDK1.8
ElasticSearch 5.2.2

下載與安裝

能夠在ElasticSearch官網下載最新版的ElasticSearch。json

下載地址：https://www.elastic.co/downloads/elasticsearchvim

若是使用Windows則下載zip包，使用linux則下載tar.gz包。centos

安裝以前須要安裝JDK1.7以上版本。

在Windows下方法比較簡單，解壓zip包後，經過DOS命令行切換到bin目錄，運行elasticsearch便可，能夠經過查看localhost:9200端口，看是否啓動成功。

在Linux中安裝與使用

在linux中新建一個目錄，名字爲elasticsearch，使用Xftp或者WinSCP等工具將剛纔下載的安裝包傳輸到該目錄下。

tar -zxvf elasticsearch-5.2.2.tar.gz解壓文件。

進入到解壓後的文件，輸入.bin/elasticsearch便可打開elasticsearch。

可能出現的問題：

啓動失敗多是由於使用了root用戶，啓動不容許使用root用戶
啓動失敗其餘緣由，可能linux系統中有些參數須要修改修改，好比最大虛擬內存、用戶最大可建立線程數，按照提示進行修改便可，具體設置地點可在網上查找。
linux防火牆的9200端口須要打開，不然在主機上沒法訪問虛擬機裏的ElasticSearch，能夠經過ifconfig命令查詢Linux主機的ip地址。
若是仍是沒法訪問，能夠進入ElasticSearch目錄中的config目錄中，vim elasticsearch.yml
，編輯設置文件，將host改成0.0.0.0，保存並重啓ElasticSearch，就應該沒問題了。

訪問 ip + 9200，結果以下，ElasticSearch開啓成功。

安裝elasticsearch-head

elasticsearch-head是一個很好的管理工具，能夠可視化看到集羣的情況。

下載地址：https://github.com/mobz/elasticsearch-head

在elasticsearch5.0後，廢棄了使用plugin安裝的方式，可使用git先clone代碼，使用npm安裝。

訪問 ip + 9100，記得開放端口。

在chrome或者火狐瀏覽器中可能會出現跨域問題而沒法鏈接，可使用IE鏈接。

基本使用

在官網能夠查看使用多種語言的API，在這裏使用最簡單的基於HTTP協議的API，使用JSON格式數據。

插入數據

在linux中，輸入：
curl -XPUT 'http://192.168.91.129:9200/website/blog/1' -d '{"title":"liuyang", "text":"太陽在頭頂之上", "date":"2017-4-16"}'

這樣就能夠插入一條數據，在插入多條數據後，在瀏覽器輸入：
http://192.168.91.129:9200/_search?pretty

在結果中能夠看到有多條記錄。pretty的做用是爲了讓JSON優化排版，也能夠直接使用curl -XGET獲取數據。

搜索

查詢字符串

在瀏覽器輸入：
http://192.168.91.129:9200/_search?q=太陽

在q後面加上本身想要查找的字符串，就能夠進行匹配了。

_score字段爲匹配程度，越高的結果就會越在前面。

格式化結果後：

{
    "took": 292, 
    "timed_out": false, 
    "_shards": {
        "total": 5, 
        "successful": 5, 
        "failed": 0
    }, 
    "hits": {
        "total": 3, 
        "max_score": 0.56008905, 
        "hits": [
            {
                "_index": "website", 
                "_type": "blog", 
                "_id": "2", 
                "_score": 0.56008905, 
                "_source": {
                    "title": "title2", 
                    "text": "小路彎彎,太陽正在頭頂上", 
                    "date": "2017-4-16"
                }
            }, 
            {
                "_index": "website", 
                "_type": "blog", 
                "_id": "3", 
                "_score": 0.5446649, 
                "_source": {
                    "title": "title3", 
                    "text": "最美的太陽", 
                    "date": "2017-4-16"
                }
            }, 
            {
                "_index": "website", 
                "_type": "blog", 
                "_id": "1", 
                "_score": 0.48515025, 
                "_source": {
                    "title": "liuyang", 
                    "text": "太陽在頭頂之上", 
                    "date": "2017-4-16"
                }
            }
        ]
    }
}

全文搜索
在linux中輸入：
curl -XGET 'http://192.168.91.129:9200/_search' -d '{"query":{"match":{"text":"太陽"}}}'

後面的JSON爲搜索的具體field，這樣就能夠作到在關係型數據庫中很難作到的全文檢索。

從輸出結果中能夠看到，每個document都有對應的_score字段，這就是對此document的相關性評分。

格式化輸出：

{
    "took": 44, 
    "timed_out": false, 
    "_shards": {
        "total": 5, 
        "successful": 5, 
        "failed": 0
    }, 
    "hits": {
        "total": 3, 
        "max_score": 0.5716521, 
        "hits": [
            {
                "_index": "website", 
                "_type": "blog", 
                "_id": "1", 
                "_score": 0.5716521, 
                "_source": {
                    "title": "liuyang", 
                    "text": "太陽在頭頂之上", 
                    "date": "2017-4-16"
                }
            }, 
            {
                "_index": "website", 
                "_type": "blog", 
                "_id": "3", 
                "_score": 0.5649868, 
                "_source": {
                    "title": "title3", 
                    "text": "最美的太陽", 
                    "date": "2017-4-16"
                }
            }, 
            {
                "_index": "website", 
                "_type": "blog", 
                "_id": "2", 
                "_score": 0.48515025, 
                "_source": {
                    "title": "title2", 
                    "text": "小路彎彎,太陽正在頭頂上", 
                    "date": "2017-4-16"
                }
            }
        ]
    }
}

經過對比兩種搜索方式的結果可知，使用全文搜索的匹配更加知足咱們的預期。

相關標籤/搜索

初識

elasticsearch+elasticsearch

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。