ElasticSearch 快速上手學習入門教程

原文連接:http://tabalt.net/blog/elasti...html

做爲最受歡迎和最有活力的全文搜索引擎系統,ElasticSearch有着你沒法拒絕的魅力,能夠方便快速地集成到項目中儲存、搜索和分析海量數據。本文咱們從零開始上手來體驗學習一下ElasticSearch。node

下載&安裝&啓動 ElasticSearch

打開ElasticSearch官網的下載頁面 https://www.elastic.co/downlo... 能夠獲取相應版本的下載地址,經過以下命令下載安裝並啓動ElasticSearch:數據庫

cd ~/soft/
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.3.zip
unzip elasticsearch-5.6.3.zip

cd elasticsearch-5.6.3
./bin/elasticsearch     # 加 -d參數可做爲守護進程後臺運行

注意,上述示例中下載的ElasticSearch 5.6.3要求Java版本爲8以上,若是你機器上沒有安裝Java或者版本不符合要求,須要先更新再執行./bin/elasticsearch命令啓動。此外,ElasticSearch對機器的配置要求也比較高。編程

在命令行使用curl 'http://localhost:9200/?pretty'可測試是否啓動成功,正常輸出以下:服務器

{
  "name" : "8Low6xs",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "CAMAT2P2QS-UnI32tB53_A",
  "version" : {
    "number" : "5.6.3",
    "build_hash" : "1a2f265",
    "build_date" : "2017-10-06T20:33:39.012Z",
    "build_snapshot" : false,
    "lucene_version" : "6.6.1"
  },
  "tagline" : "You Know, for Search"
}

ElasticSearch RESTful API

ElasticSearch提供Json格式的基於HTTP的RESTful API,可經過CURL命令直接請求,也能很是簡便的在任何編程語言中使用,官方提供的經常使用語言客戶端可在 https://www.elastic.co/guide/... 查詢下載。app

接口請求

請求格式:curl

curl -X <VERB> '<PROTOCOL>://<HOST>:<PORT>/<PATH>?<QUERY_STRING>' -d '<BODY>'
參數 說明
VERB HTTP方法 : GETPOSTPUTHEAD 或者 DELETE
PROTOCOL http 或者 https
HOST 集羣中任意節點的主機名
PORT 端口號,默認是 9200
PATH API 的終端路徑
QUERY_STRING 任意可選的查詢字符串參數
BODY JSON格式的請求體 (若是須要)

請求示例:elasticsearch

curl -X GET 'http://localhost:9200/_count?pretty' -d '
{
    "query": {
        "match_all": {}
    }
}
'

接口響應

Elasticsearch接口返回一個HTTP狀態碼(如:200 OK)和一個JSON格式的返回值(HEAD請求除外)。上面的CURL請求將返回一個像下面同樣的 JSON 體:編程語言

{
  "count" : 0,
  "_shards" : {
    "total" : 0,
    "successful" : 0,
    "skipped" : 0,
    "failed" : 0
  }
}

如需顯示狀態碼可使用curl命令的-i參數。分佈式

ElasticSearch存儲結構與概念

文檔 Document

Elasticsearch是面向文檔的,使用JSON做爲序列化格式存儲整個對象。user對象文檔示例以下:

{
    "email":      "john@smith.com",
    "first_name": "John",
    "last_name":  "Smith",
    "info": {
        "bio":         "Eco-warrior and defender of the weak",
        "age":         25,
        "interests": [ "dolphins", "whales" ]
    },
    "join_date": "2014/05/01"
}

實際存儲的文檔還包含文檔的元數據,元數據中的常見元素:

元素 說明
_index 文檔在哪一個索引存放
_type 文檔對象類型
_id 文檔惟一標識
_version 數據版本

注意:Type只是Index中的虛擬邏輯分組,不一樣的Type應該有類似的結構。6.x版只容許每一個Index包含一個Type,7.x 版將會完全移除 Type。

索引 Index

索引(Index)在ElasticSearch中是多義詞:

  • 一、相似數據庫概念的存儲文檔集合的地方叫作索引(名詞)
  • 二、存儲數據到Elasticsearch的行爲也叫作索引(動詞)
  • 三、爲了提高數據檢索速度使用的倒排索引結構

ElasticSearch默認給索引(1)中每一個文檔的每一個屬性創建倒排索引(3)使之能夠被快速檢索。

節點 Node、集羣 Cluster和分片 Shards

ElasticSearch是分佈式數據庫,容許多臺服務器協同工做,每臺服務器能夠運行多個實例。單個實例稱爲一個節點(node),一組節點構成一個集羣(cluster)。分片是底層的工做單元,文檔保存在分片內,分片又被分配到集羣內的各個節點裏,每一個分片僅保存所有數據的一部分。

ElasticSearch中增刪改查基本操做

咱們以wecompany公司的員工信息管理爲例來學習ElasticSearch中的基本操做。

索引文檔

向名稱爲wecompany的索引中添加類型爲employee的3個員工信息的文檔:

curl -X PUT 'http://localhost:9200/wecompany/employee/1?pretty' -d '
{
    "first_name" : "John",
    "last_name" :  "Smith",
    "age" :        25,
    "about" :      "I love to go rock climbing",
    "interests": [ "sports", "music" ]
}
'
curl -X PUT 'http://localhost:9200/wecompany/employee/2?pretty' -d '
{
    "first_name" :  "Jane",
    "last_name" :   "Smith",
    "age" :         32,
    "about" :       "I like to collect rock albums",
    "interests":  [ "music" ]
}
'
curl -X PUT 'http://localhost:9200/wecompany/employee/3?pretty' -d '
{
    "first_name" :  "Douglas",
    "last_name" :   "Fir",
    "age" :         35,
    "about":        "I like to build cabinets",
    "interests":  [ "forestry" ]
}
'

檢索文檔

獲取ID爲1的文檔:

curl -X GET 'http://localhost:9200/wecompany/employee/1?pretty'

{
  "_index" : "wecompany",
  "_type" : "employee",
  "_id" : "1",
  "_version" : 1,
  "found" : true,
  "_source" : {
    "first_name" : "John",
    "last_name" : "Smith",
    "age" : 25,
    "about" : "I love to go rock climbing",
    "interests" : [
      "sports",
      "music"
    ]
  }
}

搜索姓氏爲Smith的員工信息:

curl -X GET 'http://localhost:9200/wecompany/employee/_search?q=last_name:Smith&pretty'

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "wecompany",
        "_type" : "employee",
        "_id" : "2",
        "_score" : 0.2876821,
        "_source" : {
          "first_name" : "Jane",
          "last_name" : "Smith",
          "age" : 32,
          "about" : "I like to collect rock albums",
          "interests" : [
            "music"
          ]
        }
      },
      {
        "_index" : "wecompany",
        "_type" : "employee",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "first_name" : "John",
          "last_name" : "Smith",
          "age" : 25,
          "about" : "I love to go rock climbing",
          "interests" : [
            "sports",
            "music"
          ]
        }
      }
    ]
  }
}

使用查詢表達式搜索姓氏爲Smith的員工信息:

curl -X GET 'http://localhost:9200/wecompany/employee/_search?pretty' -d '
{
    "query" :  {
        "match" : {
            "last_name" : "Smith"
        }
    }
}
'

# 返回結果同上

姓氏爲Smith且年齡大於30的複雜條件搜索員工信息:

curl -X GET 'http://localhost:9200/wecompany/employee/_search?pretty' -d '
{
    "query" :  {
        "bool" : {
            "must" : {
                "match" : {
                    "last_name" : "Smith"
                }
            },
            "filter": {
                "range" : {
                    "age" : { "gt" : 30 } 
                }
            }
        }
    }
}
'

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "wecompany",
        "_type" : "employee",
        "_id" : "2",
        "_score" : 0.2876821,
        "_source" : {
          "first_name" : "Jane",
          "last_name" : "Smith",
          "age" : 32,
          "about" : "I like to collect rock albums",
          "interests" : [
            "music"
          ]
        }
      }
    ]
  }
}

全文搜索喜歡攀巖(rock climbing)的員工信息:

curl -X GET 'http://localhost:9200/wecompany/employee/_search?pretty' -d '
{
    "query" :  {
        "match" : {
            "about" : "rock climbing"
        }
    }
}
'

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 0.53484553,
    "hits" : [
      {
        "_index" : "wecompany",
        "_type" : "employee",
        "_id" : "1",
        "_score" : 0.53484553,
        "_source" : {
          "first_name" : "John",
          "last_name" : "Smith",
          "age" : 25,
          "about" : "I love to go rock climbing",
          "interests" : [
            "sports",
            "music"
          ]
        }
      },
      {
        "_index" : "wecompany",
        "_type" : "employee",
        "_id" : "2",
        "_score" : 0.26742277,
        "_source" : {
          "first_name" : "Jane",
          "last_name" : "Smith",
          "age" : 32,
          "about" : "I like to collect rock albums",
          "interests" : [
            "music"
          ]
        }
      }
    ]
  }
}

此外,將上述請求中的"match"換成"match_phrase"能夠精確匹配短語"rock climbing"的結果。在"query"同級添加"highlight"參數能夠在結果中用<em></em>標籤標註匹配的關鍵詞:

{
"query" :{ ... }
"highlight" : {
        "fields" : {
            "about" : {}
        }
    }
}

聚合分析員工的興趣:

  • 先啓用相關字段的分析功能
curl -X PUT 'http://localhost:9200/wecompany/_mapping/employee?pretty' -d '
{
    "properties": {
        "interests": { 
            "type":     "text",
            "fielddata": true
        }
    }
}
'

{
  "acknowledged" : true
}
  • 查詢聚合結果
curl -X GET 'http://localhost:9200/wecompany/employee/_search?pretty' -d '
{
    "aggs": {
        "all_interests": {
            "terms": { "field": "interests" }
        }
    }
}
'

{
  "took" : 33,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 1.0,
    "hits" : [
      ...
    ]
  },
  "aggregations" : {
    "all_interests" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "music",
          "doc_count" : 2
        },
        {
          "key" : "forestry",
          "doc_count" : 1
        },
        {
          "key" : "sports",
          "doc_count" : 1
        }
      ]
    }
  }
}

更新文檔

更新ID爲2的文檔,只需再次PUT便可:

curl -X PUT 'http://localhost:9200/wecompany/employee/2?pretty' -d '
{
    "first_name" :  "Jane",
    "last_name" :   "Smith",
    "age" :         33,
    "about" :       "I like to collect rock albums",
    "interests":  [ "music" ]
}
'

刪除文檔

curl -X DELETE 'http://localhost:9200/wecompany/employee/1?pretty'

結語

如今,你已經基本瞭解ElasticSearch的安裝使用和簡單概念了,但請不要止步於此;ElasticSearch有着深入的內涵和豐富的功能等待着你去發現,官方文檔是最新最全最好的學習材料了,打開下面這個頁面便可獲得它:

https://www.elastic.co/guide/...

原文連接:http://tabalt.net/blog/elasti...

圖片描述

相關文章
相關標籤/搜索