Elasticsearch 入門教程

Elastic 的底層是開源庫 Lucene。可是,你無法直接用 Lucene,必須本身寫代碼去調用它的接口。Elastic 是 Lucene 的封裝,提供了 REST API 的操做接口,開箱即用。html

1、安裝

環境java

#lsb_release -a
LSB Version:    :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID: CentOS
Description:    CentOS release 6.5 (Final)
Release:    6.5
Codename:   Final

#uname -a
Linux CDVM-213010030 2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

依賴node

Elasticsearch 須要 Java 8+ 環境linux

1.1 安裝Java

下載Java jdk, Java SE Development Kit 8 - Downloadsgit

[elsearch@CDVM-213010030 tmp]$ ls /tmp/installed/
jdk-8u121-linux-x64.rpm  jdk-8u121-linux-x64.tar.gz
[elsearch@CDVM-213010030 tmp]$ rpm -ivh jdk-8u121-linux-x64.rpm
[elsearch@CDVM-213010030 tmp]$ java -version
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

1.2 安裝Elasticsearch

[下載最新版本的 Elasticsearch](
https://www.elastic.co/guide/...github

解壓安裝包數據庫

# tar zxvf elasticsearch-5.2.0.tar.gz -C /opt

運行elasticsearch腳本啓動express

# cd /opt/elasticsearch-5.2.0/bin
# ./elasticsearch

若是是用root帳號運行,會出現如下錯誤:apache

[root@CDVM-213010030 elasticsearch-5.2.0]# ./bin/elasticsearch
[2017-02-08T14:22:45,125][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.RuntimeException: can not run elasticsearch as root
    at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:125) ~[elasticsearch-5.2.0.jar:5.2.0]
    at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:112) ~[elasticsearch-5.2.0.jar:5.2.0]
    at 
    ... 6 more

這是出於系統安全考慮設置的條件。因爲 ElasticSearch 能夠接收用戶輸入的腳本而且執行,爲了系統安全考慮, 建議建立一個單獨的用戶用來運行 ElasticSearch。bootstrap

建立 elsearch 用戶組及 elsearch 用戶

groupadd elsearch
useradd elsearch -g elsearch -p elsearch

更改 Elasticsearch 文件夾及內部文件的所屬用戶及組爲 elsearch:elsearch

chown -R elsearch:elsearch  /opt/elasticsearch-7.3.0

切換到elsearch用戶再啓動

# su elsearch
$ cd elasticsearch-5.2.0/bin
$ ./elasticsearch

啓動後打印信息以下

[elsearch@CDVM-213010030 elasticsearch-5.2.0]$ ./bin/elasticsearch
[2017-02-08T15:22:52,677][WARN ][o.e.b.JNANatives         ] unable to install syscall filter: 
java.lang.UnsupportedOperationException: seccomp unavailable: requires kernel 3.5+ with CONFIG_SECCOMP and CONFIG_SECCOMP_FILTER compiled in
    at org.elasticsearch.bootstrap.SystemCallFilter.linuxImpl(SystemCallFilter.java:350) ~[elasticsearch-5.2.0.jar:5.2.0]
    at org.elasticsearch.bootstrap.SystemCallFilter.init(SystemCallFilter.java:638) ~[elasticsearch-5.2.0.jar:5.2.0]
    at org.elasticsearch.bootstrap.JNANatives.tryInstallSystemCallFilter(JNANatives.java:215) [elasticsearch-5.2.0.jar:5.2.0]
    at org.elasticsearch.bootstrap.Natives.tryInstallSystemCallFilter(Natives.java:99) [elasticsearch-5.2.0.jar:5.2.0]
    at org.elasticsearch.bootstrap.Bootstrap.initializeNatives(Bootstrap.java:110) [elasticsearch-5.2.0.jar:5.2.0]
    at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:203) [elasticsearch-5.2.0.jar:5.2.0]
    at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) [elasticsearch-5.2.0.jar:5.2.0]
    at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:121) [elasticsearch-5.2.0.jar:5.2.0]
    at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:112) [elasticsearch-5.2.0.jar:5.2.0]
    at org.elasticsearch.cli.SettingCommand.execute(SettingCommand.java:54) [elasticsearch-5.2.0.jar:5.2.0]
    at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122) [elasticsearch-5.2.0.jar:5.2.0]
    at org.elasticsearch.cli.Command.main(Command.java:88) [elasticsearch-5.2.0.jar:5.2.0]
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:89) [elasticsearch-5.2.0.jar:5.2.0]
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:82) [elasticsearch-5.2.0.jar:5.2.0]
[2017-02-08T15:22:52,768][INFO ][o.e.n.Node               ] [] initializing ...
[2017-02-08T15:22:52,838][INFO ][o.e.e.NodeEnvironment    ] [qm6aUUo] using [1] data paths, mounts [[/ (/dev/vda)]], net usable_space [184gb], net total_space [196.8gb], spins? [possibly], types [ext4]
[2017-02-08T15:22:52,839][INFO ][o.e.e.NodeEnvironment    ] [qm6aUUo] heap size [1.9gb], compressed ordinary object pointers [true]
[2017-02-08T15:22:52,840][INFO ][o.e.n.Node               ] node name [qm6aUUo] derived from node ID [qm6aUUoUScO_S16Sod_7Bw]; set [node.name] to override
[2017-02-08T15:22:52,841][INFO ][o.e.n.Node               ] version[5.2.0], pid[22947], build[24e05b9/2017-01-24T19:52:35.800Z], OS[Linux/2.6.32-431.el6.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_121/25.121-b13]
[2017-02-08T15:22:53,573][INFO ][o.e.p.PluginsService     ] [qm6aUUo] loaded module [aggs-matrix-stats]
[2017-02-08T15:22:53,573][INFO ][o.e.p.PluginsService     ] [qm6aUUo] loaded module [ingest-common]
[2017-02-08T15:22:53,573][INFO ][o.e.p.PluginsService     ] [qm6aUUo] loaded module [lang-expression]
[2017-02-08T15:22:53,573][INFO ][o.e.p.PluginsService     ] [qm6aUUo] loaded module [lang-groovy]
[2017-02-08T15:22:53,573][INFO ][o.e.p.PluginsService     ] [qm6aUUo] loaded module [lang-mustache]
[2017-02-08T15:22:53,573][INFO ][o.e.p.PluginsService     ] [qm6aUUo] loaded module [lang-painless]
[2017-02-08T15:22:53,573][INFO ][o.e.p.PluginsService     ] [qm6aUUo] loaded module [percolator]
[2017-02-08T15:22:53,574][INFO ][o.e.p.PluginsService     ] [qm6aUUo] loaded module [reindex]
[2017-02-08T15:22:53,574][INFO ][o.e.p.PluginsService     ] [qm6aUUo] loaded module [transport-netty3]
[2017-02-08T15:22:53,574][INFO ][o.e.p.PluginsService     ] [qm6aUUo] loaded module [transport-netty4]
[2017-02-08T15:22:53,574][INFO ][o.e.p.PluginsService     ] [qm6aUUo] no plugins loaded
[2017-02-08T15:22:55,217][INFO ][o.e.n.Node               ] initialized
[2017-02-08T15:22:55,218][INFO ][o.e.n.Node               ] [qm6aUUo] starting ...
[2017-02-08T15:22:55,288][WARN ][i.n.u.i.MacAddressUtil   ] Failed to find a usable hardware address from the network interfaces; using random bytes: 24:be:3c:15:2d:4a:1d:e3
[2017-02-08T15:22:55,336][INFO ][o.e.t.TransportService   ] [qm6aUUo] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}
[2017-02-08T15:22:55,342][WARN ][o.e.b.BootstrapChecks    ] [qm6aUUo] max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
[2017-02-08T15:22:55,342][WARN ][o.e.b.BootstrapChecks    ] [qm6aUUo] system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk
[2017-02-08T15:22:58,384][INFO ][o.e.c.s.ClusterService   ] [qm6aUUo] new_master {qm6aUUo}{qm6aUUoUScO_S16Sod_7Bw}{qojdwy-UQmODn4nlWEboQA}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2017-02-08T15:22:58,401][INFO ][o.e.h.HttpServer         ] [qm6aUUo] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}
[2017-02-08T15:22:58,401][INFO ][o.e.n.Node               ] [qm6aUUo] started
[2017-02-08T15:22:58,424][INFO ][o.e.g.GatewayService     ] [qm6aUUo] recovered [0] indices into cluster_state

注意:Elasticsearch-5只支持內核3.5以上版本的linux操做系統

ElasticSearch後端啓動命令

./elasticsearch -d

修改配置文件

# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
bootstrap.system_call_filter: false
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
network.host: 10.213.10.30
#network.host: 127.0.0.1
#
# Set a custom port for HTTP:
#
http.port: 9200

默認狀況下,Elastic 只容許本機訪問,若是須要遠程訪問,能夠修改 Elastic 安裝目錄的config/elasticsearch.yml文件,去掉network.host的註釋,將它的值改爲0.0.0.0,而後從新啓動 Elastic。

錯誤1

ERROR: bootstrap checks failed
max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]
system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk
[2017-02-08T16:13:57,827][INFO ][o.e.n.Node               ] [qm6aUUo] stopping ...
[2017-02-08T16:13:57,925][INFO ][o.e.n.Node               ] [qm6aUUo] stopped
[2017-02-08T16:13:57,925][INFO ][o.e.n.Node               ] [qm6aUUo] closing ...
[2017-02-08T16:13:57,946][INFO ][o.e.n.Node               ] [qm6aUUo] closed

解決方案:
使用下面的方法臨時使其生效
$ sudo sysctl -w vm.max_map_count=262144
或修改 /etc/sysctl.conf 文件,添加 「vm.max_map_count = 262144」;
設置後,當即生效
[root@CDVM-213010030 elasticsearch-5.2.0]# sysctl -p
[root@CDVM-213010030 elasticsearch-5.2.0]# sysctl -a | grep
vm.max_map_count = 262144

錯誤2

ERROR: bootstrap checks failed
system call filters failed to install; check the logs and fix your configuration or disable system call filters at your own risk
[2017-02-08T16:39:43,123][INFO ][o.e.n.Node               ] [qm6aUUo] stopping ...
[2017-02-08T16:39:43,136][INFO ][o.e.n.Node               ] [qm6aUUo] stopped
[2017-02-08T16:39:43,136][INFO ][o.e.n.Node               ] [qm6aUUo] closing ...
[2017-02-08T16:39:43,242][INFO ][o.e.n.Node               ] [qm6aUUo] closed

緣由:

這是在由於Centos6不支持SecComp,而ES5.2.0默認bootstrap.system_call_filter爲true進行檢測,因此致使檢測失敗,失敗後直接致使ES不能啓動。

解決方案:

在elasticsearch.yml中配置bootstrap.system_call_filter爲false,注意要在Memory下面:
bootstrap.memory_lock: false
bootstrap.system_call_filter: false

能夠查看issues

錯誤3

max file descriptors [4096] for elasticsearch process likely too low, increase to at least [65536]

解決方案:

修改/etc/security/limits.conf文件,添加或修改以下行:
hard nofile 65536
soft nofile 65536

503錯誤

# curl -X PUT 'localhost:9200/weather'
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}

解決方案:

you must config elasticsearch.yml
set
node.name='node-1'
and
cluster.initial_master_nodes: ["node-1"]
then restart es!

重啓後解決

# supervisorctl restart elasticSearch
# curl -X PUT '127.0.0.1:9200/weather'
{"acknowledged":true,"shards_acknowledged":true,"index":"weather"}

附上完整配置

# ------------------------------------ Node ------------------------------------
node.name: node-1
# ---------------------------------- Network -----------------------------------
network.host: 0.0.0.0
#
# Set a custom port for HTTP:
#
http.port: 9200
# --------------------------------- Discovery ----------------------------------
cluster.initial_master_nodes: ["node-1"]
# ---------------------------------- Gateway -----------------------------------
gateway.recover_after_nodes: 1
# ---------------------------------- Various -----------------------------------
http.cors.enabled: true
http.cors.allow-origin: "*"

1.3 測試

$ curl http://10.213.10.30:9200/?pretty
{
  "name" : "qm6aUUo",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "UXrjeTP6SmmOIZOZ4j9I4w",
  "version" : {
    "number" : "5.2.0",
    "build_hash" : "24e05b9",
    "build_date" : "2017-01-24T19:52:35.800Z",
    "build_snapshot" : false,
    "lucene_version" : "6.4.0"
  },
  "tagline" : "You Know, for Search"
}

更多見官方安裝文檔

2、基本概念

2.1 接近實時(NRT)

Elasticsearch 是一個接近實時的搜索平臺。這意味着,從索引一個文檔直到這個文檔可以被搜索到有一個很小的延遲(一般是 1 秒)。

2.2 集羣(cluster)

一個集羣就是由一個或多個節點組織在一塊兒, 它們共同持有你所有的數據, 並一塊兒提供索引和搜索功能。 一個集羣由一個惟一的名字標識, 這個名字默認就是「elasticsearch」。 這個名字很重要, 由於一個節點只能經過指定某個集羣的名字,來加入這個集羣。在生產環境中顯式地設定這個名字是一個好習慣,可是使用默認值來進行測試/開發也是不錯的。

注意,一個集羣中只包含一個節點是合法的。另外,你也能夠擁有多個集羣,集羣以名字區分。

2.3 節點(node)

一個節點是你集羣中的一個服務器,做爲集羣的一部分,它存儲你的數據,參與集羣的索引和搜索功能。 和集羣相似, 一個節點也是由一個名字來標識的, 默認狀況下, 這個名字是一個隨機的Marvel角色的名字,這個名字會在節點啓動時分配給它。這個名字對於管理工做來講很重要,由於在這個管理過程當中,你會去肯定網絡中的哪些 服務器對應於Elasticsearch集羣中的哪些節點。

一個節點能夠經過配置集羣名稱的方式來加入一個指定的集羣。 默認狀況下,每一個節點都會被安排加入到一個叫作「elasticsearch」的集羣中,這意味着,若是你在你的網絡中啓動了若干個節點, 並假定它們可以相互發現彼此,它們將會自動地造成並加入到一個叫作「elasticsearch」 的集羣中。

在一個集羣裏能夠擁有任意多個節點。並且,若是當前你的網絡中沒有運行任何Elasticsearch節點,這時啓動一個節點,會默認建立並加入一個叫作「elasticsearch」的單節點集羣。

2.4 索引(Index)

ElasticSearch把數據存放到一個或者多個索引(indices)中。若是用關係型數據庫模型對比,索引(index)的地位與數據庫實例(database)至關。索引存放和讀取的基本單元是文檔(Document)。

一個索引就是一個擁有類似特徵的文檔的集合。好比說,你能夠有一個客戶數據的索引,另外一個產品目錄的索引,還有一個訂單數據的索引。一個索引由一個名字來標識(必須所有是小寫字母的),而且當咱們要對這個索引中的文檔進行索引、搜索、更新和刪除的時候,都要使用到這個名字。在一個集羣中,你可以建立任意多個索引。

Elastic 會索引全部字段,通過處理後寫入一個反向索引(Inverted Index)。查找數據的時候,直接查找該索引。

因此,Elastic 數據管理的頂層單位就叫作 Index(索引)。它是單個數據庫的同義詞。每一個 Index (即數據庫)的名字必須是小寫。

下面的命令能夠查看當前節點的全部 Index。

$ curl -X GET 'http://localhost:9200/_cat/indices?v'

2.5 類型(type)

Document 能夠分組,好比weather這個 Index 裏面,能夠按城市分組(北京和上海),也能夠按氣候分組(晴天和雨天)。這種分組就叫作 Type,它是虛擬的邏輯分組,用來過濾 Document。

在一個索引中,你能夠定義一種或多種類型。一個類型是你的索引的一個邏輯上的分類/分區,其語義徹底由你來定。一般,會爲具備一組相同字段的文檔定義一個類型。好比說,咱們假設你運營一個博客平臺而且將你全部的數據存儲到一個索引中。在這個索引中,你能夠爲用戶數據定義一個類型,爲博客數據定義另外一個類型,固然,也能夠爲評論數據定義另外一個類型。

能夠把 type 看做關係型數據庫中的 table

不過若是把 type 看做 table 和如下這個說法有矛盾:

不一樣的 Type 應該有類似的結構(schema),舉例來講,id字段不能在這個組是字符串,在另外一個組是數值。這是與關係型數據庫的表的一個區別。性質徹底不一樣的數據(好比products和logs)應該存成兩個 Index,而不是一個 Index 裏面的兩個 Type(雖然能夠作到)。

下面的命令能夠列出每一個 Index 所包含的 Type。

$ curl 'localhost:9200/_mapping?pretty=true'

根據規劃,Elastic 6.x 版只容許每一個 Index 包含一個 Type,7.x 版將會完全移除 Type。

2.6 文檔(document)

Index 裏面單條的記錄稱爲 Document(文檔),許多條 Document 構成了一個 Index。

一個文檔是一個可被索引的基礎信息單元。好比,你能夠擁有某一個客戶的文檔、某一個產品的一個文檔、某個訂單的一個文檔,在一個index/type裏面,你能夠存儲任意多的文檔。

注意,一個文檔物理上存在於一個索引之中。

Document 使用 JSON 格式表示,下面是一個例子。

{
  "user": "張三",
  "title": "工程師",
  "desc": "數據庫管理"
}

同一個 Index 裏面的 Document,不要求有相同的結構(scheme),可是最好保持相同,這樣有利於提升搜索效率。

2.7 分片和複製(shards and replicas)

一個索引能夠存儲超出單個結點硬件限制的大量數據。好比,一個具備10億文檔的索引佔據1TB的磁盤空間,而任一節點可能沒有這樣大的磁盤空間來存儲或者單個節點處理搜索請求,響應會太慢。

爲了解決這個問題,Elasticsearch提供了將索引劃分紅多片的能力,這些片叫作分片。當你建立一個索引的時候,你能夠指定你想要的分片的數量。每一個分片自己也是一個功能完善而且獨立的「索引」,這個「索引」 能夠被放置到集羣中的任何節點上。

分片之因此重要,主要有兩方面的緣由:

  • 容許你水平分割/擴展你的內容容量
  • 容許你在分片(位於多個節點上)之上進行分佈式的、並行的操做,進而提升性能/吞吐量

至於一個分片怎樣分佈,它的文檔怎樣聚合回搜索請求,是徹底由Elasticsearch管理的,對於做爲用戶的你來講,這些都是透明的。

在一個網絡/雲的環境裏,失敗隨時均可能發生。在某個分片/節點由於某些緣由處於離線狀態或者消失的狀況下,故障轉移機制是很是有用且強烈推薦的。爲此, Elasticsearch容許你建立分片的一份或多份拷貝,這些拷貝叫作複製分片,或者直接叫複製。

複製之因此重要,有兩個主要緣由:

  • 在分片/節點失敗的狀況下,複製提供了高可用性。複製分片不與原/主要分片置於同一節點上是很是重要的。
  • 由於搜索能夠在全部的複製上並行運行,複製能夠擴展你的搜索量/吞吐量

總之,每一個索引能夠被分紅多個分片。一個索引也能夠被複制0次(即沒有複製) 或屢次。一旦複製了,每一個索引就有了主分片(做爲複製源的分片)和複製分片(主分片的拷貝)。 分片和複製的數量能夠在索引建立的時候指定。在索引建立以後,你能夠在任什麼時候候動態地改變複製的數量,可是你不能再改變分片的數量。

默認狀況下,Elasticsearch中的每一個索引分配5個主分片和1個複製。這意味着,若是你的集羣中至少有兩個節點,你的索引將會有5個主分片和另外5個複製分片(1個徹底拷貝),這樣每一個索引總共就有10個分片。

3、建立和刪除 Index

3.1 建立Index

$ curl -X PUT 'localhost:9200/weather'

服務器返回一個 JSON 對象,裏面的acknowledged字段表示操做成功。

{
  "acknowledged":true,
  "shards_acknowledged":true
}

3.2 刪除Index

$ curl -X DELETE 'localhost:9200/weather'

四 、數據操做

4.1 新增記錄

新增記錄有兩種方式

一種是:POST 請求,不指定 Id,好比,向/accounts/person 新增一條人員記錄

$ curl -X POST 'localhost:9200/accounts/person' -d '
{
  "user": "李四",
  "title": "工程師",
  "desc": "系統管理"
}'

上面代碼中,向/accounts/person發出一個 POST 請求,添加一個記錄。這時,服務器返回的 JSON 對象裏面,_id字段就是一個隨機字符串。

{
  "_index":"accounts",
  "_type":"person",
  "_id":"AV3qGfrC6jMbsbXb6k1p",
  "_version":1,
  "result":"created",
  "_shards":{"total":2,"successful":1,"failed":0},
  "created":true
}

另外一種是PUT 請求,路徑是/accounts/person/1,最後的1是該條記錄的 Id。它不必定是數字,任意字符串(好比abc)均可以。

$ curl -X PUT 'localhost:9200/accounts/person/1' -d '
{
  "user": "張三",
  "title": "工程師",
  "desc": "數據庫管理"
}'

服務器返回的 JSON 對象,會給出 Index、Type、Id、Version 等信息。

{
  "_index":"accounts",
  "_type":"person",
  "_id":"1",
  "_version":1,
  "result":"created",
  "_shards":{"total":2,"successful":1,"failed":0},
  "created":true
}
注意,若是沒有先建立 Index(這個例子是accounts),直接執行上面的命令,Elastic 也不會報錯,而是直接生成指定的 Index。因此,打字的時候要當心,不要寫錯 Index 的名稱。

4.2 查看記錄

向/Index/Type/Id發出 GET 請求,就能夠查看這條記錄。

$ curl 'localhost:9200/accounts/person/1?pretty=true'

上面代碼請求查看/accounts/person/1這條記錄,URL 的參數pretty=true表示以易讀的格式返回。

返回的數據中,found字段表示查詢成功,_source字段返回原始記錄。

{
  "_index" : "accounts",
  "_type" : "person",
  "_id" : "1",
  "_version" : 1,
  "found" : true,
  "_source" : {
    "user" : "張三",
    "title" : "工程師",
    "desc" : "數據庫管理"
  }
}

若是 Id 不正確,就查不到數據,found字段就是false。

$ curl 'localhost:9200/weather/beijing/abc?pretty=true'

{
  "_index" : "accounts",
  "_type" : "person",
  "_id" : "abc",
  "found" : false
}

4.3 刪除記錄

刪除記錄就是發出 DELETE 請求。

$ curl -X DELETE 'localhost:9200/accounts/person/1'

4.4 更新記錄

更新記錄就是使用 PUT 請求,從新發送一次數據。

$ curl -X PUT 'localhost:9200/accounts/person/1' -d '
{
    "user" : "張三",
    "title" : "工程師",
    "desc" : "數據庫管理,軟件開發"
}' 

{
  "_index":"accounts",
  "_type":"person",
  "_id":"1",
  "_version":2,
  "result":"updated",
  "_shards":{"total":2,"successful":1,"failed":0},
  "created":false
}

上面代碼中,咱們將原始數據從"數據庫管理"改爲"數據庫管理,軟件開發"。 返回結果裏面,有幾個字段發生了變化。

"_version" : 2,
"result" : "updated",
"created" : false

能夠看到,記錄的 Id 沒變,可是版本(version)從1變成2,操做類型(result)從created變成updated,created字段變成false,由於此次不是新建記錄。

5、數據查詢

5.1 返回全部記錄

使用 GET 方法,直接請求/Index/Type/_search,就會返回全部記錄。

$ curl 'localhost:9200/accounts/person/_search'

{
  "took":2,
  "timed_out":false,
  "_shards":{"total":5,"successful":5,"failed":0},
  "hits":{
    "total":2,
    "max_score":1.0,
    "hits":[
      {
        "_index":"accounts",
        "_type":"person",
        "_id":"AV3qGfrC6jMbsbXb6k1p",
        "_score":1.0,
        "_source": {
          "user": "李四",
          "title": "工程師",
          "desc": "系統管理"
        }
      },
      {
        "_index":"accounts",
        "_type":"person",
        "_id":"1",
        "_score":1.0,
        "_source": {
          "user" : "張三",
          "title" : "工程師",
          "desc" : "數據庫管理,軟件開發"
        }
      }
    ]
  }
}

上面代碼中,返回結果的 took字段表示該操做的耗時(單位爲毫秒),timed_out字段表示是否超時,hits字段表示命中的記錄。

The response also provides the following information about the search request:

  • took – how long it took Elasticsearch to run the query, in milliseconds
  • timed_out – whether or not the search request timed out
  • _shards – how many shards were searched and a breakdown of how many shards succeeded, failed, or were skipped.
  • max_score – the score of the most relevant document found
  • hits.total.value - how many matching documents were found
  • hits.sort - the document’s sort position (when not sorting by relevance score)
  • hits._score - the document’s relevance score (not applicable when using match_all)

5.2 查詢分頁

Each search request is self-contained: Elasticsearch does not maintain any state information across requests. To page through the search hits, specify the from and size parameters in your request.

For example, the following request gets hits 10 through 19:

GET /bank/_search
{
  "query": { "match_all": {} },
  "sort": [
    { "account_number": "asc" }
  ],
  "from": 10,
  "size": 10
}

經過from字段,指定位移。經過size字段設置每頁返回條數。

Now that you’ve seen how to submit a basic search request, you can start to construct queries that are a bit more interesting than match_all.

5.3 匹配單個詞

To search for specific terms within a field, you can use a match query. For example, the following request searches the address field to find customers whose addresses contain mill or lane:

GET /bank/_search
{
  "query": { "match": { "address": "mill lane" } }
}

上面代碼搜索的是「mill」 or 「lane」,是 「or」 關係。

若是要執行多個關鍵詞的 「and」 搜索,必須使用布爾查詢。

$ curl 'localhost:9200/bank/_search'  -d '
{
  "query": {
    "bool": {
      "must": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]
    }
  }
}'

注意:Elastic 的查詢很是特別,使用本身的查詢語法,以上查詢要求 GET 請求帶有數據體。

5.4 匹配整個詞組

To perform a phrase search rather than matching individual terms, you use match_phrase instead of match. For example, the following request only matches addresses that contain the phrase mill lane:

GET /bank/_search
{
  "query": { "match_phrase": { "address": "mill lane" } }
}

5.5 結合多個查詢條件:匹配或不匹配

To construct more complex queries, you can use a bool query to combine multiple query criteria. You can designate criteria as required (must match), desirable (should match), or undesirable (must not match).

For example, the following request searches the bank index for accounts that belong to customers who are 40 years old, but excludes anyone who lives in Idaho (ID):

GET /bank/_search
{
  "query": {
    "bool": {
      "must": [
        { "match": { "age": "40" } }
      ],
      "must_not": [
        { "match": { "state": "ID" } }
      ]
    }
  }
}

Each must, should, and must_not element in a Boolean query is referred to as a query clause. How well a document meets the criteria in each must or should clause contributes to the document’s relevance score. The higher the score, the better the document matches your search criteria. By default, Elasticsearch returns documents ranked by these relevance scores.

The criteria in a must_not clause is treated as a filter. It affects whether or not the document is included in the results, but does not contribute to how documents are scored. You can also explicitly specify arbitrary filters to include or exclude documents based on structured data.

For example, the following request uses a range filter to limit the results to accounts with a balance between $20,000 and $30,000 (inclusive).

GET /bank/_search
{
  "query": {
    "bool": {
      "must": { "match_all": {} },
      "filter": {
        "range": {
          "balance": {
            "gte": 20000,
            "lte": 30000
          }
        }
      }
    }
  }
}

更多見 Start searching

6、刪除文檔

6.1 Delete

Request

DELETE /<index>/_doc/<_id>

Description

You use DELETE to remove a document from an index. You must specify the index name and document ID.

Examples

Delete the JSON document 1 from the twitter index:

DELETE /twitter/_doc/1

The API returns the following result:

{
    "_shards" : {
        "total" : 2,
        "failed" : 0,
        "successful" : 2
    },
    "_index" : "twitter",
    "_type" : "_doc",
    "_id" : "1",
    "_version" : 2,
    "_primary_term": 1,
    "_seq_no": 5,
    "result": "deleted"
}

6.2 Delete by query

Request

POST /<index>/_delete_by_query

Deletes documents that match the specified query.

POST /twitter/_delete_by_query
{
  "query": {
    "match": {
      "message": "some message"
    }
  }
}

Response body

The JSON response looks like this:

{
  "took" : 147,
  "timed_out": false,
  "total": 119,
  "deleted": 119,
  "batches": 1,
  "version_conflicts": 0,
  "noops": 0,
  "retries": {
    "bulk": 0,
    "search": 0
  },
  "throttled_millis": 0,
  "requests_per_second": -1.0,
  "throttled_until_millis": 0,
  "failures" : [ ]
}
Documents with a version equal to 0 cannot be deleted using delete by query because internal versioning does not support 0 as a valid version number.

REST APIs

參考

Elasticsearch 官方文檔
Elasticsearch guide
Mastering elasticsearch
全文搜索引擎 Elasticsearch 入門教程

相關文章
相關標籤/搜索