Elasticsearch-數據的輸入和輸出

時間 2019-12-06

原文原文鏈接

文檔

在 Elasticsearch 中，術語文檔有着特定的含義。它是指最頂層或者根對象,
這個根對象被序列化成 JSON 並存儲到 Elasticsearch 中，指定了惟一 IDweb

{
    "name":         "John Smith",
    "age":          42,
    "confirmed":    true,
    "join_date":    "2014-06-01",
    "home": {
        "lat":      51.5,
        "lon":      0.1
    },
    "accounts": [
        {
            "type": "facebook",
            "id":   "johnsmith"
        },
        {
            "type": "twitter",
            "id":   "johnsmith"
        }
    ]
}

文檔元數據

元數據三要素

_index 文檔在哪存放
_type 文檔表示的對象類別
_id 文檔惟一標識

_index

一個索引應該是因共同的特性被分組到一塊兒的文檔集合。
例如，你可能存儲全部的產品在索引 products 中，
而存儲全部銷售的交易到索引 sales 中。
雖然也容許存儲不相關的數據到一個索引中，但這一般看做是一個反模式的作法。數組

索引名 : 這個名字必須小寫，不能如下劃線開頭，不能包含逗號electron

_type

數據可能在索引中只是鬆散的組合在一塊兒，
可是一般明肯定義一些數據中的子分區是頗有用的。例如，全部的產品都放在一個索引中，可是你有許多不一樣的產品類別，好比 "electronics" 、 "kitchen" 和 "lawn-care"。this

_id

ID 是一個字符串，當它和 _index 以及 _type 組合就能夠惟一肯定 Elasticsearch 中的一個文檔。當你建立一個新的文檔，要麼提供本身的 _id ，要麼讓 Elasticsearch 幫你生成。編碼

索引文檔

舉個例子，若是咱們的索引稱爲 website ，類型稱爲 blog ，而且選擇 123 做爲 ID ，那麼索引請求應該是下面這樣code

PUT /website/blog/123
{
  "title": "My first blog entry",
  "text":  "Just trying this out...",
  "date":  "2014/01/01"
}

Autogenerating IDs

自動生成的 ID 是 URL-safe、基於 Base64 編碼且長度爲20個字符的 GUID 字符串。
這些 GUID 字符串由可修改的 FlakeID 模式生成，這種模式容許多個節點並行生成惟一 ID ，且互相之間的衝突機率幾乎爲零。對象

POST /website/blog/
{
  "title": "My second blog entry",
  "text":  "Still trying this out...",
  "date":  "2014/01/01"
}

檢索文檔

響應體包括目前已經熟悉了的元數據元素，再加上 _source 字段，這個字段包含咱們索引數據時發送給 Elasticsearch 的原始 JSON 文檔blog

GET /website/blog/123?pretty

{
  "_index" :   "website",
  "_type" :    "blog",
  "_id" :      "123",
  "_version" : 1,
  "found" :    true,
  "_source" :  {
      "title": "My first blog entry",
      "text":  "Just trying this out...",
      "date":  "2014/01/01"
  }
}

返回文檔的一部分

獲取source中的幾個字段索引

GET /website/blog/123?_source=title,text

{
  "_index" :   "website",
  "_type" :    "blog",
  "_id" :      "123",
  "_version" : 1,
  "found" :   true,
  "_source" : {
      "title": "My first blog entry" ,
      "text":  "Just trying this out..."
  }
}

只獲取source文檔

GET /website/blog/123/_source

{
   "title": "My first blog entry",
   "text":  "Just trying this out...",
   "date":  "2014/01/01"
}

更新整個文檔

在 Elasticsearch 中文檔是 不可改變 的，不能修改它們。相反，若是想要更新現有的文檔，須要重建索引或者進行替換

PUT /website/blog/123
{
  "title": "My first blog entry",
  "text":  "I am starting to get the hang of this...",
  "date":  "2014/01/02"
}

在響應體中，咱們能看到 Elasticsearch 已經增長了 _version 字段值：

{
  "_index" :   "website",
  "_type" :    "blog",
  "_id" :      "123",
  "_version" : 2,
  "created":   false 
}

過程：
實際上 Elasticsearch 按前述徹底相同方式執行如下過程：

從舊文檔構建 JSON
更改該 JSON
刪除舊文檔
索引一個新文檔

建立新文檔

當咱們索引一個文檔，怎麼確認咱們正在建立一個徹底新的文檔，而不是覆蓋現有的呢？

_index 、 _type 和 _id 的組合能夠惟一標識一個文檔

方法一（op_type 查詢 -字符串參數）：

PUT /website/blog/123?op_type=create
{ ... }

方法二（URL 末端使用 /_create）：

PUT /website/blog/123/_create
{ ... }

刪除文檔

DELETE /website/blog/123

部分更新文檔

POST /website/blog/123/_update
{
   "doc" : {
      "tags" : [ "testing" ],
      "views": 0
   }
}

結果：
{
  "_index": "website",
  "_type": "blog",
  "_id": "123",
  "_version": 3,
  "found": true,
  "_source": {
    "title": "My first blog entry",
    "text": "I am starting to get the hang of this...",
    "date": "2014/01/02",
    "views": 0,
    "tags": [
      "testing"
    ]
  }
}

取回多個文檔

相同index 的查詢

GET /website/blog/_mget
{
   "docs" : [
      { "_id" : 2 },
      { "_type" : "pageviews", "_id" :   1 }
   ]
}

若是全部文檔的 _index 和 _type 都是相同的，你能夠只傳一個 ids 數組，而不是整個 docs 數組

GET /website/blog/_mget
{
   "ids" : [ "2", "1" ]
}

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。