elasticsearch 內部對象結構數據索引

時間 2019-11-18

原文原文鏈接

內部對象常常用於嵌入一個實體或對象到其它對象中。例如，與其在 tweet 文檔中包含 user_name 和 user_id 域，咱們也能夠這樣寫：elasticsearch

{
    "tweet":            "Elasticsearch is very flexible",
    "user": {
        "id":           "@johnsmith",
        "gender":       "male",
        "age":          26,
        "name": {
            "full":     "John Smith",
            "first":    "John",
            "last":     "Smith"
        }
    }
}

Elasticsearch 會動態監測新的對象域並映射它們爲對象，在 properties 屬性下列出內部域：flex

{
  "gb": {
    "tweet": { 
      "properties": {
        "tweet":            { "type": "string" },
        "user": { 
          "type":             "object",
          "properties": {
            "id":           { "type": "string" },
            "gender":       { "type": "string" },
            "age":          { "type": "long"   },
            "name":   { 
              "type":         "object",
              "properties": {
                "full":     { "type": "string" },
                "first":    { "type": "string" },
                "last":     { "type": "string" }
              }
            }
          }
        }
      }
    }
  }
}

user 和 name 域的映射結構與 tweet 類型的相同。事實上， type 映射只是一種特殊的對象映射，咱們稱之爲根對象。除了它有一些文檔元數據的特殊頂級域，例如 _source 和 _all 域，它和其餘對象同樣。
Lucene 不理解內部對象。 Lucene 文檔是由一組鍵值對列表組成的。爲了能讓 Elasticsearch 有效地索引內部類，它把咱們的文檔轉化成這樣：spa

{
    "tweet":            [elasticsearch, flexible, very],
    "user.id":          [@johnsmith],
    "user.gender":      [male],
    "user.age":         [26],
    "user.name.full":   [john, smith],
    "user.name.first":  [john],
    "user.name.last":   [smith]
}

內部域能夠經過名稱引用（例如， first ）。爲了區分同名的兩個域，咱們可使用全路徑（例如， user.name.first ）或 type 名加路徑（ tweet.user.name.first ）。code