相似於數據庫中的表結構定義,主要做用以下:html
須要注意的是,在索引中定義太多字段可能會致使索引膨脹,出現內存不足和難以恢復的狀況,下面有幾個設置:正則表達式
strict_date_optional_time||epoch_millis
format# 建立range索引
PUT range_index
{
"mappings": {
"_doc": {
"properties": {
"expected_attendees": {
"type": "integer_range"
},
"time_frame": {
"type": "date_range",
"format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis"
}
}
}
}
}
# 插入一個文檔
PUT range_index/_doc/1
{
"expected_attendees" : {
"gte" : 10,
"lte" : 20
},
"time_frame" : {
"gte" : "2015-10-31 12:00:00",
"lte" : "2015-11-05"
}
}
# 12在 10~20的範圍內,能夠搜索到文檔1
GET range_index/_search
{
"query" : {
"term" : {
"expected_attendees" : {
"value": 12
}
}
}
}
# within能夠搜索到文檔
# 能夠修改日期,而後分別對比CONTAINS,WITHIN,INTERSECTS的區別
GET range_index/_search
{
"query" : {
"range" : {
"time_frame" : {
"gte" : "2015-11-02",
"lte" : "2015-11-03",
"relation" : "within"
}
}
}
}
複製代碼
# tags字符串數組,lists 對象數組
PUT my_index/_doc/1
{
"message": "some arrays in this document...",
"tags": [ "elasticsearch", "wow" ],
"lists": [
{
"name": "prog_list",
"description": "programming list"
},
{
"name": "cool_list",
"description": "cool stuff list"
}
]
}
複製代碼
經過例子來講明:算法
DELETE my_index
PUT my_index/_doc/1
{
"group" : "fans",
"user" : [
{
"first" : "John",
"last" : "Smith"
},
{
"first" : "Alice",
"last" : "White"
}
]
}
複製代碼
GET my_index/_search
{
"query": {
"bool": {
"must": [
{ "match": { "user.first": "Alice" }},
{ "match": { "user.last": "Smith" }}
]
}
}
}
複製代碼
{
"group" : "fans",
"user.first" : [ "alice", "john" ],
"user.last" : [ "smith", "white" ]
}
複製代碼
user.first 和 user.last 扁平化爲多值字段,alice 和 white 的關聯關係丟失了。致使這個文檔錯誤地匹配對 alice 和 smith 的查詢數據庫
若是最開始就把user設置爲 nested 嵌套對象呢?json
DELETE my_index
PUT my_index
{
"mappings": {
"_doc": {
"properties": {
"user": {
"type": "nested"
}
}
}
}
}
PUT my_index/_doc/1
{
"group": "fans",
"user": [
{
"first": "John",
"last": "Smith"
},
{
"first": "Alice",
"last": "White"
}
]
}
複製代碼
GET my_index/_search
{
"query": {
"nested": {
"path": "user",
"query": {
"bool": {
"must": [
{ "match": { "user.first": "Alice" }},
{ "match": { "user.last": "Smith" }}
]
}
}
}
}
}
GET my_index/_search
{
"query": {
"nested": {
"path": "user",
"query": {
"bool": {
"must": [
{ "match": { "user.first": "Alice" }},
{ "match": { "user.last": "White" }}
]
}
},
"inner_hits": {
"highlight": {
"fields": {
"user.first": {}
}
}
}
}
}
}
複製代碼
nested對象將數組中每一個對象做爲獨立隱藏文檔來索引,這意味着每一個嵌套對象均可以獨立被搜索數組
須要注意的是:bash
# ip類型,存儲IP
PUT my_index
{
"mappings": {
"_doc": {
"properties": {
"ip_addr": {
"type": "ip"
}
}
}
}
}
PUT my_index/_doc/1
{
"ip_addr": "192.168.1.1"
}
GET my_index/_search
{
"query": {
"term": {
"ip_addr": "192.168.0.0/16"
}
}
}
複製代碼
GET my_index/_mapping
# 結果
{
"my_index": {
"mappings": {
"doc": {
"properties": {
"age": {
"type": "integer"
},
"created": {
"type": "date"
},
"name": {
"type": "text"
},
"title": {
"type": "text"
}
}
}
}
}
}
複製代碼
PUT my_index
{
"mappings": {
"_doc": {
"dynamic": false,
"properties": {
"user": {
"properties": {
"name": {
"type": "text"
},
"social_networks": {
"dynamic": true,
"properties": {}
}
}
}
}
}
}
}
複製代碼
定義後my_index這個索引下不能自動新增字段,可是在user.social_networks下能夠自動新增子字段微信
DELETE my_index
PUT my_index
{
"mappings": {
"doc": {
"properties": {
"first_name": {
"type": "text",
"copy_to": "full_name"
},
"last_name": {
"type": "text",
"copy_to": "full_name"
},
"full_name": {
"type": "text"
}
}
}
}
}
PUT my_index/doc/1
{
"first_name": "John",
"last_name": "Smith"
}
GET my_index/_search
{
"query": {
"match": {
"full_name": {
"query": "John Smith",
"operator": "and"
}
}
}
}
複製代碼
# 設置 mapping
PUT my_index
{
"mappings": {
"_doc": {
"properties": {
"city": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
}
}
}
}
}
# 插入兩條數據
PUT my_index/_doc/1
{
"city": "New York"
}
PUT my_index/_doc/2
{
"city": "York"
}
# 查詢,city用於全文索引 match,city.raw用於排序和聚合
GET my_index/_search
{
"query": {
"match": {
"city": "york"
}
},
"sort": {
"city.raw": "asc"
},
"aggs": {
"Cities": {
"terms": {
"field": "city.raw"
}
}
}
}
複製代碼
strict_date_optional_time||epoch_millis
名稱 | 格式 |
---|---|
epoch_millis | 時間戳(單位:毫秒) |
epoch_second | 時間戳(單位:秒) |
date_optional_time | |
basic_date | yyyyMMdd |
basic_date_time | yyyyMMdd'T'HHmmss.SSSZ |
basic_date_time_no_millis | yyyyMMdd'T'HHmmssZ |
basic_ordinal_date | yyyyDDD |
basic_ordinal_date_time | yyyyDDD'T'HHmmss.SSSZ |
basic_ordinal_date_time_no_millis | yyyyDDD'T'HHmmssZ |
basic_time | HHmmss.SSSZ |
basic_time_no_millis | HHmmssZ |
basic_t_time | 'T'HHmmss.SSSZ |
basic_t_time_no_millis | 'T'HHmmssZ |
strict_
表示爲嚴格格式PUT my_index
{
"mappings": {
"_doc": {
"properties": {
"manager": {
"properties": {
"age": { "type": "integer" },
"name": { "type": "text" }
}
},
"employees": {
"type": "nested",
"properties": {
"age": { "type": "integer" },
"name": { "type": "text" }
}
}
}
}
}
}
PUT my_index/_doc/1
{
"region": "US",
"manager": {
"name": "Alice White",
"age": 30
},
"employees": [
{
"name": "John Smith",
"age": 34
},
{
"name": "Peter Brown",
"age": 26
}
]
}
複製代碼
與 analyzer 相似,只不過 analyzer 用於 text 類型字段,分詞產生多個 token,而 normalizer 用於 keyword 類型,只產生一個 token(整個字段的值做爲一個token,而不是分詞拆分爲多個token)app
定義一個自定義 normalizer,使用大寫uppercase過濾器elasticsearch
PUT test_index_4
{
"settings": {
"analysis": {
"normalizer": {
"my_normalizer": {
"type": "custom",
"char_filter": [],
"filter": ["uppercase", "asciifolding"]
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"foo": {
"type": "keyword",
"normalizer": "my_normalizer"
}
}
}
}
}
# 插入數據
POST test_index_4/_doc/1
{
"foo": "hello world"
}
POST test_index_4/_doc/2
{
"foo": "Hello World"
}
POST test_index_4/_doc/3
{
"foo": "hello elasticsearch"
}
# 搜索hello,結果爲空,而不是3條!!
GET test_index_4/_search
{
"query": {
"match": {
"foo": "hello"
}
}
}
# 搜索 hello world,結果2條,1 和 2
GET test_index_4/_search
{
"query": {
"match": {
"foo": "hello world"
}
}
}
複製代碼
ES是依靠JSON文檔的字段類型來實現自動識別字段類型,支持的類型以下:
JSON 類型 | ES 類型 |
---|---|
null | 忽略 |
boolean | boolean |
浮點類型 | float |
整數 | long |
object | object |
array | 由第一個非 null 值的類型決定 |
string | 匹配爲日期則設爲date類型(默認開啓); 匹配爲數字則設置爲 float或long類型(默認關閉); 設爲text類型,並附帶keyword的子字段 |
舉栗子
POST my_index/doc
{
"username":"whirly",
"age":22,
"birthday":"1995-01-01"
}
GET my_index/_mapping
# 結果
{
"my_index": {
"mappings": {
"doc": {
"properties": {
"age": {
"type": "long"
},
"birthday": {
"type": "date"
},
"username": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
複製代碼
# 自定義日期識別格式
PUT my_index
{
"mappings": {
"_doc": {
"dynamic_date_formats": ["MM/dd/yyyy"]
}
}
}
# 關閉日期自動識別機制
PUT my_index
{
"mappings": {
"_doc": {
"date_detection": false
}
}
}
複製代碼
容許根據ES自動識別的數據類型、字段名等來動態設定字段類型,能夠實現以下效果:
"dynamic_templates": [
{
"my_template_name": {
... match conditions ...
"mapping": { ... }
}
},
...
]
複製代碼
匹配規則通常有以下幾個參數:
# double類型的字段設定爲float以節省空間
PUT my_index
{
"mappings": {
"_doc": {
"dynamic_templates": [
{
"integers": {
"match_mapping_type": "double",
"mapping": {
"type": "float"
}
}
}
]
}
}
}
複製代碼
# 建立索引模板,匹配 test-index-map 開頭的索引
PUT _template/template_1
{
"index_patterns": ["test-index-map*"],
"order": 2,
"settings": {
"number_of_shards": 1
},
"mappings": {
"doc": {
"_source": {
"enabled": false
},
"properties": {
"name": {
"type": "keyword"
},
"created_at": {
"type": "date",
"format": "YYYY/MM/dd HH:mm:ss"
}
}
}
}
}
# 插入一個文檔
POST test-index-map_1/doc
{
"name" : "小旋鋒",
"created_at": "2018/08/16 20:11:11"
}
# 獲取該索引的信息,能夠發現 settings 和 mappings 和索引模板裏設置的同樣
GET test-index-map_1
# 刪除
DELETE /_template/template_1
# 查詢
GET /_template/template_1
複製代碼
更多內容請訪問個人我的網站: laijianfeng.org
參考文檔:
- elasticsearch 官方文檔
- 慕課網 Elastic Stack從入門到實踐
歡迎關注個人微信公衆號