ELasticsearch 5.X以後的字段類型再也不支持string,由text或keyword取代。 若是仍使用string,會給出警告。前端
PUT my_index { "mappings": { "my_type": { "properties": { "title": { "type": "string" } } } } }
#! Deprecation: The [string] field is deprecated, please use [text] or [keyword] instead on [title] { "acknowledged": true, "shards_acknowledged": true }
text取代了string,當一個字段是要被全文搜索的,好比Email內容、產品描述,應該使用text類型。設置text類型之後,字段內容會被分析,在生成倒排索引之前,字符串會被分析器分紅一個一個詞項。text類型的字段不用於排序,不多用於聚合(termsAggregation除外)。node
把full_name字段設爲text類型的Mapping以下:json
PUT my_index { "mappings": { "my_type": { "properties": { "full_name": { "type": "text" } } } } }
keyword類型適用於索引結構化的字段,好比email地址、主機名、狀態碼和標籤。若是字段須要進行過濾(好比查找已發佈博客中status屬性爲published的文章)、排序、聚合。keyword類型的字段只能經過精確值搜索到。數組
對於數字類型,ELasticsearch支持如下幾種:app
類型 | 取值範圍 |
---|---|
long | -2^63至2^63-1 |
integer | -2^31至2^31-1 |
short | -32,768至32768 |
byte | -128至127 |
double | 64位雙精度IEEE 754浮點類型 |
float | 32位單精度IEEE 754浮點類型 |
half_float | 16位半精度IEEE 754浮點類型 |
scaled_float | 縮放類型的的浮點數(好比價格只須要精確到分,price爲57.34的字段縮放因子爲100,存起來就是5734) scaled縮放 [skeɪld] |
對於float、half_float和scaled_float,-0.0和+0.0是不一樣的值,使用term查詢查找-0.0不會匹配+0.0,一樣range查詢中上邊界是-0.0不會匹配+0.0,下邊界是+0.0不會匹配-0.0。工具
對於數字類型的數據,選擇以上數據類型的注意事項:ui
例子:編碼
POST my_index { "mappings": { "my_type": { "properties": { "number_of_bytes": { "type": "integer" }, "time_in_seconds": { "type": "float" }, "price": { "type": "scaled_float", "scaling_factor": 100 } } } } }
JSON天生具備層級關係,文檔會包含嵌套的對象:加密
POST my_index/my_type/1 { "region": "US", "manager": { "age": 30, "name": { "first": "John", "last": "Smith" } } }
上面的文檔中,總體是一個JSON,JSON中包含一個manager,manager又包含一個name。最終,文檔會被索引成一平的key-value對:spa
{ "region": "US", "manager.age": 30, "manager.name.first": "John", "manager.name.last": "Smith" }
上面文檔結構的Mapping以下:
PUT my_index { "mappings": { "my_type": { "properties": { "region": { "type": "keyword" }, "manager": { "properties": { "age": { "type": "integer" }, "name": { "properties": { "first": { "type": "text" }, "last": { "type": "text" } } } } } } } } }
JSON中沒有日期類型,因此在ELasticsearch中,日期類型能夠是如下幾種:
日期格式能夠自定義,若是沒有自定義,默認格式以下:
"strict_date_optional_time||epoch_millis"
PUT my_index { "mappings": { "my_type": { "properties": { "date": { "type": "date" } } } } } POST my_index/my_type/1 { "date": "2015-01-01" } POST my_index/my_type/2 { "date": "2015-01-01T12:10:30Z" } POST my_index/my_type/3 { "date": 1420070400001 } POST my_index/_search { "sort": { "date": "asc"} }
查看三個日期類型:
{ "took": 0, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 3, "max_score": 1, "hits": [ { "_index": "my_index", "_type": "my_type", "_id": "2", "_score": 1, "_source": { "date": "2015-01-01T12:10:30Z" } }, { "_index": "my_index", "_type": "my_type", "_id": "1", "_score": 1, "_source": { "date": "2015-01-01" } }, { "_index": "my_index", "_type": "my_type", "_id": "3", "_score": 1, "_source": { "date": 1420070400001 } } ] } }
排序結果:
{ "took": 2, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 3, "max_score": null, "hits": [ { "_index": "my_index", "_type": "my_type", "_id": "1", "_score": null, "_source": { "date": "2015-01-01" }, "sort": [ 1420070400000 ] }, { "_index": "my_index", "_type": "my_type", "_id": "3", "_score": null, "_source": { "date": 1420070400001 }, "sort": [ 1420070400001 ] }, { "_index": "my_index", "_type": "my_type", "_id": "2", "_score": null, "_source": { "date": "2015-01-01T12:10:30Z" }, "sort": [ 1420114230000 ] } ] } }
ELasticsearch沒有專用的數組類型,默認狀況下任何字段均可以包含一個或者多個值,可是一個數組中的值要是同一種類型。例如:
注意事項:
binary類型接受base64編碼的字符串,默認不存儲也不可搜索。
PUT my_index { "mappings": { "my_type": { "properties": { "name": { "type": "text" }, "blob": { "type": "binary" } } } } } POST my_index/my_type/1 { "name": "Some binary blob", "blob": "U29tZSBiaW5hcnkgYmxvYg==" }
搜索blog字段:
POST my_index/_search { "query": { "match": { "blob": "test" } } } 返回結果: { "error": { "root_cause": [ { "type": "query_shard_exception", "reason": "Binary fields do not support searching", "index_uuid": "fgA7UM5XSS-56JO4F4fYug", "index": "my_index" } ], "type": "search_phase_execution_exception", "reason": "all shards failed", "phase": "query", "grouped": true, "failed_shards": [ { "shard": 0, "index": "my_index", "node": "3dQd1RRVTMiKdTckM68nPQ", "reason": { "type": "query_shard_exception", "reason": "Binary fields do not support searching", "index_uuid": "fgA7UM5XSS-56JO4F4fYug", "index": "my_index" } } ] }, "status": 400 }
Base64加密、解碼工具:http://www1.tc711.com/tool/BASE64.htm
ip類型的字段用於存儲IPV4或者IPV6的地址。
PUT my_index { "mappings": { "my_type": { "properties": { "ip_addr": { "type": "ip" } } } } } POST my_index/my_type/1 { "ip_addr": "192.168.1.1" } POST my_index/_search { "query": { "term": { "ip_addr": "192.168.0.0" } } }
range類型支持如下幾種:
類型 | 範圍 |
---|---|
integer_range | -2^31至2^31-1 |
float_range | 32-bit IEEE 754 |
long_range | -2^63至2^63-1 |
double_range | 64-bit IEEE 754 |
date_range | 64位整數,毫秒計時 |
range類型的使用場景:好比前端的時間選擇表單、年齡範圍選擇表單等。
例子:
PUT range_index { "mappings": { "my_type": { "properties": { "expected_attendees": { "type": "integer_range" }, "time_frame": { "type": "date_range", "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis" } } } } } POST range_index/my_type/1 { "expected_attendees" : { "gte" : 10, "lte" : 20 }, "time_frame" : { "gte" : "2015-10-31 12:00:00", "lte" : "2015-11-01" } }
上面代碼建立了一個range_index索引,expected_attendees的人數爲10到20,時間是2015-10-31 12:00:00至2015-11-01。
查詢:
"query" : {
"range" : {
"time_frame" : {
"gte" : "2015-08-01",
"lte" : "2015-12-01",
"relation" : "within"
}
}
}
POST range_index/_search { "query" : { "range" : { "time_frame" : { "gte" : "2015-08-01", "lte" : "2015-12-01", "relation" : "within" } } } }
查詢結果:
{ "took": 2, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "range_index", "_type": "my_type", "_id": "1", "_score": 1, "_source": { "expected_attendees": { "gte": 10, "lte": 20 }, "time_frame": { "gte": "2015-10-31 12:00:00", "lte": "2015-11-01" } } } ] } }
nested嵌套類型是object中的一個特例,可讓array類型的Object獨立索引和查詢。 使用Object類型有時會出現問題,好比文檔 my_index/my_type/1的結構以下:
POST my_index/my_type/1 { "group" : "fans", "user" : [ { "first" : "John", "last" : "Smith" }, { "first" : "Alice", "last" : "White" } ] }
user字段會被動態添加爲Object類型。
最後會被轉換爲如下平整的形式:
{ "group" : "fans", "user.first" : [ "alice", "john" ], "user.last" : [ "smith", "white" ] }
user.first和user.last會被平鋪爲多值字段,Alice和White之間的關聯關係會消失。上面的文檔會不正確的匹配如下查詢(雖然能搜索到,實際上不存在Alice Smith):
POST my_index/_search { "query": { "bool": { "must": [ { "match": { "user.first": "Alice" }}, { "match": { "user.last": "Smith" }} ] } } }
使用nested字段類型解決Object類型的不足:
PUT my_index { "mappings": { "my_type": { "properties": { "user": { "type": "nested" } } } } } POST my_index/my_type/1 { "group" : "fans", "user" : [ { "first" : "John", "last" : "Smith" }, { "first" : "Alice", "last" : "White" } ] } POST my_index/_search { "query": { "nested": { "path": "user", "query": { "bool": { "must": [ { "match": { "user.first": "Alice" }}, { "match": { "user.last": "Smith" }} ] } } } } } POST my_index/_search { "query": { "nested": { "path": "user", "query": { "bool": { "must": [ { "match": { "user.first": "Alice" }}, { "match": { "user.last": "White" }} ] } }, "inner_hits": { "highlight": { "fields": { "user.first": {} } } } } } }
token_count用於統計詞頻:
PUT my_index { "mappings": { "my_type": { "properties": { "name": { "type": "text", "fields": { "length": { "type": "token_count", "analyzer": "standard" } } } } } } } POST my_index/my_type/1 { "name": "John Smith" } POST my_index/my_type/2 { "name": "Rachel Alice Williams" } POST my_index/_search { "query": { "term": { "name.length": 3 } } }
地理位置信息類型用於存儲地理位置信息的經緯度:
PUT my_index { "mappings": { "my_type": { "properties": { "location": { "type": "geo_point" } } } } } POST my_index/my_type/1 { "text": "Geo-point as an object", "location": { "lat": 41.12, "lon": -71.34 } } POST my_index/my_type/2 { "text": "Geo-point as a string", "location": "41.12,-71.34" } POST my_index/my_type/3 { "text": "Geo-point as a geohash", "location": "drm3btev3e86" } POST my_index/my_type/4 { "text": "Geo-point as an array", "location": [ -71.34, 41.12 ] } POST my_index/_search { "query": { "geo_bounding_box": { "location": { "top_left": { "lat": 42, "lon": -72 }, "bottom_right": { "lat": 40, "lon": -74 } } } } }