tmdb 表示的是模板名稱 dmdb1 表示的是當前的索引算法
##編輯模板 POST _scripts/tmdb { "script": { "lang": "mustache", "source": { "_source": ["title", "overview"], "size": 20, "query": { "multi_match": { "query": "{{q}}", "fields": ["title", "overview"] } } } } } ## 編輯查詢 POST tmdb1/_search/template { "id": "tmdb", "params": { "q": "basketball with cartoon aliens" } }
它至關於 es 某個文檔的一個別名,能夠把多個索引放入到同一個視圖中,也能夠添加過濾器,把符合條件的索引數據 查詢出來,最後集中成一個別名,查詢該別名能夠把多個索引裏的數據都查詢出來json
#### 新增別名 POST _aliases { "actions": [ { "add": { "index": "news", "alias": "new1" } }, { "add": { "index": "blogs", "alias": "new1" } } ] } ## 查詢的時候 會吧對應的news和blogs裏的數據都查詢出來 POST new1/_search { "query": { "match_all": {} } } ### 刪除別名 POST _aliases { "actions": [ { "remove": { "index": "blogs", "alias": "new1" } }, { "remove": { "index": "news", "alias": "new1" } } ] }
新算分 = 老算分 * 投票數緩存
使用modifier 新算分 = 老算分 * log( 1+投票數)併發
引入 factor 老算分 * log( 1 + factor * 投票數)app
當前的 boostMode 都是爲 multipfy, 表示的是老算法和後邊的關係,能夠爲 sum 等等能夠查官網信息curl
max boost 表示的是 當前的分數控制的最大範圍iphone
DELETE blogs PUT blogs/_doc/1 { "title": "About popularity", "content": "In this post we wil talk about...", "votes": 0 } PUT blogs/_doc/2 { "title": "About popularity", "content": "In this post we wil talk about...", "votes": 100 } PUT blogs/_doc/3 { "title": "About popularity", "content": "In this post we wil talk about...", "votes": 1000000 } POST blogs/_search { "query": { "function_score": { "query": { "multi_match": { "query": "popularity", "fields": ["title", "content"] } }, "field_value_factor": { "field": "votes", "modifier": "log1p", "factor": 0.1 }, "boost_mode": "sum", "max_boost": 3 } } }
suggest_mode missing表示的是 若是索引中若是存在,則不提供建議,例如lucen solid中的solid; popular 表示的是 推薦出現的頻率比較高的詞例如:
,由於文檔裏面有兩個裏面有 rocks;always
表示的是 不管是否存在在索引中都會推薦
POST articles/_bulk {"index": {}} {"body": "lucene is very cool"} {"index": {}} {"body": "Elasticsearch builds on top of lucene"} {"index": {}} {"body": "Elasticsearch rocks"} {"index": {}} {"body": "elastic is the company behind ELK stack"} {"index":{}} {"body": "Elk stack rocks"} {"index":{}} {"body": "elasticsearch is rock solid"} POST articles/_search { "suggest": { "test1": { "text": "lucen solid", "term": { "field": "body", "suggest_mode": "missing" } } } }
聯想詞信息,基於fst 內存查找的方式,速度比較快,可是侷限也是 只能從首字母開始匹配
DELETE articles GET articles/_mapping PUT articles { "mappings": { "properties": { "title_completion": { "type": "completion" } } } } POST articles/_bulk {"index": {}} {"title_completion": "lucene is very cool"} {"index": {}} {"title_completion": "Elasticsearch builds on top of lucene"} {"index": {}} {"title_completion": "Elasticsearch rocks"} {"index": {}} {"title_completion": "elastic is the company behind ELK stack"} {"index":{}} {"title_completion": "Elk stack rocks"} {"index":{}} {"title_completion": "elasticsearch is rock solid"} POST articles/_search?pretty { "suggest": { "articles_suggester": { "prefix": "e", "completion": { "field": "title_completion" } } } }
completion 能夠根據分類進行查找不一樣的文檔, type爲 category 表示任意字符串
DELETE comments PUT comments { "mappings": { "properties":{ "comment_autocomplete": { "type": "completion", "contexts": [ { "type": "category", "name": "comment_category" } ] } } } } POST comments/_doc/1 { "comment": "I love the star war movies", "comment_autocomplete": { "input": ["star wars"], "contexts": { "comment_category": "movies" } } } POST comments/_doc/2 { "comment": "Where can I find a Starbucks", "comment_autocomplete": { "input": ["starbucks"], "completions": { "comment_category": "coffee" } } } POST comments/_search { "suggest": { "YOUR_SUGGESTION": { "text": "star", "completion":{ "field": "comment_autocomplete", "contexts": { "comment_category": "movies" } } } } }
elasticsearch-7.5.0/bin/elasticsearch -E -E -E -E discovery.type=single-node -E http.port=9201 -E transport.port=9301 elasticsearch-7.5.0/bin/elasticsearch -E -E -E -E discovery.type=single-node -E http.port=9202 -E transport.port=9302 elasticsearch-7.5.0/bin/elasticsearch -E -E -E -E discovery.type=single-node -E http.port=9203 -E transport.port=9303 curl -XPUT "http://localhost:9201/_cluster/settings" -H "Content-Type:application/json" -d '{"persistent":{"cluster":{"remote":{"cluster1":{"seeds":[""], "transport.ping_schedule":"30s"},"cluster2":{"seeds":[""],"transport.ping_schedule":"30s","transport.compress": true, "skip_unavailable":true},"cluster3":{"seeds":[""]}}}}}' curl -XPUT "http://localhost:9202/_cluster/settings" -H "Content-Type:application/json" -d '{"persistent":{"cluster":{"remote":{"cluster1":{"seeds":[""], "transport.ping_schedule":"30s"},"cluster2":{"seeds":[""],"transport.ping_schedule":"30s","transport.compress": true, "skip_unavailable":true},"cluster3":{"seeds":[""]}}}}}' curl -XPUT "http://localhost:9203/_cluster/settings" -H "Content-Type:application/json" -d '{"persistent":{"cluster":{"remote":{"cluster1":{"seeds":[""], "transport.ping_schedule":"30s"},"cluster2":{"seeds":[""],"transport.ping_schedule":"30s","transport.compress": true, "skip_unavailable":true},"cluster3":{"seeds":[""]}}}}}' curl -XPOST "http://localhost:9201/users/_doc" -H "Content-Type:application/json" -d '{"name":"user1", "age": 10}' curl -XPOST "http://localhost:9202/users/_doc" -H "Content-Type:application/json" -d '{"name":"user2", "age": 20}' curl -XPOST "http://localhost:9203/users/_doc" -H "Content-Type:application/json" -d '{"name":"user3", "age": 30}' 訪問方式 http://localhost:9201/cluster1:users,cluster2:users,cluster3:users/_search
node.master=false 能夠設置當前節點不能爲主節點
master not discovered or elected yet, an election requires
不能找到主節點,解決方式是 刪除 對應的data數據,可是這樣對應的 數據信息也所有刪除掉啦;能夠優先啓動 原主節點,而後再啓動刪除之後的 節點,這樣數據會從新同步過來
es 若是設置的主分片爲3 副本爲1,若是數據分佈到不一樣的機器上,若是某臺機子掛掉,則改機子裏面的數據對應的副本也會同步到其餘機器上,若是掛掉的機器是主分片,則會在副本中從新選舉 主分片
主分片建立的時候就肯定不能修改,除非刪除索引 從新錄入;
shard = hash(_routing) % number_of_primary_shards
hash 確保均勻的分佈到分片中 默認_routing 是文檔的id
能夠指定_routing 的值 這裏就是 主分片不能修改的緣由
PUT posts/_doc/100?routing=bigdata { "title": "Master Elasticsearch", "body": "Let's Rock" }
單個倒排索引表示的是一個 segment,segment是不可變動的,多個segment就是 index,他對應的是es中的分片
當有文檔寫入的時候,會生成新的 segment ,查詢的時候會查詢全部的 segments,對結果彙總 ,刪除文檔信息 保存在 .del 文件中
將index buffer 寫入到 segment的過程叫 refresh,
refresh 默認1秒執行一次,refresh成功之後就能夠被搜索到啦;
若是系統有大量的數據寫入就會有不少的 segment
index buffer 被佔滿也會觸發 refresh,默認值爲 JVM的 10%
segment 寫入磁盤的過程比較耗時,因此,先把segment寫入緩存,以開放查詢;
爲啦防止數據丟失,因此同時會寫入到 Transaction log 中,transaction log會有入盤操做,每一個分片都有一個 transaction log
這樣,若是斷電的狀況下,若是啓動先從transaction log中加載到數據,保證數據完整性
flush 默認30分鐘調用一次,首先調用 refresh 清空 index buffer;
調用 fsync, 將緩存中的 segment 寫入到磁盤,保證全部數據 進入到 transaction log中;
清空 transaction log 中的數據;
當 transaction log 滿的時候也會調用flush, transaction默認爲 512MB大小;
segment有不少,會按期進行合併;減小 segment的數量和 刪除的文件;
強制merge 經過 POST my_index/_forcemerge 進行操做
對文本排序須要設置 字段爲 fielddata 爲true 默認爲 docvalues ,更改成 field data
PUT /kibana_sample_data_ecommerce/_mapping { "properties":{ "customer_full_name" : { "type" : "text", "fielddata": true, "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } } } }
es數據是保存到多個分片上的,多個機器上的,當查詢 from 990 size 10 的時候,會在每一個分片上獲取 1000個文檔,而後經過 coordinating Node 聚合全部結果,最後再經過排序獲取前 1000個文檔,頁數越深,佔用的內存也越大,es默認限制是10000個文檔,能夠經過 index.max.result.window 來設置
POST /kibana_sample_data_ecommerce/_search { "from": 1, "size": 2, "query": { "match_all": {} } }
search_after 爲 返回的結果信息裏面的 sort信息,以此來實現分頁效果;能夠避免深度分頁問題;
//第一次請求: POST /kibana_sample_data_ecommerce/_search { "size": 2, "query": { "match_all": {} }, "sort": [ { "order_date": { "order": "desc" }, "_id":{ "order": "desc" } } ] } 返回結果 ... "sort" : [ 1581808954000, "gTTfym8BtdKew7ex1Zsk" ] //第二次請求 POST /kibana_sample_data_ecommerce/_search { "from": 1, "size": 2, "query": { "match_all": {} }, "search_after":[ 1581808954000, "gTTfym8BtdKew7ex1Zsk" ], "sort": [ { "order_date": { "order": "desc" }, "_id":{ "order": "desc" } } ] }
//設置croll保存5分鐘 POST /kibana_sample_data_ecommerce/_search?scroll=5m { "size": 1, "query": { "match_all":{} } } //吧上面結果的scrollId 獲取 再次查詢,有效爲1分鐘 POST _search/scroll { "scroll": "1m", "scroll_id": "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAADZUWSThxQkp4VUZTZy1ZZzE0OGI1OW02Zw==" }
from size | search after | scroll |
不適合深度分頁 | 能夠深度分頁,可是隻能從0開始日後查詢 | 隨機返回效率高 |
適合 前幾頁數據查詢 | 適合深度分頁查詢 | 適合所有文件獲取下載 |
es 併發採用的是樂觀鎖;
es 使用 if_seq_no和if_primary_term來更新數據的時候,當前的數據的 對應的值必須的和傳遞的值相同,不然不能更新
es 也能夠經過 version和version_type 爲鎖來控制對應的值信息
DELETE products GET products/_search //此處會返回對應的 seq_no 和 primary_term 的值,就是下面對應的值信息 PUT products/_doc/1 { "title": "iphone", "count": 100 } PUT products/_doc/1?if_seq_no=1&if_primary_term=1 { "title":"iphone1", "count": 100 }
//此處的version必須的大於當前1文檔的version不然衝突這個就是es的併發處理樂觀鎖 PUT products/_doc/1?version=6&version_type=external { "title":"iphone2", "count": 1 }
sql | es |
select count(brand) from table | metric |
group by | bucket |
聚合計算 是不能操做text類型的數據的;
terms aggregation 不能對text進行 分桶,能夠更改成
類型 能夠參考 docvalue和field data 的不區別, keyword 默認支持分桶aggs 包含 min max avg stats terms range histogram
先分組,而後獲取分組內的 top信息用的是 top_hits
獲取總分組的數量使用的是 cardinality
DELETE employees GET employees/_mapping PUT employees { "mappings": { "properties": { "age":{ "type": "integer" }, "gender": { "type": "keyword" }, "name": { "type": "keyword" }, "salary": { "type": "integer" }, "job": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above":22 } } } } } } POST employees/_bulk {"index":{"_id": "1"}} {"name":"Emma","age":"32","job":"Product Manager", "gender": "female","salary": "35000"} {"index":{"_id": "2"}} {"name":"Underwood","age":"41","job":"Dev Manager", "gender": "male","salary": "50000"} {"index":{"_id": "3"}} {"name":"Tran","age":"25","job":"Web Designer", "gender": "male","salary": "18000"} {"index":{"_id": "4"}} {"name":"Rivera","age":"26","job":"Web Designer", "gender": "female","salary": "22000"} {"index":{"_id": "5"}} {"name":"Rose","age":"25","job":"QA", "gender": "female","salary": "18000"} {"index":{"_id": "6"}} {"name":"Lucy","age":"31","job":"QA", "gender": "female","salary": "25000"} {"index":{"_id": "7"}} {"name":"Byrd","age":"27","job":"QA", "gender": "male","salary": "20000"} {"index":{"_id": "8"}} {"name":"Foster","age":"27","job":"Java Programmer", "gender": "male","salary": "20000"} {"index":{"_id": "9"}} {"name":"Gregory","age":"32","job":"Java Programmer", "gender": "male","salary": "22000"} {"index":{"_id": "10"}} {"name":"Bryant","age":"20","job":"Java Programmer", "gender": "male","salary": "9000"} {"index":{"_id": "11"}} {"name":"Jenny","age":"36","job":"Java Programmer", "gender": "female","salary": "38000"} {"index":{"_id": "12"}} {"name":"Mcdonald","age":"31","job":"Java Programmer", "gender": "male","salary": "32000"} {"index":{"_id": "13"}} {"name":"Jonthna","age":"30","job":"Java Programmer", "gender": "female","salary": "30000"} {"index":{"_id": "14"}} {"name":"Marsha","age":"32","job":"Javascript Programmer", "gender": "male","salary": "25000"} {"index":{"_id": "15"}} {"name":"King","age":"33","job":"Java Programmer", "gender": "male","salary": "28000"} {"index":{"_id": "16"}} {"name":"Mccarthy","age":"21","job":"Javascript Programmer", "gender": "male","salary": "16000"} {"index":{"_id": "17"}} {"name":"Goodwid","age":"25","job":"Javascript Programmer", "gender": "male","salary": "16000"} {"index":{"_id": "18"}} {"name":"Catherine","age":"29","job":"Javascript Programmer", "gender": "female","salary": "20000"} {"index":{"_id": "19"}} {"name":"Boone","age":"30","job":"DBA", "gender": "male","salary": "30000"} {"index":{"_id": "20"}} {"name":"Kathy","age":"29","job":"DBA", "gender": "female","salary": "20000"} POST employees/_search { "size": 0, "aggs": { "min_salary": { "min": { "field": "salary" } } } } POST employees/_search { "size": 0, "aggs": { "max_salary": { "max": { "field": "salary" } } } } POST employees/_search { "size": 0, "aggs": { "min_salay": { "min": { "field": "salary" } }, "max_salay": { "max": { "field": "salary" } }, "avg_salay": { "avg": { "field": "salary" } } } } POST employees/_search { "size": 20, "aggs": { "stats_salay":{ "stats": { "field": "salary" } } } } POST employees/_search { "size": 0, "aggs": { "jobs": { "terms": { "field": "job.keyword" } } } } //分桶返回的類別總數量 POST employees/_search { "size": 0, "aggs": { "cardinate": { "cardinality": { "field": "job.keyword" } } } } POST employees/_search { "size": 0, "aggs": { "gender": { "terms": { "field": "age", "size": 20 } } } } ### 根據不一樣的工種 年齡最大的3員工信息 POST employees/_search { "size": 0, "aggs": { "result": { "terms": { "field": "job.keyword" }, "aggs": { "old_employees": { "top_hits": { "size": 3 , "sort": [ {"age": {"order": "desc"}} ] } } } } } } #### range 分桶,指定key POST employees/_search { "size": 0, "aggs": { "range_result": { "range": { "field": "salary", "ranges": [ { "from": 0, "to": 10000 }, { "key": "1w-2w", "from": 10000, "to": 20000 }, { "key": ">2w", "from": 20000 } ] } } } } #### histogram 分桶 按照5000進行分桶統計 POST employees/_search { "size": 0, "aggs": { "result1": { "histogram": { "field": "salary", "interval": 5000, "extended_bounds": { "min": 0, "max": 100000 } } } } }
pipeline 表示的是 能夠對 聚合的結果進行二次聚合
#### 獲取term_job 分桶下的每個值的平均值中的最小值 #### term_job 表示的外部聚合 avg_salary 表示的是外部聚合的內部聚合 POST employees/_search { "size": 0, "aggs": { "term_job": { "terms": { "field": "job.keyword" }, "aggs": { "avg_salary":{ "avg": { "field": "salary" } } } }, "result":{ "min_bucket": { "buckets_path": "term_job>avg_salary" } } } }
1使用query進行查詢,當進行聚合的時候,是對query的結果進行聚合操做的; eg1
二、能夠再 aggs中使用 filter 進行過濾,同時進行agg聚合,當前是在 fillter結果中進行聚合操做,若是在filter的父級進行aggs操做的話,是操做的所有數據 eg2
三、postfield 是對 聚合結果進行篩選,查看匹配對應結果的數據 eg3
四、global 至關因而1和2的整合,當使用global的時候,進行query查詢不會對結果統計有影響 eg4
五、聚合排序的時候 能夠按照字段key和count進行排序 eg5
六、聚合排序的時候,能夠按照另外一個聚合結果進行排序 eg6
POST employees/_search { "size": 0, "query": { "range": { "age": { "gte": 20 } } }, "aggs": { "result": { "terms": { "field": "job.keyword" } } } }
#### 分爲兩種,一種是過濾結果的統計,一個是整個內容的統計,query 查詢結果只能 過濾結果的統計 POST employees/_search { "size": 0, "aggs": { "old_persion": { "filter": { "range": { "age": { "gte": 40 } } }, "aggs": { "jobs": { "terms": { "field": "job.keyword" } } } }, "all_jobs":{ "terms": { "field": "job.keyword" } } } }
POST employees/_search { "aggs": { "jobs": { "terms": { "field": "job.keyword" } } }, "post_filter": { "match":{ "job.keyword": "Web Designer" } } }
##### global 至關於上面的對全部內容統計的部分處理;此處不是用的filter方式,而是用的global的方式進行處理; 他是忽略掉啦 query的查詢條件; POST employees/_search { "size": 0, "query": { "range": { "age": { "gte": 40 } } }, "aggs": { "jobs": { "terms": { "field": "job.keyword" } }, "all":{ "global": {}, "aggs": { "all_result": { "terms": { "field": "job.keyword" } } } } } }
#### 排序順序 _key 表示的是按照key執行 _count 按照數量執行排序,順序是按照後面寫的字段優先排序,而後再按照前面寫的字段排序,當前就是 先按照 _count 再按照 _key 排序 POST employees/_search { "size": 0, "query": { "range": { "age": { "gte": 20 } } }, "aggs": { "NAME": { "terms": { "field": "job.keyword", "order": { "_key": "desc", "_count": "asc" } } } } }
#### 按照聚合結果進行排序 POST employees/_search { "size": 0, "aggs": { "jobs": { "terms": { "field": "job.keyword", "order": { "test1": "asc" } }, "aggs": { "test1": { "avg": { "field": "salary" } } } } } }
TODO 再驗證
nested 對象信息 表示的是數據查詢中包含對象的信息
eg1 查詢中 搜索
經過之前學的知識可知,只要包含查詢值,則會命中,因此能夠選中;eg2 查詢搜索
是不能夠命中的,由於使用啦 nested 表示的是一個對象,他解析成的是 兩個文檔,Keanu Reeves
和Dennis Hopper
DELETE my_movie PUT my_movie { "mappings" : { "properties" : { "actors" : { "properties" : { "first_name" : { "type" : "keyword" }, "last_name" : { "type" : "text" } } }, "title" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } } } } } } POST my_movie/_doc/1 { "title": "speed", "actors": [ { "first_name": "Keanu", "last_name": "Reeves" }, { "first_name": "Dennis", "last_name": "Hopper" } ] } POST my_movie/_search { "query": { "bool": { "must": [ {"match": { "actors.first_name": "Keanu" }}, {"match": { "actors.last_name": "Hopper" }} ] } } }
DELETE my_movie PUT my_movie { "mappings" : { "properties" : { "actors" : { "type": "nested", "properties" : { "first_name" : { "type" : "keyword" }, "last_name" : { "type" : "text" } } }, "title" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } } } } } } POST my_movie/_doc/1 { "title": "speed", "actors": [ { "first_name": "Keanu", "last_name": "Reeves" }, { "first_name": "Dennis", "last_name": "Hopper" } ] } POST my_movie/_search { "query": { "bool": { "must": [ { "nested": { "path": "actors", "query": { "bool": { "must": [ {"match": { "actors.first_name": "Keanu" }}, {"match": { "actors.last_name": "Hopper" }} ] } } } } ] } } }
DELETE my_blogs PUT my_blogs { "settings": { "number_of_shards": 2 }, "mappings": { "properties": { "blog_comments_relation":{ "type": "join", "relations": { "blog": "comment" } }, "content":{ "type": "text" }, "title":{ "type": "keyword" } } } } PUT my_blogs/_doc/blog1 { "title": "Learning Elasticsearch", "content": "learning ELK @ geektime", "blog_comments_relation": { "name": "blog" } } PUT my_blogs/_doc/blog2 { "title": "Learning Hadoop", "content": "learning Hadoop", "blog_comments_relation":{ "name": "blog" } } PUT my_blogs/_doc/comment1?routing=blog1 { "comment": "I am learning ELK", "username": "Jack", "blog_comments_relation": { "name":"comment", "parent": "blog1" } } PUT my_blogs/_doc/comment2?routing=blog2 { "comment": "I like Hadoop!!!!!", "username": "Jack", "blog_comments_relation": { "name":"comment", "parent": "blog2" } } PUT my_blogs/_doc/comment3?routing=blog2 { "comment": "Hello Hadoop", "username": "Bob", "blog_comments_relation": { "name":"comment", "parent": "blog2" } } POST my_blogs/_search { } GET my_blogs/_doc/blog2 POST my_blogs/_search { "query": { "parent_id": { "type": "comment", "id": "blog2" } } } #### 返回子文檔信息 POST my_blogs/_search { "query": { "has_parent": { "parent_type": "blog", "query": { "match": { "content": "Learning hadoop" } } } } } #### 返回父文檔信息 POST my_blogs/_search { "query": { "has_child": { "type": "comment", "query": { "match": { "username": "Bob" } } } } }
索引主分片發生變化 須要重建索引
update by query 在現有的索引上重建
reindex 在其餘索引上重建
DELETE blogs PUT blogs/_doc/1 { "content":"Hadoop is cool", "keyword": "hadoop" } GET blogs/_mapping PUT blogs/_mapping { "properties" : { "content" : { "type" : "text", "fields" : { "english" : { "type" : "text", "analyzer": "english" } } } } } PUT blogs/_doc/2 { "content": "Elasticsearch rocks", "keyword": "elasticsearch" } POST blogs/_search { "query": { "match": { "content.english": "hadoop" } } } ##### 添加索引部份內容的時候,直接 _update_by_query POST blogs/_update_by_query {} PUT blogs/_mapping { "properties": { "keyword" : { "type" : "keyword" } } } DELETE blogs_fix #### 更改類型的時候 重建索引 PUT blog_fix { "mappings": { "properties" : { "content" : { "type" : "text", "fields" : { "english" : { "type" : "text", "analyzer" : "english" }, "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "keyword" : { "type" : "keyword" } } } } GET blog_fix/_mapping ###### 重建索引,把原來的索引導入進新索引 POST _reindex { "source": { #### 原來索引名 "index": "blogs", #### 獲取匹配的索引 "query": { "match": { "content": "elasticsearch" } }, "size": 1 }, "dest": { #### 目標索引 "index": "blog_fix", #### 若是當前索引的數據存在,則拋異常,不存在的數據添加進去 #### 若是不加這個則 所有覆蓋, 可是若是原來已經存在的,添加進來的數據不存在,則直接保留 "op_type": "create" } } GET blog_fix/_doc/1 PUT blog_fix/_doc/3 { "content": "Elasticsearch rocks copy1", "keyword": "elasticsearch copy1" } DELETE blog_fix/_doc/1 POST blog_fix/_search { "size": 0, "aggs": { "blog_keyword": { "terms": { "field": "keyword", "size": 10 } } } } POST blog_fix/_search {}
至關因而一個管道,能夠對添加進去的數據進行 管道過濾處理,好比說新增字段 es,hadoop 能夠經過分割管道,在新增的時候指定分割管道,則添加的數據自動轉換成 對應的數據; 也能夠對原來的數據 指定管道的方式重建索引
### pipleline 的用法 DELETE tech_blogs PUT tech_blogs/_doc/1 { "title": "Introducing big data...", "tags": "hadoop,elasticsearch,spark", "content": "You know, for big data" } GET tech_blogs/_doc/1 #### 測試pipleline對字段的 測試效果 POST _ingest/pipeline/_simulate { "pipeline": { "description": "to split blog tags", "processors": [ { // 對字段進行分割 "split": { "field": "tags", "separator": "," } }, { // 添加字段 "set": { "field": "view", "value": "0" } } ] }, "docs": [ { "_source" : { "tags" : "hadoop,elasticsearch,spark" } } ] } //定義一個pipleine PUT _ingest/pipeline/blog_pipleline { "processors": [ { "split": { "field": "tags", "separator": "," }, "set": { "field": "view", "value": "0" } } ] } GET _ingest/pipeline/blog_pipleline // 這樣會對文案自動使用上blog_pipleline 對應的信息 POST _ingest/pipeline/blog_pipleline/_simulate { "docs": [ { "_source" : { "tags" : "hadoop,elasticsearch,spark" } } ] } POST tech_blogs/_doc/2?pipeline=blog_pipleline { "title": "Introducing cloud computering", "tags": "openstacks, k8s", "content": "You know, for cloud" } POST tech_blogs/_doc/3 { "title": "Introducing cloud computering", "tags": "openstacks, k8s", "content": "You know, for cloud" } POST tech_blogs/_search {} //執行的時候雖然已經使用 blog_pipleline 的數據會報錯,可是也會修改爲功 POST tech_blogs/_update_by_query?pipeline=blog_pipleline {} // 能夠經過這樣的方法總體更改 POST tech_blogs/_update_by_query?pipeline=blog_pipleline { "query": { "bool": { "must_not": [ { "exists": { "field": "views" } } ] } } }