文檔元數據html
查詢集羣的名字java
⇒ curl -XGET 'http://localhost:9200'
查詢集羣的健康情況git
⇒ curl -XGET 'http://localhost:9200/_cluster/health?format=yaml'
status字段說明:github
- green 一切正常
- yellow replicas沒有分配[多是隻有單個節點],集羣正常
- red 某些數據取不到
format=yaml指定使用yaml格式輸出,方便查看web
獲取集羣的全部索引數據庫
⇒ curl -XGET 'http://localhost:9200/_cat/indices'
索引的字段apache
⇒ curl -XGET 'http://localhost:9200/mytest/_mapping?format=yaml'
結果併發
mytest: mappings: external: properties: addre: type: "string" name: type: "string"
它相似於數據庫的schema,描述文檔可能具備的字段或屬性、每一個字段的數據類型。字段對於非string類型,通常只須要設置type。string域兩重要屬性 index analyzerapp
indexcurl
1. analyzed 全文索引這個域。首先分析字符串,而後索引 2. not_analyzed 精確索引 ,不分析 3. no 此域不會被搜索
analyzer
將文本分紅四核倒排索引的獨立詞條,後將詞條統一化提升可搜索性
動態映射: 文檔中出現以前從未遇到過的字段,動態肯定數據類型,並自動把新的字段添加到類型映射
⇒ curl -XPUT 'localhost:9200/mytest'
⇒ curl -XDELETE 'localhost:9200/mytest?format=yaml'
插入單條數據
⇒ curl -XPUT 'localhost:9200/mytest/external/1?format=yaml' -d ' quote> { "name":"paxi"}'
查詢單條數據
⇒ curl -XGET 'localhost:9200/mytest/external/1?format=yaml'
刪除單條數據
curl -XDELETE 'localhost:9200/mytest/external/3?format=yaml'
curl -XGET 'localhost:9200/_analyze?format=yaml' -d ' {"papa xixi write"}'
結果爲
tokens: - token: "papa" start_offset: 3 end_offset: 7 type: "<ALPHANUM>" position: 1 - token: "xixi" start_offset: 8 end_offset: 12 type: "<ALPHANUM>" position: 2 - token: "write" start_offset: 13 end_offset: 18 type: "<ALPHANUM>" position: 3
token 表示實際存儲的詞條,position表示詞條在原始文本中的位置。能夠看出完整的文本會被切割存儲成不一樣的詞條
curl -XGET 'localhost:9200/mytest/_search?filter_path=hits.hits._source&format=yaml' -d ' { "query":{"match":{"name":"papa xixi write"}}}'
低版本沒法生效
curl -XGET 'localhost:9200/mytest/_search?format=yaml' -d ' { "query":{"match":{"name":"papa xixi write"}},"_source":["name"]}'
低版本無效,能夠用通配符
curl -XGET 'localhost:9200/mytest/_search?format=yaml' -d ' { "query":{"match":{"name":"papa xixi write"}}}'
查詢匹配的結果以下
hits: - _index: "mytest" _type: "external" _id: "11" _score: 0.6532502 _source: name: "papa xixi write" - _index: "mytest" _type: "external" _id: "4" _score: 0.22545706 _source: name: "papa xixi" - _index: "mytest" _type: "external" _id: "2" _score: 0.12845722 _source: name: "papa" - _index: "mytest" _type: "external" _id: "10" _score: 0.021688733 _source: name: "xixi"
從查詢結果,它獲取到了全部包含 papa 、 xixi和write 的詞,至關於將原來的詞拆開,而後兩個單詞作了 OR 操做,若是要所有匹配,可使用AND操做
curl -XGET 'localhost:9200/mytest/_search?format=yaml' -d ' { "query":{"match":{"name":{"query":"papa xixi write","operator":"and"}}}}' --- hits: total: 1 max_score: 0.6532502 hits: - _index: "mytest" _type: "external" _id: "11" _score: 0.6532502 _source: name: "papa xixi write"
若是隻是想提升精度
curl -XGET 'localhost:9200/mytest/_search?format=yaml' -d ' { "query":{"match":{"name":{"query":"papa xixi write","minimum_should_match":"75%"}}}}' --- hits: total: 2 max_score: 0.6532502 hits: - _index: "mytest" _type: "external" _id: "11" _score: 0.6532502 _source: name: "papa xixi write" - _index: "mytest" _type: "external" _id: "4" _score: 0.22545706 _source: name: "papa xixi"
curl -XGET 'localhost:9200/mytest/_search?format=yaml' -d ' { "query":{"term":{"name":"papa xixi write"}}}'
它的結果是什麼也沒有查到
total: 0 max_score: null hits: []
換用查詢語句
curl -XGET 'localhost:9200/mytest/_search?format=yaml' -d ' { "query":{"term":{"name":"papa"}}}'
結果爲
hits: - _index: "mytest" _type: "external" _id: "2" _score: 1.0 _source: name: "papa" - _index: "mytest" _type: "external" _id: "4" _score: 0.37158427 _source: name: "papa xixi" - _index: "mytest" _type: "external" _id: "11" _score: 0.2972674 _source: name: "papa xixi write"
match 若是在全文字段上查詢,會使用正確的分析器分析查詢字符串;若是精確值字段使用,會精確匹配。 term精確匹配,只要包含了對應的文本就能夠,不對文本分析(not_analyzed文本會精確匹配,terms 多個值只要有一個匹配就匹配);
從"papa xixi write"的存儲文本分析來看,它自己會被切割成不一樣的詞條,因此用 term查詢"papa xixi write",沒法獲取到結果,可是match確可以匹配
curl -XGET 'localhost:9200/mytest/_search?format=yaml' -d ' { "query":{"filtered":{"filter":{"range":{"name":{"gt":"w"}}}}}}'
或者
curl -XGET 'localhost:9200/mytest/_search?format=yaml' -d ' { "query":{"constant_score":{"filter":{"range":{"name":{"gt":"w"}}}}}}'
⇒ curl -XGET 'localhost:9200/_validate/query?explain&format=yaml' -d '{ "query":{{"filter":{"range":{"name":{"gt":"w"}}}}}' --- valid: false //緣由省略
使用term查詢
curl -XGET 'localhost:9200/mytest/_search?format=yaml' -d ' { "query":{"term":{"addre":"beijing"}}}'
結果爲
hits: - _index: "mytest" _type: "external" _id: "5" _score: 0.30685282 _source: addre: "beijing" - _index: "mytest" _type: "external" _id: "6" _score: 0.30685282 _source: addre: "beijing" name: "px"
轉換爲bool查詢,結果同樣
curl -XGET 'localhost:9200/mytest/_search?format=yaml' -d ' { query:{bool:{must:{match:{addre:"beijing"}}}}}'
若是隻想要最後一條
curl -XGET 'localhost:9200/mytest/_search?format=yaml' -d ' { query:{bool:{must:{match:{addre:"beijing"}},must:{match:{name:"px"}}}}}'
想要第一條
curl -XGET 'localhost:9200/mytest/_search?format=yaml' -d ' { query:{bool:{must:{match:{addre:"beijing"}},must_not:{match:{name:"px"}}}}}'
都想要
curl -XGET 'localhost:9200/mytest/_search?format=yaml' -d ' { query:{bool:{must:{match:{addre:"beijing"}},should:{match:{name:"px"}}}}}'
must的意思是當前值必須是有的,must_not必須沒有,should表示數據能夠有也能夠沒有
/ ** @param startTime 開始的時間 * @param endTime 結束的時間 * @param termAggName term過濾 * @param fieldName 要作count的字段 * @param top 返回的數量 */ RangeQueryBuilder actionPeriod = QueryBuilders.rangeQuery("myTimeField").gte(startTime).lte(endTime).format("epoch_second"); TermsBuilder termsBuilder = AggregationBuilders.terms(termAggName).field(fieldName).size(top).order(Terms.Order.count(false)); return client.prepareSearch(INDICE).setQuery(actionPeriod).addAggregation(termsBuilder).setSize(0).execute().actionGet();
order(Terms.Order.count(false)):表示降序size(top):top表示只要排序的數量
prepareSearch(INDICE):INDICE表示索引的名字
setSize(0):表示只要聚合結果
若是須要去掉某些特殊字段取值
client爲構建的ES客戶端
BoolQueryBuilder actionPeriodMustNot = QueryBuilders.boolQuery().must(QueryBuilders.rangeQuery("myTimeField").gte(startTime).lte(endTime).format("epoch_second")).mustNot(QueryBuilders.termQuery(field, value));
若是是單個字段特定的多個值
//values是個List BoolQueryBuilder actioPeriodMust = QueryBuilders.boolQuery().must(QueryBuilders.rangeQuery("myTimeField").gte(startTime).lte(endTime).format("epoch_second")).must(QueryBuilders.termsQuery(field, values));
使用結果
Terms clickCount= sr.getAggregations().get(termAggName); for (Terms.Bucket term:clickCount.getBuckets()){ int key = term.getKeyAsNumber().intValue(); //要排序字段的值 long docCount = term.getDocCount(); //數量 }
BoolQueryBuilder actioPeriodMust = QueryBuilders.boolQuery().must(QueryBuilders.rangeQuery("myTimeField").gte(startTime).lte(endTime).format("epoch_second")); DateHistogramBuilder actionInterval = AggregationBuilders.dateHistogram(dateNickName).field("myTimeField").timeZone("Asia/Shanghai"); if (timeInterval<MINUTE){ actionTimeInterval.interval(DateHistogramInterval.seconds(timeInterval)).format("HH:mm:ss"); }else if (timeInterval<HOUR){ actionTimeInterval.interval(DateHistogramInterval.minutes(timeInterval / MINUTE)).format("dd HH:mm"); }else if (timeInterval < DAY){ actionTimeInterval.interval(DateHistogramInterval.hours(timeInterval / HOUR)).format("HH:mm"); }else if (timeInterval < THIRTY_DAY){ actionTimeInterval.interval(DateHistogramInterval.days(timeInterval / DAY)); }else{ actionTimeInterval.interval(DateHistogramInterval.MONTH); } actionInterval.format("yyyy-MM-dd HH:mm:ssZ"); return client.prepareSearch(INDICE).setQuery(actioPeriodMust).addAggregation(actionInterval).setSize(0).execute().actionGet();
es自己默認設置的時間戳是 UTC形式,在國內要設置TimeZone(「Asia/Shanghai」);java的SimpleDateFormate會默認獲取虛擬機所在時區的時間戳,因此存時間的時候,最好存與時區無關的時間,再作本地化顯示
使用結果
Histogram histogram=sr.getAggregations().get(dateNickName); for(Histogram.Bucket entry:histogram.getBuckets()){ String key = entry.getKeyAsString();//時間間隔 long count = entry.getDocCount();//數量 }
至關於合併上述兩個場景
BoolQueryBuilder query = QueryBuilders.boolQuery().must(QueryBuilders.rangeQuery("myTimeField").gte(startTime).lte(endTime).format("epoch_second")) .must(QueryBuilders.termsQuery("action", orderValue)); DateHistogramBuilder actionTimeInterval = AggregationBuilders.dateHistogram(dateNickName).field("myTimeField").timeZone("Asia/Shanghai"); actionTimeInterval.subAggregation(AggregationBuilders.terms(termNickName).field("action").size(size)); return client.prepareSearch(INDICE).setQuery(query).addAggregation(actionTimeInterval).setSize(0).execute().actionGet();
使用結果
Histogram hitogram = sr.getAggregations().get(dateAggName); for (Histogram.Bucket date : hitogram.getBuckets()) { String intervalName = date.getKeyAsString(); long timeIntervalCount = date.getDocCount(); if (timeIntervalCount != 0) { Terms terms = date.getAggregations().get(termAggName); for (Terms.Bucket entry : terms.getBuckets()) { int key= entry.getKeyAsNumber().intValue(); long childCount = entry.getDocCount(); } } }
BoolQueryBuilder actionPeriodMust = QueryBuilders.boolQuery().must(QueryBuilders.termQuery(key, value)).must(QueryBuilders.rangeQuery("myTimeField").gte(startTime).lte(endTime).format("epoch_second")); return client.prepareSearch(INDICE).setQuery(actionPeriodMust).addSort(SortBuilders.fieldSort("myTimeField").order(SortOrder.ASC)).setFrom(from).setSize(size).execute().actionGet();
使用
Iterator<SearchHit> iterator = sr.getHits().iterator(); while (iterator.hasNext()) { SearchHit next = iterator.next(); JSONObject jo = JSONObject.parseObject(next.getSourceAsString()); }
BoolQueryBuilder query = QueryBuilders.boolQuery().must(QueryBuilders.rangeQuery("myTimeField").gte(startTimeInSec*1000).lte(endTimeInSec*1000).format("epoch_millis")); CardinalityBuilder fieldCardinality = AggregationBuilders.cardinality(cardinalityAggName).field(field);//field 要獲取的字段 return client.prepareSearch(INDICE).setQuery(query).addAggregation(fieldCardinality).execute().actionGet();
使用結果
Cardinality cardinality = sr.getAggregations().get(cardinalityAggName); long value = cardinality.getValue();
好比想要addr是beijing的,同時必須知足條件:name是 paxi,或者,phoneNumber是 1234567890
BoolQueryBuilder searchIdQuery = QueryBuilders.boolQuery(); BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery(); while (kvs.hasNext()){ Map.Entry<String, String> fieldValue = kvs.next(); String field=fieldValue.getKey(); String value=fieldValue.getValue(); searchIdQuery.should(QueryBuilders.termQuery(field, value)); } boolQueryBuilder.must(searchIdQuery); boolQueryBuilder.must(QueryBuilders.termsQuery(key, values)); return client.prepareSearch(INDICE).setQuery(boolQueryBuilder).execute().actionGet();
./bin/logstash -f conf/test.conf
啓動方式
kibana框中的查詢可使用LUCENE查詢語法或者是ES的查詢語句
查詢指定的字段不然使用默認字段
好比 index包含兩個字段 title , text ;text是默認字段
title:」hello world」 AND text:to 和 title:」hello world」 AND to 等效
title: hello world 查詢的則是 title爲hello的字段 text爲world的字段
te?t 匹配 text test ;表示任意一個字符
test* 匹配 test tests tester;表示0到多個字符
?和 * 不能用在第一個位置
roam~ 匹配 foam和roams 基於 Levenshtein Distance,波浪線添加在末尾。從1.9版本開始能夠追加數字表明類似度,越接近1類似度越高,好比 roam~0.8,默認是0.5
「jakarta apache」~10 匹配從jakarta到apache中間隔10個單詞
mode_date:[20020101 TO 20030101] 匹配時間在20020101到20030101之間,包括20020101和20030101
title:{Aida TO Carmen} 匹配Aida 到 Carmen之間,不包括Aida和Carmen
「[」表示包含 「{」表示不包含
關鍵字要大寫
(jakarta OR apache) AND website 組合查詢 包含 website 和 jakarta/apache
\(1\+1\)\:2
將ES命令中的 -d 後面的參數加入便可;好比curl查詢爲
curl -XGET 'localhost:9200/_search?format=yaml' -d ' { "query":{"term":{"addre":"beijing"}}}'
命令行下輸入爲:{ "query":{"term":{"addre":"beijing"}}}