es之查詢大彙總

記錄一下最近用到的es查詢,感受常見的應該都趕上了,下午抽空更新。java

又拖,一直拖,終於閒下來了,如今開始寫吧。。。python

es做爲一款基於文檔的非關係型數據庫,elasticsearch,既然帶有search,因此,能夠猜到他的檢索能力是很是出色的,今天,我想記錄一下我最近用的es的查詢分析功能,先說基本查詢,而後模糊查詢,最後說聚合查詢吧c++

一,簡單查詢

1,先看一下java api

SearchRequestBuilder searchRequestBuilder = client.prepareSearch(index);
RangeQueryBuilder rangequerybuilder = QueryBuilders
        .rangeQuery("date")
        .from(startTime).to(endTime);
TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery(key, value);
TermQueryBuilder termQueryBuilderNot = QueryBuilders.termQuery("status", "0");
searchRequestBuilder = searchRequestBuilder.setQuery(QueryBuilders.boolQuery().must(rangequerybuilder).must(termQueryBuilder).mustNot(termQueryBuilderNot))
        .setFrom(0).setSize(10);
SearchResponse searchResponse = searchRequestBuilder.execute().actionGet();

上面是java查詢es最基本的api。es6

client.prepareSearch(index);指定index,提示:es6中每一個index只能有一個type,因此查詢時type存在的意義不大,且es支持查詢時不帶typesql

RangeQueryBuilder rangequerybuilder 能夠限定範圍數據庫

TermQueryBuilder termQueryBuilder 能夠查詢等於某值的結果,拼裝searchRequestBuilder時,配合.mustNot(),must()使用api

searchRequestBuilder用於最後拼裝上面的查詢限定條件less

.setFrom(0).setSize(0)表明從查詢結果的第0個開始顯示,共顯示10個elasticsearch

.execute().actionGet();這個是真正的向es集羣發送請求的語句ui

瞭解了基本的查詢以後,咱們在說一下模糊查詢

二,模糊查詢

我用到的是正則匹配,好比實際值是https://my.oschina.net/u/3796880/blog/write/3042734,但你想經過https://my.oschina.net/u/3796880/blog/write就能夠查出來,後面的值模糊匹配,ok,開始

WildcardQueryBuilder wildcardQueryBuilder = QueryBuilders.wildcardQuery(key, value);

key是es中的字段名,value是「https://my.oschina.net/u/3796880/blog/write*」,道理懂吧。值「*oschina.net/u/3796880/blog*」也能夠查詢來,總之就是正則嘛

同理前綴查詢道理也同樣

PrefixQueryBuilder prefixQueryBuilder=QueryBuilders.prefixQuery(key,value)

我暫時沒用,想用的同窗可自行了解一下

說道這裏,我想到本身踩過的一個坑

TermsQueryBuilder termsQueryBuilder = QueryBuilders.termsQuery(key, values);
TermsQueryBuilder 支持查詢某字段的多個值,好比查詢語言是java,c,c++的數據,能夠這麼寫

String values[]={"java","c","python"};
    TermsQueryBuilder termsQueryBuilder = QueryBuilders.termsQuery("language", values);

是否是很方便?可是,有一天,需求變了,language字段不是精準查詢,要改成模糊匹配。第一反應是so eazy,把wildcardQuery改成wildcardsQuery不得了,但。。。最後發現es沒提供這個方法。。。,最後只能一個一個加了

但怎麼加呢,他們之間是或的關係,理所固然這樣寫

WildcardQueryBuilder wildcardQueryBuilder1 = QueryBuilders.wildcardQuery(key, value1);
WildcardQueryBuilder wildcardQueryBuilder2 = QueryBuilders.wildcardQuery(key, value2);
WildcardQueryBuilder wildcardQueryBuilder3 = QueryBuilders.wildcardQuery(key, value3);

最後

QueryBuilders.boolQuery().must(其餘QueryBuilders).should(wildcardQueryBuilder1).should(wildcardQueryBuilder2).should(wildcardQueryBuilder3)

但,最後發現查不到數據,最後解決辦法:須要外面包一層must

BoolQueryBuilder queryBuilder= QueryBuilders.boolQuery().should(wildcardQueryBuilder1).should(wildcardQueryBuilder2).should(wildcardQueryBuilder3);
QueryBuilders.boolQuery().must(其餘QueryBuilders).must(queryBuilder);

完美解決!

3、聚合查詢

一、某字段聚合

最多見的需求,查詢某字段分組後的個數,好比男生幾個,女生幾個,相似於sql中的group by

AggregationBuilder aggregationBuilder = AggregationBuilders.terms("sex").field("sex).size(1000);

第一個「sex」表明你給聚合完的數據起的名字

第二個「sex」,即.field("sex)表明真實的es中存在的,你想要分組的字段名

1000,表明本次查詢支持最大的buckets數量,此例中,不出意外,最大是3,非男即女或空。因此此處的1000改成3也無所謂。即有幾個分組就會有幾個buckets。

二、簡單運算

計算平均值 
AggregationBuilders.avg(本身取的名字).field(實際字段名)
計算某字段的50線,60線...95線,99線
AggregationBuilders.percentiles(本身取的名字).field(實際字段名).percentiles(50.0, 60.0, 70.0, 80.0, 90.0, 95.0, 99.0);
將某字段的值按大小分組

如,統計工資在0-100,100-150,150-200,大於200的個數 RangeAggregationBuilder rangeQueryBuilder2 = AggregationBuilders.range("tolerating").addRange(0, 10000).field("load.duration");

RangeAggregationBuilder rangeQueryBuilder1 = AggregationBuilders.range("least").addRange(0, 100).field("wages");
RangeAggregationBuilder rangeQueryBuilder2 = AggregationBuilders.range("less").addRange(100, 150).field("wages");
RangeAggregationBuilder rangeQueryBuilder3 = AggregationBuilders.range("many").addRange(150, 200).field("wages");
RangeAggregationBuilder rangeQueryBuilder4 = AggregationBuilders.range("mush").addRange(200, 10000000000).field("wages");

而後再searchRequestBuilder中加入聚合

SearchResponse searchResponse = searchRequestBuilder.addAggregation(rangeQueryBuilder1)
                                                    .addAggregation(rangeQueryBuilder2)
                                                    .addAggregation(rangeQueryBuilder3)
                                                    .addAggregation(rangeQueryBuilder4)

這裏有個不方便的地方,好比大於200的,這個很差限制,寫成【200,1000000000】確實結果大機率是對的,但加入真的有一我的,工資超級高,那你就把它,它,哈哈,落下了。怎麼辦呢?

大於200:RangeAggregationBuilder rangeQueryBuilder1 = AggregationBuilders.range("must").addUnboundedFrom("200").field("wages");
小於100:RangeAggregationBuilder rangeQueryBuilder1 = AggregationBuilders.range("must").addUnboundedTo("100").field("wages");
時間分組 
方法1:
AggregationBuilders.dateHistogram(本身取的名字).field(實際字段名).dateHistogramInterval(DateHistogramInterval.SECOND);表示按秒分組 
還有DateHistogramInterval.MINUTE,DateHistogramInterval.HOURDate,HistogramInterval.DAY
方法2:
AggregationBuilders.dateHistogram(本身取的名字).field(實際字段名).interval(毫秒值);

三、多重聚合

AggregationBuilder firstAggregationBuilder = AggregationBuilders.terms(agg1).field(agg1).size(1000);
AvgAggregationBuilder secondAggregationBuilder = AggregationBuilders.avg(agg2).field(agg2);
firstAggregationBuilder.subAggregation(secondAggregationBuilder)
最後searchRequestBuilder.addAggregation(firstAggregationBuilder)便可

4、哈哈哈

1,有不對或不懂的地方,私信我,或者評論區評論均可以

2,有實現不了的查詢需求,發出來,你們一塊兒想一想,哦了

相關文章
相關標籤/搜索