solr中facet及facet.pivot理解(整合兩篇文章保留參考)

Facet['fæsɪt]很難翻譯,只能靠例子來理解了。Solr做者Yonik Seeley也給出更爲直接的名字:導航(Guided Navigation)、參數化查詢(Paramatic Search)。json

image

上面是比較直接的Faceted Search例子,品牌、產品特徵、賣家,均是 Facet 。而Apple、Lenovo等品牌,就是 Facet values 或者說 Constraints ,而Facet values所帶的統計值就是 Facet count/Constraint count 。ide

2 、Facet 使用ui

q = 超級本 
facet = true 
facet.field = 產品特性 
facet.field = 品牌 
facet.field = 賣家this

http://…/select?q=超級本&facet=true&wt=jsonspa

&facet.field=品牌&facet.field=產品特性&facet.field=賣家翻譯

也能夠提交查詢條件,設置fq(filter query)。排序

q = 電腦 
facet = true 
fq = 價格:[8000 TO *] 
facet.mincount = 1 // fq將不符合的字段過濾後,會顯示count爲0 
facet.field = 產品特性 
facet.field = 品牌 
facet.field = 賣家ip

http://…/select?q=超級本&facet=true&wt=jsonci

&fq=價格:[8000 TO *]&facet.mincount=1字符串

&facet.field=品牌&facet.field=產品特性&facet.field=賣家

"facet_counts": {
"facet_fields": {
  "品牌": [
    "Apple", 4,
    "Lenovo", 39
      …]
  "產品特性": [
    "顯卡", 42,
    "酷睿", 38
      …]
 
  …}}

若是用戶選擇了Apple這個分類,查詢條件中須要添加另一個fq查詢條件,並移除Apple所在的facet.field。

http://…/select?q=超級本&facet=true&wt=json

&fq=價格:[8000 TO *]&fq=品牌:Apple&facet.mincount=1

&facet.field= 品牌 &facet.field=產品特性&facet.field=賣家

3 、Facet 參數

facet.prefix  –   限制constaints的前綴

facet.mincount=0 –  限制constants count的最小返回值,默認爲0

facet.sort=count –  排序的方式,根據count或者index

facet.offset=0  –   表示在當前排序狀況下的偏移,能夠作分頁

facet.limit=100 –  constraints返回的數目

facet.missing=false –  是否返回沒有值的field

facet.date –  Deprecated, use facet.range

facet.query

指定一個查詢字符串做爲Facet Constraint

facet.query = rank:[* TO 20]

facet.query = rank:[21 TO *]

"facet_counts": {
"facet_fields": {
  "品牌": [
    "Apple", 4,
    "Lenovo", 10
      …]
  "產品特性": [
    "顯卡", 11,
    "酷睿", 20
      …]
 
  …}}

facet.range

http://…/select?&facet=true

&facet.range=price

&facet.range.start=5000

&facet.range.end=8000

&facet.range.gap=1000

<result numFound="27" ... />
 ...
 <lst name="facet_counts">
 <lst name="facet_queries">
   <int name="rank:[* TO 20]">2</int>
   <int name="rank:[21 TO *]">15</int>
 </lst>
...

WARNING:  range範圍是左閉右開,[start, end)

facet.pivot

這個是Solr 4.0的新特性,pivot和facet同樣難理解,仍是用例子來說吧。

Syntax:  facet.pivot=field1,field2,field3...

e.g.  facet.pivot=comment_user, grade

#docs

#docs grade:好

#docs 等級:中

#docs 等級:差

comment_user:1

10

8

1

1

comment_user:2

20

18

2

0

comment_user:3

15

12

2

1

comment_user:4

18

15

2

1

"facet_counts":{
"facet_pivot":{
 "comment_user, grade ":[{
   "field":"comment_user",
   "value":"1",
   "count":10,
   "pivot":[{
     "field":"grade",
     "value":"",
     "count":8}, {
     "field":"grade",
     "value":"",
     "count":1}, {
     "field":"grade",
     "value":"",
     "count":1}]
   }, {
     "field":" comment_user ",
     "value":"2",
     "count":20,
     "pivot":[{
      …

沒有pivot機制的話,要作到上面那點可能須要屢次查詢:

http://...q= comment&fq= grade:好&facet=true&facet.field=comment_user

http://...q=comment&fq=grade:中&facet=true&facet.field=comment_user

http://...q=comment&fq=grade:差&facet=true&facet.field=comment_user

Facet.pivot -  Computes a Matrix of Constraint Counts across multiple Facet Fields. by Yonik Seeley.

上面那個解釋很不錯,只能理解不能翻譯。

 

 

facet.pivot本身的理解,就是按照多個維度進行分組查詢,如下是本身的實戰代碼,按照newsType,property兩個維度統計:

public List<ReportNewsTypeDTO> queryNewsType(
            ReportQuery reportQuery) {    
        HttpSolrServer solrServer = SolrServer.getInstance().getServer();
        SolrQuery sQuery = new SolrQuery();
        List<ReportNewsTypeDTO> list = new ArrayList<ReportNewsTypeDTO>();
        try {
            String para = this.initReportQueryPara(reportQuery, 0);
            sQuery.setFacet(true);
            sQuery.add("facet.pivot", "newsType,property");//根據這兩維度來分組查詢
            sQuery.setQuery(para);
            QueryResponse response = solrServer.query(sQuery,SolrRequest.METHOD.POST);     
            NamedList<List<PivotField>> namedList = response.getFacetPivot();
            System.out.println(namedList);//底下爲啥要這樣判斷,把這個值打印出來,你就明白了 if(namedList != null){
                List<PivotField> pivotList = null;
                for(int i=0;i<namedList.size();i++){
                    pivotList = namedList.getVal(i);
                    if(pivotList != null){
                        ReportNewsTypeDTO dto = null;
                        for(PivotField pivot:pivotList){
                            dto = new ReportNewsTypeDTO();
                            dto.setNewsTypeId((Integer)pivot.getValue());
                            dto.setNewsTypeName(News.newsTypeMap.get((Integer)pivot.getValue()));
                            int pos = 0;
                            int neg = 0;
                            List<PivotField> fieldList = pivot.getPivot();
                            if(fieldList != null){
                                for(PivotField field:fieldList){
                                    int proValue = (Integer) field.getValue();
                                    int count = field.getCount();
                                    if(proValue == 1){
                                        pos = count;
                                    }else{
                                        neg = count;
                                    }
                                }
                            }
                            dto.setPositiveCount(pos);
                            dto.setNegativeCount(neg);
                            list.add(dto);
                        }
                    }
                }
            }

            return list;
        } catch (SolrServerException e) {
            log.error("查詢solr失敗", e);
            e.printStackTrace();
        } finally{
            solrServer.shutdown();
            solrServer = null;
        }
        return list;    
    }
namedList打印結果:
{newsType,property=
[
newsType:8 [4260] [property:1 [3698] null, property:0 [562] null], 
newsType:1 [1507] [property:1 [1389] null, property:0 [118] null], 
newsType:2 [1054] [property:1 [909] null, property:0 [145] null], 
newsType:6 [715] [property:1 [581] null, property:0 [134] null], 
newsType:4 [675] [property:1 [466] null, property:0 [209] null], 
newsType:3 [486] [property:1 [397] null, property:0 [89] null], 
newsType:7 [458] [property:1 [395] null, property:0 [63] null], 
newsType:5 [289] [property:1 [263] null, property:0 [26] null], 
newsType:9 [143] [property:1 [138] null, property:0 [5] null]
]
}
這下應該明白了。寫到這裏,忽然想到一個,全部的分組查詢統計,無論是一個維度兩個維度均可以使用face.pivot來統計,不錯的東東。
相關文章
相關標籤/搜索