Elasticsearch由淺入深(九)搜索引擎:query DSL、filter與query、query搜索實戰

search api的基本語法

語法概要:web

GET /_search
{}
GET /index1,index2/type1,type2/_search
{}
GET /_search
{
  "from": 0,
  "size": 10
}

http協議中get是否能夠帶上request body?sql

HTTP協議,通常不容許get請求帶上request body,可是由於get更加適合描述查詢數據的操做,所以仍是這麼用了。api

不少瀏覽器,或者是服務器,也都支持GET+request body模式瀏覽器

若是遇到不支持的場景,也能夠用POST /_search服務器

GET /_search?from=0&size=10

POST /_search
{
  "from":0,
  "size":10
}

query DSL

一個例子讓你明白什麼是query DSL

GET /_search
{
    "query": {
        "match_all": {}
    }
}

Query DSL的基本語法

GET /{index}/_search/{type}
{
    "各類條件"
}

示例:app

GET /test_index/test_type/_search 
{
  "query": {
    "match": {
      "test_field": "test"
    }
  }
}


{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0.843298,
    "hits": [
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "6",
        "_score": 0.843298,
        "_source": {
          "test_field": "test test"
        }
      },
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "8",
        "_score": 0.43445712,
        "_source": {
          "test_field": "test client 2"
        }
      },
      {
        "_index": "test_index",
        "_type": "test_type",
        "_id": "7",
        "_score": 0.25316024,
        "_source": {
          "test_field": "test client 1"
        }
      }
    ]
  }
}

組合多個搜索條件

搜索需求:title必須包含elasticsearch,content能夠包含elasticsearch也能夠不包含,author_id必須不爲111nosql

構造數據:elasticsearch

PUT /website/article/1
{
  "title":"my elasticsearch article",
  "content":"es is very bad",
  "author_id":110
}

PUT /website/article/2
{
  "title":"my hadoop article",
  "content":"hadoop is very bad",
  "author_id":111
}

PUT /website/article/3
{
  "title":"my hadoop article",
  "content":"hadoop is very good",
  "author_id":111
}

組合查詢:ide

GET /website/article/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "title": "elasticsearch"
          }
        }
      ],
      "should": [
        {
          "match": {
            "content": "elasticsearch"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "author_id": 111
          }
        }
      ]
    }
  }
}

查詢結果:oop

{
  "took": 7,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.25316024,
    "hits": [
      {
        "_index": "website",
        "_type": "article",
        "_id": "1",
        "_score": 0.25316024,
        "_source": {
          "title": "my elasticsearch article",
          "content": "es is very bad",
          "author_id": 110
        }
      }
    ]
  }
}
View Code

filter與query

初始化數據:

PUT /company/employee/2
{
  "address": {
    "country": "china",
    "province": "jiangsu",
    "city": "nanjing"
  },
  "name": "tom",
  "age": 30,
  "join_date": "2016-01-01"
}

PUT /company/employee/3
{
  "address": {
    "country": "china",
    "province": "shanxi",
    "city": "xian"
  },
  "name": "marry",
  "age": 35,
  "join_date": "2015-01-01"
}

搜索請求:年齡必須大於等於30,同時join_date必須是2016-01-01

GET /company/employee/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "join_date": "2016-01-01"
          }
        }
      ],
      "filter": {
        "range": {
          "age": {
            "gte": 30
          }
        }
      }
    }
  }
}

filter與query對比大揭祕

  • filter,僅僅只是按照搜索條件過濾出須要的數據而已,不計算任何相關度分數,對相關度沒有任何影響
  • query,會去計算每一個document相對於搜索條件的相關度,並按照相關度進行排序

通常來講,若是你是在進行搜索,須要將最匹配搜索條件的數據先返回,那麼用query;若是你只是要根據一些條件篩選出一部分數據,不關注其排序,那麼用filter

除非是你的這些搜索條件,你但願越符合這些搜索條件的document越排在前面返回,那麼這些搜索條件要放在query中;若是你不但願一些搜索條件來影響你的document排序,那麼就放在filter中便可

filter與query性能

  • filter,不須要計算相關度分數,不須要按照相關度分數進行排序,同時還有內置的自動cache最常使用filter的數據
  • query,相反,要計算相關度分數,按照分數進行排序,並且沒法cache結果

Elasticsearch 實戰各類query搜索

各類query搜索語法

  • match_all

    GET /_search
    {
        "query": {
            "match_all": {}
        }
    }
  • match
    GET /{index}/_search
    {
      "query": {
        "match": {
          "FIELD": "TEXT"
        }
      }
    }
  • multi match

    GET /{index}/_search
    {
      "query": {
        "multi_match": {
          "query": "",
          "fields": []
        }
      }
    }

    示例

    GET /test_index/test_type/_search
    {
      "query": {
        "multi_match": {
          "query": "test",
          "fields": ["test_field", "test_field1"]
        }
      }
    }
    View Code
  • range query
    GET /{index}/_search
    {
      "query": {
        "range": {
          "FIELD": {
            "gte": 10,
            "lte": 20
          }
        }
      }
    }

    示例

    GET /company/employee/_search 
    {
      "query": {
        "range": {
          "age": {
            "gte": 30
          }
        }
      }
    }
    View Code
  • term query(與match相比不分詞)
    GET /{index}/_search
    {
      "query": {
        "term": {
          "FIELD": {
            "value": "VALUE"
          }
        }
      }
    }

    示例

    GET /test_index/test_type/_search 
    {
      "query": {
        "term": {
          "test_field": "test hello"
        }
      }
    }
    View Code
  • terms query

    GET /{index}/_search
    {
      "query": {
        "terms": {
          "FIELD": [
            "VALUE1",
            "VALUE2"
          ]
        }
      }
    }

    示例

    GET /_search
    {
        "query": { "terms": { "tag": [ "search", "full_text", "nosql" ] }}
    }
    View Code
  • exist query
    GET /{index}/_search
    {
      "query": {
        "exists": {
           "field": ""
        }
      }
    }

多搜索條件組合查詢

  • bool: must, must_not, should, filter

    每一個子查詢都會計算一個document針對它的相關度分數,而後bool綜合全部分數,合併爲一個分數,固然filter是不會計算分數的。

    GET /company/employee/_search
    {
      "query": {
        "constant_score": {
          "filter": {
            "range": {
              "age": {
                "gte": 30
              }
            }
          }
        }
      }
    }

     

定位不合法的搜索

通常用在那種特別複雜龐大的搜索下,好比你一會兒寫了上百行的搜索,這個時候能夠先用validate api去驗證一下,搜索是否合法

GET /test_index/test_type/_validate/query?explain
{
  "query": {
    "math": {
      "test_field": "test"
    }
  }
}

{
  "valid": false,
  "error": "org.elasticsearch.common.ParsingException: no [query] registered for [math]"
}

正常數據

GET /test_index/test_type/_validate/query?explain
{
  "query":{
    "match":{
      "test_field":"test"
    }
  }
}


{
  "valid": true,
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "explanations": [
    {
      "index": "test_index",
      "valid": true,
      "explanation": "+test_field:test #(#_type:test_type)"
    }
  ]
}

定製搜索結果的排序規則

默認狀況下,返回的document是按照_score降序排列的。若是咱們想本身定義排序規則怎麼辦,此時只須要使用sort便可

語法:

# 主要語法
"sort": [
    {
      "FIELD": {
        "order": "desc"
      }
    }
  ]
# 總體位置
GET /{index}/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "exists": {
          "field": ""
        }
      },
      "boost": 1.2
    }
  },
  "sort": [
    {
      "FIELD": {
        "order": "desc"
      }
    }
  ]
}

示例:

GET company/employee/_search
{
  "query": {
    "constant_score": {
      "filter": {
        "range": {
          "age": {
            "gte": 30
          }
        }
      }
    }
  },
  "sort": [
    {
      "join_date": {
        "order": "asc"
      }
    }
  ]
}

將一個field索引兩次來解決字符串排序問題

若是某個字段的類型是text,在建立索引的時候,針對每一個document,對應的這個text字段都會對內容進行分詞。因爲ES不容許對已經存在的field的類型進行修改,就會致使該字段一直都是會被分詞,那麼若是以後有需求想對該字段排序,就不行了。具體看下面展現的示例。

# 刪除原來的刪除索引
DELETE /website

# 手動創建索引 
PUT /website
{
  "mappings": {
    "article": {
      "properties": {
        "title":{
          "type": "text",
          "fields": {
            "raw":{
              "type": "string",
              "index": "not_analyzed"
            }
          },
          "fielddata": true
        },
        "content":{
          "type": "text"
        },
        "post_date":{
          "type": "date"
        },
        "author_id":{
          "type": "long"
        }
      }
    }
  }
}

插入模擬數據 

PUT /website/article/1
{
  "title": "second article",
  "content": "this is my second article",
  "post_date": "2017-01-01",
  "author_id": 110
}

PUT /website/article/2
{
  "title": "first article",
  "content": "this is my first article",
  "post_date": "2017-02-01",
  "author_id": 110
}

PUT /website/article/3
{
  "title": "third article",
  "content": "this is my third article",
  "post_date": "2017-03-01",
  "author_id": 110
}

按照不分詞排序

GET /website/article/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "title.raw": {
        "order": "desc"
      }
    }
  ]
}
相關文章
相關標籤/搜索