Elasticsearch 的搜索方法

搜索數據創建html

ElasticSearch最誘人的地方便是爲咱們提供了方便快捷的搜索功能,咱們首先嚐試使用以下的命令建立測試文檔:json

curl -XPUT "http://localhost:9200/movies/movie/1" -d'
{
    "title": "The Godfather",
    "director": "Francis Ford Coppola",
    "year": 1972,
    "genres": ["Crime", "Drama"]
}'

curl -XPUT "http://localhost:9200/movies/movie/2" -d'
{
    "title": "Lawrence of Arabia",
    "director": "David Lean",
    "year": 1962,
    "genres": ["Adventure", "Biography", "Drama"]
}'

curl -XPUT "http://localhost:9200/movies/movie/3" -d'
{
    "title": "To Kill a Mockingbird",
    "director": "Robert Mulligan",
    "year": 1962,
    "genres": ["Crime", "Drama", "Mystery"]
}'

curl -XPUT "http://localhost:9200/movies/movie/4" -d'
{
    "title": "Apocalypse Now",
    "director": "Francis Ford Coppola",
    "year": 1979,
    "genres": ["Drama", "War"]
}'

curl -XPUT "http://localhost:9200/movies/movie/5" -d'
{
    "title": "Kill Bill: Vol. 1",
    "director": "Quentin Tarantino",
    "year": 2003,
    "genres": ["Action", "Crime", "Thriller"]
}'

curl -XPUT "http://localhost:9200/movies/movie/6" -d'
{
    "title": "The Assassination of Jesse James by the Coward Robert Ford",
    "director": "Andrew Dominik",
    "year": 2007,
    "genres": ["Biography", "Crime", "Drama"]
}'

這裏須要瞭解的是,ElasticSearch爲咱們提供了通用的_bulk端點來在單請求中完成多文檔建立操做,不過這裏爲了簡單起見仍是分爲了多個請求進行執行。數組

ElasticSearch中搜索主要是基於_search這個端點進行的,其標準請求格式爲:<index>/<type>/_search</type></index>,其中index與type都是可選的。
換言之,咱們能夠以以下幾種方式發起請求:sass

響應內容會包含文檔的元信息,文檔的原始數據存在 _source 字段中。app

檢索某個文檔
咱們也能夠直接檢索出文檔的 _source 字段,以下:curl

curl -XGET 'http://localhost:9200/movies/movie/1/_source'

返回的結果:elasticsearch

{
    "title": "The Godfather",
    "director": "Francis Ford Coppola",
    "year": 1972,
    "genres": ["Crime", "Drama"]
}

檢索全部文檔
咱們可使用 _search 這個 API 檢索出全部的文檔,命令以下:ide

curl -XGET 'http://localhost:9200/movies/movie/_search'

返回的結果:post

{
    "took": 5,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 6,
        "max_score": 1,
        "hits": [
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "5",
                "_score": 1,
                "_source": {
                    "title": "Kill Bill: Vol. 1",
                    "director": "Quentin Tarantino",
                    "year": 2003,
                    "genres": [
                        "Action",
                        "Crime",
                        "Thriller"
                    ]
                }
            },
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "2",
                "_score": 1,
                "_source": {
                    "title": "Lawrence of Arabia",
                    "director": "David Lean",
                    "year": 1962,
                    "genres": [
                        "Adventure",
                        "Biography",
                        "Drama"
                    ]
                }
            },
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "4",
                "_score": 1,
                "_source": {
                    "title": "Apocalypse Now",
                    "director": "Francis Ford Coppola",
                    "year": 1979,
                    "genres": [
                        "Drama",
                        "War"
                    ]
                }
            },
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "6",
                "_score": 1,
                "_source": {
                    "title": "The Assassination of Jesse James by the Coward Robert Ford",
                    "director": "Andrew Dominik",
                    "year": 2007,
                    "genres": [
                        "Biography",
                        "Crime",
                        "Drama"
                    ]
                }
            },
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "1",
                "_score": 1,
                "_source": {
                    "title": "The Godfather",
                    "director": "Francis Ford Coppola",
                    "year": 1972,
                    "genres": [
                        "Crime",
                        "Drama"
                    ]
                }
            },
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "3",
                "_score": 1,
                "_source": {
                    "title": "To Kill a Mockingbird",
                    "director": "Robert Mulligan",
                    "year": 1962,
                    "genres": [
                        "Crime",
                        "Drama",
                        "Mystery"
                    ]
                }
            }
        ]
    }
}

能夠看到,hits 這個 object 包含了 totalhits 數組等字段,其中,hits 數組包含了全部的文檔,這裏只有兩個文檔,total 代表了文檔的數量,默認狀況下會返回前 10 個結果。咱們也能夠設定 From/Size 參數來獲取某一範圍的文檔,可參考這裏,好比:測試

curl -XGET 'http://localhost:9200/movies/movie/_search?from=1&size=2'

返回的結果以下:

{
    "took": 6,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 6,
        "max_score": 1,
        "hits": [
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "2",
                "_score": 1,
                "_source": {
                    "title": "Lawrence of Arabia",
                    "director": "David Lean",
                    "year": 1962,
                    "genres": [
                        "Adventure",
                        "Biography",
                        "Drama"
                    ]
                }
            },
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "4",
                "_score": 1,
                "_source": {
                    "title": "Apocalypse Now",
                    "director": "Francis Ford Coppola",
                    "year": 1979,
                    "genres": [
                        "Drama",
                        "War"
                    ]
                }
            }
        ]
    }
}

檢索某些字段

有時候,咱們只需檢索文檔的個別字段,這時可使用 _source 參數,多個字段可使用逗號分隔,以下所示:

curl -XGET 'http://localhost:9200/movies/movie/1?_source=title,director'

返回的結果:

{
    "_index": "movies",
    "_type": "movie",
    "_id": "1",
    "_version": 1,
    "found": true,
    "_source": {
        "director": "Francis Ford Coppola",
        "title": "The Godfather"
    }
}

query string 搜索
query string 搜索以 q=field:value 的形式進行查詢,好比查詢 title 字段含有 godfather 的電影:

curl -XGET 'http://localhost:9200/movies/movie/_search?q=title:godfather'

返回的結果:

{
    "took": 6,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 1,
        "max_score": 0.25811607,
        "hits": [
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "1",
                "_score": 0.25811607,
                "_source": {
                    "title": "The Godfather",
                    "director": "Francis Ford Coppola",
                    "year": 1972,
                    "genres": [
                        "Crime",
                        "Drama"
                    ]
                }
            }
        ]
    }
}

DSL 搜索
上面的 query string 搜索比較輕量級,只適用於簡單的場合。Elasticsearch 提供了更爲強大的 DSL(Domain Specific Language)查詢語言,適用於複雜的搜索場景,好比全文搜索。咱們能夠將上面的 query string 搜索轉換爲 DSL 搜索,以下:

GET /movies/movie/_search
{
    "query" : {
        "match" : {
            "title" : "godfather"
        }
    }
}

使用 curl請求:

curl -X GET "127.0.0.1:9200/movies/movie/_search" -d '{"query": {"match": {"title": "godfather"}}}'

最簡單的查詢請求便是全文檢索,譬如咱們這裏須要搜索關鍵字:godfather:

搜索包含「godfather」的關鍵字:

curl -XPOST "http://localhost:9200/_search" -d'
{
    "query": {
        "query_string": {
            "query": "godfather",
        }
    }
}'

在title中搜索包含「godfather」的關鍵字

curl -XPOST "http://localhost:9200/_search" -d'
{
    "query": {
        "query_string": {
            "query": "godfather",
            "fields": ["title"]
        }
    }
}'

返回的結果:

{
    "took": 24,
    "timed_out": false,
    "_shards": {
        "total": 25,
        "successful": 25,
        "failed": 0
    },
    "hits": {
        "total": 1,
        "max_score": 0.25811607,
        "hits": [
            {
                "_index": "movies",
                "_type": "movie",
                "_id": "1",
                "_score": 0.25811607,
                "_source": {
                    "title": "The Godfather",
                    "director": "Francis Ford Coppola",
                    "year": 1972,
                    "genres": [
                        "Crime",
                        "Drama"
                    ]
                }
            }
        ]
    }
}

檢查文檔是否存在
若是你想作的只是檢查文檔是否存在——你對內容徹底不感興趣——使用HEAD方法來代替GET。HEAD請求不會返回響應體,只有HTTP頭:

curl -i -XHEAD "http://localhost:9200/movies/movie/3"

Elasticsearch將會返回200 OK狀態若是你的文檔存在:

HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 255

若是不存在返回404 Not Found:

curl -i -XHEAD "http://localhost:9200/movies/movie/36"
HTTP/1.1 404 Not Found
content-type: application/json; charset=UTF-8
content-length: 60

固然,這隻表示你在查詢的那一刻文檔不存在,但並不表示幾毫秒後依舊不存在。另外一個進程在這期間可能建立新文檔。

參考:
ElasticSearch 2.x 入門與快速實踐
Elasticsearch 入門使用

相關文章
相關標籤/搜索