elasticsearc 參考資料

_source 和storehtml

http://stackoverflow.com/questions/18833899/in-elasticsearch-what-happens-if-i-set-store-to-yes-on-a-few-fields-but-soujava

http://stackoverflow.com/questions/17103047/why-do-i-need-storeyes-in-elasticsearchapi

You usually send a field to elasticsearch because you either want to search on it, or retrieve it. But it's true that if you don't store the field explicitly and you don't disable the source you can still retrieve the field using the _source. This means that in some cases it might actually make sense to have a field that is not indexed nor stored.app

When you store a field, that's done in the underlying lucene. Lucene is an inverted index, that allows for fast full-text search and gives back document ids given text queries. Beyond the inverted index Lucene has some kind of storage where the field values can be stored in order to be retrieved given a document id. You usually store in lucene the fields that you want to return as search results. Elasticsearch doesn't require to store every field that you want to return because it always stores by default every document that you send to it, thus it's always able to return everything you sent to it as search result.elasticsearch

In just a few cases it might be useful to store fields explicitly in lucene: when the _source field is disabled, or when we want to avoid parsing it, even if the parsing is done automatically by elasticsearch. Keep in mind though that retrieving many stored fields from lucene might require one disk seek per field while with retrieving only the _source from lucene and parsing it in order to retrieve the needed fields is just a single disk seek and just faster in most of the cases.ide

 若是字段的屬性store 被設置爲no,也能夠經過_source獲取文檔,而後解析出該字段的內容,可是前提是_source的屬性"enabled": true。ui

 

 

 

Aggregationspa

http://chrissimpson.co.uk/elasticsearch-aggregations-overview.htmlcode

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations.htmlhtm

http://stackoverflow.com/questions/21018493/how-to-access-aggregations-result-with-elasticsearch-java-api-in-searchresponse

 

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation-order

 

Top Hit Aggregation

https://www.elastic.co/guide/en/elasticsearch/reference/1.6/search-aggregations-metrics-top-hits-aggregation.html

Shards and replicas

一個shard 其實是一個lucence index

主分片能夠接受index,副本不行;可是查詢均可以

http://blog.trifork.com/2014/01/07/elasticsearch-how-many-shards/

 

Aggregation

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations.html

Aggregation不許確

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_document_counts_are_approximate

 

 

Mapping

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/mapping-intro.html

 

 每一個文檔在索引中都有一個類型,每一個類型有本身的mapping或者叫模型定義。mapping定義類型中的字段,每一個字段的數據類型,以及在彈性搜索中字段是被如何處理的。mapping也用來配置與類型相關的元數據。

彈性搜索支持以下的簡單字段數據類型:

  • String: string
  • Whole number: byteshortintegerlong
  • Floating-point: floatdouble
  • Boolean: boolean
  • Date: date

當你索引一個包含新字段的文檔時,彈性搜索根據JSON的基本數據類型來猜想文檔字段的數據類型。具體的對應關係以下:

JSON type

Field type

Boolean: true or false

boolean

Whole number: 123

long

Floating point: 123.45

double

String, valid date: 2014-09-15

date

String: foo bar

string

 

 

 

 

 

 

 

 

 

 

注意:
  這意味着,若是字段以「123」索引一個數字,該字段會被映射爲String類型,而不是long類型。然而,若是該字段已經存在而且被定義爲long類型,那麼彈性搜索會嘗試把String類型轉換爲long,若是沒法轉換(例如包含了字母)則會拋出一個異常。
 
自定義字段映射
字段最重要的屬性是type,對於非String類型的字段,除了type屬性,你幾乎不用指定任何屬性。
String類型的字段默認是全文,即:在索引以前,值會傳遞給分詞器;全文檢索時,在搜索前值也會先傳給分詞器。
String類型最重要的兩個屬性是 indexanalyzer
 
Index屬性包含三個備選值:
analyzed
先分詞,再索引。
not_analyzed
    直接索引,因此它是可搜索的,可是用全值建索引,不分詞。
no
不建索引,因此該字段是不可搜索的。

String類型的屬性,默認值是analyzed,因此想要用原始值建索引,須要設置爲 not_analyzed。

其餘類型(例如long,double,date)也有index屬性,可是備選值只有no和not_analyzed,這些值永遠不會被分詞

相關文章
相關標籤/搜索