docvalues和Fieldcache

時間 2019-12-01

標籤 docvalues fieldcache 简体版

原文原文鏈接

Fieldcache:

docID->document->fieldvaluehtml

不管是聚類排序關聯等，首先都須要得到文檔中某個字段的值，經過docID去得到整個document，而後再去得到字段值，term轉換獲得最終值，FieldCache一開始就緩存了全部文檔的某個特定域(全部數值類型以及不分詞的stringField)的值到內存，便於隨機存取該域值！apache

Fieldcache實現過程：緩存

http://moshalanye.iteye.com/blog/281379數據結構

缺點：less

1. 常駐內存，大小是全部文檔個數特定域類型大小elasticsearch

2. 初始加載過程耗時，須要遍歷倒排索引及類型轉換ide

Docvalues:

docID->fieldvalueui

建索引時，創建了document到field value的面向列的正排索引數據結構，直接經過已知的docID定位到字段值，從而無需加載document，亦不須要term轉換，遍歷term找尋doc等的過程spa

優勢：大約節省三分之一的內存!htm

缺點：因爲是硬盤讀取，而非內存模式，對於大批量的使用下，優點明顯，速度更優；小量狀況下沒有內存快！整體會慢15-20%

20 February 2015 - Apache Lucene 5.0.0 and Apache Solr 5.0.0 Available

http://lucene.apache.org/

FieldCache is gone (moved to a dedicated UninvertingReader in the misc module). This means when you intend to sort on a field, you should index that field using doc values, which is much faster and less heap consuming than FieldCache.

LUCENE-5666：Change uninverted access (sorting, faceting, grouping, etc) to use the DocValues API instead of FieldCache

Es中

https://www.elastic.co/guide/en/elasticsearch/guide/current/doc-values.html

Sorl中

http://wiki.apache.org/solr/DocValues?cm_mc_uid=56088888487714180880058&cm_mc_sid_50200000=1448507379

https://cwiki.apache.org/confluence/display/solr/DocValues

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。