倒排索引的一些算法調研

下面的文章專門針對搜索引擎裏的倒排列表 sorted sets研究交集算法,思路相似快排,很是值得一看html

www.cs.ucr.edu/~stelo/cpm/cpm04/25_Baeza-yates.pdf
 

合併sorted sequence算法:git

https://github.com/rklaehn/rklaehn.github.io/blob/master/_posts/2016-01-05-binarymerge.mdgithub

 

彙總資料:https://github.com/TechConf/CodeMash2016/blob/master/Great%20Galloping%20Cuckoos-%20Algorithms%20Faster%20than%20log(n)/index.html算法

關鍵信息:post

  ## Comparisons of Set Intersections
   
  <small>Excerpted from [Faster Adaptive Set Intersections for Text Searching](http://www.cs.toronto.edu/~tl/papers/wea06.pdf)</small>
   
  Algorithm | # of comparisons
  -----------|----------------:
  Sequential | 119479075
  Adaptive | 83326341
  Small Adaptive | 68706234
  Interpolation Sequential | 55275738
  Interpolation Adaptive | 58558408
   
   
  </markdeep></section><section><markdeep>
  ## Comparisons of Set Intersections
   
  <small>Excerpted from [Faster Adaptive Set Intersections for Text Searching](http://www.cs.toronto.edu/~tl/papers/wea06.pdf)</small>
   
  Algorithm | # of comparisons
  -----------|----------------:
  Sequential | 119479075
  Interpolation Small Adaptive | 44525318
  Extrapolation Small Adaptive | 50018852
  Extrapolate Many Small Adaptive | 44087712
  Extrapolate Ahead Small Adaptive | 43930174

 

## Resources: Sets
  - [A Fast Set Intersection Algorithm for Sorted Sequences](http://www.cs.ucr.edu/~stelo/cpm/cpm04/25_Baeza-yates.pdf)
  - [Experimental Analysis of a Fast Intersection Algorithm for Sorted Sequences](https://cs.uwaterloo.ca/~ajsaling/papers/paper-spire.pdf)
  - [Experimental Comparison of Set Intersection Algorithms for Inverted Indexing](http://ceur-ws.org/Vol-1003/58.pdf)
  - [Fast Set Intersection in Memory](http://research.microsoft.com/pubs/142850/p255-dingkoenig.pdf)
  - [Faster Adaptive Set Intersections for Text Searching](http://www.cs.toronto.edu/~tl/papers/wea06.pdf)
  - [Faster Set Intersection with SIMD instructions by Reducing Branch Mispredictions](http://www.vldb.org/pvldb/vol8/p293-inoue.pdf)
  - [SIMD Compression and the Intersection of Sorted Integers](http://arxiv.org/abs/1401.6399)

 

https://github.com/lemire/SIMDCompressionAndIntersection 搜索引擎

A C++ library to compress and intersect sorted lists of integers using SIMD instructions
裏面說起的一些資料:

Documentation

This work has also inspired other work such as...spa

 
說起較多的:
https://github.com/Randl/CS/tree/master/Hwang-Lin
相關文章
相關標籤/搜索