一致性哈希算法(consistent hash)

consistent hash 算法筆記

一致性哈希算法主要使用在分佈式數據存儲系統中,按照必定的策略將數據儘量均勻分佈到全部的存儲節點上去,使得系統具備良好的負載均衡性能和擴展性。html

詳細能夠看這篇文章:git

實現:github

groupcache 的consistent hash 算法

源碼url:
https://github.com/golang/groupcache/blob/master/consistenthash/consistenthash.gogolang

type Map struct {
    hash     Hash // hash 函數
    replicas int
    keys     []int // Sorted
    hashMap  map[int]string
}

Map 結構,定義核心數據結構,其中hash是哈希函數,用於對key進行hash,keys字段保存全部的節點(包括虛擬節點)是可排序的,hashmap 則是虛擬節點到真實節點的映射。算法

一致性哈希算法在服務節點太少時,容易由於節點分部不均勻而形成數據傾斜問題。一致性哈希算法引入了虛擬節點機制,即對每個服務節點計算多個哈希,每一個計算結果位置都放置一個此服務節點,稱爲虛擬節點。replicas是指的是每一個節點和虛擬節點的個數。express

// Adds some keys to the hash.
func (m *Map) Add(keys ...string) {
    for _, key := range keys {
        for i := 0; i < m.replicas; i++ {
            hash := int(m.hash([]byte(strconv.Itoa(i) + key)))
            m.keys = append(m.keys, hash)
            m.hashMap[hash] = key
        }
    }
    sort.Ints(m.keys)
}

Map的Add方法,添加節點到圓環,參數是一個或者多個string,對每個key關鍵字進行哈希,這樣每臺機器就能肯定其在哈希環上的位置,在添加每一個關鍵字的時候,並添加對應的虛擬節點,每一個真實節點和虛擬節點個數有replicas字段指定,保存虛擬節點到真實節點的對應關係到hashmap字段。apache

好比在測試用例中, hash.Add("6", "4", "2"),則全部的節點是 2, 4, 6, 12, 14, 16, 22, 24, 26, 當has.Get('11') 時,對應的節點是12,而12對應的真實節點是2服務器

hash.Add("6", "4", "2")是數據值:數據結構

2014/02/20 15:45:16 replicas: 3
2014/02/20 15:45:16 keys: [2 4 6 12 14 16 22 24 26]
2014/02/20 15:45:16 hashmap map[16:6 26:6 4:4 24:4 2:2 6:6 14:4 12:2 22:2]
// Gets the closest item in the hash to the provided key.
func (m *Map) Get(key string) string {
    if m.IsEmpty() {
        return ""
    }

    hash := int(m.hash([]byte(key)))

    // 順時針「行走」,找到第一個大於哈希值的節點
    for _, v := range m.keys {
        if v >= hash {
            return m.hashMap[v] // 返回真實節點
        }
    }

    // hash值大於最大節點哈希值的狀況
    return m.hashMap[m.keys[0]]
}

Get方法根據提供的key定位數據訪問到相應服務器節點,算法是:將數據key使用相同的哈希函數H計算出哈希值h,通根據h肯定此數據在環上的位置,今後位置沿環順時針「行走」,第一臺遇到的服務器就是其應該定位到的服務器。app

附groupcache consistent hash算法源碼:

/*
Copyright 2013 Google Inc.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

     http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

// Package consistenthash provides an implementation of a ring hash.
package consistenthash

import (
    "hash/crc32"
    "sort"
    "strconv"
)

type Hash func(data []byte) uint32

type Map struct {
    hash     Hash
    replicas int
    keys     []int // Sorted
    hashMap  map[int]string
}

func New(replicas int, fn Hash) *Map {
    m := &Map{
        replicas: replicas,
        hash:     fn,
        hashMap:  make(map[int]string),
    }
    if m.hash == nil {
        m.hash = crc32.ChecksumIEEE
    }
    return m
}

// Returns true if there are no items available.
func (m *Map) IsEmpty() bool {
    return len(m.keys) == 0
}

// Adds some keys to the hash.
func (m *Map) Add(keys ...string) {
    for _, key := range keys {
        for i := 0; i < m.replicas; i++ {
            hash := int(m.hash([]byte(strconv.Itoa(i) + key)))
            m.keys = append(m.keys, hash)
            m.hashMap[hash] = key
        }
    }
    sort.Ints(m.keys)
}

// Gets the closest item in the hash to the provided key.
func (m *Map) Get(key string) string {
    if m.IsEmpty() {
        return ""
    }

    hash := int(m.hash([]byte(key)))

    // Linear search for appropriate replica.
    for _, v := range m.keys {
        if v >= hash {
            return m.hashMap[v]
        }
    }

    // Means we have cycled back to the first replica.
    return m.hashMap[m.keys[0]]
}
相關文章
相關標籤/搜索