MemCache分佈式緩存的一個bug

       Memcached分佈式緩存策略不是由服務器端至支持的,多臺服務器之間並不知道彼此的存在。分佈式的實現是由客戶端代碼(Memcached.ClientLibrary)經過緩存key-server映射來實現的,基本原理就是對緩存key求hash值,用hash值對服務器數量進行模運算,該key值被分配到模運算結果爲索引的那臺server上html

       Memcached.ClientLibrary對緩存key計算hashcode的核心算法以下:java

  1 /// <summary>
  2 /// Returns appropriate SockIO object given
  3 /// string cache key and optional hashcode.
  4 /// 
  5 /// Trys to get SockIO from pool.  Fails over
  6 /// to additional pools in event of server failure.
  7 /// </summary>
  8 /// <param name="key">hashcode for cache key</param>
  9 /// <param name="hashCode">if not null, then the int hashcode to use</param>
 10 /// <returns>SockIO obj connected to server</returns>
 11 public SockIO GetSock(string key, object hashCode)
 12 {
 13     string hashCodeString = "<null>";
 14     if(hashCode != null)
 15         hashCodeString = hashCode.ToString();
 16 
 17     if(Log.IsDebugEnabled)
 18     {
 19         Log.Debug(GetLocalizedString("cache socket pick").Replace("$$Key$$", key).Replace("$$HashCode$$", hashCodeString));
 20     }
 21 
 22     if (key == null || key.Length == 0)
 23     {
 24         if(Log.IsDebugEnabled)
 25         {
 26             Log.Debug(GetLocalizedString("null key"));
 27         }
 28         return null;
 29     }
 30 
 31     if(!_initialized)
 32     {
 33         if(Log.IsErrorEnabled)
 34         {
 35             Log.Error(GetLocalizedString("get socket from uninitialized pool"));
 36         }
 37         return null;
 38     }
 39 
 40     // if no servers return null
 41     if(_buckets.Count == 0)
 42         return null;
 43 
 44     // if only one server, return it
 45     if(_buckets.Count == 1)
 46         return GetConnection((string)_buckets[0]);
 47 
 48     int tries = 0;
 49 
 50     // generate hashcode
 51     int hv;
 52     if(hashCode != null)
 53     {
 54         hv = (int)hashCode;
 55     }
 56     else
 57     {
 58 
 59         // NATIVE_HASH = 0
 60         // OLD_COMPAT_HASH = 1
 61         // NEW_COMPAT_HASH = 2
 62         switch(_hashingAlgorithm)
 63         {
 64             case HashingAlgorithm.Native:
 65                 hv = key.GetHashCode();
 66                 break;
 67 
 68             case HashingAlgorithm.OldCompatibleHash:
 69                 hv = OriginalHashingAlgorithm(key);
 70                 break;
 71 
 72             case HashingAlgorithm.NewCompatibleHash:
 73                 hv = NewHashingAlgorithm(key);
 74                 break;
 75 
 76             default:
 77                 // use the native hash as a default
 78                 hv = key.GetHashCode();
 79                 _hashingAlgorithm = HashingAlgorithm.Native;
 80                 break;
 81         }
 82     }
 83 
 84     // keep trying different servers until we find one
 85     while(tries++ <= _buckets.Count)
 86     {
 87         // get bucket using hashcode 
 88         // get one from factory
 89         int bucket = hv % _buckets.Count;
 90         if(bucket < 0)
 91             bucket += _buckets.Count;
 92 
 93         SockIO sock = GetConnection((string)_buckets[bucket]);
 94 
 95         if(Log.IsDebugEnabled)
 96         {
 97             Log.Debug(GetLocalizedString("cache choose").Replace("$$Bucket$$", _buckets[bucket].ToString()).Replace("$$Key$$", key));
 98         }
 99 
100         if(sock != null)
101             return sock;
102 
103         // if we do not want to failover, then bail here
104         if(!_failover)
105             return null;
106 
107         // if we failed to get a socket from this server
108         // then we try again by adding an incrementer to the
109         // current key and then rehashing 
110         switch(_hashingAlgorithm)
111         {
112             case HashingAlgorithm.Native:
113                 hv += ((string)("" + tries + key)).GetHashCode();
114                 break;
115 
116             case HashingAlgorithm.OldCompatibleHash:
117                 hv += OriginalHashingAlgorithm("" + tries + key);
118                 break;
119 
120             case HashingAlgorithm.NewCompatibleHash:
121                 hv += NewHashingAlgorithm("" + tries + key);
122                 break;
123 
124             default:
125                 // use the native hash as a default
126                 hv += ((string)("" + tries + key)).GetHashCode();
127                 _hashingAlgorithm = HashingAlgorithm.Native;
128                 break;
129         }
130     }
131 
132     return null;
133 }
根據緩存key獲得服務器的核心代碼

       從源碼中(62--82行代碼)能夠發現,計算hashcode的算法共三種:算法

      (1)HashingAlgorithm.Native: 即便用.NET自己的hash算法,速度快,但與其餘client可能不兼容,例如須要和java、ruby的client共享緩存的狀況;緩存

      (2)HashingAlgorithm.OldCompatibleHash: 能夠與其餘客戶端兼容,但速度慢;ruby

      (3)HashingAlgorithm.NewCompatibleHash: 能夠與其餘客戶端兼容,據稱速度快。服務器

       進一步分析發現,Memcached.ClientLibrary默認計算緩存key的hashcode的方式就是HashingAlgorithm.Native,而HashingAlgorithm.Native計算hashcode的算法爲「hv = key.GetHashCode()」,即用了.net類庫string類型自帶的GetHashCode()方法。app

       Bug就要浮現出來了,根據微軟(http://msdn.microsoft.com/zh-cn/library/system.object.gethashcode.aspx)對GetHashCode的解釋:the .NET Framework does not guarantee the default implementation of the GetHashCode method, and the value this method returns may differ between .NET Framework versions and platforms, such as 32-bit and 64-bit platforms。string類型的GetHashCode()函數並不能保證不一樣平臺同一個字符串返回的hash值相同,這樣問題就出來了,對於不一樣服務器的同一緩存key來講,產生的hashcode可能不一樣,同一key對應的數據可能緩存到了不一樣的MemCache服務器上,數據的一致性沒法保證,清除緩存的代碼也可能失效dom

// 64位 4.0
[__DynamicallyInvokable, ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail), SecuritySafeCritical]
public unsafe override int GetHashCode()
{
    if (HashHelpers.s_UseRandomizedStringHashing)
    {
        return string.InternalMarvin32HashString(this, this.Length, 0L);
    }
    IntPtr arg_25_0;
    IntPtr expr_1C = arg_25_0 = this;
    if (expr_1C != 0)
    {
        arg_25_0 = (IntPtr)((int)expr_1C + RuntimeHelpers.OffsetToStringData);
    }
    char* ptr = arg_25_0;
    int num = 5381;
    int num2 = num;
    char* ptr2 = ptr;
    int num3;
    while ((num3 = (int)(*(ushort*)ptr2)) != 0)
    {
        num = ((num << 5) + num ^ num3);
        num3 = (int)(*(ushort*)(ptr2 + (IntPtr)2 / 2));
        if (num3 == 0)
        {
            break;
        }
        num2 = ((num2 << 5) + num2 ^ num3);
        ptr2 += (IntPtr)4 / 2;
    }
    return num + num2 * 1566083941;
}


// 64位 2.0
// string
[ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail)]
public unsafe override int GetHashCode()
{
    IntPtr arg_0F_0;
    IntPtr expr_06 = arg_0F_0 = this;
    if (expr_06 != 0)
    {
        arg_0F_0 = (IntPtr)((int)expr_06 + RuntimeHelpers.OffsetToStringData);
    }
    char* ptr = arg_0F_0;
    int num = 5381;
    int num2 = num;
    char* ptr2 = ptr;
    int num3;
    while ((num3 = (int)(*(ushort*)ptr2)) != 0)
    {
        num = ((num << 5) + num ^ num3);
        num3 = (int)(*(ushort*)(ptr2 + (IntPtr)2 / 2));
        if (num3 == 0)
        {
            break;
        }
        num2 = ((num2 << 5) + num2 ^ num3);
        ptr2 += (IntPtr)4 / 2;
    }
    return num + num2 * 1566083941;
}

//32位 4.0
[__DynamicallyInvokable, ReliabilityContract(Consistency.WillNotCorruptState, Cer.MayFail), SecuritySafeCritical]
public unsafe override int GetHashCode()
{
    if (HashHelpers.s_UseRandomizedStringHashing)
    {
        return string.InternalMarvin32HashString(this, this.Length, 0L);
    }
    IntPtr arg_25_0;
    IntPtr expr_1C = arg_25_0 = this;
    if (expr_1C != 0)
    {
        arg_25_0 = (IntPtr)((int)expr_1C + RuntimeHelpers.OffsetToStringData);
    }
    char* ptr = arg_25_0;
    int num = 352654597;
    int num2 = num;
    int* ptr2 = (int*)ptr;
    int i;
    for (i = this.Length; i > 2; i -= 4)
    {
        num = ((num << 5) + num + (num >> 27) ^ *ptr2);
        num2 = ((num2 << 5) + num2 + (num2 >> 27) ^ ptr2[(IntPtr)4 / 4]);
        ptr2 += (IntPtr)8 / 4;
    }
    if (i > 0)
    {
        num = ((num << 5) + num + (num >> 27) ^ *ptr2);
    }
    return num + num2 * 1566083941;
}
GetHashCode幾種版本的實現代碼

      解決問題的方法就是不要用MemCache默認的hash算法,實現方式有兩種:socket

     (1)初始化MemCache服務器的時候,指定爲MemCahce自帶其它的hash算法,代碼爲「this.pool.HashingAlgorithm = HashingAlgorithm.OldCompatibleHash;」。分佈式

     (2)自定義hash算法,調用set()、get()、delete()等方式時傳遞hash值,這幾個方法有參數傳遞hashcode的重載。

 

       參考資料:分析Memcached客戶端如何把緩存數據分佈到多個服務器上(轉)memcached client - memcacheddotnet (Memcached.ClientLibrary) 1.1.5memcache分佈式實現Object.GetHashCode 方法關於 HashCode作key的可能性

相關文章
相關標籤/搜索