Python 字典(Dictionary)與hash

    瞭解過python字典的人都知道一句話,字典的鍵必須不可變,因此能夠用數,字符串或元組充當,列表就不行。python

    那麼,爲何可變類型不能看成鍵呢?函數

    自定義類型能夠作鍵嗎?this

    首先,咱們看看list作鍵值的狀況spa

lsta = [1,2,3]
dicta = {lsta:1}

>>>TypeError: unhashable type: 'list'

    看了字典的源碼dictobject.ccode

static int
insertdict(register PyDictObject *mp, PyObject *key, long hash, PyObject *value)

    其中有個關鍵參數long hash,這個值是經過key調用本身的hash方法計算出來的,每一種不一樣的類型都定義了本身的hash方法。orm

    Int類型intobject.c:對象

PyTypeObject PyInt_Type = {
    ...
        (hashfunc)int_hash,            /* tp_hash */
    ...
}

static long
int_hash(PyIntObject *v)
{
    /* XXX If this is changed, you also need to change the way
       Python's long, float and complex types are hashed. */
    long x = v -> ob_ival;
    if (x == -1)
        x = -2;
    return x;
}

    String類型stringobject.c:blog

 
 
PyTypeObject PyString_Type = {
    ...
        (hashfunc)string_hash,  /* tp_hash */ ... }

static long string_hash(PyStringObject
*a) { register Py_ssize_t len; register unsigned char *p; register long x; if (a->ob_shash != -1) return a->ob_shash; len = Py_SIZE(a); p = (unsigned char *) a->ob_sval; x = *p << 7; while (--len >= 0) x = (1000003*x) ^ *p++; x ^= Py_SIZE(a); if (x == -1) x = -2; a->ob_shash = x; return x; }

     Tuple類型tupleobject.c:字符串


PyTypeObject PyTuple_Type = { ... (hashfunc)tuplehash, /* tp_hash */ ... }

static long tuplehash(PyTupleObject
*v) { register long x, y; register Py_ssize_t len = Py_SIZE(v); register PyObject **p; long mult = 1000003L; x = 0x345678L; p = v->ob_item; while (--len >= 0) { y = PyObject_Hash(*p++); if (y == -1) return -1; x = (x ^ y) * mult; /* the cast might truncate len; that doesn't change hash stability */ mult += (long)(82520L + len + len); } x += 97531L; if (x == -1) x = -2; return x; }

 

     List類型listobject.c:源碼

PyTypeObject PyList_Type = {
    ...
    (hashfunc)PyObject_HashNotImplemented,    /* tp_hash */
    ...
}

    PyObject_HashNotImplemented實在基類object中定義的object.c:

long
PyObject_HashNotImplemented(PyObject *self)
{
    PyErr_Format(PyExc_TypeError, "unhashable type: '%.200s'",
             self->ob_type->tp_name);
    return -1;
}

long
PyObject_Hash(PyObject *v)
{
    PyTypeObject *tp = v->ob_type;
    if (tp->tp_hash != NULL)
        return (*tp->tp_hash)(v);
    if (tp->tp_compare == NULL && RICHCOMPARE(tp) == NULL) {
        return _Py_HashPointer(v); /* Use address as hash value */
    }
    /* If there's a cmp but no hash defined, the object can't be hashed */
    return PyObject_HashNotImplemented(v);
}

     從上面就能夠看出爲何list不能做爲鍵,由於list類型的hash函數是PyObject_HashNotImplemented,這個函數執行時會拋出TypeError異常。

     同時,從上面tuplehash函數能夠看出,tuple類型雖然能夠作鍵值,可是若是tuple容器裏的子項沒法hash,那麼這個tuple依然不能作鍵值,好比下面把list放到tuple容器中:

lista = [1,2,3]
tuplea = (lista,)
dicta = {tuplea:1}

>>>TypeError: unhashable type: 'list'

     

     對於自定義類型,若是沒有重寫__hash__函數,則會使用基類object的hash函數,默認返回對象的地址,若是重寫了hash函數,則根據新的hash函數

#1)重寫hash函數,拋出異常,沒法做爲鍵值
class my_class(object):
    def __hash__(self):
        raise TypeError, "unhashable type: my_class"

obj = my_class()
dicta = {obj:1}

>>>TypeError: unhashable type: my_class
 
 
#2)重寫hash函數,不拋出異常
class my_class2(object):
    def __init__(self, value):
        self.value = value
    def __hash__(self):
        return self.value
        
obj = my_class2(1)
obj2 = my_class2(2)
obj3 = my_class2(1)
dicta = {obj:1, obj2:2, obj3:3}
print hash(obj)
print hash(obj2)
print hash(obj3)
print dicta

>>>18
   1
   2
   1
   {<__main__.my_class2 object at 0x01A94490>: 1, <__main__.my_class2 object at 0x01A944B0>: 2, <__main__.my_class2 object at 0x01A944D0>: 3}
 
 
#3)不重寫hash函數
class my_class3(object):
    pass
        
obj = my_class3()
obj2 = my_class3()
dicta = {obj:1, obj2:2}
print hash(obj)
print hash(obj2)
print dicta

>>>29050192
   29050224
   {<__main__.my_class3 object at 0x01BB4550>: 1, <__main__.my_class3 object at 0x01BB4570>: 2}
相關文章
相關標籤/搜索