[譯]C語言實現一個簡易的Hash table(3)

時間 2019-12-05

標籤 c語言實現一個簡易 hash table 简体版

原文原文鏈接

上一章，咱們講了hash表的數據結構，並簡單實現了hash表的初始化與刪除操做，這一章咱們會講解Hash函數和實現算法，並手動實現一個Hash函數。算法

Hash函數

本教程中咱們實現的Hash函數將會實現以下操做：segmentfault

輸入一個字符串，而後返回一個0到m(Hash表的大小)的數字
爲一組日常的輸入返回均勻的bucket索引。若是Hash函數不是均勻分佈的，就會將多個記錄插入到相同的bucket中，這就回提升衝突的概率，而這個衝突就會影響到咱們的Hash表的效率。

Hash算法

咱們將會設計一個普通的字符串Hash函數，在僞代碼中表示以下：安全

function hash(string, a, num_buckets):
    hash = 0
    string_len = length(string)
    for i = 0, 1, ..., string_len:
        hash += (a ** (string_len - (i+1))) * char_code(string[i])
    hash = hash % num_buckets
    return hash

這個Hash函數主要分爲兩步：數據結構

將字符串轉爲大整型
經過取餘數mod m將整數的大小減少到固定範圍

變量a是一個素數，而且要大於英文字母，咱們正在散列ASCII字符串，其字母大小爲128，所以咱們應該選擇大於此的素數。函數

char_code這個函數會返回字母對應的整數，使用的是ASCII中的字母。測試

以下使用這個Hash函數：spa

hash("cat", 151, 53)

// 函數拆解
hash = (151**2 * 99 + 151**1 * 97 + 151**0 * 116) % 53
hash = (2257299 + 14647 + 116) % 53
hash = (2272062) % 53
hash = 5

若是改變a咱們會獲得不一樣的結果：設計

hash("cat", 163, 53) = 3

代碼實現

// hash_table.c
static int ht_hash(const char* s, const int a, const int m) {
    long hash = 0;
    const int len_s = strlen(s);
    for (int i = 0; i < len_s; i++) {
        hash += (long)pow(a, len_s - (i+1)) * s[i];
        hash = hash % m;
    }
    return (int)hash;
}