C/C++的內存泄漏檢測工具Valgrind memcheck的使用經歷

時間 2019-11-17

標籤 c++ 內存泄漏檢測工具 valgrind memcheck 使用經歷欄目 C&C++ 简体版

原文原文鏈接

Linux下的Valgrind真是利器啊（不知道Valgrind的請自覺查看參考文獻（1）（2）），幫我找出了很多C++中的內存管理錯誤，前一陣子還在糾結爲何VS 2013下運行良好的程序到了Linux下用g++編譯運行卻崩潰了，給出一堆彙編代碼也看不懂。久久不得解事後，想一想確定是內存方面的錯誤，VS在這方面通常都不檢查的，就算你的程序千瘡百孔，各類內存泄露、內存管理錯誤，只要不影響運行，沒有讀到不應讀的東西VS就不會告訴你（應該是VS內部沒實現這個內存檢測功能），所以用VS寫出的程序可能不是完美或健壯的。html

------------------------------------------------------------------------------------------------------------------------------linux

更新：感謝博客園好心網友@shines77的熱心推薦，即VS中有內存泄漏檢測工具插件VLD(Visual Leak Detector)，須要下載安裝，安裝方法請看官方介紹，使用很是簡單，在第一個入口文件里加上#include <vld.h>就能夠了，檢測報告在輸出窗口中。我安裝使用了下，不知道是安裝錯誤仍是什麼，不管程序有無內存泄露，輸出都是「No memory leaks detected.」程序員

下面是我經過 Valgrind第一次檢測獲得的結果和一點點修改後獲得的結果（還沒改完，因此還有很多內存泄露問題……）：web

第一次檢測結果：慘不忍睹，由於程序規模有些大。編程

根據提示一點點修改事後，雖然還有個別錯誤和內存泄露問題，但還在修改中，至少已經能成功運行了……數組

真感謝Valgrind幫我成功找出了一堆內存問題，查找過程當中也爲本身犯的低級錯誤而感到羞愧，因此記錄下來以便謹記。app

1. 最多最低級的錯誤：不匹配地使用malloc/new/new[] 和 free/delete/delete[]

這樣的錯誤主要源於我對C++的new/new[]、delete/delete[]機制不熟悉，凡是new/new[]分配內存的類型變量我一律用delete進行釋放，或者有的變量用malloc進行分配，結果釋放的時候卻用delete，致使申請、釋放不少地方不匹配，不少內存空間沒能釋放掉。爲了維護方便，我後來一概使用new/new[]和delete/delete[]，拋棄C中的malloc和free。ide

若是將用戶new的類型分爲基本數據類型和自定義數據類型兩種，那麼對於下面的操做相信你們都很熟悉，也沒有任何問題。函數

（1）基本數據類型工具

一維指針：

// 申請空間
int *d = new int[5];

// 釋放空間
delete[] d;

二維指針:

// 申請空間
int **d = new int*[5];
for (int i = 0; i < 5; i++)
    d[i] = new int[10];

// 釋放空間
for (int i = 0; i < 5; i++)
    delete[] d[i];
delete[] d;

（2）自定義數據類型

好比下面這樣一個類型：

class DFA {
    bool is_mark;
    char *s;

public:
     ~DFA() { printf("delete it.\n"); }
};

一維指針：

 DFA *d = new DFA();
 delete d;

二維指針：

// 申請空間
DFA **d = new DFA*[5];
for (int i = 0; i < 5; i++)
    d[i] = new DFA();

// 釋放空間
for (int i = 0; i < 5; i++)
    delete d[i];
delete[]d;

這沒有任何問題，由於咱們都是配套使用new/delete和new[]/delete[]的。這在Valgrind下檢測也是完美經過的，但爲何要這配套使用呢？原理是什麼？

雖然深究這些東西好像沒什麼實際意義，但對於想深刻了解C++內部機制或像我同樣總是釋放出錯致使大量內存泄露的小白程序員仍是值得研究的，至少知道了爲何，之後就不會犯如今的低級錯誤。

參考文獻（3）是這樣描述的：

一般情況下，編譯器在new的時候會返回用戶申請的內存空間大小，可是實際上，編譯器會分配更大的空間，目的就是在delete的時候可以準確的釋放這段空間。

這段空間在用戶取得的指針以前以及用戶空間末尾以後存放。

實際上：blockSize = sizeof(_CrtMemBlockHeader) + nSize + nNoMansLandSize; 其中，blockSize 是系統所分配的實際空間大小，_CrtMemBlockHeader是new的頭部信息，其中包含用戶申請的空間大小等其餘一些信息。 nNoMansLandSize是尾部的越界校驗大小，通常是4個字節「FEFEFEFE」，若是用戶越界寫入這段空間，則校驗的時候會assert。nSize纔是爲咱們分配的真正可用的內存空間。

用戶new的時候分爲兩種狀況
A. new的是基礎數據類型或者是沒有自定義析構函數的結構
B. new的是有自定義析構函數的結構體或類

這二者的區別是若是有用戶自定義的析構函數，則delete的時候必需要調用析構函數，那麼編譯器delete時如何知道要調用多少個對象的析構函數呢，答案就是new的時候，若是是狀況B，則編譯器會在new頭部以後，用戶得到的指針以前多分配4個字節的空間用來記錄new的時候的數組大小，這樣delete的時候就能夠取到個數並正確的調用。

這段描述可能有些晦澀難懂，參考文獻（4）給了更加詳細的解釋，一點即通。這樣的解釋其實也隱含着一個推論：若是new的是基本數據類型或者是沒有自定義析構函數的結構，那麼這種狀況下編譯器不會在用戶得到的指針以前多分配4個字節，由於這時候delete時不用調用析構函數，也就是不用知道數組個數的大小（由於只有調用析構函數時才須要知道要調用多少個析構函數，也就是數組的大小），而是直接傳入數組的起始地址從而釋放掉這塊內存空間，此時delete與delete[]是等價的。

所以下面的釋放操做也是正確的：

// 申請空間
int *d = new int[5];

// 釋放空間
delete d;

將其放在Valgrind下進行檢測，結果以下：

==2955== Memcheck, a memory error detector
==2955== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==2955== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==2955== Command: ./test_int
==2955== 
==2955== Mismatched free() / delete / delete []
==2955==    at 0x402ACFC: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==2955==    by 0x8048530: main (in /home/hadoop/test/test_int)
==2955==  Address 0x434a028 is 0 bytes inside a block of size 20 alloc'd
==2955==    at 0x402B774: operator new[](unsigned int) (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==2955==    by 0x8048520: main (in /home/hadoop/test/test_int)
==2955== 
==2955== 
==2955== HEAP SUMMARY:
==2955==     in use at exit: 0 bytes in 0 blocks
==2955==   total heap usage: 1 allocs, 1 frees, 20 bytes allocated
==2955== 
==2955== All heap blocks were freed -- no leaks are possible
==2955== 
==2955== For counts of detected and suppressed errors, rerun with: -v
==2955== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

首先從「All heap blocks were freed -- no leaks are possible」能夠看出上面的釋放操做的確是正確的，而不是有些人認爲的delete d;只會釋放d[]的第一個元素的空間，後面的都不會獲得釋放。可是從「Mismatched free() / delete / delete []」知道Valgrind其實是不容許這樣操做的，雖然沒有內存泄露問題，可是new[]與delete不匹配，這樣的編程風格不經意間就容易犯低級錯誤，因此Valgrind報錯了，可是我想Valgrind內部實現應該不會考慮的這麼複雜，它就檢查new是否與delete配對，new[]是否與delete[]配對，而無論有時候new[]與delete配對也不會出現問題的。

綜上所述，給個人經驗就是：在某些狀況下，new[]分配的內存用delete不會出錯，可是大多狀況下會產生嚴重的內存問題，因此必定要養成將new和delete，new[]和delete[]配套使用的良好編程習慣。

2. 最看不懂的錯誤：一堆看不懂的Invalid read/write錯誤(更新：已解決)

好比下面這樣一個程序：

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

struct accept_pair {

    bool is_accept_state;

    bool is_strict_end;

    char app_name[0];
};

int main() {
    
    char *s = "Alexia";
    accept_pair *ap = (accept_pair*)malloc(sizeof(accept_pair) + sizeof(s));
    strcpy(ap->app_name, s);

    printf("app name: %s\n", ap->app_name);

    free(ap);

    return 0;
}

首先對該程序作個扼要的說明：

這裏結構體裏定義零長數組的緣由在於個人需求：我在其它地方要用到很大的accept_pair數組，其中只有個別accept_pair元素中的app_name是有效的（取決於某些值的判斷，若是爲true纔給app_name賦值，若是爲false則app_name無心義，爲空），所以如果char app_name[20]，那麼大部分accept_pair元素都浪費了這20個字節的空間，因此我在這裏先一個字節都不分配，到時誰須要就給誰分配，遵循「按需分配」的古老思想。可能有人會想，用char *app_name也能夠啊，一樣能實現按需分配，是的，只是多4個字節而已，屬於替補方法。

在g++下通過測試，沒有什麼問題，可以正確運行，但用Valgrind檢測時卻報出了一些錯誤，不是內存泄露問題，而是內存讀寫錯誤：

==3511== Memcheck, a memory error detector
==3511== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==3511== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==3511== Command: ./zero
==3511== 
==3511== Invalid write of size 1
==3511==    at 0x402CD8B: strcpy (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==3511==    by 0x80484E3: main (in /home/hadoop/test/zero)
==3511==  Address 0x420002e is 0 bytes after a block of size 6 alloc'd
==3511==    at 0x402C418: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==3511==    by 0x80484C8: main (in /home/hadoop/test/zero)
==3511== 
==3511== Invalid write of size 1
==3511==    at 0x402CDA5: strcpy (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==3511==    by 0x80484E3: main (in /home/hadoop/test/zero)
==3511==  Address 0x4200030 is 2 bytes after a block of size 6 alloc'd
==3511==    at 0x402C418: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==3511==    by 0x80484C8: main (in /home/hadoop/test/zero)
==3511== 
==3511== Invalid read of size 1
==3511==    at 0x40936A5: vfprintf (vfprintf.c:1655)
==3511==    by 0x409881E: printf (printf.c:34)
==3511==    by 0x4063934: (below main) (libc-start.c:260)
==3511==  Address 0x420002e is 0 bytes after a block of size 6 alloc'd
==3511==    at 0x402C418: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==3511==    by 0x80484C8: main (in /home/hadoop/test/zero)
==3511== 
==3511== Invalid read of size 1
==3511==    at 0x40BC3C0: _IO_file_xsputn@@GLIBC_2.1 (fileops.c:1311)
==3511==    by 0x4092184: vfprintf (vfprintf.c:1655)
==3511==    by 0x409881E: printf (printf.c:34)
==3511==    by 0x4063934: (below main) (libc-start.c:260)
==3511==  Address 0x420002f is 1 bytes after a block of size 6 alloc'd
==3511==    at 0x402C418: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==3511==    by 0x80484C8: main (in /home/hadoop/test/zero)
==3511== 
==3511== Invalid read of size 1
==3511==    at 0x40BC3D7: _IO_file_xsputn@@GLIBC_2.1 (fileops.c:1311)
==3511==    by 0x4092184: vfprintf (vfprintf.c:1655)
==3511==    by 0x409881E: printf (printf.c:34)
==3511==    by 0x4063934: (below main) (libc-start.c:260)
==3511==  Address 0x420002e is 0 bytes after a block of size 6 alloc'd
==3511==    at 0x402C418: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==3511==    by 0x80484C8: main (in /home/hadoop/test/zero)
==3511== 
==3511== Invalid read of size 4
==3511==    at 0x40C999C: __GI_mempcpy (mempcpy.S:59)
==3511==    by 0x40BC310: _IO_file_xsputn@@GLIBC_2.1 (fileops.c:1329)
==3511==    by 0x4092184: vfprintf (vfprintf.c:1655)
==3511==    by 0x409881E: printf (printf.c:34)
==3511==    by 0x4063934: (below main) (libc-start.c:260)
==3511==  Address 0x420002c is 4 bytes inside a block of size 6 alloc'd
==3511==    at 0x402C418: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==3511==    by 0x80484C8: main (in /home/hadoop/test/zero)
==3511== 
app name: Alexia
==3511== 
==3511== HEAP SUMMARY:
==3511==     in use at exit: 0 bytes in 0 blocks
==3511==   total heap usage: 1 allocs, 1 frees, 6 bytes allocated
==3511== 
==3511== All heap blocks were freed -- no leaks are possible
==3511== 
==3511== For counts of detected and suppressed errors, rerun with: -v
==3511== ERROR SUMMARY: 9 errors from 6 contexts (suppressed: 0 from 0)

從檢測報告能夠看出：

strcpy(ap->app_name, s);這句是內存寫錯誤，printf("app name: %s\n", ap->app_name);這句是內存讀錯誤，二者都說明Valgrind認爲ap->app_name所處內存空間是不合法的，但是我明明已經爲其分配了內存空間，只是沒有註明這段空間就是給它用的，難道結構體中零長數組char app_name[0]是不能寫入值的嗎？仍是我對零長數組的使用有誤？至今仍不得解，求大神解答……

------------------------------------------------------------------------------------------------------------------------------

更新：謝謝博客園網友@shines77的好心指正，這裏犯了個超級低級的錯誤，就是忘了main中s是char*的，所以sizeof(s)=4或8（64位機），所以accept_pair *ap = (accept_pair*)malloc(sizeof(accept_pair) + sizeof(s));這句並無爲app_name申請足夠的空間，固然就會出現Invalid read/write了。這個低級錯誤真是。。。後來想了下，是本身在項目中直接拷貝過來的這句，項目中的s不是char*的，拷貝過來忘了改爲accept_pair *ap = (accept_pair*)malloc(sizeof(accept_pair) + strlen(s) + 1);了，之後仍是細心的好，真是浪費本身時間也浪費你們時間了。

3. 最不明因此的內存泄露：definitely lost/indefinitely lost（更新：已解決）

請看下面這樣一個程序：

#include <stdio.h>
#include <string.h>

class accept_pair {
public:

    bool is_accept_state;

    bool is_strict_end;

    char *app_name;

public:

    accept_pair(bool is_accept = false, bool is_end = false);

    ~accept_pair();
};

class DFA {

public:

    unsigned int _size;

    accept_pair **accept_states;

public:

    DFA(int size);

    ~DFA();

    void add_state(int index, char *s);
    void add_size(int size);
};

int main() {
    char *s = "Alexia";
    
    DFA *dfa = new DFA(3);
    dfa->add_state(0, s);
    dfa->add_state(1, s);
    dfa->add_state(2, s);

    dfa->add_size(2);
    dfa->add_state(3, s);
    dfa->add_state(4, s);

    printf("\napp_name: %s\n", dfa->accept_states[4]->app_name);
    printf("size: %d\n\n", dfa->_size);

    delete dfa;

    return 0;
}

accept_pair::accept_pair(bool is_accept, bool is_end) {
    is_accept_state = is_accept;
    is_strict_end = is_end;
    app_name = NULL;
}

accept_pair::~accept_pair() { 
    if (app_name) {
        printf("delete accept_pair.\n");
        delete[] app_name;
    }
}

DFA::DFA(int size) {
    _size = size;

    accept_states = new accept_pair*[_size];
    for (int s = 0; s < _size; s++) {
        accept_states[s] = NULL;
    }
}

DFA::~DFA() {
    for (int i = 0; i < _size; i++) {
        if (accept_states[i]) {
            printf("delete dfa.\n");
            delete accept_states[i];
            accept_states[i] = NULL;
        }
    }
    delete[] accept_states;
}

void DFA::add_state(int index, char *s) {
    accept_states[index] = new accept_pair(true, true);
    accept_states[index]->app_name = new char[strlen(s) + 1];
    memcpy(accept_states[index]->app_name, s, strlen(s) + 1);
}

void DFA::add_size(int size) {
    // reallocate memory for accept_states.
    accept_pair **tmp_states = new accept_pair*[size + _size];
    for (int s = 0; s < size + _size; s++)
        tmp_states[s] = new accept_pair(false, false);

    for (int s = 0; s < _size; s++) {
        tmp_states[s]->is_accept_state = accept_states[s]->is_accept_state;
        tmp_states[s]->is_strict_end = accept_states[s]->is_strict_end;
        if (accept_states[s]->app_name != NULL) {
            tmp_states[s]->app_name = new char[strlen(accept_states[s]->app_name) + 1];
            memcpy(tmp_states[s]->app_name, accept_states[s]->app_name, strlen(accept_states[s]->app_name) + 1);
        }
    }

    // free old memory.
    for (int s = 0; s < _size; s++) {
        if (accept_states[s] != NULL) {
            delete accept_states[s];
            accept_states[s] = NULL;
        }
    }
    _size += size;
    delete []accept_states;

    accept_states = tmp_states;
}

雖然有點長，但邏輯很簡單，其中add_size()首先分配一個更大的accept_pair數組，將已有的數據所有拷貝進去，而後釋放掉原來的accept_pair數組所佔空間，最後將舊的數組指針指向新分配的內存空間。這是個demo程序，在我看來這段程序是沒有任何內存泄露問題的，由於申請的全部內存空間最後都會在DFA析構函數中獲得釋放。可是Valgrind的檢測報告卻報出了1個內存泄露問題（紅色的是程序輸出）：

==3093== Memcheck, a memory error detector
==3093== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==3093== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==3093== Command: ./test
==3093== 
delete accept_pair.
delete accept_pair.
delete accept_pair.

app_name: Alexia
size: 5

delete dfa.
delete accept_pair.
delete dfa.
delete accept_pair.
delete dfa.
delete accept_pair.
delete dfa.
delete accept_pair.
delete dfa.
delete accept_pair.
==3093== 
==3093== HEAP SUMMARY:
==3093==     in use at exit: 16 bytes in 2 blocks
==3093==   total heap usage: 21 allocs, 19 frees, 176 bytes allocated
==3093== 
==3093== 16 bytes in 2 blocks are definitely lost in loss record 1 of 1
==3093==    at 0x402BE94: operator new(unsigned int) (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==3093==    by 0x8048A71: DFA::add_size(int) (in /home/hadoop/test/test)
==3093==    by 0x8048798: main (in /home/hadoop/test/test)
==3093== 
==3093== LEAK SUMMARY:
==3093==    definitely lost: 16 bytes in 2 blocks
==3093==    indirectly lost: 0 bytes in 0 blocks
==3093==      possibly lost: 0 bytes in 0 blocks
==3093==    still reachable: 0 bytes in 0 blocks
==3093==         suppressed: 0 bytes in 0 blocks
==3093== 
==3093== For counts of detected and suppressed errors, rerun with: -v
==3093== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

說明add_size()這個函數裏存在用new申請的內存空間沒有獲得釋放，這一點感受很費解，開始覺得tmp_states指針所指向的數據賦給accept_states後沒有及時釋放致使的，因而我最後加了句delete tmp_states;結果招致更多的錯誤。相信不是Valgrind誤報，說明我對C++的new和delete機制仍是不明不白，一些於我而言不明因此的內存泄露問題真心不得解，但願有人可以告訴我是哪裏的問題？

------------------------------------------------------------------------------------------------------------------------------

更新：謝謝博客園好心網友@NewClear的解惑。這裏的確有泄露問題，下面是他的解答：

第3個問題，是有兩個泄露
DFA::add_state裏面直接
accept_states[index] = new accept_pair(true, true);
若是原來的accept_states[index]不爲NULL就泄露了
而在DFA::add_size裏面，
for (int s = 0; s < size + _size; s++)
tmp_states[s] = new accept_pair(false, false);
對新分配的tmp_states的每個元素都new了一個新的accept_pair
因此在main函數裏面dfa->add_size(2);之後，總共有5個成員，並且5個都不爲NULL
以後
dfa->add_state(3, s);
dfa->add_state(4, s);
結果就致使了index爲3和4的原先的對象泄露了
你的系統是32位的，因此一個accept_pair大小是8byte，兩個對象就是16byte

解決方案也很簡單，修改add_size函數，從新申請空間時僅爲已有的accept_pair數據申請空間，其它的初始化爲NULL，這樣在須要時纔在add_state裏面申請空間，也就是修改add_size函數以下：

void DFA::add_size(int size) {
    // reallocate memory for accept_states.
    accept_pair **tmp_states = new accept_pair*[size + _size];
    for (int s = 0; s < size + _size; s++)
        tmp_states[s] = NULL;

    for (int s = 0; s < _size; s++) {
        tmp_states[s] = new accept_pair(false, false);
        tmp_states[s]->is_accept_state = accept_states[s]->is_accept_state;
        tmp_states[s]->is_strict_end = accept_states[s]->is_strict_end;
        if (accept_states[s]->app_name != NULL) {
            tmp_states[s]->app_name = new char[strlen(accept_states[s]->app_name) + 1];
            memcpy(tmp_states[s]->app_name, accept_states[s]->app_name, strlen(accept_states[s]->app_name) + 1);
        }
    }

    // free old memory.
    for (int s = 0; s < _size; s++) {
        if (accept_states[s] != NULL) {
            delete accept_states[s];
            accept_states[s] = NULL;
        }
    }
    _size += size;
    delete[]accept_states;

    accept_states = tmp_states;
}