abort的堆棧以下:node
#0 0x00007f338dd60b55 in raise () from /lib64/libc.so.6
#1 0x00007f338dd620c5 in abort () from /lib64/libc.so.6
#2 0x00007f338dd9ee0f in __libc_message () from /lib64/libc.so.6
#3 0x00007f338dda4628 in malloc_printerr () from /lib64/libc.so.6
#4 0x000000000046abfe in OSMemory::Delete (inMemory=0x7f333e7fcf20) at OSMemory.cpp:278
#5 0x000000000046ac2f in operator delete (mem=0x7f333e7fcf20) at OSMemory.cpp:202
#6 0x000000000040e8a7 in __gnu_cxx::new_allocator<std::_List_node<CZMBuff*> >::deallocate (this=0x7f32a4a155a0, __p=0x7f333e7fcf20) at /usr/include/c++/4.3/ext/new_allocator.h:98
#7 0x000000000040e8cf in std::_List_base<CZMBuff*, std::allocator<CZMBuff*> >::_M_put_node (this=0x7f32a4a155a0, __p=0x7f333e7fcf20) at /usr/include/c++/4.3/bits/stl_list.h:318
#8 0x000000000040e9ef in std::_List_base<CZMBuff*, std::allocator<CZMBuff*> >::_M_clear (this=0x7f32a4a155a0) at /usr/include/c++/4.3/bits/list.tcc:79
#9 0x000000000049d579 in std::list<CZMBuff*, std::allocator<CZMBuff*> >::clear (this=0x7f32a4a155a0) at /usr/include/c++/4.3/bits/stl_list.h:1066c++
因爲該段堆棧處於對象的銷燬過程,因此應該是free的報錯。根據對象自己的內存池設計,在malloc的時候,咱們使用用戶態的一個記錄結構,記錄了對象的長度。結構以下:git
typedef struct
{
size_t ID;
size_t size;
}mem_hdr;github
兩個都是8位的長度,以後再跟實際的數據,也就是我調用my_malloc的時候,若是是傳入24個字節,那麼最終會向glibc的malloc提交40個字節,24+16.c#
查看free的異常的數據以下:函數
x /40xg 0x7f333e7fcf20 -64 0x7f333e7fcf20 就是上面堆棧中inMemory的值,這個值真正傳給glibc的時候,會減去16而提交,即爲0x7f333e7fcf0x7f333e7fcee0: 0x0000000000000000 0x0000000000000028this
0x7f333e7fcef0: 0xffffffffffffffff 0xffffffffffffffff---------------------------------------這兩列值明顯異常,按道理應該是指針
0x7f333e7fcf00: 0xffffffffffffffff 0x00000000ffffffff--------------------------------
0x7f333e7fcf10: 0x0000000000000000 0x0000000000000028
0x7f333e7fcf20: 0x00007f32a57976e0 0x00007f333f7c08e0
0x7f333e7fcf30: 0x00007f32c25b2618 0x0000000000000035-------------這個轉化爲二進制就是110101 ,後面三位表明flag,#define PREV_INUSE 0x1,前面那個110000爲48,表示長度
0x7f333e7fcf40: 0x0000000000000000 0x0000000000000028
0x7f333e7fcf50: 0x00007f330047a640 0x00007f333dbebfd0
0x7f333e7fcf60: 0x00007f32b04b81b8設計
這個就是應用程序的mem_hdr結構的id 和size,40轉換成16進制就是0x28,0x28後面24個字節(3個指針)也應該3d
是用戶數據,在本例中,分別就是 _List_node_base* _M_next; _List_node_base* _M_prev; _Tp _M_data; // 數據域,即標準模板類的管理結構。指針
正常的例子以下:
0x7f333e7fcf10: 0x0000000000000000 0x0000000000000028
0x7f333e7fcf20: 0x00007f32a57976e0 0x00007f333f7c08e0
0x7f333e7fcf30: 0x00007f32c25b2618 0x0000000000000035--------------最關鍵的是0x0000000000000035值被踩成了0x00000000ffffffff,若是隻踩24字節而不是32字節,就不會glibc中報錯了。
0x7f333e7fcf40: 0x0000000000000000 0x0000000000000028--------------下一個結構開始
分爲兩段來看,下面那段是正常的分配,上面那段是異常的分配,能夠明顯看出,上面0x1497650地址開始那段的32個字節,是有問題的。
咱們回一下malloc的內存分配管理單元結構:
struct malloc_chunk { INTERNAL_SIZE_T prev_size; /* Size of previous chunk (if free). */ INTERNAL_SIZE_T size; /* Size in bytes, including overhead. */ struct malloc_chunk* fd; /* double links -- used only if free. */ struct malloc_chunk* bk; /* Only used for large blocks: pointer to next larger size. */ struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */ struct malloc_chunk* bk_nextsize; };
prev_size: If the previous chunk is free, this field contains the size of previous chunk. Else if previous chunk is allocated, this field contains previous chunk’s user data.
size: This field contains the size of this allocated chunk. Last 3 bits of this field contains flag information.
Bins: Bins are the freelist datastructures. They are used to hold free chunks. Based on chunk sizes, different bins are available:
映射到內存示意圖上以下圖所示:
能夠看到,咱們每次malloc返回的指針並非內存塊的首指針,前面還有兩個size_t大小的參數,對於非空閒內存而言size參數最爲重要。size參數存放着整個chunk的大小,因爲物理內存的分配是要作字節對齊的,因此size參數的低位用不上,便做爲flag使用。
根據多個core文件的規律,發現每次踩的都是32字節,且踩的數據如出一轍,都是:
0x1497650: 0xffffffffffffffff 0xffffffffffffffff
0x1497660: 0xffffffffffffffff 0x00000000ffffffff
換算成實際代碼,有兩種可能,一種是賦值爲-1,一種是直接memcpy的時候是0xffffffffffffffff 。
切換到對應的堆棧,使用info register看寄存器,獲取出來的CZMBuff是ok的,因爲free的時候,是從標準模板類的雙向循環列表中移除某個節點,
移除以後,調用free來釋放對應的循環鏈表管理結構,此時出了問題。
標準模板類中的循環列表的結構,表示以下:
// ListNodeBase定義 struct _List_node_base { _List_node_base* _M_next; _List_node_base* _M_prev; }; // ListNode定義 template <class _Tp> struct _List_node : public _List_node_base { _Tp _M_data; // 數據域 };
咱們的數據域,實際上是一個指向CZMBuff的二級指針,由於直接使用p很差打印鏈表中的內容,因此須要藉助腳本:
建立一個腳本文件,裏面包含以下內容(能夠在網上下載:)
define plist if $argc == 0 help plist else set $head = &$arg0._M_impl._M_node set $current = $arg0._M_impl._M_node._M_next set $size = 0 while $current != $head if $argc == 2 printf "elem[%u]: ", $size p *($arg1*)($current + 1) end if $argc == 3 if $size == $arg2 printf "elem[%u]: ", $size p *($arg1*)($current + 1) end end set $current = $current._M_next set $size++ end printf "List size = %u \n", $size if $argc == 1 printf "List " whatis $arg0 printf "Use plist <variable_name> <element_type> to see the elements in the list.\n" end end end document plist Prints std::list<T> information. Syntax: plist <list> <T> <idx>: Prints list size, if T defined all elements or just element at idx Examples: plist l - prints list size and definition plist l int - prints all elements and list size plist l int 2 - prints the third element in the list (if exists) and list size end define plist_member if $argc == 0 help plist_member else set $head = &$arg0._M_impl._M_node set $current = $arg0._M_impl._M_node._M_next set $size = 0 while $current != $head if $argc == 3 printf "elem[%u]: ", $size p (*($arg1*)($current + 1)).$arg2 end if $argc == 4 if $size == $arg3 printf "elem[%u]: ", $size p (*($arg1*)($current + 1)).$arg2 end end set $current = $current._M_next set $size++ end printf "List size = %u \n", $size if $argc == 1 printf "List " whatis $arg0 printf "Use plist_member <variable_name> <element_type> <member> to see the elements in the list.\n" end end end document plist_member Prints std::list<T> information. Syntax: plist <list> <T> <idx>: Prints list size, if T defined all elements or just element at idx Examples: plist_member l int member - prints all elements and list size plist_member l int member 2 - prints the third element in the list (if exists) and list size end
而後使用plist方法和plist_member 來獲取成員的值,
plist this->m_listBuff
List size = 16595
其中引用計數爲counter ,
counter =1 個數爲204
counter = 0 個數爲 16596
二者相加爲16800,可是 list 裏面,只有 16595 個元素,少掉的那個元素去哪了?沒有進入鏈表惟一的多是,鏈表中