[TOC]html
問題描述
今天在寫代碼時,運行時奔潰了。segment fault,並且是在程序退出main()函數後,才報的。 惟一的信息是:Segmentation fault (core dumped)
簡直是一頭霧水。linux
查看core文件
系統默認是不會生成core文件的,ulimit -c unlimited
把core文件設爲無限大。c++
使用gdb查看core文件
gdb ./example/sudoku_batch_test core
提示以下:app
Program terminated with signal SIGSEGV, Segmentation fault. #0 __GI___libc_free (mem=0x313030303030300a) at malloc.c:2951 2951 malloc.c: No such file or directory. (gdb)
能夠肯定崩潰發生在malloc.c中。可是提示沒有malloc.c的源碼。函數
首先安裝glibc的符號表,命令以下: sudo apt-get install libc6-dbg
ui
再來是安裝glibc的源文件,命令以下: sudo apt-get source libc6-dev
安裝完畢後在當前目錄下會多出一個glibc-2.23文件夾,該文件夾包含了glibc的源碼。this
源碼準備就緒後,接着上面,在gdb命令提示符下輸入: directory glibc-2.23/malloc/
將glibc-2.23/malloc/設爲gdb源碼搜索目錄。結果以下:spa
warning: core file may not match specified executable file. [New LWP 24491] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `./example/sudoku_batch_test ../example/test1000 127.0.0.1 1'. Program terminated with signal SIGSEGV, Segmentation fault. #0 __GI___libc_free (mem=0x313030303030300a) at malloc.c:2951 2951 malloc.c: No such file or directory. (gdb) directory glibc-2.23/malloc/ Source directories searched: /root/work/melon/build/glibc-2.23/malloc:$cdir:$cwd (gdb)
如今咱們就能夠在gdb中查看崩潰處的源碼了,執行list
:線程
(gdb) l warning: Source file is more recent than executable. 2946 if (mem == 0) /* free(0) has no effect */ 2947 return; 2948 2949 p = mem2chunk (mem); 2950 2951 if (chunk_is_mmapped (p)) /* release mmapped memory. */ 2952 { 2953 /* see if the dynamic brk/mmap threshold needs adjusting */ 2954 if (!mp_.no_dyn_threshold 2955 && p->size > mp_.mmap_threshold (gdb)
雖然知道了崩潰發生在2951行,可是貌似沒有更多有效的信息。這時我想到了是否是能夠看下函數的調用棧,或許會有信息。 接着執行backtrace(或者bt):debug
(gdb) bt #0 __GI___libc_free (mem=0x313030303030300a) at malloc.c:2951 #1 0x000000000048bc9d in melon::Coroutine::~Coroutine (this=0x1fc9120, __in_chrg=<optimized out>) at /root/work/melon/src/Coroutine.cpp:56 #2 0x000000000048d099 in std::_Sp_counted_ptr<melon::Coroutine*, (__gnu_cxx::_Lock_policy)2>::_M_dispose ( this=0x1fc8190) at /usr/include/c++/5/bits/shared_ptr_base.h:374 #3 0x00000000004630f1 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release (this=0x1fc8190) at /usr/include/c++/5/bits/shared_ptr_base.h:150 #4 0x0000000000461f32 in std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count (this=0x7f07f4ff1770, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr_base.h:659 #5 0x00000000004749ed in std::__shared_ptr<melon::Coroutine, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr ( this=0x7f07f4ff1768, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr_base.h:925 #6 0x0000000000474a39 in std::shared_ptr<melon::Coroutine>::~shared_ptr (this=0x7f07f4ff1768, __in_chrg=<optimized out>) at /usr/include/c++/5/bits/shared_ptr.h:93 #7 0x00007f07f40915ff in __GI___call_tls_dtors () at cxa_thread_atexit_impl.c:155 #8 0x00007f07f4090f27 in __run_exit_handlers (status=0, listp=0x7f07f441b5f8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true) at exit.c:40 #9 0x00007f07f4091045 in __GI_exit (status=<optimized out>) at exit.c:104 #10 0x00007f07f4077837 in __libc_start_main (main=0x45f1c4 <main(int, char**)>, argc=4, argv=0x7ffcfb2ab218, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffcfb2ab208) at ../csu/libc-start.c:325 #11 0x000000000045ec89 in _start ()
這下問題找到了,首先在線程結束或者程序運行結束會調用__GI___call_tls_dtors函數來析構線程本地存儲。我確實用了thread_local關鍵字修飾Coroutine::Ptr變量。 從#1 0x000000000048bc9d in melon::Coroutine::~Coroutine
可知在melon::Coroutine類的析構函數中調用了free()致使奔潰。 這下問題基本明確了,我在Coroutine析構函數中會釋放stack_這個指針,
53 Coroutine::~Coroutine() { 54 LOG_DEBUG << "destroy coroutine:" << name_; 55 if (stack_) { 56 free(stack_); 57 } 58 }
有兩個構造函數,其中一個以下:
39 Coroutine::Coroutine() 40 :c_id_(++t_coroutine_id), 41 name_("Main-" + std::to_string(c_id_)), 42 cb_(nullptr), 43 state_(CoroutineState::INIT) { 44 45 if (getcontext(&context_)) { 46 LOG_ERROR << "getcontext: errno=" << errno 47 << " error string:" << strerror(errno); 58 } 59 }
由於大意犯了個很是低級的錯誤,這個構造函數沒有正確初始化statck_指針,將statck_初始化爲nullptr後,問題就解決了。
update:2019-10-31 其實不用這麼麻煩,gdb有個where命令,能直接打印出函數棧信息。
總結
遇到這類問題,通常用gdb查看core文件都能定位到崩潰的位置,若是不是直接引起的,能夠查看函數調用棧,通常都能找到問題緣由。
原文出處:https://www.cnblogs.com/gatsby123/p/11755320.html