調試二進制程序時,常常要藉助GDB工具,跟蹤程序的執行流程,獲取程序執行時變量的值,以發現問題所在。GDB能獲得這些信息,是由於編譯程序時,編譯器保存了相應的信息。Linux下的可執行程序和連接庫通常爲ELF格式(Executable and Linking Format),調試信息以DWARF格式保存。html
新建文件main.clinux
內容以下函數
#include <stdio.h> #include <stdlib.h> int add(int a, int b) { int c; c = a + b return c; } int main(void) { int a = 3; int b = 4; int c = add(a, b); printf("a + b = %d.\n", c); return 0; }
經過gcc編譯, -g
表示添加調試信息工具
$ gcc -O0 -g main.c -o main
經過readelf命令ELF文件的Section headersui
$ readelf -S main There are 34 section headers, starting at offset 0x2240: Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .interp PROGBITS 0000000000000238 00000238 000000000000001c 0000000000000000 A 0 0 1 [ 2] .note.ABI-tag NOTE 0000000000000254 00000254 0000000000000020 0000000000000000 A 0 0 4 [ 3] .note.gnu.build-i NOTE 0000000000000274 00000274 0000000000000024 0000000000000000 A 0 0 4 [ 4] .gnu.hash GNU_HASH 0000000000000298 00000298 000000000000001c 0000000000000000 A 5 0 8 [ 5] .dynsym DYNSYM 00000000000002b8 000002b8 00000000000000a8 0000000000000018 A 6 1 8 [ 6] .dynstr STRTAB 0000000000000360 00000360 0000000000000084 0000000000000000 A 0 0 1 [ 7] .gnu.version VERSYM 00000000000003e4 000003e4 000000000000000e 0000000000000002 A 5 0 2 [ 8] .gnu.version_r VERNEED 00000000000003f8 000003f8 0000000000000020 0000000000000000 A 6 1 8 [ 9] .rela.dyn RELA 0000000000000418 00000418 00000000000000c0 0000000000000018 A 5 0 8 [10] .rela.plt RELA 00000000000004d8 000004d8 0000000000000018 0000000000000018 AI 5 22 8 [11] .init PROGBITS 00000000000004f0 000004f0 0000000000000017 0000000000000000 AX 0 0 4 [12] .plt PROGBITS 0000000000000510 00000510 0000000000000020 0000000000000010 AX 0 0 16 [13] .plt.got PROGBITS 0000000000000530 00000530 0000000000000008 0000000000000008 AX 0 0 8 [14] .text PROGBITS 0000000000000540 00000540 00000000000001e2 0000000000000000 AX 0 0 16 [15] .fini PROGBITS 0000000000000724 00000724 0000000000000009 0000000000000000 AX 0 0 4 [16] .rodata PROGBITS 0000000000000730 00000730 0000000000000011 0000000000000000 A 0 0 4 [17] .eh_frame_hdr PROGBITS 0000000000000744 00000744 0000000000000044 0000000000000000 A 0 0 4 [18] .eh_frame PROGBITS 0000000000000788 00000788 0000000000000128 0000000000000000 A 0 0 8 [19] .init_array INIT_ARRAY 0000000000200db8 00000db8 0000000000000008 0000000000000008 WA 0 0 8 [20] .fini_array FINI_ARRAY 0000000000200dc0 00000dc0 0000000000000008 0000000000000008 WA 0 0 8 [21] .dynamic DYNAMIC 0000000000200dc8 00000dc8 00000000000001f0 0000000000000010 WA 6 0 8 [22] .got PROGBITS 0000000000200fb8 00000fb8 0000000000000048 0000000000000008 WA 0 0 8 [23] .data PROGBITS 0000000000201000 00001000 0000000000000010 0000000000000000 WA 0 0 8 [24] .bss NOBITS 0000000000201010 00001010 0000000000000008 0000000000000000 WA 0 0 1 [25] .comment PROGBITS 0000000000000000 00001010 0000000000000024 0000000000000001 MS 0 0 1 [26] .debug_aranges PROGBITS 0000000000000000 00001034 0000000000000030 0000000000000000 0 0 1 [27] .debug_info PROGBITS 0000000000000000 00001064 0000000000000396 0000000000000000 0 0 1 [28] .debug_abbrev PROGBITS 0000000000000000 000013fa 000000000000011c 0000000000000000 0 0 1 [29] .debug_line PROGBITS 0000000000000000 00001516 00000000000000da 0000000000000000 0 0 1 [30] .debug_str PROGBITS 0000000000000000 000015f0 0000000000000289 0000000000000001 MS 0 0 1 [31] .symtab SYMTAB 0000000000000000 00001880 0000000000000678 0000000000000018 32 48 8 [32] .strtab STRTAB 0000000000000000 00001ef8 0000000000000208 0000000000000000 0 0 1 [33] .shstrtab STRTAB 0000000000000000 00002100 000000000000013e 0000000000000000 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings), I (info), L (link order), O (extra OS processing required), G (group), T (TLS), C (compressed), x (unknown), o (OS specific), E (exclude), l (large), p (processor specific)
編譯生成的可執行文件中有34個Section, 其中的debug_*
幾個ELF頭部表明程序的DWARF格式的調試信息prototype
debug_line
Section中記錄了二進制程序的指令地址對應源代碼的位置。能夠經過readelf -wl main
查看debug
$ readelf -wl main Raw dump of debug contents of section .debug_line: Offset: 0x0 Length: 214 DWARF Version: 2 Prologue Length: 179 Minimum Instruction Length: 1 Initial value of 'is_stmt': 1 Line Base: -5 Line Range: 14 Opcode Base: 13 Opcodes: Opcode 1 has 0 args Opcode 2 has 1 arg Opcode 3 has 1 arg Opcode 4 has 1 arg Opcode 5 has 1 arg Opcode 6 has 0 args Opcode 7 has 0 args Opcode 8 has 0 args Opcode 9 has 1 arg Opcode 10 has 0 args Opcode 11 has 0 args Opcode 12 has 1 arg The Directory Table (offset 0x1b): 1 /usr/lib/gcc/x86_64-linux-gnu/7/include 2 /usr/include/x86_64-linux-gnu/bits 3 /usr/include The File Name Table (offset 0x74): Entry Dir Time Size Name 1 0 0 0 main.c 2 1 0 0 stddef.h 3 2 0 0 types.h 4 2 0 0 libio.h 5 3 0 0 stdio.h 6 2 0 0 sys_errlist.h Line Number Statements: [0x000000bd] Extended opcode 2: set Address to 0x64a [0x000000c8] Special opcode 8: advance Address by 0 to 0x64a and Line by 3 to 4 [0x000000c9] Special opcode 146: advance Address by 10 to 0x654 and Line by 1 to 5 [0x000000ca] Special opcode 160: advance Address by 11 to 0x65f and Line by 1 to 6 [0x000000cb] Special opcode 48: advance Address by 3 to 0x662 and Line by 1 to 7 [0x000000cc] Special opcode 35: advance Address by 2 to 0x664 and Line by 2 to 9 [0x000000cd] Special opcode 118: advance Address by 8 to 0x66c and Line by 1 to 10 [0x000000ce] Special opcode 104: advance Address by 7 to 0x673 and Line by 1 to 11 [0x000000cf] Special opcode 104: advance Address by 7 to 0x67a and Line by 1 to 12 [0x000000d0] Advance PC by constant 17 to 0x68b [0x000000d1] Special opcode 20: advance Address by 1 to 0x68c and Line by 1 to 13 [0x000000d2] Advance PC by constant 17 to 0x69d [0x000000d3] Special opcode 76: advance Address by 5 to 0x6a2 and Line by 1 to 14 [0x000000d4] Special opcode 76: advance Address by 5 to 0x6a7 and Line by 1 to 15 [0x000000d5] Advance PC by 2 to 0x6a9 [0x000000d7] Extended opcode 1: End of Sequence
注意這裏面的Line Number Statements
,裏面的每一行都標明瞭一條指定的地址和對應的源代碼再文件中的位置。調試
[0x000000c8] Special opcode 8: advance Address by 0 to 0x64a and Line by 3 to 4
這一行表示指定地址爲0x64a
的指令對應源代碼在文件中的第4行code
對debug_line
section處理後,能夠獲得指令地址到源代碼位置的轉換表,能夠經過readelf -wL main
查看處理後的結果(指令地址和源代碼的對應關係)。orm
$ readelf -wL main Contents of the .debug_line section: CU: ./main.c: File name Line number Starting address View main.c 4 0x64a main.c 5 0x654 main.c 6 0x65f main.c 7 0x662 main.c 9 0x664 main.c 10 0x66c main.c 11 0x673 main.c 12 0x67a main.c 13 0x68c main.c 14 0x6a2 main.c 15 0x6a7 main.c 15 0x6a9
其中每行都指明瞭文件名、第X行、指令地址。
debug_info section
經過readelf -wi main
能夠讀取debug_info
section的內容。debug_info section是DWARF的核心內容,其餘一些如debug_str
等都是爲了加快查找/壓縮空間而使用的。
$ readelf -wi main Contents of the .debug_info section: Compilation Unit @ offset 0x0: Length: 0x392 (32-bit) Version: 4 Abbrev Offset: 0x0 Pointer Size: 8 <0><b>: Abbrev Number: 1 (DW_TAG_compile_unit) <c> DW_AT_producer : (indirect string, offset: 0x21a): GNU C11 7.3.0 -mtune=generic -march=x86-64 -g -O0 -fstack-protector-strong <10> DW_AT_language : 12 (ANSI C99) <11> DW_AT_name : (indirect string, offset: 0x118): main.c <15> DW_AT_comp_dir : (indirect string, offset: 0xdc): /home/kenan <19> DW_AT_low_pc : 0x64a <21> DW_AT_high_pc : 0x5f <29> DW_AT_stmt_list : 0x0 ................... ................... <1><353>: Abbrev Number: 20 (DW_TAG_subprogram) <354> DW_AT_external : 1 <354> DW_AT_name : add <358> DW_AT_decl_file : 1 <359> DW_AT_decl_line : 4 <35a> DW_AT_prototyped : 1 <35a> DW_AT_type : <0x62> <35e> DW_AT_low_pc : 0x64a <366> DW_AT_high_pc : 0x1a <36e> DW_AT_frame_base : 1 byte block: 9c (DW_OP_call_frame_cfa) <370> DW_AT_GNU_all_call_sites: 1 <2><370>: Abbrev Number: 21 (DW_TAG_formal_parameter) <371> DW_AT_name : a <373> DW_AT_decl_file : 1 <374> DW_AT_decl_line : 4 <375> DW_AT_type : <0x62> <379> DW_AT_location : 2 byte block: 91 5c (DW_OP_fbreg: -36) <2><37c>: Abbrev Number: 21 (DW_TAG_formal_parameter) <37d> DW_AT_name : b <37f> DW_AT_decl_file : 1 <380> DW_AT_decl_line : 4 <381> DW_AT_type : <0x62> <385> DW_AT_location : 2 byte block: 91 58 (DW_OP_fbreg: -40) <2><388>: Abbrev Number: 19 (DW_TAG_variable) <389> DW_AT_name : c <38b> DW_AT_decl_file : 1 <38c> DW_AT_decl_line : 5 <38d> DW_AT_type : <0x62> <391> DW_AT_location : 2 byte block: 91 6c (DW_OP_fbreg: -20) <2><394>: Abbrev Number: 0 ****
這裏是由一個個的稱爲DIE(Debuging Information Entity)的單元來表示的,其中TAG表示DIE的類型,如DW_TAG_compile_unit中包含了編譯時的參數,源文件,目錄等。這裏只關注其中DW_TAG_subsystem,即函數信息。
DW_AT_name: add
: 函數名爲addDW_AT_low_pc: 0x64a
: 函數對應的初始PC地址DW_AT_high_pc: 0x1a
: 函數結束時PC地址爲0x64a + 0x1aDW_AT_frame_base
: 表達函數的棧幀基址(frame base),函數參數和局部變量的存儲位置會以相對棧幀基址的偏移給出。在DW_TAG_subprogram後面是函數中變量的信息 DW_TAB_formal_parameter
表示這個DIE表明的時函數的參數
DW_AT_name: a
: 參數名爲aDW_AT_type: 0x62
: 參數類型DW_AT_location : 2 byte block: 91 5c (DW_OP_fbreg: -36)
: 表示存儲在函數棧幀基址偏移-24
的地方,有了這些信息,GDB就能夠根據當前執行的指令地址獲得對應的源代碼文件位置、當前函數名以及當前函數中的參數/局部變量/全局變量的信息。
GDB容許將調試信息保存在單獨的文件中。由於調試信息佔用的空間會很大,甚至遠超過程序自己。不少系統會將調試信息剝離到單獨的文件中,須要調試時再安裝,以節約存儲空間。
經過objcopy能夠將ELF文件的調試信息提取到單獨的文件中
$ objcopy --only-keep-debug main main.debug
而後經過strip去除文件中的調試信息
$ strip -g main
GDB支持使用下面兩種方式來尋找調試信息所在的文件
build-id
指定$ readelf -n main Displaying notes found in: .note.ABI-tag Owner Data size Description GNU 0x00000010 NT_GNU_ABI_TAG (ABI version tag) OS: Linux, ABI: 3.2.0 Displaying notes found in: .note.gnu.build-id Owner Data size Description GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring) Build ID: 536cc8d42fa3ed672abc427d4a683313fb902b6b
在.note.gnu.build-id
中記錄了build-id爲536cc8d42fa3ed672abc427d4a683313fb902b6b
,GDB會嘗試從 /usr/lib/debug/.build-id/53/6cc8d42fa3ed672abc427d4a683313fb902b6b.debug
文件中讀取調試信息。
在/usr/lib/debug
目錄中,以build-id的頭兩位爲子目錄名,後面的幾位+.debug
爲文件名。
debug_link
指定經過debug_link
指定文件名。如指定爲main.debug,則GDB會在下面三個路徑下尋找debug文件
可經過objcopy
命令將debug_link添加到程序中
$ objcopy --add-gnu-debuglink=main.debug main
而後能夠看到ELF頭部已有gnu_debuglink
$ readelf -S main ......... [26] .gnu_debuglink PROGBITS 0000000000000000 00001034 0000000000000010 0000000000000000 0 0 4 ........