摘要:詳細介紹了C++中的Name Mangling的原理和gcc中對應的實現,經過程序代碼和nm c++filt等工具來驗證這些原理。對於詳細瞭解程序的連接過程有必定的幫助。ios
Name Mangling概述c++
C++的語言特性比C豐富的多,C++支持的函數重載功能是須要Name Mangling技術的最直接的例子。對於重載的函數,不能僅依靠函數名稱來區分不一樣的函數,由於C++中重載函數的區分是創建在如下規則上的:git
固然,C++還有不少其餘的地方須要Name Mangling,如namespace, class, template等等。sql
- /*
- * simple_test.c
- * a demo to show that different name mangling technology in C++ and C
- * Author: Chaos Lee
- */
- #include<stdio.h>
- int rect_area(int x1,int x2,int y1,int y2)
- {
- return (x2-x1) * (y2-y1);
- }
- int elipse_area(int a,int b)
- {
- return 3.14 * a * b;
- }
- int main(int argc,char *argv[])
- {
- int x1 = 10, x2 = 20, y1 = 30, y2 = 40;
- int a = 3,b=4;
- int result1 = rect_area(x1,x2,y1,y2);
- int result2 = elipse_area(a,b);
- return 0;
- }
- [lichao@sg01 name_mangling]$ gcc -c simple_test.c
- [lichao@sg01 name_mangling]$ nm simple_test.o
- 0000000000000027 T elipse_area
- 0000000000000051 T main
- 0000000000000000 T rect_area
- [lichao@sg01 name_mangling]$ nm simple_test.o
- 0000000000000028 T _Z11elipse_areaii
- 0000000000000000 T _Z9rect_areaiiii
- U __gxx_personality_v0
- 0000000000000052 T main
l C++語言中規定 :如下劃線並緊挨着大寫字母開頭或者以兩個下劃線開頭的標識符都是C++語言中保留的標示符。因此_Z9rect_areaiiii是保留的標識符,g++編譯的目標文件中的符號使用_Z開頭(C99標準)。express
- /*
- * simple_test.c
- * a demo to show that different name mangling technology in C++ and C
- * Author: Chaos Lee
- */
- #include<stdio.h>
- #ifdef __cplusplus
- extern "C" {
- #endif
- int rect_area(int x1,int x2,int y1,int y2)
- {
- return (x2-x1) * (y2-y1);
- }
- int elipse_area(int a,int b)
- {
- return (int)(3.14 * a * b);
- }
- #ifdef __cplusplus
- }
- #endif
- int main(int argc,char *argv[])
- {
- int x1 = 10, x2 = 20, y1 = 30, y2 = 40;
- int a = 3,b=4;
- int result1 = rect_area(x1,x2,y1,y2);
- int result2 = elipse_area(a,b);
- return 0;
- }
- [lichao@sg01 name_mangling]$ gcc -c simple_test.c
- [lichao@sg01 name_mangling]$ nm simple_test.o
- 0000000000000027 T elipse_area
- 0000000000000051 T main
- 0000000000000000 T rect_area
- [lichao@sg01 name_mangling]$ g++ -c simple_test.c
- [lichao@sg01 name_mangling]$ nm simple_test.o
- U __gxx_personality_v0
- 0000000000000028 T elipse_area
- 0000000000000052 T main
- 0000000000000000 T rect_area
事實上,C標準庫中使用了大量的extern 「C」關鍵字,由於C標準庫也是能夠用C++編譯器編譯的,可是要確保編譯以後仍然保持C的接口而不是C++的接口(由於是C標準庫),因此須要使用extern 「C」關鍵字。網絡
- /*
- * libc_test.c
- * a demo program to show that how the standard C
- * library are compiled when encountering a C++ compiler
- */
- #include<stdio.h>
- int main(int argc,char * argv[])
- {
- puts("hello world.\n");
- return 0;
- }
搜索一下puts,咱們並無看到extern 「C」.奇怪麼?ide
- [lichao@sg01 name_mangling]$ g++ -E libc_test.c | grep 'puts'
- extern int fputs (__const char *__restrict __s, FILE *__restrict __stream);
- extern int puts (__const char *__s);
- extern int fputs_unlocked (__const char *__restrict __s,
- puts("hello world.\n");
- [lichao@sg01 name_mangling]$ g++ -E libc_test.c | grep 'extern "C"'
- extern "C" {
- extern "C" {
不一樣編譯器使用不一樣的方式進行name mangling, 你可能會問爲何不將C++的 name mangling標準化,這樣就能實現各個編譯器之間的互操做了。事實上,在C++的FAQ列表上有對此問題的回答:函數
"Compilers differ as to how objects are laid out, how multiple inheritance is implemented, how virtual function calls are handled, and so on, so if the name mangling were made the same, your programs would link against libraries provided from other compilers but then crash when run. For this reason, the ARM (Annotated C++ Reference Manual) encourages compiler writers to make their name mangling different from that of other compilers for the same platform. Incompatible libraries are then detected at link time, rather than at run time."工具
GCC採用IA 64的name mangling方案,此方案定義於Intel IA64 standard ABI.在g++的FAQ列表中有如下一段話:
"GNU C++ does not do name mangling in the same way as other C++ compilers.佈局
This means that object files compiled with one compiler cannot be used with
GNU C++的name mangling方案和其餘C++編譯器方案不一樣,因此一種編譯器生成的目標文件並不能被另一種編譯器生成的目標文件使用。
- Builtin types encoding
- <builtin-type> ::= v # void
- ::= w # wchar_t
- ::= b # bool
- ::= c # char
- ::= a # signed char
- ::= h # unsigned char
- ::= s # short
- ::= t # unsigned short
- ::= i # int
- ::= j # unsigned int
- ::= l # long
- ::= m # unsigned long
- ::= x # long long, __int64
- ::= y # unsigned long long, __int64
- ::= n # __int128
- ::= o # unsigned __int128
- ::= f # float
- ::= d # double
- ::= e # long double, __float80
- ::= g # __float128
- ::= z # ellipsis
- ::= u <source-name> # vendor extended type
Operator encoding
- <operator-name> ::= nw # new
- ::= na # new[]
- ::= dl # delete
- ::= da # delete[]
- ::= ps # + (unary)
- ::= ng # - (unary)
- ::= ad # & (unary)
- ::= de # * (unary)
- ::= co # ~
- ::= pl # +
- ::= mi # -
- ::= ml # *
- ::= dv # /
- ::= rm # %
- ::= an # &
- ::= or # |
- ::= eo # ^
- ::= aS # =
- ::= pL # +=
- ::= mI # -=
- ::= mL # *=
- ::= dV # /=
- ::= rM # %=
- ::= aN # &=
- ::= oR # |=
- ::= eO # ^=
- ::= ls # <<
- ::= rs # >>
- ::= lS # <<=
- ::= rS # >>=
- ::= eq # ==
- ::= ne # !=
- ::= lt # <
- ::= gt # >
- ::= le # <=
- ::= ge # >=
- ::= nt # !
- ::= aa # &&
- ::= oo # ||
- ::= pp # ++
- ::= mm # --
- ::= cm # ,
- ::= pm # ->*
- ::= pt # ->
- ::= cl # ()
- ::= ix # []
- ::= qu # ?
- ::= st # sizeof (a type)
- ::= sz # sizeof (an expression)
- ::= cv <type> # (cast)
- ::= v <digit> <source-name> # vendor extended operator
- <type> ::= <CV-qualifiers> <type>
- ::= P <type> # pointer-to
- ::= R <type> # reference-to
- ::= O <type> # rvalue reference-to (C++0x)
- ::= C <type> # complex pair (C 2000)
- ::= G <type> # imaginary (C 2000)
- ::= U <source-name> <type> # vendor extended type qualifier
- /*
- * Author: Chaos Lee
- * Description: A simple demo to show how the rules used to mangle functions' names work
- * Date:2012/05/06
- */
- #include<iostream>
- #include<string>
- using namespace std;
- int test_func(int & tmpInt,const char * ptr,double dou,string str,float f)
- {
- return 0;
- }
- int main(int argc,char * argv[])
- {
- char * test="test";
- int intNum = 10;
- double dou = 10.012;
- string str="str";
- float f = 1.2;
- test_func(intNum,test,dou,str,f);
- return 0;
- }
- [lichao@sg01 name_mangling]$ g++ -c func.cpp
- [lichao@sg01 name_mangling]$ nm func.cpp
- nm: func.cpp: File format not recognized
- [lichao@sg01 name_mangling]$ nm func.o
- 0000000000000060 t _GLOBAL__I__Z9test_funcRiPKcdSsf
- U _Unwind_Resume
- 0000000000000022 t _Z41__static_initialization_and_destruction_0ii
- 0000000000000000 T _Z9test_funcRiPKcdSsf
- U _ZNSaIcEC1Ev
- U _ZNSaIcED1Ev
- U _ZNSsC1EPKcRKSaIcE
- U _ZNSsC1ERKSs
- U _ZNSsD1Ev
- U _ZNSt8ios_base4InitC1Ev
- U _ZNSt8ios_base4InitD1Ev
- 0000000000000000 b _ZSt8__ioinit
- U __cxa_atexit
- U __dso_handle
- U __gxx_personality_v0
- 0000000000000076 t __tcf_0
- 000000000000008e T main
加粗的那行就是函數test_func通過name mangling以後的結果,其中:
C++的name mangling技術通常使得函數變得面目全非,而不少狀況下咱們在查看這些符號的時候並不須要看到這些函數name mangling以後的效果,而是想看看是否認義了某個函數,或者是否引用了某個函數,這對於咱們調試程序是很是有幫助的。
因此須要一種方法從name mangling以後的符號變換爲name mangling以前的符號,這個過程稱之爲name demangling.事實上有不少工具提供這些功能,最經常使用的就是c++file命令,c++filt命令接受一個name mangling以後的符號做爲輸入並輸出demangling以後的符號。例如:
- [lichao@sg01 name_mangling]$ c++filt _Z9test_funcRiPKcdSsf
- test_func(int&, char const*, double, std::basic_string<char, std::char_traits<char>, std::allocator<char> >, float)
- [lichao@sg01 name_mangling]$ nm func.o | c++filt
- 0000000000000060 t global constructors keyed to _Z9test_funcRiPKcdSsf
- U _Unwind_Resume
- 0000000000000022 t __static_initialization_and_destruction_0(int, int)
- 0000000000000000 T test_func(int&, char const*, double, std::basic_string<char, std::char_traits<char>, std::allocator<char> >, float)
- U std::allocator<char>::allocator()
- U std::allocator<char>::~allocator()
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()
- U std::ios_base::Init::Init()
- U std::ios_base::Init::~Init()
- 0000000000000000 b std::__ioinit
- U __cxa_atexit
- U __dso_handle
- U __gxx_personality_v0
- 0000000000000076 t __tcf_0
- 000000000000008e T main
- [lichao@sg01 name_mangling]$ nm -C func.o
- 0000000000000060 t global constructors keyed to _Z9test_funcRiPKcdSsf
- U _Unwind_Resume
- 0000000000000022 t __static_initialization_and_destruction_0(int, int)
- 0000000000000000 T test_func(int&, char const*, double, std::string, float)
- U std::allocator<char>::allocator()
- U std::allocator<char>::~allocator()
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&)
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&)
- U std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()
- U std::ios_base::Init::Init()
- U std::ios_base::Init::~Init()
- 0000000000000000 b std::__ioinit
- U __cxa_atexit
- U __dso_handle
- U __gxx_personality_v0
- 0000000000000076 t __tcf_0
- 000000000000008e T main
又到了Last but not least important的時候了,還有一個特別重要的接口函數就是__cxa_demangle(),此函數的原型爲:
- namespace abi {
- extern "C" char* __cxa_demangle (const char* mangled_name,
- char* buf,
- size_t* n,
- int* status);
- }
- /*
- * Author: Chaos Lee
- * Description: Employ __cxa_demangle to demangle a mangling function name.
- * Date:2012/05/06
- *
- */
- #include<iostream>
- #include<cxxabi.h>
- using namespace std;
- using namespace abi;
- int main(int argc,char *argv[])
- {
- const char * mangled_string = "_Z9test_funcRiPKcdSsf";
- char buffer[100];
- int status;
- size_t n=100;
- __cxa_demangle(mangled_string,buffer,&n,&status);
- cout<<buffer<<endl;
- cout<<status<<endl;
- return 0;
- }
- [lichao@sg01 name_mangling]$ g++ cxa_demangle.cpp -o cxa_demangle
- [lichao@sg01 name_mangling]$ ./cxa_demangle
- test_func(int&, char const*, double, std::string, float)
- 0
l 編寫名稱爲name mangling接口函數,打開重複符號的編譯開關,能夠替換原來函數中連接函數的指向,從而改變程序的運行結果。