內核中dump_stack的實現原理(2) —— symbol

環境

Linux-4.14
Aarch64
 

正文

在前面的分析中調用print_symbol("PC is at %s\n", instruction_pointer(regs))輸出當前PC地址的時候,輸出的的內容倒是:PC is at demo_init+0xc/0x1000 [demo]
下面分析一下這個函數print_symbol。
 1 static __printf(1, 2)
 2 void __check_printsym_format(const char *fmt, ...)
 3 {
 4 }
 5 
 6 static inline void print_symbol(const char *fmt, unsigned long addr)
 7 {
 8     __check_printsym_format(fmt, "");
 9     __print_symbol(fmt, (unsigned long)
10                __builtin_extract_return_addr((void *)addr));
11 }
 
第8行,格式檢查
第9行,__builtin_extract_return_addr((void *)addr)返回實際的addr,這裏返回的仍是addr,這個函數的說明能夠參考GCC文檔:
下面分析 __print_symbol
1 /* Look up a kernel symbol and print it to the kernel messages. */
2 void __print_symbol(const char *fmt, unsigned long address)
3 {
4     char buffer[KSYM_SYMBOL_LEN];
5 
6     sprint_symbol(buffer, address);
7 
8     printk(fmt, buffer);
9 }
 
第6行就是核心,這個函數完成了將address轉換成對應的內核符號字符串,並將字符串存入buffer中
 
下面分析 sprint_symbol:
 1 /**
 2  * sprint_symbol - Look up a kernel symbol and return it in a text buffer
 3  * @buffer: buffer to be stored
 4  * @address: address to lookup
 5  *
 6  * This function looks up a kernel symbol with @address and stores its name,
 7  * offset, size and module name to @buffer if possible. If no symbol was found,
 8  * just saves its @address as is.
 9  *
10  * This function returns the number of bytes stored in @buffer.
11  */
12 int sprint_symbol(char *buffer, unsigned long address)
13 {
14     return __sprint_symbol(buffer, address, 0, 1);
15 }
根據註釋,這個函數用於查找一個地址爲address的內核符號,而後將查找到的符號名字,偏移,大小以及模塊名存放到buffer中,若是沒有找到的話,只是將address按字符串的格式存入buffer。
這裏說明一下: demo_init+0xc/0x1000 [demo]
符號名字: demo_init
偏移: 0xc
大小: 0x1000
模塊名: demo
上面這行的意思是:傳入的address處於函數demo_init中,距離demo_init起始地址的偏移爲0xC,demo_init函數佔用的代碼空間是0x1000。所在的內核模塊是demo
 
下面分析 __sprint_symbol
 1 /* Look up a kernel symbol and return it in a text buffer. */
 2 static int __sprint_symbol(char *buffer, unsigned long address,
 3                int symbol_offset, int add_offset)
 4 {
 5     char *modname;
 6     const char *name;
 7     unsigned long offset, size;
 8     int len;
 9 
10     address += symbol_offset;
11     name = kallsyms_lookup(address, &size, &offset, &modname, buffer);
12     if (!name)
13         return sprintf(buffer, "0x%lx", address - symbol_offset);
14 
15     if (name != buffer)
16         strcpy(buffer, name);
17     len = strlen(buffer);
18     offset -= symbol_offset;
19 
20     if (add_offset)
21         len += sprintf(buffer + len, "+%#lx/%#lx", offset, size);
22 
23     if (modname)
24         len += sprintf(buffer + len, " [%s]", modname);
25 
26     return len;
27 }

上面的第11行的kallsyms_lookup就是根據address獲取size,offset,modname數組

 
kallsyms_lookup
 1 /*
 2  * Lookup an address
 3  * - modname is set to NULL if it's in the kernel.
 4  * - We guarantee that the returned name is valid until we reschedule even if.
 5  *   It resides in a module.
 6  * - We also guarantee that modname will be valid until rescheduled.
 7  */
 8 const char *kallsyms_lookup(unsigned long addr,
 9                 unsigned long *symbolsize,
10                 unsigned long *offset,
11                 char **modname, char *namebuf)
12 {
13     const char *ret;
14 
15     namebuf[KSYM_NAME_LEN - 1] = 0;
16     namebuf[0] = 0;
17 
18     if (is_ksym_addr(addr)) {
19         unsigned long pos;
20 
21         pos = get_symbol_pos(addr, symbolsize, offset);
22         /* Grab name */
23         kallsyms_expand_symbol(get_symbol_offset(pos),
24                        namebuf, KSYM_NAME_LEN);
25         if (modname)
26             *modname = NULL;
27 
28         ret = namebuf;
29         goto found;
30     }
31 
32     /* See if it's in a module or a BPF JITed image. */
33     ret = module_address_lookup(addr, symbolsize, offset,
34                     modname, namebuf);
35     if (!ret)
36         ret = bpf_address_lookup(addr, symbolsize,
37                      offset, modname, namebuf);
38 
39 found:
40     cleanup_symbol_name(namebuf);
41     return ret;
42 }
上面會從三個地方去查找符號,首先是內核中,若是沒有找到,就從內核模塊中查找,若是仍是沒有找到的話,最後就從bpf中查找。
 
下面分析第18~30行,即從內核中查找,其餘的之後再分析。
第18行,判斷addr是否位於內核的代碼段
第21行,要分析get_symbol_pos須要用到內核代碼編譯時生成的的.tmp_kallsyms2.S,其中存放了符號信息。
大體說明一下這個文件:
這個文件是動態生成的,使用的工具是scripts/kallsyms.c,下面說明一下 .tmp_kallsyms2.S中的變量做用:
 
 
kallsyms_offsets數組中存放的是每一個符號距離_text地址的偏移量,對於一下System.map:
 
 
能夠看到System.map中的符號地址減去_text的地址,就是 kallsyms_offsets數組中的值。
 
 
kallsyms_relative_base中存放的是符號的基地址,這個值加上 kallsyms_offsets數組中的offset就是符號的實際地址
kallsyms_num_syms存放的是內核符號的個數
kallsyms_names中存放的是每一個符號的名字,每一行對應一個,不過這裏爲了壓縮字符串,第一列表示後面的字節數,第二列開始表示的都是索引,索引的是kallsyms_token_index數組中的元素,而 kallsyms_token_index數組中存放的也是索引,它索引的是kallsyms_token_table
 
 
kallsyms_token_index:
 
 
kallsyms_token_table:
 
 
在遍歷 kallsyms_names時爲了加快索引速度,又引入了kallsyms_markers數組,這個數組每個成員都是 kallsyms_names中每256行的首地址,因此未來在根據address得到內核符號的索引下標後,將這個索引除以256,而後再在這個256行中找到對應的那行就快多了。
 
下面分析get_symbol_pos :
 1 static unsigned long get_symbol_pos(unsigned long addr,
 2                     unsigned long *symbolsize,
 3                     unsigned long *offset)
 4 {
 5     unsigned long symbol_start = 0, symbol_end = 0;
 6     unsigned long i, low, high, mid;
 7 
 8     /* This kernel should never had been booted. */
 9     if (!IS_ENABLED(CONFIG_KALLSYMS_BASE_RELATIVE))
10         BUG_ON(!kallsyms_addresses);
11     else
12         BUG_ON(!kallsyms_offsets);
13 
14     /* Do a binary search on the sorted kallsyms_addresses array. */
15     low = 0;
16     high = kallsyms_num_syms;
17 
18     while (high - low > 1) {
19         mid = low + (high - low) / 2;
20         if (kallsyms_sym_address(mid) <= addr)
21             low = mid;
22         else
23             high = mid;
24     }
25 
26     /*
27      * Search for the first aliased symbol. Aliased
28      * symbols are symbols with the same address.
29      */
30     while (low && kallsyms_sym_address(low-1) == kallsyms_sym_address(low))
31         --low;
32 
33     symbol_start = kallsyms_sym_address(low);
34 
35     /* Search for next non-aliased symbol. */
36     for (i = low + 1; i < kallsyms_num_syms; i++) {
37         if (kallsyms_sym_address(i) > symbol_start) {
38             symbol_end = kallsyms_sym_address(i);
39             break;
40         }
41     }
42 
43     /* If we found no next symbol, we use the end of the section. */
44     if (!symbol_end) {
45         if (is_kernel_inittext(addr))
46             symbol_end = (unsigned long)_einittext;
47         else if (IS_ENABLED(CONFIG_KALLSYMS_ALL))
48             symbol_end = (unsigned long)_end;
49         else
50             symbol_end = (unsigned long)_etext;
51     }
52 
53     if (symbolsize)
54         *symbolsize = symbol_end - symbol_start;
55     if (offset)
56         *offset = addr - symbol_start;
57 
58     return low;
59 }
第18~24,根據addr查找 kallsyms_offsets,獲取addr在哪兩個符號之間。這裏用到了二分法的查找方式,最後addr就位於索引爲low和high的兩個符號之間,其實就是位於索引爲low的函數內部
第30, kallsyms_offsets中能夠看到有不少符號的地址是相同的,這行用於獲取相同address的符號中的第一個對應的索引,即low
第33,獲取索引爲low的符號的地址symbol_start
第36~41,獲取緊接着比 symbol_start大的一個符號地址,symbol_end
第54行,獲取地址爲 symbol_start內核函數的佔用的空間的大小
第56行,獲取address相對於 symbol_start的偏移量
第58行,返回address所在的內核函數的首地址對應的索引號
 
接着分析kallsyms_lookup:
第21行,獲取了address所在的內核函數的首地址對應的索引號
第23行,get_symbol_offset獲取pos對應的內核符號字符串的地址相對於kallsyms_names的偏移量,能夠結合以前對 .tmp_kallsyms2.S的分析理解
 1 /*
 2  * Find the offset on the compressed stream given and index in the
 3  * kallsyms array.
 4  */
 5 static unsigned int get_symbol_offset(unsigned long pos)
 6 {
 7     const u8 *name;
 8     int i;
 9 
10     /*
11      * Use the closest marker we have. We have markers every 256 positions,
12      * so that should be close enough.
13      */
14     name = &kallsyms_names[kallsyms_markers[pos >> 8]];
15 
16     /*
17      * Sequentially scan all the symbols up to the point we're searching
18      * for. Every symbol is stored in a [<len>][<len> bytes of data] format,
19      * so we just need to add the len to the current pointer for every
20      * symbol we wish to skip.
21      */
22     for (i = 0; i < (pos & 0xFF); i++)
23         name = name + (*name) + 1;
24 
25     return name - kallsyms_names;
26 }
kallsyms_expand_symbol:
 1 /*
 2  * Expand a compressed symbol data into the resulting uncompressed string,
 3  * if uncompressed string is too long (>= maxlen), it will be truncated,
 4  * given the offset to where the symbol is in the compressed stream.
 5  */
 6 static unsigned int kallsyms_expand_symbol(unsigned int off,
 7                        char *result, size_t maxlen)
 8 {
 9     int len, skipped_first = 0;
10     const u8 *tptr, *data;
11 
12     /* Get the compressed symbol length from the first symbol byte. */
13     data = &kallsyms_names[off];
14     len = *data;
15     data++;
16 
17     /*
18      * Update the offset to return the offset for the next symbol on
19      * the compressed stream.
20      */
21     off += len + 1;
22 
23     /*
24      * For every byte on the compressed symbol data, copy the table
25      * entry for that byte.
26      */
27     while (len) {
28         tptr = &kallsyms_token_table[kallsyms_token_index[*data]];
29         data++;
30         len--;
31 
32         while (*tptr) {
33             if (skipped_first) {
34                 if (maxlen <= 1)
35                     goto tail;
36                 *result = *tptr;
37                 result++;
38                 maxlen--;
39             } else
40                 skipped_first = 1;
41             tptr++;
42         }
43     }
44 
45 tail:
46     if (maxlen)
47         *result = '\0';
48 
49     /* Return to offset to the next symbol. */
50     return off;
51 }
 
 
最後會將轉換獲得的內核符號的字符串名字拷貝到namebuf中。
 
完。
相關文章
相關標籤/搜索