C語言怎麼實現可變參數

時間 2021-07-23

標籤 shell 函數優化操作系統指針 code orm 內存字符串欄目 Unix 简体版

原文原文鏈接

可變參數

可變參數是指函數的參數的數據類型和數量都是不固定的。shell

printf函數的參數就是可變的。這個函數的原型是：int printf(const char *format, ...)。函數

用一段代碼演示printf的用法。優化

// code-A
#include <stdio.h>
int main(int argc, char **argv)
{
  	printf("a is %d, str is %s, c is %c\n", 23, "Hello, World;", 'A');
  	printf("T is %d\n", 78);
  	return 0;
}

在code-A中，第一條printf語句有4個參數，第二條printf語句有2個參數。顯然，printf的參數是可變的。操作系統

實現

代碼

code-A

先看兩段代碼，分別是code-A和code-B。指針

// file stack-demo.c

#include <stdio.h>

// int f(char *fmt, int a, char *str);
int f(char *fmt, ...);
int f2(char *fmt, void *next_arg);
int main(int argc, char *argv)
{
        char fmt[20] = "hello, world!";
        int a = 10;
        char str[10] = "hi";
        f(fmt, a, str);
        return 0;
}

// int f(char *fmt, int a, char *str)
int f(char *fmt, ...)
{
        char c = *fmt;
        void *next_arg = (void *)((char *)&fmt + 4);
        f2(fmt, next_arg);
        return 0;
}


int f2(char *fmt, void *next_arg)
{
        printf(fmt);
        printf("a is %d\n", *((int *)next_arg));
        printf("str is %s\n", *((char **)(next_arg + 4)));

        return 0;
}

編譯執行，結果以下：code

# 編譯
[root@localhost c]# gcc -o stack-demo stack-demo.c -g -m32
# 反彙編並把彙編代碼寫入dis-stack.asm中
[root@localhost c]# objdump -d stack-demo>dis-stack.asm
[root@localhost c]# ./stack-demo
hello, world!a is 10
str is hi

code-B

// file stack-demo.c

#include <stdio.h>

// int f(char *fmt, int a, char *str);
int f(char *fmt, ...);
int f2(char *fmt, void *next_arg);
int main(int argc, char *argv)
{
        char fmt[20] = "hello, world!";
        int a = 10;
        char str[10] = "hi";
  			char str2[10] = "hello";
        f(fmt, a, str, str2);
        return 0;
}

// int f(char *fmt, int a, char *str)
int f(char *fmt, ...)
{
        char c = *fmt;
        void *next_arg = (void *)((char *)&fmt + 4);
        f2(fmt, next_arg);
        return 0;
}


int f2(char *fmt, void *next_arg)
{
        printf(fmt);
        printf("a is %d\n", *((int *)next_arg));
        printf("str is %s\n", *((char **)(next_arg + 4)));
  			printf("str2 is %s\n", *((char **)(next_arg + 8)));

        return 0;
}

編譯執行，結果以下：orm

# 編譯
[root@localhost c]# gcc -o stack-demo stack-demo.c -g -m32
# 反彙編並把彙編代碼寫入dis-stack.asm中
[root@localhost c]# objdump -d stack-demo>dis-stack.asm
[root@localhost c]# ./stack-demo
hello, world!a is 10
str is hi
str2 is hello

分析

在code-A中，調用f的語句是f(fmt, a, str);；在code-B中，調用f的語句是f(fmt, a, str, str2);。ip

很容易看出，int f(char *fmt, ...);就是參數可變的函數。內存

關鍵語句

實現可變參數的關鍵語句是：字符串

char c = *fmt;
void *next_arg = (void *)((char *)&fmt + 4);
printf("a is %d\n", *((int *)next_arg));
printf("str is %s\n", *((char **)(next_arg + 4)));
printf("str2 is %s\n", *((char **)(next_arg + 8)));

&fmt是第一個參數的內存地址。
next_arg是第二個參數的內存地址。
next_arg+4、next_arg+8分別是第三個、第四個參數的內存地址。

爲何

內存地址的計算方法

先看一段僞代碼。這段僞代碼是f函數的對應的彙編代碼。假設f有三個參數。固然f也能夠有四個參數或2個參數。咱們用三個參數的狀況來觀察一下f。

f:
	; 入棧ebp
	; 把ebp設置爲esp
	
	; ebp + 0 存儲的是 eip，由call f入棧
	; ebp + 4 存儲的是 舊ebp
	; 第一個參數是 ebp + 8
	; 第二個參數是 ebp + 12
	; 第三個參數是 ebp + 16
	
	; 函數f的邏輯
	
	; 出棧ebp。ebp恢復成了剛進入函數以前的舊ebp
	; ret

調用f的僞代碼是：

; 入棧第三個參數
; 入棧第二個參數
; 入棧第一個參數
; 調用f，把eip入棧

在彙編代碼中，第一個參數的內存地址很容易肯定，第二個、第三個還有第N個參數的內存地址也很是容易肯定。沒法是在ebp的基礎上增長特定長度而已。

但是，咱們只能肯定，一定存在第一個參數，不能肯定是否存在的二個、第三個還有第N個參數。沒有理由使用一個可能不存在的參數做爲參照物、而且還要用它卻計算其餘參數的地址。

第一個參數一定存在，因此，咱們用它做爲肯定其餘參數的內存地址的參照物。

內存地址

在f函數的C代碼中，&fmt是第一個參數佔用的f的棧的元素的內存地址，換句話說，是一個局部變量的內存地址。

局部變量的內存地址不能做爲函數的返回值，卻可以在本函數執行結束前使用，包括在本函數調用的其餘函數中使用。這就是在f2中仍然可以使用fmt計算出來的內存地址的緣由。

難點

當參數是int類型時，獲取參數的值使用*(int *)(next_arg)。

當參數是char str[20]時，獲取參數的值使用*(char **)(next_arg + 4)。

爲何不直接使用next_arg、(next_arg + 4)呢？

分析*(int *)(next_arg)。

在32位操做系統中，任何內存地址的值看起來都是一個32位的正整數。但是這個正整數的值的類型並非unsigned int，而是int *。

關於這點，咱們能夠在gdb中使用ptype確認一下。例如，有一小段代碼int *a;*a = 5;，執行ptype a，結果會是int *。

next_arg只是一個正整數，損失了它的數據類型，咱們須要把數據類型補充進來。咱們可以把這個操做理解成」強制類型轉換「。

至於*(int *)(next_arg)前面的*，很容易理解，獲取一個指針指向的內存中的值。

用通用的方式分析*(char **)(next_arg+4)。

由於是第三個參數，所以next_arg+4。
由於第三個參數的數據類型是char str[20]。根據經驗，char str[20]對應的指針是char *。
由於next_arg+4只是函數的棧的元素的內存地址，在目標元素中存儲的是一個指針。也就是說，next_arg+4是一個雙指針類型的指針。它最終又指向字符串，根據經驗，next_arg+4的數據類型是char **。不必太糾結這一點。本身寫一個簡單的指向字符串的雙指針，使用gdb的ptype查看這種類型的數據類型就能驗證這一點。
最前面的*，獲取指針指向的數據。

給出一段驗證第3點的代碼。

char str[20] = "hello";
char *ptr = str;
// 使用gdb的ptype 打印 ptype &ptr

打印結果以下：

Breakpoint 1, main (argc=1, argv=0xffffd3f4) at point.c:13
13		char str7[20] = "hello";
(gdb) s
14		char *ptr = str7;
(gdb) s
19		int b = 7;
(gdb) p &str
$1 = (char **) 0xffffd2fc