系統調用:從write到vfs_write

   1. 幾個重要概念linux

- 系統調用和庫函數api

- 同步中斷(異常)和異步中斷(中斷)bash

- 中斷向量(0~255),中斷號,中斷處理程序,中斷描述符表(IDT)ssh

  2. trap_init()異步

- called-by:    init/main.c:start_kernel函數

- loc: arch/x86/kernel/traps.c:824this

系統調用入口:
code

...
870 #ifdef CONFIG_X86_32
871     set_system_trap_gate(IA32_SYSCALL_VECTOR, entry_INT80_32);
872     set_bit(IA32_SYSCALL_VECTOR, used_vectors);
873 #endif
...

其中,arch/x86/include/asm/irq_vectors.h:orm

 #define IA32_SYSCALL_VECTOR     0x80

entry_INIT80_32就是系統調用的入口,彙編寫的暫時也看不懂,arch/x86/entry/entry_32.S:ci

 360 ENTRY(entry_INT80_32)
 361         ASM_CLAC
 362         pushl   %eax                            # save orig_eax
 363         SAVE_ALL
 364         GET_THREAD_INFO(%ebp)
 365                                                 # system call tracing in operation / emulation
 366         testl   $_TIF_WORK_SYSCALL_ENTRY, TI_flags(%ebp)
 367         jnz     syscall_trace_entry
 368         cmpl    $(NR_syscalls), %eax
 369         jae     syscall_badsys
 370 syscall_call:
 371         call    *sys_call_table(, %eax, 4)
 ...
 451 ENDPROC(entry_INT80_32)

不過,371行是關鍵,sys_call_table就是系統調用表,不過初始化已經不像過去那樣直接,如今若是要修改系統調用表,好像只須要修改模板文件,編譯時會調用腳本進行處理,arch/x86/entry/syscalls/syscall_64.tbl:

  1 #
  2 # 64-bit system call numbers and entry vectors
  3 #
  4 # The format is:
  5 # <number> <abi> <name> <entry point>
  6 #
  7 # The abi is "common", "64" or "x32" for this file.
  8 #
  9 0   common  read            sys_read
 10 1   common  write           sys_write
...

sys_write相似函數的*聲明*出如今include/uapi/asm-generic/unistd.h:

197 /* fs/read_write.c */
198 #define __NR3264_lseek 62
199 __SC_3264(__NR3264_lseek, sys_llseek, sys_lseek)
200 #define __NR_read 63
201 __SYSCALL(__NR_read, sys_read)
202 #define __NR_write 64
203 __SYSCALL(__NR_write, sys_write)

而它們的定義則出如今fs/read_write.c:

 577 SYSCALL_DEFINE3(write, unsigned int, fd, const char __user *, buf,
 578         size_t, count)
 579 {
 580     struct fd f = fdget_pos(fd);
 581     ssize_t ret = -EBADF;
 582 
 583     if (f.file) {
 584         loff_t pos = file_pos_read(f.file);
 585         ret = vfs_write(f.file, buf, count, &pos);
 586         if (ret >= 0)
 587             file_pos_write(f.file, pos);
 588         fdput_pos(f);
 589     }
 590 
 591     return ret;
 592 }

SYSCALL_DEFINE3固然又是宏定義了,在include/linux/syscalls.h中:

184 #define SYSCALL_DEFINE3(name, ...) SYSCALL_DEFINEx(3, _##name, __VA_ARGS__)
189 #define SYSCALL_DEFINEx(x, sname, ...)              \
190     SYSCALL_METADATA(sname, x, __VA_ARGS__)         \
191     __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
194 #define __SYSCALL_DEFINEx(x, name, ...)                 \
195     asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))   \
196         __attribute__((alias(__stringify(SyS##name))));     \
197     static inline long SYSC##name(__MAP(x,__SC_DECL,__VA_ARGS__));  \
198     asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__));  \
199     asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__))   \
200     {                               \
201         long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__));  \
202         __MAP(x,__SC_TEST,__VA_ARGS__);             \
203         __PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__));   \
204         return ret;                     \
205     }                               \

TMD,看到這裏真是醉了,之因此把宏定義用的淋漓盡致,都是爲了支持多種體系結構吧。

3. system_call_fastpath

 566 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 567 #              | |       |   ||||       |         |
 568             bash-1977  [000] .... 17284.993652: sys_close <-system_call_fastpath

見過這個函數吧?這個怪物暫時不明來歷,grep沒有抓到任何定義,只有這些:

Documentation/trace/ftrace.txt:574:            bash-1977  [000] .... 17284.993652: sys_close <-system_call_fastpath
Documentation/trace/ftrace.txt:583:            sshd-1974  [003] .... 17284.993658: sys_select <-system_call_fastpath
Documentation/trace/ftrace.txt:597:called this function "system_call_fastpath". The timestamp is the time
Documentation/trace/ftrace.txt:640: => system_call_fastpath
Documentation/trace/ftrace.txt:1053: => system_call_fastpath
Documentation/trace/ftrace.txt:1276: => system_call_fastpath
Documentation/trace/ftrace.txt:1376: => system_call_fastpath
Documentation/trace/ftrace.txt:2194:          usleep-2665  [001] ....  4186.475355: sys_nanosleep <-system_call_fastpath
Documentation/trace/ftrace.txt:2575:            bash-1994  [000] ....  5281.568967: sys_dup2 <-system_call_fastpath
Documentation/trace/ftrace.txt:2888: 17)      128     128   system_call_fastpath+0x16/0x1b
Documentation/kasan.txt:63: system_call_fastpath+0x12/0x17
Documentation/kasan.txt:104: [<ffffffff81cd3129>] system_call_fastpath+0x12/0x17

可是,比較老的內和版本3.0抓到了點東西:

arch/x86/kernel/entry_64.S:492:system_call_fastpath:
arch/x86/kernel/entry_64.S:573:    jmp system_call_fastpath

搞不懂,先這樣吧

相關文章
相關標籤/搜索