本文對bash的源碼(版本:4.2.46(1)-release
)進行簡要分析。shell
bash是用C語言寫成的,其源碼中只使用了少許的數據結構:數組
,樹
,單向鏈表
,雙向鏈表
和哈希表
。幾乎全部的bash結構都是用這些基本結構實現的。segmentfault
源碼中最主要的結構都定義在根目錄下頭文件command.h
中。數組
bash在不一樣階段傳輸信息並處理數據單元的數據結構是WORD_DESC
:緩存
typedef struct word_desc { char *word; /* Zero terminated string. */ int flags; /* Flags associated with this word. */ } WORD_DESC;
WORD_DESC
表示一個單詞,字符指針word
指向一個以\0結尾的字符串,整型成員flags
定義了該單詞的類型。
當前源碼中定義了二十多種單詞類型,如W_HASDOLLAR
表示該單詞包含擴展字符$
,W_ASSIGNMENT
表示該單詞是一個賦值語句,W_GLOBEXP
表示該單詞是路徑擴展(通配符擴展)以後的結果等等。bash
單詞被組合爲簡單的鏈表WORD_LIST
:數據結構
typedef struct word_list { struct word_list *next; WORD_DESC *word; } WORD_LIST;
WORD_LIST
在shell中無處不在。一個簡單的命令就是一個單詞列表,展開結果一樣是一個單詞列表,內置命令的參數仍是一個單詞列表。app
結構REDIRECT
描述了一條命令的重定向鏈表,包含指向下一個REDIRECT對象的next指針:async
typedef struct redirect { struct redirect *next; /* Next element, or NULL. */ REDIRECTEE redirector; /* Descriptor or varname to be redirected. */ int rflags; /* Private flags for this redirection */ int flags; /* Flag value for `open'. */ enum r_instruction instruction; /* What to do with the information. */ REDIRECTEE redirectee; /* File descriptor or filename */ char *here_doc_eof; /* The word that appeared in <<foo. */ } REDIRECT;
整型成員flags
定義了目標文件打開方式。
重定向描述符redirector
的類型是一個聯合體REDIRECTEE
:函數
typedef union { int dest; /* Place to redirect REDIRECTOR to, or ... */ WORD_DESC *filename; /* filename to redirect to. */ } REDIRECTEE;
instruction
是枚舉型變量r_instruction
,它定義了一個重定向的類型:oop
enum r_instruction { r_output_direction, r_input_direction, r_inputa_direction, r_appending_to, r_reading_until, r_reading_string, r_duplicating_input, r_duplicating_output, r_deblank_reading_until, r_close_this, r_err_and_out, r_input_output, r_output_force, r_duplicating_input_word, r_duplicating_output_word, r_move_input, r_move_output, r_move_input_word, r_move_output_word, r_append_err_and_out };
在REDIRECTEE
中,若是重定向類型是ri_duplicating_input
或者ri_duplicating_output
則使用整型成員dest
(若是其值爲負則表示錯誤的重定向),不然使用結構指針成員filename
。
REDIRECT結構中的字符指針成員here_doc_eof
,指定了重定向類型爲Here Document
(見這裏)。
命令COMMAND
結構描述一條bash命令,對於複合命令
,其內部可能還包含有其餘命令:
typedef struct command { enum command_type type; /* FOR CASE WHILE IF CONNECTION or SIMPLE. */ int flags; /* Flags controlling execution environment. */ int line; /* line number the command starts on */ REDIRECT *redirects; /* Special redirects for FOR CASE, etc. */ union { struct for_com *For; struct case_com *Case; struct while_com *While; struct if_com *If; struct connection *Connection; struct simple_com *Simple; struct function_def *Function_def; struct group_com *Group; #if defined (SELECT_COMMAND) struct select_com *Select; #endif #if defined (DPAREN_ARITHMETIC) struct arith_com *Arith; #endif #if defined (COND_COMMAND) struct cond_com *Cond; #endif #if defined (ARITH_FOR_COMMAND) struct arith_for_com *ArithFor; #endif struct subshell_com *Subshell; struct coproc_com *Coproc; } value; } COMMAND;
枚舉型成員type
定義了命令類型:
/* Command Types: */ enum command_type { cm_for, cm_case, cm_while, cm_if, cm_simple, cm_select, cm_connection, cm_function_def, cm_until, cm_group, cm_arith, cm_cond, cm_arith_for, cm_subshell, cm_coproc };
整型成員flags
定義了命令的執行環境,好比是否在子shell中執行,是否在後臺執行等等。
聯合成員value
指明瞭命令值的結構指針,各個不一樣的命令對應於不一樣的結構體。
如if
命令結構:
/* IF command. */ typedef struct if_com { int flags; /* See description of CMD flags. */ COMMAND *test; /* Thing to test. */ COMMAND *true_case; /* What to do if the test returned non-zero. */ COMMAND *false_case; /* What to do if the test returned zero. */ } IF_COM;
簡單命令simple
結構:
typedef struct simple_com { int flags; /* See description of CMD flags. */ int line; /* line number the command starts on */ WORD_LIST *words; /* The program name, the arguments, variable assignments, etc. */ REDIRECT *redirects; /* Redirections to perform. */ } SIMPLE_COM;
while
命令結構:
/* WHILE command. */ typedef struct while_com { int flags; /* See description of CMD flags. */ COMMAND *test; /* Thing to test. */ COMMAND *action; /* Thing to do while test is non-zero. */ } WHILE_COM;
等等。
如下所涉及文件如無特殊說明均處於bash源碼的根目錄下。
對於一行bash命令的執行流程分爲兩大步驟:解析
和執行
(注意和上一篇中的解析和執行的區別)。解析
的做用是得到用於執行的命令結構體:COMMAND *global_command
執行
主要是針對特定類型的命令進行執行和結果處理。
bash的入口函數main()
位於文件shell.c
中:
int main (argc, argv, env) int argc; char **argv, **env; { .... shell_initialize (); .... run_startup_files (); .... shell_initialized = 1; /* Read commands until exit condition. */ reader_loop (); exit_shell (last_command_exit_value); }
函數定義了shell啓動和運行過程當中的一些狀態變量,依據不一樣的參數初始化shell:shell_initialize ()
初始化了shell變量和參數,run_startup_files ()
執行須要的配置文件(/etc/profile
和~/.bashrc
等)。
初始化完成以後,進入eval.c
中的交互循環函數reader_loop()
。該函數不斷讀取和執行命令,直到遇到EOF。
此時函數調用關係爲:main()-->reader_loop()
。
/* Read and execute commands until EOF is reached. This assumes that the input source has already been initialized. */ int reader_loop () { .... if (read_command () == 0) { .... } else if (current_command = global_command) { .... execute_command (current_command); } .... return (last_command_exit_value); }
reader_loop()
函數中調用read_command()
取得命令結構體global_command
,而後賦值給current_command
並交給execute_command ()
去執行。
read_command ()
調用parse_command ()
,此時函數調用關係爲:main()-->reader_loop()-->read_command()-->parse_command()
/* Read and parse a command, returning the status of the parse. The command is left in the globval variable GLOBAL_COMMAND for use by reader_loop. This is where the shell timeout code is executed. */ int read_command () { .... result = parse_command (); .... return (result); } .... /* Call the YACC-generated parser and return the status of the parse. Input is read from the current input stream (bash_input). yyparse leaves the parsed command in the global variable GLOBAL_COMMAND. This is where PROMPT_COMMAND is executed. */ int parse_command () { .... r = yyparse (); if (need_here_doc) gather_here_documents (); return (r); }
parse_command()
調用y.tab.c
中的yyparse ()
函數,並使用函數gather_here_documents ()
處理here document
類型的輸入重定向。
yyparse ()
由YACC經過parse.y
生成,函數內使用大量的goto語句,此文件可讀性較差:
int yyparse () { .... yychar = YYLEX; .... yytoken = YYTRANSLATE (yychar); .... yyn += yytoken; .... switch (yyn) { case 2: { global_command = (yyvsp[(1) - (2)].command); .... } break; case 3: { global_command = (COMMAND *)NULL; .... } break; .... case 6: { (yyval.word_list) = make_word_list ((yyvsp[(1) - (1)].word), (WORD_LIST *)NULL); } break; .... case 8: { .... redir.filename = (yyvsp[(2) - (2)].word); (yyval.redirect) = make_redirection (source, r_output_direction, redir, 0); } .... case 57: { (yyval.command) = make_simple_command ((yyvsp[(1) - (1)].element), (COMMAND *)NULL); } break; .... case 107: { (yyval.command) = make_if_command ((yyvsp[(2) - (7)].command), (yyvsp[(4) - (7)].command), (yyvsp[(6) - (7)].command)); } break; .... default: break; } .... return YYID (yyresult); }
函數內調用yylex()
(宏定義:#define YYLEX yylex ()
)來得到並計算出整型變量yyn
的值,而後根據不一樣的yyn
值獲取具體的命令結構體。
在函數yylex()
內部,調用read_token()
得到各類類型的token
並進一步調用read_token_word()
獲取具體的不一樣類型的單詞結構WORD_DESC
。
以後在yyparse()
中,調用文件make_cmd.c
中各類函數,根據yylex()
得到的各類token
和word
組裝成具體command
。
其中,make_word_list()
負責生成單詞鏈表WORD_LIST
;make_redirection()
負責生成重定向鏈表REDIRECT
;command_connect()
根據一行語句中多個命令的邏輯順序生成關係;make_simple_command()
負責生成簡單命令;以及一系列生成各類不一樣命令的其餘函數。
此時的函數調用關係爲:
main()-->reader_loop()-->read_command()-->parse_command()-->yyparse()-->yylex()-->read_token()-->read_token_word() | | | | current_command <-------------- global_command <------------token------------word
在函數reader_loop()
中,調用完read_command()
得到current_command
後,將調用execute_cmd.c
中的execute_command()
來執行命令:
int execute_command (command) COMMAND *command; { .... result = execute_command_internal (command, 0, NO_PIPE, NO_PIPE, bitmap); .... return (result); }
execute_command()
調用execute_command_internal()
函數:
int execute_command_internal (command, asynchronous, pipe_in, pipe_out,fds_to_close) .... { .... switch (command->type) { case cm_simple: { .... exec_result = execute_simple_command (command->value.Simple, pipe_in, pipe_out, asynchronous, fds_to_close); .... } break; case cm_for: .... exec_result = execute_for_command (command->value.For); break; .... case cm_cond: .... exec_result = execute_cond_command (command->value.Cond); .... break; .... default: command_error ("execute_command", CMDERR_BADTYPE, command->type, 0); } .... last_command_exit_value = exec_result; .... return (last_command_exit_value); }
在函數execute_command_internal()
中,根據參數command
的類型command->type
,分別調用不一樣的命令執行函數,並返回命令的退出碼。
此時函數的調用關係爲:main()-->reader_loop()-->execute_command()-->execute_command_internal()-->execute_xxxx_command()
。
這些命令執行函數除execute_arith_command()
和execute_cond_command()
以外,都將遞歸地調用execute_command_internal()
並最終執行execute_simple_command()
:
static int execute_simple_command (simple_command, pipe_in, pipe_out, async, fds_to_close) .... { .... if (dofork) { .... if (make_child (savestring (the_printed_command_except_trap), async) == 0) { .... } else { .... return (result); } } .... words = expand_words (simple_command->words); .... builtin = find_special_builtin (words->word->word); .... func = find_function (words->word->word); .... run_builtin: .... if (func == 0 && builtin == 0) builtin = find_shell_builtin (this_command_name); .... if (builtin || func) { .... result = execute_builtin_or_function(words, builtin, func, simple_command->redirects, fds_to_close, simple_command->flags); .... goto return_result; } .... result = execute_disk_command (words, simple_command->redirects, command_line, pipe_in, pipe_out, async, fds_to_close, simple_command->flags); return_result: .... return (result); }
首先,對於須要在子shell中執行的命令(如管道中的命令),先調用job.c
中的make_child()
,而後進一步執行系統調用fork()
及execve()
。
若是並不須要在子shell中執行,則將簡單命令中的單詞進行擴展操做,調用的函數位於subst.c
中,包括:expand_words()
、expand_word_list_internal()
等等。
以後進行命令搜索,前後調用以下函數:搜索特殊內置命令find_special_builtin()
(此版本的bash包含以下特殊內置命令:break continue : eval exec exit return set unset export readonly shift source . times trap
),搜索函數find_function()
,搜索內置命令find_shell_builtin()
。
若是搜索到結果則執行execute_builtin_or_function()
,若是沒有搜索到則執行execute_disk_command()
:
static int execute_disk_command (words, redirects, command_line, pipe_in, pipe_out, async, fds_to_close, cmdflags) .... { .... result = EXECUTION_SUCCESS; .... command = search_for_command (pathname); .... pid = make_child (savestring (command_line), async); if (pid == 0) { .... if (command == 0) { .... internal_error (_("%s: command not found"), pathname); exit (EX_NOTFOUND); .... } .... exit (shell_execve (command, args, export_env)); } else { parent_return: close_pipes (pipe_in, pipe_out); .... FREE (command); return (result); } }
execute_disk_command()
首先調用findcmd.c
中的search_for_command()
進行命令搜索(注意區別函數execute_simple_command()
中的命令搜索):
char * search_for_command (pathname) const char *pathname; { .... hashed_file = phash_search (pathname); .... if (hashed_file) command = hashed_file; else if (absolute_program (pathname)) command = savestring (pathname); else { .... command = find_user_command (pathname); .... } return (command); }
命令搜索首先在hash緩存中進行,若是命令名包含斜線/
,則既不在PATH中搜索,也不在hash表中進行緩存,直接返回該命令。
若是hash緩存中未找到且不包含斜線,則調用find_user_command()
及find_user_command_internal()
等函數繼續在PATH中尋找。
而後,execute_disk_command()
調用job.c
中的make_child()
,make_child()
內部執行系統調用fork()
並返回pid
。在子進程中,execute_disk_command()
判斷返回的命令command
,若是未搜索到命令,則返回報錯並退出,若是找到命令,則調用shell_execve()
並進一步執行系統調用execve()
:
int shell_execve (command, args, env) .... { .... execve (command, args, env); .... i = errno; /* error from execve() */ .... if (i != ENOEXEC) { if (file_isdir (command)) .... else if (executable_file (command) == 0) .... else .... } .... return (execute_shell_script (sample, sample_len, command, args, env)); .... }
若是execve()
失敗了,則判斷文件,若是文件不是目錄且有可執行權限,則把它當作腳本執行execute_shell_script()
。
至此,子進程退出,父進程關閉管道,釋放命令結構體,返回至函數execute_command_internal()
並將結果result
賦值給全局變量last_command_exit_value
返回。
整個流程函數調用關係爲:
main() | reader_loop() 解析 |--------------------------->read_command()-->parse_command()-->yyparse()-->yylex()-->read_token()-->read_token_word() | | | | | execute_command() <-------------- current_command <--------------- global_command <------------token------------word | execute_command_internal() | execute_xxxx_command() | execute_simple_command() | |--->expand_words()-->expand_word_list_internal() | 子進程 |------------------------------------->execute_disk_command()------------->shell_execve()-->execve() | 磁盤命令 | | | |函數及內置命令 make_child() | |FAILED | | | | execute_builtin_or_function() fork()----------->pid ->execute_shell_script() | --------->return(result) 父進程