[翻譯] 使用 Python 建立你本身的 Shell：Part I

時間 2019-12-06

標籤翻譯使用 python 建立本身 shell 欄目 Python 简体版

原文原文鏈接

目錄python

使用 Python 建立你本身的 Shell：Part I

使用 Python 建立你本身的 Shell：Part I

原文連接與說明

https://hackercollider.com/articles/2016/07/05/create-your-own-shell-in-python-part-1/
本翻譯文檔原文選題自 Linux中國 ，翻譯文檔版權歸屬 Linux中國 全部

我很想知道一個 shell （像 bash，csh 等）內部是如何工做的。爲了知足本身的好奇心，我使用 Python 實現了一個名爲 yosh （Your Own Shell）的 Shell。本文章所介紹的概念也能夠應用於其餘編程語言。git

（提示：你能夠在這裏查找本博文使用的源代碼，代碼以 MIT 許可證發佈。在 Mac OS X 10.11.5 上，我使用 Python 2.7.10 和 3.4.3 進行了測試。它應該能夠運行在其餘類 Unix 環境，好比 Linux 和 Windows 上的 Cygwin。）github

讓咱們開始吧。正則表達式

步驟 0：項目結構

對於此項目，我使用瞭如下的項目結構。shell

yosh_project
|-- yosh
   |-- __init__.py
   |-- shell.py

yosh_project 爲項目根目錄（你也能夠把它簡單命名爲 yosh）。編程

yosh 爲包目錄，且 __init__.py 可使它成爲與包目錄名字相同的包（若是你不寫 Python，能夠忽略它。）bash

shell.py 是咱們主要的腳本文件。編程語言

步驟 1：Shell 循環

當啓動一個 shell，它會顯示一個命令提示符並等待你的命令輸入。在接收了輸入的命令並執行它以後（稍後文章會進行詳細解釋），你的 shell 會從新回到循環，等待下一條指令。ide

在 shell.py，咱們會以一個簡單的 mian 函數開始，該函數調用了 shell_loop() 函數，以下：函數

def shell_loop():
    # Start the loop here


def main():
    shell_loop()


if __name__ == "__main__":
    main()

接着，在 shell_loop()，爲了指示循環是否繼續或中止，咱們使用了一個狀態標誌。在循環的開始，咱們的 shell 將顯示一個命令提示符，並等待讀取命令輸入。

import sys

SHELL_STATUS_RUN = 1
SHELL_STATUS_STOP = 0


def shell_loop():
    status = SHELL_STATUS_RUN

    while status == SHELL_STATUS_RUN:
        # Display a command prompt
        sys.stdout.write('> ')
        sys.stdout.flush()

        # Read command input
        cmd = sys.stdin.readline()

以後，咱們切分命令輸入並進行執行（咱們即將實現命令切分和執行函數）。

所以，咱們的 shell_loop() 會是以下這樣：

import sys

SHELL_STATUS_RUN = 1
SHELL_STATUS_STOP = 0


def shell_loop():
    status = SHELL_STATUS_RUN

    while status == SHELL_STATUS_RUN:
        # Display a command prompt
        sys.stdout.write('> ')
        sys.stdout.flush()

        # Read command input
        cmd = sys.stdin.readline()

        # Tokenize the command input
        cmd_tokens = tokenize(cmd)

        # Execute the command and retrieve new status
        status = execute(cmd_tokens)

這就是咱們整個 shell 循環。若是咱們使用 python shell.py 啓動咱們的 shell，它會顯示命令提示符。然而若是咱們輸入命令並按回車，它會拋出錯誤，由於咱們還沒定義命令切分函數。

爲了退出 shell，能夠嘗試輸入 ctrl-c。稍後我將解釋如何以優雅的形式退出 shell。

步驟 2：命令切分

當用戶在咱們的 shell 中輸入命令並按下回車鍵，該命令將會是一個包含命令名稱及其參數的很長的字符串。所以，咱們必須切分該字符串（分割一個字符串爲多個標記）。

咋一看彷佛很簡單。咱們或許可使用 cmd.split()，以空格分割輸入。它對相似 ls -a my_folder 的命令起做用，由於它可以將命令分割爲一個列表 ['ls', '-a', 'my_folder']，這樣咱們便能輕易處理它們了。

然而，也有一些相似 echo "Hello World" 或 echo 'Hello World' 以單引號或雙引號引用參數的狀況。若是咱們使用 cmd.spilt，咱們將會獲得一個存有 3 個標記的列表 ['echo', '"Hello', 'World"'] 而不是 2 個標記的列表 ['echo', 'Hello World']。

幸運的是，Python 提供了一個名爲 shlex 的庫，它可以幫助咱們效驗如神地分割命令。（提示：咱們也可使用正則表達式，但它不是本文的重點。）

import sys
import shlex

...

def tokenize(string):
    return shlex.split(string)

...

而後咱們將這些標記發送到執行進程。

步驟 3：執行

這是 shell 中核心和有趣的一部分。當 shell 執行 mkdir test_dir 時，到底發生了什麼？（提示： mkdir 是一個帶有 test_dir 參數的執行程序，用於建立一個名爲 test_dir 的目錄。）

execvp 是涉及這一步的首個函數。在咱們解釋 execvp 所作的事以前，讓咱們看看它的實際效果。

import os
...

def execute(cmd_tokens):
    # Execute command
    os.execvp(cmd_tokens[0], cmd_tokens)

    # Return status indicating to wait for next command in shell_loop
    return SHELL_STATUS_RUN

...

再次嘗試運行咱們的 shell，並輸入 mkdir test_dir 命令，接着按下回車鍵。

在咱們敲下回車鍵以後，問題是咱們的 shell 會直接退出而不是等待下一個命令。然而，目標正確地被建立。

所以，execvp 實際上作了什麼？

execvp 是系統調用 exec 的一個變體。第一個參數是程序名字。v 表示第二個參數是一個程序參數列表（可變參數）。p 表示環境變量 PATH 會被用於搜索給定的程序名字。在咱們上一次的嘗試中，它將會基於咱們的 PATH 環境變量查找mkdir 程序。

（還有其餘 exec 變體，好比 execv、execvpe、execl、execlp、execlpe；你能夠 google 它們獲取更多的信息。）

exec 會用即將運行的新進程替換調用進程的當前內存。在咱們的例子中，咱們的 shell 進程內存會被替換爲 mkdir 程序。接着，mkdir 成爲主進程並建立 test_dir 目錄。最後該進程退出。

這裏的重點在於咱們的 shell 進程已經被 mkdir 進程所替換。這就是咱們的 shell 消失且不會等待下一條命令的緣由。

所以，咱們須要其餘的系統調用來解決問題：fork。

fork 會開闢新的內存並拷貝當前進程到一個新的進程。咱們稱這個新的進程爲子進程，調用者進程爲父進程。而後，子進程內存會被替換爲被執行的程序。所以，咱們的 shell，也就是父進程，能夠免受內存替換的危險。

讓咱們看看修改的代碼。

...

def execute(cmd_tokens):
    # Fork a child shell process
    # If the current process is a child process, its `pid` is set to `0`
    # else the current process is a parent process and the value of `pid`
    # is the process id of its child process.
    pid = os.fork()

    if pid == 0:
    # Child process
        # Replace the child shell process with the program called with exec
        os.execvp(cmd_tokens[0], cmd_tokens)
    elif pid > 0:
    # Parent process
        while True:
            # Wait response status from its child process (identified with pid)
            wpid, status = os.waitpid(pid, 0)

            # Finish waiting if its child process exits normally
            # or is terminated by a signal
            if os.WIFEXITED(status) or os.WIFSIGNALED(status):
                break

    # Return status indicating to wait for next command in shell_loop
    return SHELL_STATUS_RUN

...

當咱們的父進程調用 os.fork()時，你能夠想象全部的源代碼被拷貝到了新的子進程。此時此刻，父進程和子進程看到的是相同的代碼，且並行運行着。

若是運行的代碼屬於子進程，pid 將爲 0。不然，若是運行的代碼屬於父進程，pid 將會是子進程的進程 id。

當 os.execvp 在子進程中被調用時，你能夠想象子進程的全部源代碼被替換爲正被調用程序的代碼。然而父進程的代碼不會被改變。

當父進程完成等待子進程退出或終止時，它會返回一個狀態，指示繼續 shell 循環。