透過源碼看本質-關於Selenium Webdriver 實現原理的一點思考和分享

時間 2019-12-08

標籤透過源碼本質關於 selenium webdriver 實現原理一點 1點思考分享欄目興趣愛好简体版

原文原文鏈接

做爲一名使用Selenium開發UI自動化多年的工程師，一直都對Selenium Webdriver的實現原理感受不是很清楚。怎麼就經過腳本控制瀏覽器進行各類操做了呢？相信不少Selenium的使用者也會有相似的疑惑。最近針對這個問題看了很多了文章和書籍，在加上一點本身的思考和整理，與你們一塊兒分享，一塊兒學習。文章中若是有不許確的地方，但願你們給予指正。html

結構

想要使用Selenium實現自動化測試，主要須要三個東西。python

測試代碼
Webdriver
瀏覽器

測試代碼

測試代碼就是程序員利用不一樣的語言和相應的selenium API庫完成的代碼。本文將以python爲例進行說明。git

Webdriver

Webdriver是針對不一樣的瀏覽器開發的，不一樣的瀏覽器有不一樣的webdriver。例如針對Chrome使用的chromedriver。程序員

瀏覽器

瀏覽器和相應的Webdriver對應。github

首先咱們來看一下這三個部分的關係。
對於三個部分的關係模型，能夠用一個平常生活中常見的例子來類比。
web

對於打的這個行爲來講，乘客和出租車司機進行交互，告訴出租車想去的目的地，出租車司機駕駛汽車把乘客送到目的地，這樣乘客就乘坐出租車到達了本身想去的地方。
這和Webdriver的實現原理是相似的，測試代碼中包含了各類指望的對瀏覽器界面的操做，例如點擊。測試代碼經過給Webdriver發送指令，讓Webdriver知道想要作的操做，而Webdriver根據這些操做在瀏覽器界面上進行控制，由此測試代碼達到了在瀏覽器界面上操做的目的。
理清了Selenium自動化測試三個重要組成之間的關係，接下來咱們來具體分析其中一個最重要的關係。chrome

測試代碼與Webdriver的交互

接下來我會以獲取界面元素這個基本的操做爲例來分析二者之間的關係。
在測試代碼中，咱們第一步要作的是新建一個webdriver類的對象：json

from selenium import webdriver
driver = webdriver.Chrome()

這裏新建的driver對象是一個webdriver.Chrome()類的對象，而webdriver.Chrome()類的本質是api

from .chrome.webdriver import WebDriver as Chrome

也就是一個來自chrome的WebDriver類。這個.chrome.webdriver.WebDriver是繼承了selenium.webdriver.remote.webdriver.WebDriver瀏覽器

from selenium.webdriver.remote.webdriver import WebDriver as RemoteWebDriver
...
class WebDriver(RemoteWebDriver):
    """
    Controls the ChromeDriver and allows you to drive the browser.

    You will need to download the ChromeDriver executable from
    http://chromedriver.storage.googleapis.com/index.html
    """

    def __init__(self, executable_path="chromedriver", port=0,
                 chrome_options=None, service_args=None,
                 desired_capabilities=None, service_log_path=None):
...

以python爲例，在selenium庫中，經過ID獲取界面元素的方法是這樣的：

from selenium import webdriver
driver = webdriver.Chrome()
driver.find_element_by_id(id)

find_elements_by_id是selenium.webdriver.remote.webdriver.WebDriver類的實例方法。在代碼中，咱們直接使用的其實不是selenium.webdriver.remote.webdriver.WebDriver這個類，而是針對各個瀏覽器的webdriver類，例如webdriver.Chrome()。
因此說在測試代碼中執行各類瀏覽器操做的方法其實都是selenium.webdriver.remote.webdriver.WebDriver類的實例方法。
接下來咱們再深刻selenium.webdriver.remote.webdriver.WebDriver類來看看具體是如何實現例如find_element_by_id()的實例方法的。
經過Source code能夠看到：

def find_element(self, by=By.ID, value=None):
        """
        'Private' method used by the find_element_by_* methods.

        :Usage:
            Use the corresponding find_element_by_* instead of this.

        :rtype: WebElement
        """
        if self.w3c:
      ...
        return self.execute(Command.FIND_ELEMENT, {
            'using': by,
            'value': value})['value']

這個方法最後call了一個execute方法，方法的定義以下：

def execute(self, driver_command, params=None):
        """
        Sends a command to be executed by a command.CommandExecutor.

        :Args:
         - driver_command: The name of the command to execute as a string.
         - params: A dictionary of named parameters to send with the command.

        :Returns:
          The command's JSON response loaded into a dictionary object.
        """
        if self.session_id is not None:
            if not params:
                params = {'sessionId': self.session_id}
            elif 'sessionId' not in params:
                params['sessionId'] = self.session_id

        params = self._wrap_value(params)
        response = self.command_executor.execute(driver_command, params)
        if response:
            self.error_handler.check_response(response)
            response['value'] = self._unwrap_value(
                response.get('value', None))
            return response
        # If the server doesn't send a response, assume the command was
        # a success
        return {'success': 0, 'value': None, 'sessionId': self.session_id}

正如註釋中提到的同樣，其中的關鍵在於

response = self.command_executor.execute(driver_command, params)

一個名爲command_executor的對象執行了execute方法。
名爲command_executor的對象是RemoteConnection類的對象，而且這個對象是在新建selenium.webdriver.remote.webdriver.WebDriver類對象的時候就完成賦值的self.command_executor = RemoteConnection(command_executor, keep_alive=keep_alive)。
結合selenium.webdriver.remote.webdriver.WebDriver類的類註釋來看：

class WebDriver(object):
    """
    Controls a browser by sending commands to a remote server.
    This server is expected to be running the WebDriver wire protocol
    as defined at
    https://github.com/SeleniumHQ/selenium/wiki/JsonWireProtocol

    :Attributes:
     - session_id - String ID of the browser session started and controlled by this WebDriver.
     - capabilities - Dictionaty of effective capabilities of this browser session as returned
         by the remote server. See https://github.com/SeleniumHQ/selenium/wiki/DesiredCapabilities
     - command_executor - remote_connection.RemoteConnection object used to execute commands.
     - error_handler - errorhandler.ErrorHandler object used to handle errors.
    """

    _web_element_cls = WebElement

    def __init__(self, command_executor='http://127.0.0.1:4444/wd/hub',
                 desired_capabilities=None, browser_profile=None, proxy=None,
                 keep_alive=False, file_detector=None):

WebDriver類的功能是經過給一個remote server發送指令來控制瀏覽器。而這個remote server是一個運行WebDriver wire protocol的server。而RemoteConnection類就是負責與Remote WebDriver server的鏈接的類。
能夠注意到有這麼一個新建WebDriver類的對象時候的參數command_executor，默認值＝'http://127.0.0.1:4444/wd/hub'。這個值表示的是訪問remote server的URL。所以這個值做爲了RemoteConnection類的構造方法的參數，由於要鏈接remote server，URL是必須的。
如今再來看RemoteConnection類的實例方法execute。

def execute(self, command, params):
        """
        Send a command to the remote server.

        Any path subtitutions required for the URL mapped to the command should be
        included in the command parameters.

        :Args:
         - command - A string specifying the command to execute.
         - params - A dictionary of named parameters to send with the command as
           its JSON payload.
        """
        command_info = self._commands[command]
        assert command_info is not None, 'Unrecognised command %s' % command
        data = utils.dump_json(params)
        path = string.Template(command_info[1]).substitute(params)
        url = '%s%s' % (self._url, path)
        return self._request(command_info[0], url, body=data)

這個方法有兩個參數：

command
params

command表示指望執行的指令的名字。經過觀察self._commands這個dict能夠看到，self._commands存儲了selenium.webdriver.remote.command.Command類裏的常量指令和WebDriver wire protocol中定義的指令的對應關係。

self._commands = {
            Command.STATUS: ('GET', '/status'),
            Command.NEW_SESSION: ('POST', '/session'),
            Command.GET_ALL_SESSIONS: ('GET', '/sessions'),
            Command.QUIT: ('DELETE', '/session/$sessionId'),
...
            Command.FIND_ELEMENT: ('POST', '/session/$sessionId/element'),

以FIND_ELEMENT爲例能夠看到，指令的URL部分包含了幾個組成部分：

HTTP請求方法。WebDriver wire protocol中定義的指令是符合RESTful規範的，經過不一樣請求方法對應不一樣的指令操做。
sessionId。Session的概念是這麼定義的：

The server should maintain one browser per session. Commands sent to a session will be directed to the corresponding browser.

也就是說sessionId表示了remote server和瀏覽器的一個會話，指令經過這個會話變成對於瀏覽器的一個操做。
element。這一部分用來表示具體的指令。

而selenium.webdriver.remote.command.Command類裏的常量指令又在各個具體的相似find_elements的實例方法中做爲execute方法的參數來使用，這樣就實現了selenium.webdriver.remote.webdriver.WebDriver類中實現各類操做的實例方法與WebDriver wire protocol中定義的指令的一一對應。
而selenium.webdriver.remote.webelement.WebElement中各類在WebElement上的操做也是用相似的原理實現的。

實例方法execute的另外一個參數params則是用來保存指令的參數的，這個參數將轉化爲JSON格式，做爲HTTP請求的body發送到remote server。
remote server在執行完對瀏覽器的操做後獲得的數據將做爲HTTP Response的body返回給測試代碼，測試代碼通過解析處理後獲得想要的數據。