Selenium一款自動化測試工具,固然用來寫爬蟲也是沒有問題的。它支持Chrome、Safari、Firefox等主流界面式瀏覽器,另外它也支持多種語言開發,好比 Java,C,Ruby,Python等。PhantomJS是一個無界面的,可腳本編程的WebKit瀏覽器引擎,當咱們爬一些網站,頁面全js渲染,若是逐個去分析後臺請求,對於web高手無所謂,但對於像我這樣連js都不太懂的小白來講,簡直崩潰。這個時候用PhantomJS就能夠幫助咱們像瀏覽器同樣渲染js處理的頁面。css
代碼很簡單,關鍵地方已註釋:python
1 #!/usr/bin/env python 2 # -*- coding: utf-8 -*- 3 # @Time : 2018/1/5 16:55 4 # @Author : Eivll0m 5 # @Site : https://github.com/Eivll0m 6 # @File : YD_dict.py 7 # @Software : PyCharm 8 9 from selenium import webdriver 10 from selenium.webdriver.common.desired_capabilities import DesiredCapabilities 11 import sys 12 reload(sys) 13 sys.setdefaultencoding('utf8') 14 15 class YoudaoDict: 16 def __init__(self): 17 self.url = 'http://fanyi.youdao.com' 18 self.agent = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36' 19 self.dcap = dict(DesiredCapabilities.PHANTOMJS) 20 self.dcap["phantomjs.page.settings.userAgent"] = self.agent 21 self.service_args = [] 22 self.service_args.append('--load-images=no') ##關閉圖片加載 23 self.service_args.append('--disk-cache=yes') ##開啓緩存 24 self.service_args.append('--ignore-ssl-errors=true') ##忽略https錯誤 25 self.browser = webdriver.PhantomJS('D:\\Program Files\\phantomjs-2.1.1-windows\\bin\\phantomjs.exe',service_args=self.service_args) 26 27 def transTarget(self): 28 browser = self.browser 29 browser.get(self.url) 30 browser.implicitly_wait(3) 31 text = browser.find_element_by_id('inputOriginal') 32 text.clear() 33 while 1: 34 key = str(raw_input('請輸入您須要翻譯的內容:')) 35 if key == 'quit': 36 browser.quit() 37 exit() 38 if key: 39 break 40 text.send_keys(key.decode('utf-8')) 41 while 1: 42 try: 43 bro = browser.find_element_by_css_selector('#transTarget > p > span') 44 break 45 except: 46 print '還未定位到元素!' 47 return bro.text 48 49 if __name__ == '__main__': 50 D = YoudaoDict() 51 while 1: 52 print D.transTarget()
運行效果:git