selenium 爬蟲

時間 2019-12-19

原文原文鏈接

基礎：css

"""
Selenium
是一個用於文本應用程序測試工具，提供一些函數經過這些函數能夠指定操做到指定的標籤，這些定位標籤API函數就是經過python實現的，框架底層是
同過javascipt實現的，徹底模擬用戶操做

#使用selenium作爬蟲的目的：
有些網站經過動態加載的方式來展現數據，這些網站在正常請求時，數據沒有辦法拿回來，就能夠使用selenium加載操做網頁，等待數據加載完成後，在繼續解析數據

用戶模擬登陸，而後直接訪問數據的操做，而且在操做中，不須要手動提取cookie，瀏覽器會根據操做請求，自動攜帶一些須要的數據

"""
#引用selenium中的webdriver
from selenium import webdriver
# 建立一個火狐瀏覽器對象，會自動打開瀏覽器
firefox = webdriver.Firefox()
# chorme = webdriver.Chrome()
# 打開一個目標網址
firefox.get("http://www.baidu.com")
# chorme.get("htttp://www.baidu.com")

# # 經過class屬性值查找
# firefox.find_element_by_class_name()
# #經過id 屬性值查找
# firefox.find_element_by_id()
# # 經過超連接文本內容查找
# firefox.find_element_by_link_text()
# # 經過css選擇器查找
# firefox.find_element_by_css_selector()
# # 經過name屬性值查找
# firefox.find_element_by_name()
# # 經過標籤名
# firefox.find_element_by_tag_name()
# # 經過xpath查找
# firefox.find_element_by_xpath()

ele = firefox.find_element_by_id("kw")
# ele = firefox.find_element_by_class("s_ipt")
# ele = chorme.find_element_by_class("s_ipt")

# 向輸入框輸入內容
# ele.send_keys("selenium")
# #找到百度一下的按鈕
# btn = firefox.find_element_by_id("su")
# # 點擊
# btn.click()

# get_attribute 獲取標籤內的屬性值
res = ele.get_attribute("class")
print(res)
#獲取標籤文本內容
res = ele.text
print(res)
# 獲取標籤的名稱
res = ele.tag_name
print(res)
#判斷是否被選中
res = ele.is_selected()
print(res)
#判斷標籤是否能夠
res = ele.is_enabled()
print(res)
#向文本框內輸入一些數據
ele.send_keys("selenium")
# 點擊
ele.click()
ele.submit()#提交表單
import time
time.sleep(1)
res = ele.clear()# 清空輸入框的內容

# 截圖
ele.screenshot('test.png')
# 退出瀏覽器
firefox.quit()

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。