python爬蟲熱點項目—滑塊驗證碼項目（以Bilili爲例）

時間 2019-11-10

標籤 python 爬蟲熱點項目滑塊驗證碼 bilili 爲例欄目 Python 简体版

原文原文鏈接

1.模擬登陸的網站：python

bilibili視頻網：https://passport.bilibili.com/loginweb

2. 開發環境

本項目須要用到canvas

io瀏覽器

timeapp

randomdom

seleniumide

PIL網站

請安裝對應版本的庫以下，其餘庫均爲標準庫，無需安裝 pip install pillow pip install seleniumui

3.項目流程介紹

初始化
請求bilibili的登陸頁面&模擬輸入帳號密碼
有陰影拼圖的驗證碼圖片&獲取驗證碼圖片
比較兩個驗證碼圖片獲取驗證碼滑塊的偏移量
使用偏移值計算移動操做
操做滑塊按鈕，模擬拖動滑塊作驗證登陸

5.bilibili模擬登錄-初始化和模擬輸入帳號密碼

class Bilibili(object): def __init__(self): #建立瀏覽器對象 self.driver = webdriver.Chrome() #隱式等待 self.driver.implicitly_wait(3) self.url = 'https://passport.bilibili.com/login' #用戶名 self.user = '' #密碼 self.pwd = '' def close(self): ''' 關閉瀏覽器 ''' self.driver.quit() def input_user_pwd(self): ''' 輸入用戶名和密碼 ''' #進入登錄頁面 self.driver.get(self.url) #文本框輸入用戶名 tb_user = self.driver.find_element_by_id('login-username') tb_user.send_keys(self.user) #文本框輸入密碼 tb_pwd = self.driver.find_element_by_id('login-passwd') tb_pwd.send_keys(self.pwd)

6.有陰影拼圖的驗證碼圖片&獲取驗證碼圖片

def get_screenshot(self):         '''         獲取屏幕截圖         '''         screenshot = self.driver.get_screenshot_as_png()         screenshot =Image.open(BytesIO(screenshot))         return screenshot     def update_style(self):         '''             修改圖片的style屬性，顯示完好口的圖片         '''         js = 'document.querySelectorAll("canvas")[3].style="display:block"'         self.driver.execute_script(js)         time.sleep(2)     def get_position(self):         '''             獲取截取驗證碼時的四條邊         '''         #定位到登錄按鈕         bt_login = self.driver.find_element_by_xpath('//a[@class="btn btn-login"]')         #模擬點擊         bt_login.click()         time.sleep(2)         #獲取驗證碼圖片對象         code_img = self.driver.find_element_by_xpath('//canvas[@class="geetest_canvas_slice geetest_absolute"]')         time.sleep(2)         location = code_img.location         size = code_img.size         #screenshot = self.get_screenshot()         #print(screenshot.size)         #計算圖片截取區域(左，上，右，下，的座標值)         left,top,right,buttom = location['x'],location['y'],location['x']+size['width'],location['y']+size['height']         return left,top,right,buttom     def get_image(self):         '''             截取驗證碼圖片         '''         #獲取驗證碼位置         position = self.get_position()         #從屏幕截圖中摳出有缺口的驗證碼圖片         captcha1 = self.get_screenshot().crop(position)         #修改style屬性，顯示完好口的驗證碼圖片         self.update_style()         #從屏幕截圖中摳出完好口的驗證碼圖片         captcha2 = self.get_screenshot().crop(position)         with open('captcha1.png','wb') as f1 ,open('captcha2.png','wb') as f2:             captcha1.save(f1)             captcha2.save(f2)         return captcha1,captcha2

7. 比較兩個驗證碼圖片獲取驗證碼滑塊的偏移量

    def is_pixel_equal(self,img1,img2,x,y):         '''             判斷兩張圖片的同一像素點的RGB值是否相等         '''         pixel1,pixel2= img1.load()[x,y],img2.load()[x,y]         #print(pixel1,pixel2)         #設定一個比較基準         sub_index = 60         #比較         if abs(pixel1[0]-pixel2[0])< sub_index and abs(pixel1[1]-pixel2[1])< sub_index and abs(pixel1[2]-pixel2[2])< sub_index:             return True         else:             return False     def get_gap_offset(self,img1,img2):         '''             獲取缺口的偏移量         '''         x = int(img1.size[0]/4.2)         for i in range(x,img1.size[0]):             for j in range(img1.size[1]):                 #兩張圖片對比,(i,j)像素點的RGB差距，過大則該x爲偏移值                 if not self.is_pixel_equal(img1,img2,i,j):                     x = i                     return x         return x

8.使用偏移值計算移動操做（軌跡）

    def get_track(self,offset):         '''            模擬人爲拖動驗證碼滑塊          '''         track = []         #滑塊起始x座標         current = 5         #變速臨界值         border_point = int(offset*3/5)         #設置時間間隔         t = 0.2         #設置初速度         offset +=4         v = 0         #循環直到滑動到偏移值時退出         while current < offset:             #根據是否臨界點改變運動狀態             if current < border_point:                 #加速度                 a = 1             else:                 a =-0.5             v0 = v             v = v0 + a*t             move = v0*t +0.5*a*t*t             current += move             track.append(round(move))         return track

9.操做滑塊按鈕，模擬拖動滑塊作驗證登陸

    def shake_mouse(self):         """         模擬人手釋放鼠標抖動         :return: None         """         ActionChains(self.driver).move_by_offset(xoffset=-2,yoffset=0).perform()         ActionChains(self.driver).move_by_offset(xoffset=2,yoffset=0).perform()     def operate_slider(self,track):         '''            拖動滑塊         '''         #獲取拖動按鈕         back_tracks = [-1,-1,-2,-1]         slider_bt = self.driver.find_element_by_xpath('//div[@class="geetest_slider_button"]')         #點擊拖動驗證碼的按鈕不放         ActionChains(self.driver).click_and_hold(slider_bt).perform()         #按正向軌跡移動         for i in track:             ActionChains(self.driver).move_by_offset(xoffset=i,yoffset=0).perform() #先加速後減速效果也不是很好。 #每移動一次隨機停頓0-1/100秒之間騙過了極驗，經過率很高             time.sleep(random.random()/100)         time.sleep(random.random())         #按逆向軌跡移動         for i in back_tracks:             time.sleep(random.random()/100)             ActionChains(self.driver).move_by_offset(xoffset=i,yoffset=0).perform()         #模擬人手抖動         self.shake_mouse()         time.sleep(random.random())         #鬆開滑塊按鈕         ActionChains(self.driver).release().perform()     def do_captcha(self):         '''             實現處理驗證碼         '''         #有缺口，完好口圖片         img1,img2 = self.get_image()         #比較兩個驗證碼圖片獲取驗證碼滑塊的偏移量         offset = self.get_gap_offset(img1,img2)         print(offset)         #使用偏移值計算移動操做         track = self.get_track(offset)         #操做滑塊按鈕，模擬拖動滑塊作驗證登陸         self.operate_slider(track)     def login(self):         '''         實現主要的登錄邏輯         '''         #來到登錄界面並輸入帳號密碼         self.input_user_pwd()         #處理驗證碼         self.do_captcha()         #關閉瀏覽器         self.close()     def run(self):         self.login() if __name__ == '__main__':     bili =Bilibili()     bili.run()