原文連接及原做者:爬蟲進階教程:極驗(GEETEST)驗證碼破解教程 | Jack Cuihtml
爬蟲最大的敵人之一是什麼?沒錯,驗證碼![Geetest]做爲提供驗證碼服務的行家,市場佔有率仍是蠻高的。遇到Geetest提供的滑動驗證碼怎麼破?python
一種方法是分析它的js加密方法,經過大量抓包分析找到它的返回參數,直接自動生成須要的參數便可,這種方法工程量大一些,而且官方js腳本一升級,就得從新分析,耗時耗力。git
今天爲你們介紹的一種方法是,經過Selenium模擬用戶滑動解鎖。這個方法的優點在於簡單,方便更新。可是它的缺點也很明顯,速度慢,而且不能製做成api接口的形式。程序員
授人予魚不如授人予漁,接下來就爲你們呈現本教程的精彩內容。不過,在閱讀本篇文章以前,請確保你已經掌握網絡爬蟲基礎,若是不具有爬蟲基礎,請到個人CSDN專欄學習。而後,再來閱讀本文,個人專欄地址:點我查看github
左側顯示的爲自動識別過程,右邊是一些打印信息。web
咱們以國家企業信用信息公式系統爲例,這是一個企業信息查詢的網站,在每次查詢都須要進行一次驗證碼識別。它所使用的就是GEETEST驗證碼,它的URL:點我查看chrome
這個網站是這個樣子的:json
要想把大象裝冰箱,總共分幾步?api
那麼,如今思考一個問題,經過Selenium模擬用戶滑動解鎖,總共分幾步?請停在這裏,思考五分鐘,再繼續閱讀!瀏覽器
咱們先公佈一個粗率的答案:
其實,將每一個步驟拆分開來一點一點實現並不難,接下來進入正文。
這部份內容很簡單,Selenium基礎性的東西我再也不講解,若有不懂,請看我專欄的Selenium相關內容。
編寫代碼以下:
# -*-coding:utf-8 -*- from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium import webdriver class Crack(): def __init__(self,keyword): self.url = 'http://bj.gsxt.gov.cn/sydq/loginSydqAction!sydq.dhtml'; self.browser = webdriver.Chrome('D:\\chromedriver.exe') self.wait = WebDriverWait(self.browser, 100) self.keyword = keyword def open(self): """ 打開瀏覽器,並輸入查詢內容 """ self.browser.get(self.url) keyword = self.wait.until(EC.presence_of_element_located((By.ID, 'keyword_qycx'))) bowton = self.wait.until(EC.presence_of_element_located((By.CLASS_NAME, 'btn'))) keyword.send_keys(self.keyword) bowton.click() def crack(self): # 打開瀏覽器 self.open() if __name__ == '__main__': print('開始驗證') crack = Crack(u'中國移動') crack.crack()
運行效果以下:
咱們審查元素找打圖片的地址,審查結果以下:
能夠看到,圖片是不少圖片合成的,也就是說你只保存全部地址的圖片是不行的。它是經過background-position的方法進行合成的。每個圖片是亂的,這個怎麼搞?很簡單,抓取這些圖片的連接,而後根據連接的圖片,再合成這張沒有缺口的圖片,獲取缺口圖的方法也是如此,都是本身合成。
編寫代碼以下:
# -*-coding:utf-8 -*- import time, random import PIL.Image as image from io import BytesIO from PIL import Image from selenium import webdriver from selenium.webdriver.common.by import By from selenium.webdriver import ActionChains from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC import requests, json, re, urllib from bs4 import BeautifulSoup from urllib.request import urlretrieve class Crack(): def __init__(self,keyword): self.url = 'http://bj.gsxt.gov.cn/sydq/loginSydqAction!sydq.dhtml'; self.browser = webdriver.Chrome('D:\\chromedriver.exe') self.wait = WebDriverWait(self.browser, 100) self.keyword = keyword self.BORDER = 6 def __del__(self): time.sleep(2) self.browser.close() def get_screenshot(self): """ 獲取網頁截圖 :return: 截圖對象 """ screenshot = self.browser.get_screenshot_as_png() screenshot = Image.open(BytesIO(screenshot)) return screenshot def get_position(self): """ 獲取驗證碼位置 :return: 驗證碼位置元組 """ img = self.browser.find_element_by_class_name("gt_box") time.sleep(2) location = img.location size = img.size top, bottom, left, right = location['y'], location['y'] + size['height'], location['x'], location['x'] + size['width'] return (top, bottom, left, right) def get_image(self, name='captcha.png'): """ 獲取驗證碼圖片 :return: 圖片對象 """ top, bottom, left, right = self.get_position() print('驗證碼位置', top, bottom, left, right) screenshot = self.get_screenshot() captcha = screenshot.crop((left, top, right, bottom)) captcha.save(name) return captcha def get_images(self, bg_filename = 'bg.jpg', fullbg_filename = 'fullbg.jpg'): """ 獲取驗證碼圖片 :return: 圖片的location信息 """ bg = [] fullgb = [] while bg == [] and fullgb == []: bf = BeautifulSoup(self.browser.page_source, 'lxml') bg = bf.find_all('div', class_ = 'gt_cut_bg_slice') fullgb = bf.find_all('div', class_ = 'gt_cut_fullbg_slice') bg_url = re.findall('url\(\"(.*)\"\);', bg[0].get('style'))[0].replace('webp', 'jpg') fullgb_url = re.findall('url\(\"(.*)\"\);', fullgb[0].get('style'))[0].replace('webp', 'jpg') bg_location_list = [] fullbg_location_list = [] for each_bg in bg: location = {} location['x'] = int(re.findall('background-position: (.*)px (.*)px;',each_bg.get('style'))[0][0]) location['y'] = int(re.findall('background-position: (.*)px (.*)px;',each_bg.get('style'))[0][6]) bg_location_list.append(location) for each_fullgb in fullgb: location = {} location['x'] = int(re.findall('background-position: (.*)px (.*)px;',each_fullgb.get('style'))[0][0]) location['y'] = int(re.findall('background-position: (.*)px (.*)px;',each_fullgb.get('style'))[0][7]) fullbg_location_list.append(location) urlretrieve(url = bg_url, filename = bg_filename) print('缺口圖片下載完成') urlretrieve(url = fullgb_url, filename = fullbg_filename) print('背景圖片下載完成') return bg_location_list, fullbg_location_list def get_merge_image(self, filename, location_list): """ 根據位置對圖片進行合併還原 :filename:圖片 :location_list:圖片位置 """ im = image.open(filename) new_im = image.new('RGB', (260,116)) im_list_upper=[] im_list_down=[] for location in location_list: if location['y']==-58: im_list_upper.append(im.crop((abs(location['x']),58,abs(location['x'])+10,166))) if location['y']==0: im_list_down.append(im.crop((abs(location['x']),0,abs(location['x'])+10,58))) new_im = image.new('RGB', (260,116)) x_offset = 0 for im in im_list_upper: new_im.paste(im, (x_offset,0)) x_offset += im.size[0] x_offset = 0 for im in im_list_down: new_im.paste(im, (x_offset,58)) x_offset += im.size[0] new_im.save(filename) return new_im def open(self): self.browser.get(self.url) keyword = self.wait.until(EC.presence_of_element_located((By.ID, 'keyword_qycx'))) bowton = self.wait.until(EC.presence_of_element_located((By.CLASS_NAME, 'btn'))) keyword.send_keys(self.keyword) bowton.click() def get_slider(self): """ 獲取滑塊 :return: 滑塊對象 """ while True: try: slider = self.browser.find_element_by_xpath("//div[@class='gt_slider_knob gt_show']") break except: time.sleep(0.5) return slider def get_gap(self, img1, img2): """ 獲取缺口偏移量 :param img1: 不帶缺口圖片 :param img2: 帶缺口圖片 :return: """ left = 43 for i in range(left, img1.size[0]): for j in range(img1.size[1]): if not self.is_pixel_equal(img1, img2, i, j): left = i return left return left def is_pixel_equal(self, img1, img2, x, y): """ 判斷兩個像素是否相同 :param image1: 圖片1 :param image2: 圖片2 :param x: 位置x :param y: 位置y :return: 像素是否相同 """ # 取兩個圖片的像素點 pix1 = img1.load()[x, y] pix2 = img2.load()[x, y] threshold = 60 if (abs(pix1[0] - pix2[0] < threshold) and abs(pix1[1] - pix2[1] < threshold) and abs(pix1[2] - pix2[2] < threshold)): return True else: return False def get_track(self, distance): """ 根據偏移量獲取移動軌跡 :param distance: 偏移量 :return: 移動軌跡 """ # 移動軌跡 track = [] # 當前位移 current = 0 # 減速閾值 mid = distance * 4 / 5 # 計算間隔 t = 0.2 # 初速度 v = 0 while current < distance: if current < mid: # 加速度爲正2 a = 2 else: # 加速度爲負3 a = -3 # 初速度v0 v0 = v # 當前速度v = v0 + at v = v0 + a * t # 移動距離x = v0t + 1/2 * a * t^2 move = v0 * t + 1 / 2 * a * t * t # 當前位移 current += move # 加入軌跡 track.append(round(move)) return track def move_to_gap(self, slider, track): """ 拖動滑塊到缺口處 :param slider: 滑塊 :param track: 軌跡 :return: """ ActionChains(self.browser).click_and_hold(slider).perform() while track: x = random.choice(track) ActionChains(self.browser).move_by_offset(xoffset=x, yoffset=0).perform() track.remove(x) time.sleep(0.5) ActionChains(self.browser).release().perform() def crack(self): # 打開瀏覽器 self.open() # 保存的圖片名字 bg_filename = 'bg.jpg' fullbg_filename = 'fullbg.jpg' # 獲取圖片 bg_location_list, fullbg_location_list = self.get_images(bg_filename, fullbg_filename) # 根據位置對圖片進行合併還原 bg_img = self.get_merge_image(bg_filename, bg_location_list) fullbg_img = self.get_merge_image(fullbg_filename, fullbg_location_list) # 點按呼出缺口 slider = self.get_slider() # 獲取缺口位置 gap = self.get_gap(fullbg_img, bg_img) print('缺口位置', gap) track = self.get_track(gap-self.BORDER) print('滑動滑塊') print(track) self.move_to_gap(slider, track) if __name__ == '__main__': print('開始驗證') crack = Crack(u'中國移動') crack.crack() print('驗證成功')
運行效果以下:
能夠看到,運行以後,咱們已經順利生成了兩張圖片,一個是缺口圖,另外一個是非缺口圖。
根據缺口圖和非缺口圖,經過比對圖像的像素點的大小區別,找到缺口位置。
編寫代碼以下:
# -*-coding:utf-8 -*- from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from urllib.request import urlretrieve from selenium import webdriver from bs4 import BeautifulSoup import PIL.Image as image import re class Crack(): def __init__(self,keyword): self.url = 'http://bj.gsxt.gov.cn/sydq/loginSydqAction!sydq.dhtml' self.browser = webdriver.Chrome('D:\\chromedriver.exe') self.wait = WebDriverWait(self.browser, 100) self.keyword = keyword def open(self): """ 打開瀏覽器,並輸入查詢內容 """ self.browser.get(self.url) keyword = self.wait.until(EC.presence_of_element_located((By.ID, 'keyword_qycx'))) bowton = self.wait.until(EC.presence_of_element_located((By.CLASS_NAME, 'btn'))) keyword.send_keys(self.keyword) bowton.click() def get_images(self, bg_filename = 'bg.jpg', fullbg_filename = 'fullbg.jpg'): """ 獲取驗證碼圖片 :return: 圖片的location信息 """ bg = [] fullgb = [] while bg == [] and fullgb == []: bf = BeautifulSoup(self.browser.page_source, 'lxml') bg = bf.find_all('div', class_ = 'gt_cut_bg_slice') fullgb = bf.find_all('div', class_ = 'gt_cut_fullbg_slice') bg_url = re.findall('url\(\"(.*)\"\);', bg[0].get('style'))[0].replace('webp', 'jpg') fullgb_url = re.findall('url\(\"(.*)\"\);', fullgb[0].get('style'))[0].replace('webp', 'jpg') bg_location_list = [] fullbg_location_list = [] for each_bg in bg: location = {} location['x'] = int(re.findall('background-position: (.*)px (.*)px;',each_bg.get('style'))[0][0]) location['y'] = int(re.findall('background-position: (.*)px (.*)px;',each_bg.get('style'))[0][9]) bg_location_list.append(location) for each_fullgb in fullgb: location = {} location['x'] = int(re.findall('background-position: (.*)px (.*)px;',each_fullgb.get('style'))[0][0]) location['y'] = int(re.findall('background-position: (.*)px (.*)px;',each_fullgb.get('style'))[0][10]) fullbg_location_list.append(location) urlretrieve(url = bg_url, filename = bg_filename) print('缺口圖片下載完成') urlretrieve(url = fullgb_url, filename = fullbg_filename) print('背景圖片下載完成') return bg_location_list, fullbg_location_list def get_merge_image(self, filename, location_list): """ 根據位置對圖片進行合併還原 :filename:圖片 :location_list:圖片位置 """ im = image.open(filename) new_im = image.new('RGB', (260,116)) im_list_upper=[] im_list_down=[] for location in location_list: if location['y'] == -58: im_list_upper.append(im.crop((abs(location['x']),58,abs(location['x']) + 10, 166))) if location['y'] == 0: im_list_down.append(im.crop((abs(location['x']),0,abs(location['x']) + 10, 58))) new_im = image.new('RGB', (260,116)) x_offset = 0 for im in im_list_upper: new_im.paste(im, (x_offset,0)) x_offset += im.size[0] x_offset = 0 for im in im_list_down: new_im.paste(im, (x_offset,58)) x_offset += im.size[0] new_im.save(filename) return new_im def get_merge_image(self, filename, location_list): """ 根據位置對圖片進行合併還原 :filename:圖片 :location_list:圖片位置 """ im = image.open(filename) new_im = image.new('RGB', (260,116)) im_list_upper=[] im_list_down=[] for location in location_list: if location['y']==-58: im_list_upper.append(im.crop((abs(location['x']),58,abs(location['x'])+10,166))) if location['y']==0: im_list_down.append(im.crop((abs(location['x']),0,abs(location['x'])+10,58))) new_im = image.new('RGB', (260,116)) x_offset = 0 for im in im_list_upper: new_im.paste(im, (x_offset,0)) x_offset += im.size[0] x_offset = 0 for im in im_list_down: new_im.paste(im, (x_offset,58)) x_offset += im.size[0] new_im.save(filename) return new_im def get_gap(self, img1, img2): """ 獲取缺口偏移量 :param img1: 不帶缺口圖片 :param img2: 帶缺口圖片 :return: """ left = 43 for i in range(left, img1.size[0]): for j in range(img1.size[1]): if not self.is_pixel_equal(img1, img2, i, j): left = i return left return left def crack(self): # 打開瀏覽器 self.open() # 保存的圖片名字 bg_filename = 'bg.jpg' fullbg_filename = 'fullbg.jpg' # 獲取圖片 bg_location_list, fullbg_location_list = self.get_images(bg_filename, fullbg_filename) # 根據位置對圖片進行合併還原 bg_img = self.get_merge_image(bg_filename, bg_location_list) fullbg_img = self.get_merge_image(fullbg_filename, fullbg_location_list) # 獲取缺口位置 gap = self.get_gap(fullbg_img, bg_img) print('缺口位置', gap) if __name__ == '__main__': print('開始驗證') crack = Crack(u'中國移動') crack.crack()
運行結果以下:
這樣咱們就計算除了缺口位置,接下來就是根據缺口位置,滑動滑塊到相應位置。
咱們可使用瞬間移動,直接在1s內移動到目標位置,結果就是"被吃了"。
勻速直線運動,勻速直線運動大法好!果不其然,仍是"被吃了",繼續嘗試。
模仿抖抖病患者運動,顫顫巍巍,如履薄冰,估計geetest服務器認爲是我外婆在操做吧。
然這個方法偶爾會成功,但成功率極低。最好的方法是什麼呢?
模擬人的運動!你想一下,人在滑動滑塊的初期是否是速度快,可是當要接近缺口位置的時候,會減速,由於我得對準缺口位置啊!這怎麼實現呢?使用咱們初中學過的物理知識:
當前速度公式爲:
v = v0 + a * t
其中,v是當前速度,v0是初始速度,a是加速度,t是時間。咱們剛開始的讓加速大,當過了中間位置,下降加速度。使用這個移動過程,移動滑塊到缺口位置。
編寫代碼以下:
# -*-coding:utf-8 -*- from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from urllib.request import urlretrieve from selenium import webdriver from bs4 import BeautifulSoup import PIL.Image as image import re class Crack(): def __init__(self,keyword): self.url = 'http://bj.gsxt.gov.cn/sydq/loginSydqAction!sydq.dhtml' self.browser = webdriver.Chrome('D:\\chromedriver.exe') self.wait = WebDriverWait(self.browser, 100) self.keyword = keyword self.BORDER = 6 def open(self): """ 打開瀏覽器,並輸入查詢內容 """ self.browser.get(self.url) keyword = self.wait.until(EC.presence_of_element_located((By.ID, 'keyword_qycx'))) bowton = self.wait.until(EC.presence_of_element_located((By.CLASS_NAME, 'btn'))) keyword.send_keys(self.keyword) bowton.click() def get_images(self, bg_filename = 'bg.jpg', fullbg_filename = 'fullbg.jpg'): """ 獲取驗證碼圖片 :return: 圖片的location信息 """ bg = [] fullgb = [] while bg == [] and fullgb == []: bf = BeautifulSoup(self.browser.page_source, 'lxml') bg = bf.find_all('div', class_ = 'gt_cut_bg_slice') fullgb = bf.find_all('div', class_ = 'gt_cut_fullbg_slice') bg_url = re.findall('url\(\"(.*)\"\);', bg[0].get('style'))[0].replace('webp', 'jpg') fullgb_url = re.findall('url\(\"(.*)\"\);', fullgb[0].get('style'))[0].replace('webp', 'jpg') bg_location_list = [] fullbg_location_list = [] for each_bg in bg: location = {} location['x'] = int(re.findall('background-position: (.*)px (.*)px;',each_bg.get('style'))[0][0]) location['y'] = int(re.findall('background-position: (.*)px (.*)px;',each_bg.get('style'))[0][15]) bg_location_list.append(location) for each_fullgb in fullgb: location = {} location['x'] = int(re.findall('background-position: (.*)px (.*)px;',each_fullgb.get('style'))[0][0]) location['y'] = int(re.findall('background-position: (.*)px (.*)px;',each_fullgb.get('style'))[0][16]) fullbg_location_list.append(location) urlretrieve(url = bg_url, filename = bg_filename) print('缺口圖片下載完成') urlretrieve(url = fullgb_url, filename = fullbg_filename) print('背景圖片下載完成') return bg_location_list, fullbg_location_list def get_merge_image(self, filename, location_list): """ 根據位置對圖片進行合併還原 :filename:圖片 :location_list:圖片位置 """ im = image.open(filename) new_im = image.new('RGB', (260,116)) im_list_upper=[] im_list_down=[] for location in location_list: if location['y'] == -58: im_list_upper.append(im.crop((abs(location['x']),58,abs(location['x']) + 10, 166))) if location['y'] == 0: im_list_down.append(im.crop((abs(location['x']),0,abs(location['x']) + 10, 58))) new_im = image.new('RGB', (260,116)) x_offset = 0 for im in im_list_upper: new_im.paste(im, (x_offset,0)) x_offset += im.size[0] x_offset = 0 for im in im_list_down: new_im.paste(im, (x_offset,58)) x_offset += im.size[0] new_im.save(filename) return new_im def get_merge_image(self, filename, location_list): """ 根據位置對圖片進行合併還原 :filename:圖片 :location_list:圖片位置 """ im = image.open(filename) new_im = image.new('RGB', (260,116)) im_list_upper=[] im_list_down=[] for location in location_list: if location['y']==-58: im_list_upper.append(im.crop((abs(location['x']),58,abs(location['x'])+10,166))) if location['y']==0: im_list_down.append(im.crop((abs(location['x']),0,abs(location['x'])+10,58))) new_im = image.new('RGB', (260,116)) x_offset = 0 for im in im_list_upper: new_im.paste(im, (x_offset,0)) x_offset += im.size[0] x_offset = 0 for im in im_list_down: new_im.paste(im, (x_offset,58)) x_offset += im.size[0] new_im.save(filename) return new_im def is_pixel_equal(self, img1, img2, x, y): """ 判斷兩個像素是否相同 :param image1: 圖片1 :param image2: 圖片2 :param x: 位置x :param y: 位置y :return: 像素是否相同 """ # 取兩個圖片的像素點 pix1 = img1.load()[x, y] pix2 = img2.load()[x, y] threshold = 60 if (abs(pix1[0] - pix2[0] < threshold) and abs(pix1[1] - pix2[1] < threshold) and abs(pix1[2] - pix2[2] < threshold)): return True else: return False def get_gap(self, img1, img2): """ 獲取缺口偏移量 :param img1: 不帶缺口圖片 :param img2: 帶缺口圖片 :return: """ left = 43 for i in range(left, img1.size[0]): for j in range(img1.size[1]): if not self.is_pixel_equal(img1, img2, i, j): left = i return left return left def get_track(self, distance): """ 根據偏移量獲取移動軌跡 :param distance: 偏移量 :return: 移動軌跡 """ # 移動軌跡 track = [] # 當前位移 current = 0 # 減速閾值 mid = distance * 4 / 5 # 計算間隔 t = 0.2 # 初速度 v = 0 while current < distance: if current < mid: # 加速度爲正2 a = 2 else: # 加速度爲負3 a = -3 # 初速度v0 v0 = v # 當前速度v = v0 + at v = v0 + a * t # 移動距離x = v0t + 1/2 * a * t^2 move = v0 * t + 1 / 2 * a * t * t # 當前位移 current += move # 加入軌跡 track.append(round(move)) return track def crack(self): # 打開瀏覽器 self.open() # 保存的圖片名字 bg_filename = 'bg.jpg' fullbg_filename = 'fullbg.jpg' # 獲取圖片 bg_location_list, fullbg_location_list = self.get_images(bg_filename, fullbg_filename) # 根據位置對圖片進行合併還原 bg_img = self.get_merge_image(bg_filename, bg_location_list) fullbg_img = self.get_merge_image(fullbg_filename, fullbg_location_list) # 獲取缺口位置 gap = self.get_gap(fullbg_img, bg_img) print('缺口位置', gap) track = self.get_track(gap-self.BORDER) print('滑動滑塊') print(track) if __name__ == '__main__': print('開始驗證') crack = Crack(u'中國移動') crack.crack()
運行效果以下:
根據返回的每次滑動的距離,咱們移動滑塊至缺口位置。
編寫代碼以下:
# -*-coding:utf-8 -*- from selenium.webdriver.support import expected_conditions as EC from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from urllib.request import urlretrieve from selenium import webdriver from bs4 import BeautifulSoup import PIL.Image as image import re class Crack(): def __init__(self,keyword): self.url = 'http://bj.gsxt.gov.cn/sydq/loginSydqAction!sydq.dhtml' self.browser = webdriver.Chrome('D:\\chromedriver.exe') self.wait = WebDriverWait(self.browser, 100) self.keyword = keyword self.BORDER = 6 def open(self): """ 打開瀏覽器,並輸入查詢內容 """ self.browser.get(self.url) keyword = self.wait.until(EC.presence_of_element_located((By.ID, 'keyword_qycx'))) bowton = self.wait.until(EC.presence_of_element_located((By.CLASS_NAME, 'btn'))) keyword.send_keys(self.keyword) bowton.click() def get_images(self, bg_filename = 'bg.jpg', fullbg_filename = 'fullbg.jpg'): """ 獲取驗證碼圖片 :return: 圖片的location信息 """ bg = [] fullgb = [] while bg == [] and fullgb == []: bf = BeautifulSoup(self.browser.page_source, 'lxml') bg = bf.find_all('div', class_ = 'gt_cut_bg_slice') fullgb = bf.find_all('div', class_ = 'gt_cut_fullbg_slice') bg_url = re.findall('url\(\"(.*)\"\);', bg[0].get('style'))[0].replace('webp', 'jpg') fullgb_url = re.findall('url\(\"(.*)\"\);', fullgb[0].get('style'))[0].replace('webp', 'jpg') bg_location_list = [] fullbg_location_list = [] for each_bg in bg: location = {} location['x'] = int(re.findall('background-position: (.*)px (.*)px;',each_bg.get('style'))[0][0]) location['y'] = int(re.findall('background-position: (.*)px (.*)px;',each_bg.get('style'))[0][18]) bg_location_list.append(location) for each_fullgb in fullgb: location = {} location['x'] = int(re.findall('background-position: (.*)px (.*)px;',each_fullgb.get('style'))[0][0]) location['y'] = int(re.findall('background-position: (.*)px (.*)px;',each_fullgb.get('style'))[0][19]) fullbg_location_list.append(location) urlretrieve(url = bg_url, filename = bg_filename) print('缺口圖片下載完成') urlretrieve(url = fullgb_url, filename = fullbg_filename) print('背景圖片下載完成') return bg_location_list, fullbg_location_list def get_merge_image(self, filename, location_list): """ 根據位置對圖片進行合併還原 :filename:圖片 :location_list:圖片位置 """ im = image.open(filename) new_im = image.new('RGB', (260,116)) im_list_upper=[] im_list_down=[] for location in location_list: if location['y'] == -58: im_list_upper.append(im.crop((abs(location['x']),58,abs(location['x']) + 10, 166))) if location['y'] == 0: im_list_down.append(im.crop((abs(location['x']),0,abs(location['x']) + 10, 58))) new_im = image.new('RGB', (260,116)) x_offset = 0 for im in im_list_upper: new_im.paste(im, (x_offset,0)) x_offset += im.size[0] x_offset = 0 for im in im_list_down: new_im.paste(im, (x_offset,58)) x_offset += im.size[0] new_im.save(filename) return new_im def get_merge_image(self, filename, location_list): """ 根據位置對圖片進行合併還原 :filename:圖片 :location_list:圖片位置 """ im = image.open(filename) new_im = image.new('RGB', (260,116)) im_list_upper=[] im_list_down=[] for location in location_list: if location['y']==-58: im_list_upper.append(im.crop((abs(location['x']),58,abs(location['x'])+10,166))) if location['y']==0: im_list_down.append(im.crop((abs(location['x']),0,abs(location['x'])+10,58))) new_im = image.new('RGB', (260,116)) x_offset = 0 for im in im_list_upper: new_im.paste(im, (x_offset,0)) x_offset += im.size[0] x_offset = 0 for im in im_list_down: new_im.paste(im, (x_offset,58)) x_offset += im.size[0] new_im.save(filename) return new_im def is_pixel_equal(self, img1, img2, x, y): """ 判斷兩個像素是否相同 :param image1: 圖片1 :param image2: 圖片2 :param x: 位置x :param y: 位置y :return: 像素是否相同 """ # 取兩個圖片的像素點 pix1 = img1.load()[x, y] pix2 = img2.load()[x, y] threshold = 60 if (abs(pix1[0] - pix2[0] < threshold) and abs(pix1[1] - pix2[1] < threshold) and abs(pix1[2] - pix2[2] < threshold)): return True else: return False def get_gap(self, img1, img2): """ 獲取缺口偏移量 :param img1: 不帶缺口圖片 :param img2: 帶缺口圖片 :return: """ left = 43 for i in range(left, img1.size[0]): for j in range(img1.size[1]): if not self.is_pixel_equal(img1, img2, i, j): left = i return left return left def get_track(self, distance): """ 根據偏移量獲取移動軌跡 :param distance: 偏移量 :return: 移動軌跡 """ # 移動軌跡 track = [] # 當前位移 current = 0 # 減速閾值 mid = distance * 4 / 5 # 計算間隔 t = 0.2 # 初速度 v = 0 while current < distance: if current < mid: # 加速度爲正2 a = 2 else: # 加速度爲負3 a = -3 # 初速度v0 v0 = v # 當前速度v = v0 + at v = v0 + a * t # 移動距離x = v0t + 1/2 * a * t^2 move = v0 * t + 1 / 2 * a * t * t # 當前位移 current += move # 加入軌跡 track.append(round(move)) return track def get_slider(self): """ 獲取滑塊 :return: 滑塊對象 """ while True: try: slider = self.browser.find_element_by_xpath("//div[@class='gt_slider_knob gt_show']") break except: time.sleep(0.5) return slider def move_to_gap(self, slider, track): """ 拖動滑塊到缺口處 :param slider: 滑塊 :param track: 軌跡 :return: """ ActionChains(self.browser).click_and_hold(slider).perform() while track: x = random.choice(track) ActionChains(self.browser).move_by_offset(xoffset=x, yoffset=0).perform() track.remove(x) time.sleep(0.5) ActionChains(self.browser).release().perform() def crack(self): # 打開瀏覽器 self.open() # 保存的圖片名字 bg_filename = 'bg.jpg' fullbg_filename = 'fullbg.jpg' # 獲取圖片 bg_location_list, fullbg_location_list = self.get_images(bg_filename, fullbg_filename) # 根據位置對圖片進行合併還原 bg_img = self.get_merge_image(bg_filename, bg_location_list) fullbg_img = self.get_merge_image(fullbg_filename, fullbg_location_list) # 獲取缺口位置 gap = self.get_gap(fullbg_img, bg_img) print('缺口位置', gap) track = self.get_track(gap-self.BORDER) print('滑動滑塊') print(track) # 點按呼出缺口 slider = self.get_slider() # 拖動滑塊到缺口處 self.move_to_gap(slider, track) if __name__ == '__main__': print('開始驗證') crack = Crack(u'中國移動') crack.crack() print('驗證成功')
運行上述代碼,即實現滑動驗證碼破解,再看下那個nice的瞬間吧。
分享技術,樂享生活:Jack Cui公衆號每週五推送「程序員歡樂送」系列資訊類文章,歡迎您的關注!
圓方圓學院聚集 Python + AI 名師,打造精品的 Python + AI 技術課程。 在各大平臺都長期有優質免費公開課,歡迎報名收看。
公開課地址:https://ke.qq.com/course/362788