python:TX滑動驗證碼識別方案一

時間 2019-11-05

原文原文鏈接

一.介紹：

本內容只作技術交流使用，請勿作商業用途。本方法驗證採用本地自建服務器的形式來對所介紹的算法作驗證
滑動驗證碼以下圖所示：
複製代碼

二.本地測試環境構建

本地測試環境的構建CSDN已有文章作了詳細的介紹，詳細參見[博客地址](https://blog.csdn.net/mouday/article/details/83384633)
[demo地址](https://github.com/mouday/TencentCaptcha)
複製代碼

三.識別思路介紹

** 滑動驗證須要解決的問題：滑動距離的計算、滑動軌跡模擬、模擬滑動 **
複製代碼

1.滑動距離計算

經過抓包獲取驗證碼發現，此驗證碼會返回以下兩張圖片：

針對可獲取到的圖片信息，要計算滑塊滑動距離，能夠採用OpenCV庫，結合圖像處理算法，獲取滑塊距離。可是經過屢次刷新能夠觀察缺口圖片只有有限的10張，若是咱們能找到對應的10張完好口的完整背景圖片，每次經過獲取到的有缺口的圖片在10張完整圖片中找到對應完好口圖片，直接作對應像素相減，判斷差值大於閾值的第一個像素點的x的座標位置即爲滑塊左側距離圖片邊緣的距離。

2.完好口完整背景圖片獲取：

完整背景圖片的獲取除了網上文章已提到的經過大量有缺口圖片切分重組構造完整背景圖和滑動完成驗證以後截圖兩種方案外，其實還能夠直接經過接口獲取到完整背景圖片，因爲涉及相關平臺利益，因此此處對經過接口直接獲取完整背景圖片的方法不作詳細介紹。可是會把涉及到的10張背景圖片給出：python

3.拿到有缺口圖片後如何從10張背景圖中找到對應的完好口圖：

算法1：直接用有缺口圖片和10張背景圖片作減法，統計差值大於閾值的像素點的個數，閾值設置60，像素點個數設置爲缺口大小，大概6000個點，若是差值大於60的點個數超過6000則認爲圖片不是對應的完整背景圖，與10張背景圖循環遍歷，找到對應的背景圖返回對應目標圖路徑git

def get_full_pic(bg_image):
    ''' :param gap_pic: 缺口圖片 :return: (str)背景圖片路徑 '''
    #轉換圖像到灰度
    img1 = bg_image.convert('L')
    distance = 68     #因爲缺口位置都在圖片的後邊，爲減小計算，能夠減小一部分比較
    threshold = 60
    dir = ""
    for k in range(1,11):
        dir = "../background/"+str(k)+".jpg"
        fullbg_image = Image.open(dir)
        img2 = fullbg_image.convert('L')
        diff = 0
        for i in range(distance, img1.size[0]):
            # 遍歷像素點縱座標
            for j in range(img1.size[1]):
                # 若是不是相同像素
                img1_pixe = img1.load()[i,j]
                img2_pixe = img2.load()[i,j]
                if abs(img1_pixe - img2_pixe) > threshold:
                    diff = diff + 1
            if diff > 6000:
                break
                # 不一樣的像素超過必定值直接認爲不匹配，
                # 後期計算時能夠優化一下結合圖片驗證碼返回初始位置數據，
                # 比較圖片時能夠去除圖片部分區域數據
            elif i == img1.size[0]-1 and j == img1.size[1]-1:
                print("Find the target")
                return dir
    return dir
複製代碼

算法2：因爲算法1須要的計算量比較大，測試時發現找目標大概須要花費1s時間。因此改爲只須要比對圖片上的四個點，這四個點的選擇原則是，儘可能分散(相鄰點像素值比較接近)。github

代碼以下：選擇圖片上的（50，50）（50，250），（250，50），（250，250）四點的像素做爲比較點，改進算法比算法1節省1s時間web

#尋找背景目標圖片
def get_full_pic_new(bg_image):
    img1 = bg_image.convert("L")
    dir = ""
    threshold = 60 
    for k in range(1,11):
        dir = "../background/"+str(k)+".jpg"   #10張背景圖對應的路徑
        fullbg_image = Image.open(dir)
        img2 = fullbg_image.convert('L')       #不須要三個通道作比較
        pix11 = img1.load()[50, 50]
        pix12 = img1.load()[50, 250]
        pix13 = img1.load()[250, 50]
        pix14 = img1.load()[250, 250]

        pix21 = img2.load()[50, 50]
        pix22 = img2.load()[50, 250]
        pix23 = img2.load()[250, 50]
        pix24 = img2.load()[250, 250]
        if abs(pix11 - pix21)>threshold or abs(pix12 - pix22)>threshold or abs(pix13 - pix23)>threshold or abs(pix14 - pix24)>threshold:
            continue
        else:
            if abs(pix11 - pix21)<threshold and abs(pix12 - pix22)<threshold and abs(pix13 - pix23)<threshold and abs(pix14 - pix24)<threshold:
                print("Find the target:", dir)
                break
            else:
                print("Not found")
                dir = None
    return dir
複製代碼

找到對應背景圖片以後，計算距離的算法和極驗驗證碼計算方法一致，此處不作詳細介紹，完整的距離計算模塊以下：算法

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time : 2019/3/22 13:25
# @File : get_distance.py
from PIL import Image
def is_pixel_equal(img1, img2, x, y):
    """ 判斷兩個像素是否相同 :param image1: 圖片1 :param image2: 圖片2 :param x: 位置x :param y: 位置y :return: 像素是否相同 """
    # 取兩個圖片的像素點
    pix1 = img1.load()[x, y]
    pix2 = img2.load()[x, y]
    threshold = 68
    if (abs(pix1[0] - pix2[0] < threshold) and abs(pix1[1] - pix2[1] < threshold) and abs(pix1[2] - pix2[2] < threshold )):
        return True
    else:
        return False
def get_gap(img1, img2):
    """ 獲取缺口偏移量 :param img1: 不帶缺口圖片 :param img2: 帶缺口圖片 :return: """
    left = 68
    for i in range(left, img1.size[0]):
        for j in range(img1.size[1]):
            if not is_pixel_equal(img1, img2, i, j):
                left = i
                print(i)
                return left
    return left
def get_full_pic_new(bg_image):
    img1 = bg_image.convert("L")
    dir = ""
    threshold = 60
    for k in range(1,11):
        dir = "../background/"+str(k)+".jpg"
        fullbg_image = Image.open(dir)
        img2 = fullbg_image.convert('L')
        pix11 = img1.load()[50, 50]
        pix12 = img1.load()[50, 250]
        pix13 = img1.load()[250, 50]
        pix14 = img1.load()[250, 250]

        pix21 = img2.load()[50, 50]
        pix22 = img2.load()[50, 250]
        pix23 = img2.load()[250, 50]
        pix24 = img2.load()[250, 250]
        if abs(pix11 - pix21)>threshold or abs(pix12 - pix22)>threshold or abs(pix13 - pix23)>threshold or abs(pix14 - pix24)>threshold:
            continue
        else:
            if abs(pix11 - pix21)<threshold and abs(pix12 - pix22)<threshold and abs(pix13 - pix23)<threshold and abs(pix14 - pix24)<threshold:
                print("Find the target:", dir)
                break
            else:
                print("Not found")
                dir = None
    return dir
def get_full_pic(bg_image):
    ''' :param gap_pic: 缺口圖片 :return: (str)背景圖片路徑 '''
    #轉換圖像到灰度
    img1 = bg_image.convert('L')
    distance = 68
    threshold = 60
    dir = ""
    for k in range(1,11):
        dir = "../background/"+str(k)+".jpg"
        fullbg_image = Image.open(dir)
        img2 = fullbg_image.convert('L')
        diff = 0
        for i in range(distance, img1.size[0]):
            # 遍歷像素點縱座標
            for j in range(img1.size[1]):
                # 若是不是相同像素
                img1_pixe = img1.load()[i,j]
                img2_pixe = img2.load()[i,j]
                if abs(img1_pixe - img2_pixe) > threshold:
                    diff = diff + 1
            if diff > 6000:
                break
                # 不一樣的像素超過必定值直接認爲不匹配，
                # 後期計算時能夠優化一下結合圖片驗證碼返回初始位置數據，
                # 比較圖片時能夠去圖片部分區域數據
            elif i == img1.size[0]-1 and j == img1.size[1]-1:
                print("Find the target")
                return dir
    return dir
def get_distanct(bg_image):
    bg_img = Image.open(bg_image)
    full_dir = get_full_pic_new(bg_img)
    full_img = Image.open(full_dir)
    return get_gap(full_img, bg_img)
if __name__=="__main__":
    import time
    time_start = time.time()
    print("--"*20+"run"+"--"*20)
    dir = "../gap_pic/8.jpg"
    distanct = get_distanct(dir)
    time_end = time.time()
    print('totally cost', time_end - time_start)
    print(distanct)
複製代碼

四.滑動驗證完整demo

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time : 2019/4/1 11:12
# @File : tx_test.py

import json
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.common.action_chains import ActionChains

from lxml import etree
from get_distanct import get_distanct
import time
import requests
import random
import numpy as np
from scipy import stats
import math

class tx_test(object):
    def __init__(self):

        self.driver = webdriver.Chrome()
        self.driver.maximize_window()
        # 設置一個智能等待
        self.wait = WebDriverWait(self.driver, 5)
        self.url = "http://127.0.0.1:8080/"
    def get_track(self, distance):
        """ 根據偏移量獲取移動軌跡 :param distance: 偏移量 :return: 移動軌跡 """
        # 移動軌跡
        track = []
        # 當前位移
        current = 0
        # 減速閾值
        mid = distance * 4 / 5
        # 計算間隔
        t = 0.2
        # 初速度
        v = 0.1
        r = [1.1, 1.2, 1.3, 1.4, 1.5]
        p = [2, 2.5, 2.8, 3, 3.5, 3.6]
        q = 5.0
        i = 0
        while current < distance:
            if current < mid:
                # 加速度爲正2
                a = 2
                q = q * 0.9
            else:
                # 加速度爲負3
                q = 1.0
                a = -3
            # 初速度v0
            v0 = v
            # 當前速度v = v0 + at
            v = v0 + a * t
            # 移動距離x = v0t + 1/2 * a * t^2
            r1 = random.choice(r)
            p1 = random.choice(p)
            move = r1 * v0 * t + 1 / p1 * a * t * t * q
            # 當前位移
            if i == 2:
                currentdis = (distance - current) / random.choice([3.5, 4.0, 4.5, 5.0])
                current += currentdis
                track.append(round(currentdis))
            elif i == 4:
                currentdis = (distance - current) / random.choice([4.0, 5.0, 6.0, 7.0])
                current += currentdis
                track.append(round(currentdis))
            else:
                current += move
                track.append(round(move))
            # 加入軌跡
            i = i + 1
        return track
    def get_slider(self, browser):
        """ 獲取滑塊 :return: 滑塊對象 """
        slider = None
        while True:
            try:
                slider = self.wait.until(EC.presence_of_element_located((By.XPATH,'//*[@id="tcaptcha_drag_thumb"]')))
                break
            except:
                break
        return slider

    def move_to_gap(self, browser, slider, track):
        """ 拖動滑塊到缺口處 :param slider: 滑塊 :param track: 軌跡 :return: """
        ActionChains(browser).click_and_hold(slider).perform()
        time.sleep(0.5)
        while track:
            x = random.choice(track)
            y = random.choice([-2, -1, 0, 1, 2])
            ActionChains(browser).move_by_offset(xoffset=x, yoffset=y).perform()
            track.remove(x)
            t = random.choice([0.002,0.003,0.004,0.005,0.006])
            time.sleep(t)
        time.sleep(1)
        ActionChains(browser).release(on_element=slider).perform()
    def login(self):
        while True:
            self.driver.get(self.url)
            self.driver.delete_all_cookies()
            currhandle = self.driver.current_window_handle
            while True:
                try:
                    self.driver.switch_to_window(currhandle)
                except Exception as e:
                    print(e)
                try:
                    verify_Bt = self.wait.until(EC.element_to_be_clickable((By.XPATH,'//*[@id="TencentCaptcha"]')))   #按鈕是否可點擊
                    verify_Bt.click()
                except Exception as e:
                    self.driver.refresh()
                    continue
                try:
                    # if flag is not 0:
                    iframe = self.wait.until(EC.presence_of_element_located((By.XPATH, '//*[@id="tcaptcha_iframe"]')))
                    time.sleep(5)
                    self.driver.switch_to.frame(iframe)     #切換到iframe失敗
                    #檢測是否有滑動驗證碼,有滑動驗證碼就滑動
                    Sliding_Pic = self.wait.until(EC.presence_of_element_located((By.XPATH,'//*[@id="slideBgWrap"]/img')))
                    for i in range(5):
                        page = self.driver.page_source
                        selector = etree.HTML(page)
                        bg_imgSrc = selector.xpath('//*[@id="slideBgWrap"]/img/@src')[0]
                        res = requests.get(bg_imgSrc)
                        with open("./bg_img.jpg","wb") as fp:
                            fp.write(res.content)
                        #計算滑塊滑動距離
                        dist = get_distanct("./bg_img.jpg")
                        print("打印滑動距離:",dist)
                        dist = int((dist)/2-34)
                        #獲取滑動軌跡
                        print(dist)
                        track = self.get_track(dist)
                        print(track)
                        print(sum(track))
                        err = (dist-sum(track))   #距離修正值
                        print(err)
                        #獲取滑塊
                        track.append(err)
                        slide = self.get_slider(self.driver)
                        #滑動滑塊
                        self.move_to_gap(self.driver,slide,track)
                        time.sleep(2)
                        slide = self.get_slider(self.driver)
                        if slide:
                            continue
                        else:
                            print("滑動驗證經過")
                            break
                except Exception as e:
                    print("滑動異常")
                    time.sleep(5)
                    break
if __name__=="__main__":
    print("test\n")
    login = tx_test()
    login.login()
複製代碼

總結及說明

代碼只須要把tx_test.py、get_distance.py及建立背景圖片文件夾background(內存放10張背景圖片，圖片命名爲1.jpg~10.jpg便可，而後啓動本地滑動測試環境，ip端口配置本身實際服務器地址端口便可，啓動tx_test.py模塊便可驗證整個滑動識別模塊)滑動完成以後截圖以下：軌跡算法是在參考其餘極驗軌跡模擬算法的基礎上增長了一些調整，具體參看代碼。json