我深度學習0基礎，還訓練出一個識別驗證碼模型！

轉載自公衆號JAVAandPython君python

最近一直沒出文，是由於最近在寫一個爬蟲項目，這個項目裏面，碰到了一個比較棘手的事情，那就是驗證碼。各類方法用盡，最後我仍是決定去訓練本身的模型，可是，有一個問題---我深度學習能夠說是0基礎，這可咋弄？想來想去，我只能靠着百度&谷歌兩位大佬來寫了。
git

1

驗證碼樣本獲取（數據集的準備）

首先給你們基本的思路，最開始咱們須要的是你的破解目標---驗證碼，由於每個網站的驗證碼是不一樣的，因此咱們須要獲取你想要破解的驗證碼類型。數組

例如：微信

上面這個驗證碼就是我這幾天折騰的驗證碼，咱們看看它有些什麼樣的特徵，爲何要看驗證碼的特徵呢？由於咱們須要大量的已標記的驗證碼數據集，已標記的驗證碼數據集又是啥意思？我直接給你們看張圖吧↓
網絡

你們能夠看到，每一張驗證碼的名字前面四個字母或數字恰好對應的就是咱們驗證碼中的字母數字，這個就是咱們待會須要訓練的數據（我這裏是準備了2w張）
app

可是一個問題來了，我怎麼獲取這些圖片？去網站下載？那名字豈不是要我一個個改？固然不！雖然那也是一種方法，可是未免也太折磨人了...我這裏就給你們推薦兩個驗證碼生成的庫，kaptcha（JAVA裏的庫）和captcha（python裏的庫），我此次使用的是 kaptcha，由於我這個驗證碼須要設置相關的屬性，例如驗證碼圖片裏面字體的顏色、各類效果，captcha裏面貌似是不能夠設置的，因此我就選擇了kaptcha。驗證碼生成這一塊你們能夠直接去百度搜索這兩個關鍵字：kaptcha，captcha。dom

2

處理數據集

處理數據集：
ide

1.色彩在驗證碼中並不重要，咱們將彩色驗證碼圖片轉爲黑白，3維轉1維，減小干擾數據。函數

2.將黑白驗證碼圖片及其文本內容轉化爲數值數據。學習

3.設置驗證碼圖片組，以便讓圖片數據分批次進行訓練。

process.py

import numpy as np
from demo1.getimg import file_namefrom demo1.getimg import load_allimgfrom demo1.getimg import CAPTCHA_HEIGHT, CAPTCHA_WIDTH, CAPTCHA_LEN, CAPTCHA_LISTimport randomimport cv2# 圖片轉爲黑白，3維轉1維def convert2gray(img):if len(img.shape)>2: img = np.mean(img, -1)return img
# 驗證碼文本轉爲向量def text2vec(text,captcha_len=CAPTCHA_LEN, captcha_list=CAPTCHA_LIST): text_len = len(text)if text_len > captcha_len:raise ValueError("驗證碼超過4位啦！") vector = np.zeros(captcha_len * len(captcha_list))for i in range(text_len): vector[captcha_list.index(text[i]) + i * len(captcha_list)] = 1return vector

# 驗證碼向量轉爲文本def vec2text(vec, captcha_list=CAPTCHA_LIST, size=CAPTCHA_LEN): vec_idx = vec text_list = [captcha_list[v] for v in vec_idx]return ''.join(text_list)

# 返回特定shape圖片def wrap_gen_captcha_text_and_image(shape=(CAPTCHA_HEIGHT, CAPTCHA_WIDTH, 3)): t_list = [] # flie_name方法是我本身定義的，獲取目錄中全部驗證碼圖片的名字 t = file_name("E://DeskTop//codeimg") # 對名字進行處理，只保留前四位for i in t: index = i.rfind('-') name = i[:index] t_list.append(name)# print(t_list) # 這個也是我定義的，獲取全部的驗證碼圖片對象 im = load_allimg()
 im_list = []for i in range(0, len(im)):if im[i].shape == shape: im_list.append(im[i])# print(len(im_list))# print(len(t_list))return t_list, im_list

# 獲取訓練圖片組def next_batch(batch_count=60, width=CAPTCHA_WIDTH, height=CAPTCHA_HEIGHT): batch_x = np.zeros([batch_count, width * height]) batch_y = np.zeros([batch_count, CAPTCHA_LEN * len(CAPTCHA_LIST)]) text, image = wrap_gen_captcha_text_and_image()for i in range(batch_count):# 隨機抽取一張圖片 text_a = random.choice(text) image_a = image[text.index(text_a)]# print(text.index(text_a))# print(text_a) image_a = convert2gray(image_a)# 將圖片數組一維化 同時將文本也對應在兩個二維組的同一行 batch_x[i, :] = image_a.flatten()/ 255 batch_y[i, :] = text2vec(text_a)# 返回該訓練批次return batch_x, batch_y
if __name__ == '__main__': x,y = next_batch(batch_count=1) print(x,'\n\n',y)

運行以後返回的是以下結果：

3

建立模型、訓練模型

建立模型：

這裏用到了 5 層網絡，前 3 層爲卷積層，第四、5 層爲全鏈接層。對 4 層隱藏層都進行 dropout。網絡結構以下所示： input——>conv——>pool——>dropout——>conv——>pool——>dropout——>conv——>pool——>dropout——>fully connected layer——>dropout——>fully connected layer——>output

訓練數據

這裏選擇交叉熵損失函數。sigmod_cross適用於每一個類別相互獨立但不互斥，如圖中能夠有字母和數字。每批次採用 64 個訓練樣本，每訓練100次測試一次樣本識別的準確度，當準確度大於 95% 時保存模型，當準確度大於99%時訓練結束。咱們這裏採用CPU來訓練模型，我大概訓練了6-7小時，準確度達到了99%。

train.py

import osimport tensorflow as tffrom demo1.process import next_batchfrom demo1.getimg import CAPTCHA_HEIGHT, CAPTCHA_WIDTH, CAPTCHA_LEN, CAPTCHA_LISTfrom datetime import datetime# 隨機生成權重def weight_variable(shape, w_alpha=0.01):initial = w_alpha * tf.random_normal(shape)return tf.Variable(initial)

# 隨機生成偏置項def bias_variable(shape, b_alpha=0.1):initial = b_alpha * tf.random_normal(shape)return tf.Variable(initial)

# 局部變量線性組合，步長爲1，模式‘SAME’表明卷積後圖片尺寸不變，即零邊距
def conv2d(x, w):return tf.nn.conv2d(x, w, strides=[1, 1, 1, 1], padding='SAME')

# max pooling,取出區域內最大值爲表明特徵， 2x2pool，圖片尺寸變爲1/2
def max_pool_2x2(x):return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')

# 三層卷積神經網絡計算圖def cnn_graph(x, keep_prob, size, captcha_list=CAPTCHA_LIST, captcha_len=CAPTCHA_LEN): # 圖片reshape爲4維向量image_height, image_width = sizex_image = tf.reshape(x, shape=[-1, image_height, image_width, 1]) # 第一層
 # filter定義爲3x3x1， 輸出32個特徵, 即32個filter
w_conv1 = weight_variable([3, 3, 1, 32])
b_conv1 = bias_variable([32])
 # rulu激活函數
h_conv1 = tf.nn.relu(tf.nn.bias_add(conv2d(x_image, w_conv1), b_conv1))
 # 池化
h_pool1 = max_pool_2x2(h_conv1) # dropout防止過擬合
h_drop1 = tf.nn.dropout(h_pool1, keep_prob)
 # 第二層w_conv2 = weight_variable([3, 3, 32, 64])b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(tf.nn.bias_add(conv2d(h_drop1, w_conv2), b_conv2))h_pool2 = max_pool_2x2(h_conv2)
h_drop2 = tf.nn.dropout(h_pool2, keep_prob)
 # 第三層
w_conv3 = weight_variable([3, 3, 64, 64])
b_conv3 = bias_variable([64])h_conv3 = tf.nn.relu(tf.nn.bias_add(conv2d(h_drop2, w_conv3), b_conv3))h_pool3 = max_pool_2x2(h_conv3)h_drop3 = tf.nn.dropout(h_pool3, keep_prob)
 # 全鏈接層
image_height = int(h_drop3.shape[1])image_width = int(h_drop3.shape[2])w_fc = weight_variable([image_height * image_width * 64, 1024])b_fc = bias_variable([1024])h_drop3_re = tf.reshape(h_drop3, [-1, image_height * image_width * 64])h_fc = tf.nn.relu(tf.add(tf.matmul(h_drop3_re, w_fc), b_fc))h_drop_fc = tf.nn.dropout(h_fc, keep_prob)

 # 全鏈接層(輸出層)
w_out = weight_variable([1024, len(captcha_list) * captcha_len])b_out = bias_variable([len(captcha_list) * captcha_len])y_conv = tf.add(tf.matmul(h_drop_fc, w_out), b_out)
return y_conv

# 最小化loss
def optimize_graph(y, y_conv): # 交叉熵計算loss
 # sigmod_cross適用於每一個類別相互獨立但不互斥，如圖中能夠有字母和數字loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=y_conv, labels=y)) # 最小化loss優化optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss)
return optimizer

# 誤差計算
def accuracy_graph(y, y_conv, width=len(CAPTCHA_LIST), height=CAPTCHA_LEN): # 預測值
predict = tf.reshape(y_conv, [-1, height, width])
max_predict_idx = tf.argmax(predict, 2) # 標籤
label = tf.reshape(y, [-1, height, width])
max_label_idx = tf.argmax(label, 2)
correct_p = tf.equal(max_predict_idx, max_label_idx)
accuracy = tf.reduce_mean(tf.cast(correct_p, tf.float32))
return accuracy

# 訓練cnn
def train(height=CAPTCHA_HEIGHT, width=CAPTCHA_WIDTH, y_size=len(CAPTCHA_LIST) * CAPTCHA_LEN):acc_rate = 0.95
 # 按照圖片大小申請佔位符
x = tf.placeholder(tf.float32, [None, height * width])y = tf.placeholder(tf.float32, [None, y_size])
 # 防止過擬合 訓練時啓用 測試時不啓用keep_prob = tf.placeholder(tf.float32)
 # cnn模型
y_conv = cnn_graph(x, keep_prob, (height, width))
 # 最優化
optimizer = optimize_graph(y, y_conv)
 # 誤差
accuracy = accuracy_graph(y, y_conv)
 # 啓動會話.開始訓練
saver = tf.train.Saver()
sess = tf.Session()
sess.run(tf.global_variables_initializer())
step = 0while 1: # 每批次64個樣本
batch_x, batch_y = next_batch(64)sess.run(optimizer, feed_dict={x: batch_x, y: batch_y, keep_prob: 0.75})print("step：", step) # 每訓練一百次測試一次if step % 100 == 0:batch_x_test, batch_y_test = next_batch(100)acc = sess.run(accuracy, feed_dict={x: batch_x_test, y: batch_y_test, keep_prob: 1.0})print(datetime.now().strftime('%c'), ' step:', step, ' accuracy:', acc) # 誤差知足要求，保存模型if acc > acc_rate:model_path = os.getcwd() + os.sep + str(acc_rate) + "captcha.model"saver.save(sess, model_path, global_step=step)acc_rate += 0.01if acc_rate > 0.99:breakstep += 1sess.close()
if __name__ == '__main__':    train()

由於咱們這篇文章是針對0基礎深度學習的，因此咱們也不須要太多的去理解裏面的意思。咱們訓練模型，就是來運行這一串代碼。

4

測試模型

咱們訓練完成以後，會出現如下四個文件：

這個就是咱們耗費幾個小時獲取的成果！

最後，咱們來測試一下他們：

import tensorflow as tffrom demo1.train import cnn_graphfrom demo1.process import vec2text,convert2gray,wrap_gen_captcha_text_and_imagefrom demo1.getimg import CAPTCHA_HEIGHT, CAPTCHA_WIDTH, CAPTCHA_LEN, CAPTCHA_LIST
import numpy as npimport random
# 驗證碼圖片轉化爲文本def captcha2text(image_list, height=CAPTCHA_HEIGHT, width=CAPTCHA_WIDTH): x = tf.placeholder(tf.float32, [None, height * width]) keep_prob = tf.placeholder(tf.float32) y_conv = cnn_graph(x, keep_prob, (height, width)) saver = tf.train.Saver()with tf.Session() as sess: saver.restore(sess, tf.train.latest_checkpoint('.')) predict = tf.argmax(tf.reshape(y_conv, [-1, CAPTCHA_LEN, len(CAPTCHA_LIST)]), 2) vector_list = sess.run(predict, feed_dict={x: image_list, keep_prob: 1}) vector_list = vector_list.tolist()
 text_list = [vec2text(vector) for vector in vector_list]
return text_list
if __name__ == '__main__': # 從文件中隨機取出一張驗證碼進行測試 text, image = wrap_gen_captcha_text_and_image() text_a = random.choice(text) image_a = image[text.index(text_a)] img_array = np.array(image_a) image = convert2gray(img_array) image = image.flatten() / 255 pre_text = captcha2text([image])if pre_text[0] == text_a: print(' 正確驗證碼:', text_a, "識別出來的：", pre_text," TURE")else: print(' 正確驗證碼:', text_a, "識別出來的：", pre_text, "FLASE")

從結果能夠看到，識別成功！

END

往期文章回顧

一隻爬蟲的旅途

當你寫爬蟲遇到APP的請求有加密參數時該怎麼辦？【初級篇-常規模式】

有多少我的在看？

本文分享自微信公衆號 - 平常學python（daily_learn）。
若有侵權，請聯繫 support@oschina.cn 刪除。
本文參與「OSC源創計劃」，歡迎正在閱讀的你也加入，一塊兒分享。