以前用tensorflow1.13作了一個驗證碼識別的小東西準確率仍是至關高的(固然其中大部分邏輯都是從網上不少大神的博客中借鑑之後再本身試驗的)python
前不久tensorflow2.0的alpha版發佈之後就一直想着用2.0的keras方式重寫一遍,由於看了deeplearning.ai中的幾個視頻中都是以keras方式來實現的,感受比原生的tensorflow方式創建模型的方法要簡單清晰不少,並且訓練結果的保存和從新加載也是簡化了不少。git
====================================================github
驗證碼生成及預處理數組
這裏保留了以前驗證碼生成的方式,仍然使用captcha來生成驗證碼網絡
驗證碼的內容是10個數字0~9,小寫英文字母和大寫英文字母,因此總的字符量爲62種app
number = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'] alphabet = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'] ALPHABET = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] CHAR_SET = number + alphabet + ALPHABET
能夠看到生成的驗證碼都是帶干擾線干擾點,而且各類顏色和變形處理都是有的dom
首先將圖片轉成numpy數組ide
captcha_image = Image.open(captcha) captcha_image = np.array(captcha_image)
根據分析,這類驗證碼的顏色對咱們的識別沒有影響,因此將圖像進行預處理——灰度化(根據圖片的不一樣能夠作灰度化或者二值化等預處理操做)函數
這裏使用的是求均值的方法(正規的方法應該是RGB三個通道上按照必定的比例取值)學習
gray = np.mean(img, -1)
====================================================
創建卷積神經網絡模型
1.輸入的圖片爲160 * 60的,灰度化預處理之後爲一維數組,每張圖片總共有9600個輸入值
IMAGE_HEIGHT = 60 IMAGE_WIDTH = 160
2.輸出的字符集有62個字符,而且每張圖片有4位字符,總共有4 * 62 = 248個輸出值(下面的batch_size爲每批訓練的圖片數量)
batch_y = np.zeros([batch_size, MAX_CAPTCHA, CHAR_SET_LEN])
3.輸入層有9600個值,輸出層有248個值,若是使用全鏈接層做爲隱藏層則會須要天量的計算
因此須要先使用卷積核池化操做盡量的減小計算量(若是有一些深度學習基礎的同窗應該知道計算機視覺中通常都是用卷積升級網絡來解決這類問題)
圖片像素不高,因此使用的卷積核和池大小不能太大,優先考慮3 * 3 和5 * 5 的卷積核,池大小使用2 * 2
按照下面的神經網絡模型,卷積池化之後的輸出應該是128 * 17 * 5 = 10880(若是最後一層的深度仍然使用64的話,大小會減爲一半)
model.add(tf.keras.layers.Conv2D(32, (3, 3))) model.add(tf.keras.layers.PReLU()) model.add(tf.keras.layers.MaxPool2D((2, 2), strides=2)) model.add(tf.keras.layers.Conv2D(64, (5, 5))) model.add(tf.keras.layers.PReLU()) model.add(tf.keras.layers.MaxPool2D((2, 2), strides=2)) model.add(tf.keras.layers.Conv2D(128, (5, 5))) model.add(tf.keras.layers.PReLU()) model.add(tf.keras.layers.MaxPool2D((2, 2), strides=2))
4.輸出的的每一位的字符之間沒有關聯關係,因此仍然將輸出值當作4組,須要將輸出值調整爲(4, 62)的數組
model.add(tf.keras.layers.Flatten()) model.add(tf.keras.layers.Dense(MAX_CAPTCHA * CHAR_SET_LEN)) model.add(tf.keras.layers.Reshape([MAX_CAPTCHA, CHAR_SET_LEN]))
6.識別的原理是計算每一位字符上某個字符出現的可能性最大,因此每張圖片都是一個4位的多分類問題,最終輸出使用softmax進行歸一化
model.add(tf.keras.layers.Softmax())
7.在tensorflow2.0中softmax對應的損失函數是categorical_crossentropy,按照這個來配置模型
model.compile(optimizer='Adam', metrics=['accuracy'], loss='categorical_crossentropy')
8.最後將可能性最大的那個下標取出,做爲字符集的下標,獲取實際對應的字符(固然咱們在訓練的時候沒有必要轉化爲字符,直接下標比較一下是否正確就能夠了)
prediction_value = vec2text(np.argmax(prediction_value, axis=2)[0])
====================================================
訓練模型
使用一個循環邏輯開啓訓練,每批訓練512張圖片,每批訓練4次
for times in range(500000): batch_x, batch_y = get_next_batch(512) model.fit(batch_x, batch_y, epochs=4) print("y預測=\n", np.argmax(model.predict(batch_x), axis=2)) print("y實際=\n", np.argmax(batch_y, axis=2))
剛開始的時候可能損失值會在4左右徘徊,多跑幾個批次之後損失值會明顯降低,精確度accuracy也會直線上升
下面能夠看到每批的第一次訓練結果中精確度只有不到60%,而在第四次訓練結果中精確度基本上都能達到99%以上
可是這個精確度達到99%了也不能說明整個模型就訓練結束了,這個精確度只是針對這一批512張圖片來講的
畢竟4位驗證碼有62 * 62 * 62 * 62種可能,訓練集並不能表明全部的可能性,因此咱們須要使用新生成的驗證碼來證實整個訓練結果是否能結束
Epoch 1/4 32/512 [>.............................] - ETA: 5s - loss: 2.4209 - accuracy: 0.5703 64/512 [==>...........................] - ETA: 5s - loss: 2.2339 - accuracy: 0.5703 96/512 [====>.........................] - ETA: 5s - loss: 2.1561 - accuracy: 0.5911 128/512 [======>.......................] - ETA: 4s - loss: 2.0170 - accuracy: 0.6016 160/512 [========>.....................] - ETA: 4s - loss: 1.9622 - accuracy: 0.6031 192/512 [==========>...................] - ETA: 3s - loss: 1.9425 - accuracy: 0.6029 224/512 [============>.................] - ETA: 3s - loss: 1.9192 - accuracy: 0.6038 256/512 [==============>...............] - ETA: 3s - loss: 1.8921 - accuracy: 0.6113 288/512 [===============>..............] - ETA: 2s - loss: 1.8746 - accuracy: 0.6094 320/512 [=================>............] - ETA: 2s - loss: 1.8479 - accuracy: 0.6031 352/512 [===================>..........] - ETA: 1s - loss: 1.8367 - accuracy: 0.5987 384/512 [=====================>........] - ETA: 1s - loss: 1.8379 - accuracy: 0.5931 416/512 [=======================>......] - ETA: 1s - loss: 1.8287 - accuracy: 0.5913 448/512 [=========================>....] - ETA: 0s - loss: 1.8086 - accuracy: 0.5887 480/512 [===========================>..] - ETA: 0s - loss: 1.7682 - accuracy: 0.5917 512/512 [==============================] - 6s 12ms/sample - loss: 1.7781 - accuracy: 0.5864 ...... Epoch 4/4 32/512 [>.............................] - ETA: 5s - loss: 0.0034 - accuracy: 1.0000 64/512 [==>...........................] - ETA: 5s - loss: 0.0066 - accuracy: 1.0000 96/512 [====>.........................] - ETA: 4s - loss: 0.0094 - accuracy: 1.0000 128/512 [======>.......................] - ETA: 4s - loss: 0.0089 - accuracy: 1.0000 160/512 [========>.....................] - ETA: 4s - loss: 0.0097 - accuracy: 0.9984 192/512 [==========>...................] - ETA: 3s - loss: 0.0100 - accuracy: 0.9987 224/512 [============>.................] - ETA: 3s - loss: 0.0095 - accuracy: 0.9989 256/512 [==============>...............] - ETA: 3s - loss: 0.0088 - accuracy: 0.9990 288/512 [===============>..............] - ETA: 2s - loss: 0.0084 - accuracy: 0.9991 320/512 [=================>............] - ETA: 2s - loss: 0.0083 - accuracy: 0.9992 352/512 [===================>..........] - ETA: 1s - loss: 0.0081 - accuracy: 0.9993 384/512 [=====================>........] - ETA: 1s - loss: 0.0080 - accuracy: 0.9993 416/512 [=======================>......] - ETA: 1s - loss: 0.0080 - accuracy: 0.9994 448/512 [=========================>....] - ETA: 0s - loss: 0.0077 - accuracy: 0.9994 480/512 [===========================>..] - ETA: 0s - loss: 0.0075 - accuracy: 0.9995 512/512 [==============================] - 6s 12ms/sample - loss: 0.0074 - accuracy: 0.9995
在訓練150多批之後,試着進行識別,成功率大概在15%~20%,提升訓練批次之後整個模型的識別率應該會很高
y預測= XAZj y實際= iAzj 預測失敗。 y預測= EbqY y實際= EbqY 預測成功。 y預測= WjMl y實際= WjMl 預測成功。 y預測= Jppw y實際= Jlpw 預測失敗。 y預測= RFQq y實際= RFQq 預測成功。 ...... y預測= SRC2 y實際= SaKZ 預測失敗。 y預測= Kfza y實際= KpZa 預測失敗。 y預測= yrct y實際= yrtt 預測失敗。 y預測= LpKb y實際= Lpwb 預測失敗。 y預測= iWWl y實際= iWqL 預測失敗。 預測 100 次 成功率 = 0.16
====================================================
完整代碼以下,在python3.6.八、tensorflow2.0.0-alpha0 環境下成功運行
https://github.com/yukiti2007/sample/blob/master/python/tensorflow/keras_cnn.py
# coding:utf-8 from captcha.image import ImageCaptcha import random from PIL import Image import numpy as np import tensorflow as tf number = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'] alphabet = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'] ALPHABET = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z'] SAVE_PATH = "D:/test/tf2/keras_cnn/" CHAR_SET = number + alphabet + ALPHABET CHAR_SET_LEN = len(CHAR_SET) IMAGE_HEIGHT = 60 IMAGE_WIDTH = 160 def random_captcha_text(char_set=None, captcha_size=4): if char_set is None: char_set = number + alphabet + ALPHABET captcha_text = [] for i in range(captcha_size): c = random.choice(char_set) captcha_text.append(c) return captcha_text def gen_captcha_text_and_image(width=160, height=60, char_set=CHAR_SET): image = ImageCaptcha(width=width, height=height) captcha_text = random_captcha_text(char_set) captcha_text = ''.join(captcha_text) captcha = image.generate(captcha_text) captcha_image = Image.open(captcha) captcha_image = np.array(captcha_image) return captcha_text, captcha_image text, image = gen_captcha_text_and_image(char_set=CHAR_SET) MAX_CAPTCHA = len(text) print('CHAR_SET_LEN=', CHAR_SET_LEN, ' MAX_CAPTCHA=', MAX_CAPTCHA) def convert2gray(img): if len(img.shape) > 2: gray = np.mean(img, -1) return gray else: return img def text2vec(text): vector = np.zeros([MAX_CAPTCHA, CHAR_SET_LEN]) for i, c in enumerate(text): idx = CHAR_SET.index(c) vector[i][idx] = 1.0 return vector def vec2text(vec): text = [] for i, c in enumerate(vec): text.append(CHAR_SET[c]) return "".join(text) def get_next_batch(batch_size=128): batch_x = np.zeros([batch_size, IMAGE_HEIGHT, IMAGE_WIDTH, 1]) batch_y = np.zeros([batch_size, MAX_CAPTCHA, CHAR_SET_LEN]) def wrap_gen_captcha_text_and_image(): while True: text, image = gen_captcha_text_and_image(char_set=CHAR_SET) if image.shape == (60, 160, 3): return text, image for i in range(batch_size): text, image = wrap_gen_captcha_text_and_image() image = tf.reshape(convert2gray(image), (IMAGE_HEIGHT, IMAGE_WIDTH, 1)) batch_x[i, :] = image batch_y[i, :] = text2vec(text) return batch_x, batch_y def crack_captcha_cnn(): model = tf.keras.Sequential() model.add(tf.keras.layers.Conv2D(32, (3, 3))) model.add(tf.keras.layers.PReLU()) model.add(tf.keras.layers.MaxPool2D((2, 2), strides=2)) model.add(tf.keras.layers.Conv2D(64, (5, 5))) model.add(tf.keras.layers.PReLU()) model.add(tf.keras.layers.MaxPool2D((2, 2), strides=2)) model.add(tf.keras.layers.Conv2D(128, (5, 5))) model.add(tf.keras.layers.PReLU()) model.add(tf.keras.layers.MaxPool2D((2, 2), strides=2)) model.add(tf.keras.layers.Flatten()) model.add(tf.keras.layers.Dense(MAX_CAPTCHA * CHAR_SET_LEN)) model.add(tf.keras.layers.Reshape([MAX_CAPTCHA, CHAR_SET_LEN])) model.add(tf.keras.layers.Softmax()) return model def train(): try: model = tf.keras.models.load_model(SAVE_PATH + 'model') except Exception as e: print('#######Exception', e) model = crack_captcha_cnn() model.compile(optimizer='Adam', metrics=['accuracy'], loss='categorical_crossentropy') for times in range(500000): batch_x, batch_y = get_next_batch(512) print('times=', times, ' batch_x.shape=', batch_x.shape, ' batch_y.shape=', batch_y.shape) model.fit(batch_x, batch_y, epochs=4) print("y預測=\n", np.argmax(model.predict(batch_x), axis=2)) print("y實際=\n", np.argmax(batch_y, axis=2)) if 0 == times % 10: print("save model at times=", times) model.save(SAVE_PATH + 'model') def predict(): model = tf.keras.models.load_model(SAVE_PATH + 'model') success = 0 count = 100 for _ in range(count): data_x, data_y = get_next_batch(1) prediction_value = model.predict(data_x) data_y = vec2text(np.argmax(data_y, axis=2)[0]) prediction_value = vec2text(np.argmax(prediction_value, axis=2)[0]) if data_y.upper() == prediction_value.upper(): print("y預測=", prediction_value, "y實際=", data_y, "預測成功。") success += 1 else: print("y預測=", prediction_value, "y實際=", data_y, "預測失敗。") print("預測", count, "次", "成功率=", success / count) pass if __name__ == "__main__": train() predict()