上節咱們講了第一部分,如何用生成簡易的車牌,這節課中咱們會用PaddlePaddle來識別生成的車牌。html
數據讀取python
在上一節生成車牌時,咱們能夠分別生成訓練數據和測試數據,方法以下(完整代碼在這裏):git
1 # 將生成的車牌圖片寫入文件夾,對應的label寫入label.txt 2 def genBatch(self, batchSize,pos,charRange, outputPath,size): 3 if (not os.path.exists(outputPath)): 4 os.mkdir(outputPath) 5 outfile = open('label.txt','w') 6 for i in xrange(batchSize): 7 plateStr,plate = G.genPlateString(-1,-1) 8 print plateStr,plate 9 img = G.generate(plateStr); 10 img = cv2.resize(img,size); 11 cv2.imwrite(outputPath + "/" + str(i).zfill(2) + ".jpg", img); 12 outfile.write(str(plate)+"\n")
生成好數據後,咱們寫一個reader來讀取數據 ( reador.py )github
1 def reader_creator(data,label): 2 def reader(): 3 for i in xrange(len(data)): 4 yield data[i,:],int(label[i]) 5 return reader
灌入模型時,咱們須要調用paddle.batch函數,將數據shuffle後批量灌入模型中:網絡
1 # 讀取訓練數據 2 train_reader = paddle.batch(paddle.reader.shuffle( 3 reador.reader_creator(X_train,Y_train),buf_size=200), 4 batch_size=16) 5 6 # 讀取驗證數據 7 val_reader = paddle.batch(paddle.reader.shuffle( 8 reador.reader_creator(X_val,Y_val),buf_size=200), 9 batch_size=16) 10 trainer.train(reader=train_reader,num_passes=20,event_handler=event_handler)
構建網絡模型app
由於咱們訓練的是端到端的車牌識別,因此一開始構建了兩個卷積-池化層訓練,訓練完後同步訓練7個全鏈接層,分別對應車牌的7位字符,最後將其拼接起來,與原始的label計算Softmax值,預測訓練結果。 ide
1 def get_network_cnn(self): 2 # 加載data和label 3 x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(self.data)) 4 y = paddle.layer.data(name='y', type=paddle.data_type.integer_value(self.label)) 5 # 構建卷積-池化層-1 6 conv_pool_1 = paddle.networks.simple_img_conv_pool( 7 input=x, 8 filter_size=12, 9 num_filters=50, 10 num_channel=1, 11 pool_size=2, 12 pool_stride=2, 13 act=paddle.activation.Relu()) 14 drop_1 = paddle.layer.dropout(input=conv_pool_1, dropout_rate=0.5) 15 # 構建卷積-池化層-2 16 conv_pool_2 = paddle.networks.simple_img_conv_pool( 17 input=drop_1, 18 filter_size=5, 19 num_filters=50, 20 num_channel=20, 21 pool_size=2, 22 pool_stride=2, 23 act=paddle.activation.Relu()) 24 drop_2 = paddle.layer.dropout(input=conv_pool_2, dropout_rate=0.5) 25 26 # 全鏈接層 27 fc = paddle.layer.fc(input = drop_2, size = 120) 28 fc1_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5) 29 fc1 = paddle.layer.fc(input = fc1_drop,size = 65,act = paddle.activation.Linear()) 30 31 fc2_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5) 32 fc2 = paddle.layer.fc(input = fc2_drop,size = 65,act = paddle.activation.Linear()) 33 34 fc3_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5) 35 fc3 = paddle.layer.fc(input = fc3_drop,size = 65,act = paddle.activation.Linear()) 36 37 fc4_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5) 38 fc4 = paddle.layer.fc(input = fc4_drop,size = 65,act = paddle.activation.Linear()) 39 40 fc5_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5) 41 fc5 = paddle.layer.fc(input = fc5_drop,size = 65,act = paddle.activation.Linear()) 42 43 fc6_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5) 44 fc6 = paddle.layer.fc(input = fc6_drop,size = 65,act = paddle.activation.Linear()) 45 46 fc7_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5) 47 fc7 = paddle.layer.fc(input = fc7_drop,size = 65,act = paddle.activation.Linear()) 48 49 # 將訓練好的7個字符的全鏈接層拼接起來 50 fc_concat = paddle.layer.concact(input = [fc21, fc22, fc23, fc24,fc25,fc26,fc27], axis = 0) 51 predict = paddle.layer.classification_cost(input = fc_concat,label = y,act=paddle.activation.Softmax()) 52 return predict
訓練模型函數
構建好網絡模型後,就是比較常見的步驟了,譬如初始化,定義優化方法, 定義訓練參數,定義訓練器等等,再把第一步裏咱們寫好的數據讀取的方式放進去,就能夠正常跑模型了。測試
1 class NeuralNetwork(object): 2 def __init__(self,X_train,Y_train,X_val,Y_val): 3 paddle.init(use_gpu = with_gpu,trainer_count=1) 4 5 self.X_train = X_train 6 self.Y_train = Y_train 7 self.X_val = X_val 8 self.Y_val = Y_val 9 10 11 def get_network_cnn(self): 12 13 x = paddle.layer.data(name='x', type=paddle.data_type.dense_vector(self.data)) 14 y = paddle.layer.data(name='y', type=paddle.data_type.integer_value(self.label)) 15 conv_pool_1 = paddle.networks.simple_img_conv_pool( 16 input=x, 17 filter_size=12, 18 num_filters=50, 19 num_channel=1, 20 pool_size=2, 21 pool_stride=2, 22 act=paddle.activation.Relu()) 23 drop_1 = paddle.layer.dropout(input=conv_pool_1, dropout_rate=0.5) 24 conv_pool_2 = paddle.networks.simple_img_conv_pool( 25 input=drop_1, 26 filter_size=5, 27 num_filters=50, 28 num_channel=20, 29 pool_size=2, 30 pool_stride=2, 31 act=paddle.activation.Relu()) 32 drop_2 = paddle.layer.dropout(input=conv_pool_2, dropout_rate=0.5) 33 34 fc = paddle.layer.fc(input = drop_2, size = 120) 35 fc1_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5) 36 fc1 = paddle.layer.fc(input = fc1_drop,size = 65,act = paddle.activation.Linear()) 37 38 fc2_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5) 39 fc2 = paddle.layer.fc(input = fc2_drop,size = 65,act = paddle.activation.Linear()) 40 41 fc3_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5) 42 fc3 = paddle.layer.fc(input = fc3_drop,size = 65,act = paddle.activation.Linear()) 43 44 fc4_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5) 45 fc4 = paddle.layer.fc(input = fc4_drop,size = 65,act = paddle.activation.Linear()) 46 47 fc5_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5) 48 fc5 = paddle.layer.fc(input = fc5_drop,size = 65,act = paddle.activation.Linear()) 49 50 fc6_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5) 51 fc6 = paddle.layer.fc(input = fc6_drop,size = 65,act = paddle.activation.Linear()) 52 53 fc7_drop = paddle.layer.dropout(input = fc,dropout_rate = 0.5) 54 fc7 = paddle.layer.fc(input = fc7_drop,size = 65,act = paddle.activation.Linear()) 55 56 fc_concat = paddle.layer.concact(input = [fc21, fc22, fc23, fc24,fc25,fc26,fc27], axis = 0) 57 predict = paddle.layer.classification_cost(input = fc_concat,label = y,act=paddle.activation.Softmax()) 58 return predict 59 60 # 定義訓練器 61 def get_trainer(self): 62 63 cost = self.get_network() 64 65 #獲取參數 66 parameters = paddle.parameters.create(cost) 67 68 69 optimizer = paddle.optimizer.Momentum( 70 momentum=0.9, 71 regularization=paddle.optimizer.L2Regularization(rate=0.0002 * 128), 72 learning_rate=0.001, 73 learning_rate_schedule = "pass_manual") 74 75 76 # 建立訓練器 77 trainer = paddle.trainer.SGD( 78 cost=cost, parameters=parameters, update_equation=optimizer) 79 return trainer 80 81 82 # 開始訓練 83 def start_trainer(self,X_train,Y_train,X_val,Y_val): 84 trainer = self.get_trainer() 85 86 result_lists = [] 87 def event_handler(event): 88 if isinstance(event, paddle.event.EndIteration): 89 if event.batch_id % 10 == 0: 90 print "\nPass %d, Batch %d, Cost %f, %s" % ( 91 event.pass_id, event.batch_id, event.cost, event.metrics) 92 if isinstance(event, paddle.event.EndPass): 93 # 保存訓練好的參數 94 with open('params_pass_%d.tar' % event.pass_id, 'w') as f: 95 parameters.to_tar(f) 96 # feeding = ['x','y'] 97 result = trainer.test( 98 reader=val_reader) 99 # feeding=feeding) 100 print "\nTest with Pass %d, %s" % (event.pass_id, result.metrics) 101 102 result_lists.append((event.pass_id, result.cost, 103 result.metrics['classification_error_evaluator'])) 104 105 # 開始訓練 106 train_reader = paddle.batch(paddle.reader.shuffle( 107 reador.reader_creator(X_train,Y_train),buf_size=200), 108 batch_size=16) 109 110 val_reader = paddle.batch(paddle.reader.shuffle( 111 reador.reader_creator(X_val,Y_val),buf_size=200), 112 batch_size=16) 113 # val_reader = paddle.reader(reador.reader_creator(X_val,Y_val),batch_size=16) 114 115 trainer.train(reader=train_reader,num_passes=20,event_handler=event_handler)
輸出結果優化
上一步訓練完之後,保存訓練完的模型,而後寫一個test.py進行預測,須要注意的是,在預測時,構建的網絡結構得和訓練的網絡結構相同。
#批量預測測試圖片準確率 python test.py /Users/shelter/test ##輸出結果示例 output: 預測車牌號碼爲:津 K 4 2 R M Y 輸入圖片數量:100 輸入圖片行準確率:0.72 輸入圖片列準確率:0.86
若是是一次性只預測一張的話,在終端裏會顯示原始的圖片與預測的值,若是是批量預測的話,會打印出預測的總準確率,包括行與列的準確率。
總結
車牌識別的方法有不少,商業化落地的方法也很成熟,傳統的方法須要對圖片灰度化,字符進行切分等,須要不少數據預處理的過程,端到端的方法能夠直接將原始的圖片灌進去進行訓練,最後出來預測的車牌字符的結果,這個方法在構建了兩層卷積-池化網絡結構後,並行訓練了7個全鏈接層來進行車牌的字符識別,能夠實現端到端的識別。可是在實際訓練過程當中,仍然有一些問題,譬如前幾個訓練的全鏈接層的準確率要比最後一兩個的準確率高,你們能夠分別打印出每個全鏈接層的訓練結果準確率對比一下,多是因爲訓練尚未收斂致使的,也可能有其餘緣由,若是在作的過程當中發現有什麼問題,或者有更好的方法,歡迎留言~
參考文獻:
1.個人github:https://github.com/huxiaoman7/mxnet-cnn-plate-recognition