anaconda3下配置python-3.5+tensorflow-gpu-1.9.0人臉識別項目環境

時間 2019-12-14

標籤 anaconda3 anaconda 配置 python 3.5+tensorflow tensorflow gpu 1.9.0 識別項目環境欄目 Python 简体版

原文原文鏈接

http://www.javashuo.com/article/p-otgcommc-gu.htmlhtml

以前爲了配置tensorflow-gpu的環境又是裝cuda,又是裝cudnn，還有tensoflow-gpu等等，，由於當時也是第一次搭建這個環境，因此徹底是按照別人的搭建方法來一步一步的弄得，，後來我在給室友安裝環境的時候，發現cuda,cudnn什麼的徹底不用本身安裝，，，所有交給 anaconda3 （好東西）就好了python

Anaconda3安裝

幾乎最後全部的東西都是用這個完成的，，因此先去安裝這玩意，，git

直接官網下載就好了，，安裝的時候記得選擇 PATH 配置，，否則以後還得本身去弄環境變量，，shell

而後在 powershell 裏檢查一下確實配置成功就好了 conda -V數組

配置一個環境

由於個人電腦是 win10x64+gtx1050，，因此選擇安裝 tensorflow-gpu-1.9.0版的，，gpu版的到時候訓練模型的時候跑的很快，，（大概1s2-3張照片吧），若是用cpu跑的話有些慢，，1張照片可能要2s左右，，，app

打開powsershell，，（千萬不要換源，，千萬不要換源，，千萬不要換源，，less

建立一個環境

conda create -n [name] python=3.5 tensorflow-gpu=1.9.0

可能這一步會很慢，，可是建議不要去換源，，由於會出現下的東西不全，最後可能不能使用gpu版的tensorflow，，，dom

輸完這段命令後，，等一會會出現一些要安裝的東西列表，，這時主要看一下有沒有python, tensorflow-gpu, cudnn, cudatoolkit,,,都有的話就y肯定等就好了，，，ide

環境的名字隨便起，，函數

激活環境

由於這時是powershell下，，，激活環境會不成功，，因此直接切換到cmd模式就好了，，輸 cmd，，，

activate [name]

這時會發現前面多了一個 ([name]) 的東西，表示激活環境成功，，，

而後再測試一下python下能不能調用 tensorflow-gpu 版，，測試的方法能夠參考個人上一篇博客裏後面那一部份內容，，，

運行簡單的人臉識別的實例

前面的準備工做弄好以後就能夠運行一個簡單的實例看一下在這個環境下的運行狀況，，，

下面的python程序是學長給個人，，而後我發現學長的程序是這個博主寫的項目，，其中也有個人一些改動，，下面會提到，，

下面的操做都是在剛剛建立的環境下操做的，，，不然的話會是anaconda3默認的base環境下，，，

安裝必備的庫

由於這我的臉識別的實現用到了 opencv, dlib等等，，因此先安裝這些，，

安裝opencv

conda install opencv

安裝dlib

這個玩意的安裝有點坑，，有時貌似直接安裝會安裝不上，，會提示沒有 cmake 這個包管理軟件，，因此要先安裝cmake，，建議是在anaconda3主程序（開始菜單裏找 Anaconda Navigator）中找到你的那個環境，，而後再 uninstall 中找到 cmake 而後安裝，，，

可是這樣可能仍是安裝不了dlib，，不管是用conda仍是pip安裝，，

conda install dlib
pip install dlib

後來我找到一個解決方法，，去下載 dlib****.whl 而後本地安裝，，

下載地址

再 DownloadFiles 中找到一個這個東西，，
dlib-19.1.0-cp35-cp35m-win_amd64.whl

而後放到你如今的路徑下，，pip install dlib-19.1.0-cp35-cp35m-win_amd64.whl

應該這樣就能夠安裝上了dlib，，，固然你能夠用其餘的方法安裝，，網上也有不少解決方法，，，也有可能直接用 pip 就能安裝上（好比個人電腦就能，，室友的就會出現上面的錯誤，，得繞一個彎子）

安裝sklearn

這個簡單，，會在訓練那一步用到

pip install sklearn

運行實例

那個博主的項目分爲4個部分，

get_my_faces.py: 獲取人臉並識別出來裁剪出來做爲元數據
set_other_faces.py: 獲取14000張人臉的照片做爲訓練所要用的數據
train_faces.py: 訓練模型
is_my_face.py: 實時獲取人臉，並判斷是否和第一步所錄入的人臉相匹配

get_my_faces

這一步可使用 dlib 的人臉識別裁剪，也可使用opencv自帶的來使用，，和室友試驗了一下，發現opencv的雖然相對較快，可是識別不佳，並且一樣大小的視頻最後生成的照片個數也不多（也有多是那裏沒寫好），，

原博主的程序是拍一張照片而後識別一張裁剪一張，，這樣很慢，，因此我把它改爲了錄一段視頻，而後對於每一幀來識別裁剪，，這樣賊快，，，（按q退出錄製後自動進行後面的內容

注意複製代碼後要適當的改一些參數，，好比說opencv中hear的參數等等

import cv2
import os
import dlib
import sys
import random
import shutil

 
 
def make_video():
    # 錄製視頻
    shutil.rmtree('./my_faces')
    """使用opencv錄像"""
    cap = cv2.VideoCapture(0)  # 默認的攝像頭
    # 指定視頻代碼
    fourcc = cv2.VideoWriter_fourcc(*"DIVX")
    out = cv2.VideoWriter('233.avi', fourcc, 20.0, (640,480))
    while(cap.isOpened()):
        ret, frame = cap.read()
        if ret:
            out.write(frame)
            #
            cv2.imshow('frame',frame)
            # 等待按鍵q操做關閉攝像頭
            if cv2.waitKey(1) & 0xFF == ord('q'):
                break
        else:
            break
    cap.release()
    out.release()
    cv2.destroyAllWindows()




# 改變圖片的亮度與對比度
def relight(img, light=1, bias=0):
    w = img.shape[1]
    h = img.shape[0]
    #image = []
    for i in range(0,w):
        for j in range(0,h):
            for c in range(3):
                tmp = int(img[j,i,c]*light + bias)
                if tmp > 255:
                    tmp = 255
                elif tmp < 0:
                    tmp = 0
                img[j,i,c] = tmp
    return img

def hhh():
    # 利用dlib來實現
    output_dir = './my_faces'
    size = 64

    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
    #使用dlib自帶的frontal_face_detector做爲咱們的特徵提取器
    detector = dlib.get_frontal_face_detector()
    # 打開攝像頭 參數爲輸入流，能夠爲攝像頭或視頻文件
    #camera = cv2.VideoCapture(0)
    camera = cv2.VideoCapture("233.avi")

    index = 1
    while True:
        if (index <= 10000):
            print('Being processed picture %s' % index)
            # 從攝像頭讀取照片
            success, img = camera.read()
            # 轉爲灰度圖片
            gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
            # 使用detector進行人臉檢測
            dets = detector(gray_img, 1)

            if success == False:
                break

            for i, d in enumerate(dets):
                x1 = d.top() if d.top() > 0 else 0
                y1 = d.bottom() if d.bottom() > 0 else 0
                x2 = d.left() if d.left() > 0 else 0
                y2 = d.right() if d.right() > 0 else 0

                face = img[x1:y1,x2:y2]
                # 調整圖片的對比度與亮度， 對比度與亮度值都取隨機數，這樣能增長樣本的多樣性
                face = relight(face, random.uniform(0.5, 1.5), random.randint(-50, 50))

                face = cv2.resize(face, (size,size))

                cv2.imshow('image', face)

                cv2.imwrite(output_dir+'/'+str(index)+'.jpg', face)

                index += 1
            key = cv2.waitKey(30) & 0xff
            if key == 27:
                break
        else:
            print('Finished!')
            break



def hhhh():
    # 利用opencv來實現
    output_dir = './my_faces'
    size = 64
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)                                 
    # 獲取分類器
    haar = cv2.CascadeClassifier(r'G:\DIP\Anaconda3\envs\test1\Library\etc\haarcascades\haarcascade_frontalface_default.xml')

    # 打開攝像頭 參數爲輸入流，能夠爲攝像頭或視頻文件
    camera = cv2.VideoCapture("233.avi")

    n = 1
    while 1:
        if (n <= 10000):
            print('It`s processing %s image.' % n)
            # 讀幀
            success, img = camera.read()

            gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
            faces = haar.detectMultiScale(gray_img, 1.3, 5)
            for f_x, f_y, f_w, f_h in faces:
                face = img[f_y:f_y+f_h, f_x:f_x+f_w]
                face = cv2.resize(face, (64,64))
                '''
                if n % 3 == 1:
                    face = relight(face, 1, 50)
                elif n % 3 == 2:
                    face = relight(face, 0.5, 0)
                '''
                face = relight(face, random.uniform(0.5, 1.5), random.randint(-50, 50))
                cv2.imshow('img', face)
                cv2.imwrite(output_dir+'/'+str(n)+'.jpg', face)
                n+=1
            key = cv2.waitKey(30) & 0xff
            if key == 27:
                break
        else:
            break

if __name__ == '__main__':
    make_video()
    hhh()

set_other_faces

這一步主要是識別裁剪那堆別人的照片

先去下那一堆照片，，而後解壓，重命名爲 input_img （只是驗證一下整個項目的效果的話能夠刪去一半的照片，，否則可能得跑個10分鐘左右，，，

# -*- codeing: utf-8 -*-
import sys
import os
import cv2
import dlib

input_dir = './input_img'
output_dir = './other_faces'
size = 64

if not os.path.exists(output_dir):
    os.makedirs(output_dir)

#使用dlib自帶的frontal_face_detector做爲咱們的特徵提取器
detector = dlib.get_frontal_face_detector()

index = 1
for (path, dirnames, filenames) in os.walk(input_dir):
    for filename in filenames:
        if filename.endswith('.jpg'):
            print('Being processed picture %s' % index)
            img_path = path+'/'+filename
            # 從文件讀取圖片
            img = cv2.imread(img_path)
            # 轉爲灰度圖片
            gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
            # 使用detector進行人臉檢測 dets爲返回的結果
            dets = detector(gray_img, 1)

            #使用enumerate 函數遍歷序列中的元素以及它們的下標
            #下標i即爲人臉序號
            #left：人臉左邊距離圖片左邊界的距離 ；right：人臉右邊距離圖片左邊界的距離 
            #top：人臉上邊距離圖片上邊界的距離 ；bottom：人臉下邊距離圖片上邊界的距離
            for i, d in enumerate(dets):
                x1 = d.top() if d.top() > 0 else 0
                y1 = d.bottom() if d.bottom() > 0 else 0
                x2 = d.left() if d.left() > 0 else 0
                y2 = d.right() if d.right() > 0 else 0
                # img[y:y+h,x:x+w]
                face = img[x1:y1,x2:y2]
                # 調整圖片的尺寸
                face = cv2.resize(face, (size,size))
                cv2.imshow('image',face)
                # 保存圖片
                cv2.imwrite(output_dir+'/'+str(index)+'.jpg', face)
                index += 1

            key = cv2.waitKey(30) & 0xff
            if key == 27:
                sys.exit(0)

train_faces

這一步就是訓練模型，，，剛開始會卡頓一會，，，以後就會跑起來，，，看一下是否是gpu跑，，cpu的話賊慢，，，gpu的話不到一分鐘左右就能夠了，，，

import tensorflow as tf
import cv2
import numpy as np
import os
import random
import sys
from sklearn.model_selection import train_test_split

my_faces_path = './my_faces'
other_faces_path = './other_faces'
size = 64

imgs = []
labs = []

def getPaddingSize(img):
    h, w, _ = img.shape
    top, bottom, left, right = (0,0,0,0)
    longest = max(h, w)

    if w < longest:
        tmp = longest - w
        # //表示整除符號
        left = tmp // 2
        right = tmp - left
    elif h < longest:
        tmp = longest - h
        top = tmp // 2
        bottom = tmp - top
    else:
        pass
    return top, bottom, left, right

def readData(path , h=size, w=size):
    for filename in os.listdir(path):
        if filename.endswith('.jpg'):
            filename = path + '/' + filename

            img = cv2.imread(filename)

            top,bottom,left,right = getPaddingSize(img)
            # 將圖片放大， 擴充圖片邊緣部分
            img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=[0,0,0])
            img = cv2.resize(img, (h, w))

            imgs.append(img)
            labs.append(path)

readData(my_faces_path)
readData(other_faces_path)
# 將圖片數據與標籤轉換成數組
imgs = np.array(imgs)
labs = np.array([[0,1] if lab == my_faces_path else [1,0] for lab in labs])
# 隨機劃分測試集與訓練集
train_x,test_x,train_y,test_y = train_test_split(imgs, labs, test_size=0.05, random_state=random.randint(0,100))
# 參數：圖片數據的總數，圖片的高、寬、通道
train_x = train_x.reshape(train_x.shape[0], size, size, 3)
test_x = test_x.reshape(test_x.shape[0], size, size, 3)
# 將數據轉換成小於1的數
train_x = train_x.astype('float32')/255.0
test_x = test_x.astype('float32')/255.0

print('train size:%s, test size:%s' % (len(train_x), len(test_x)))
# 圖片塊，每次取100張圖片
batch_size = 100
num_batch = len(train_x) // batch_size

x = tf.placeholder(tf.float32, [None, size, size, 3])
y_ = tf.placeholder(tf.float32, [None, 2])

keep_prob_5 = tf.placeholder(tf.float32)
keep_prob_75 = tf.placeholder(tf.float32)

def weightVariable(shape):
    init = tf.random_normal(shape, stddev=0.01)
    return tf.Variable(init)

def biasVariable(shape):
    init = tf.random_normal(shape)
    return tf.Variable(init)

def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME')

def maxPool(x):
    return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')

def dropout(x, keep):
    return tf.nn.dropout(x, keep)

def cnnLayer():
    # 第一層
    W1 = weightVariable([3,3,3,32]) # 卷積核大小(3,3)， 輸入通道(3)， 輸出通道(32)
    b1 = biasVariable([32])
    # 卷積
    conv1 = tf.nn.relu(conv2d(x, W1) + b1)
    # 池化
    pool1 = maxPool(conv1)
    # 減小過擬合，隨機讓某些權重不更新
    drop1 = dropout(pool1, keep_prob_5)

    # 第二層
    W2 = weightVariable([3,3,32,64])
    b2 = biasVariable([64])
    conv2 = tf.nn.relu(conv2d(drop1, W2) + b2)
    pool2 = maxPool(conv2)
    drop2 = dropout(pool2, keep_prob_5)

    # 第三層
    W3 = weightVariable([3,3,64,64])
    b3 = biasVariable([64])
    conv3 = tf.nn.relu(conv2d(drop2, W3) + b3)
    pool3 = maxPool(conv3)
    drop3 = dropout(pool3, keep_prob_5)

    # 全鏈接層
    Wf = weightVariable([8*8*64, 512])
    bf = biasVariable([512])
    drop3_flat = tf.reshape(drop3, [-1, 8*8*64])
    dense = tf.nn.relu(tf.matmul(drop3_flat, Wf) + bf)
    dropf = dropout(dense, keep_prob_75)

    # 輸出層
    Wout = weightVariable([512,2])
    bout = biasVariable([2])
    #out = tf.matmul(dropf, Wout) + bout
    out = tf.add(tf.matmul(dropf, Wout), bout)
    return out

def cnnTrain():
    out = cnnLayer()

    cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=out, labels=y_))

    train_step = tf.train.AdamOptimizer(0.01).minimize(cross_entropy)
    # 比較標籤是否相等，再求的全部數的平均值，tf.cast(強制轉換類型)
    accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(out, 1), tf.argmax(y_, 1)), tf.float32))
    # 將loss與accuracy保存以供tensorboard使用
    tf.summary.scalar('loss', cross_entropy)
    tf.summary.scalar('accuracy', accuracy)
    merged_summary_op = tf.summary.merge_all()
    # 數據保存器的初始化
    saver = tf.train.Saver()

    with tf.Session() as sess:

        sess.run(tf.global_variables_initializer())

        summary_writer = tf.summary.FileWriter('./tmp', graph=tf.get_default_graph())

        for n in range(10):
             # 每次取128(batch_size)張圖片
            for i in range(num_batch):
                batch_x = train_x[i*batch_size : (i+1)*batch_size]
                batch_y = train_y[i*batch_size : (i+1)*batch_size]
                # 開始訓練數據，同時訓練三個變量，返回三個數據
                _,loss,summary = sess.run([train_step, cross_entropy, merged_summary_op],
                                           feed_dict={x:batch_x,y_:batch_y, keep_prob_5:0.5,keep_prob_75:0.75})
                summary_writer.add_summary(summary, n*num_batch+i)
                # 打印損失
                print(n*num_batch+i, loss)

                if (n*num_batch+i) % 100 == 0:
                    # 獲取測試數據的準確率
                    acc = accuracy.eval({x:test_x, y_:test_y, keep_prob_5:1.0, keep_prob_75:1.0})
                    print(n*num_batch+i, acc)
                    # 準確率大於0.98時保存並退出
                    if acc > 0.98 and n > 2:
                        saver.save(sess, './train_faces.model', global_step=n*num_batch+i)
                        sys.exit(0)
        print('accuracy less 0.98, exited!')

cnnTrain()

is_my_face

最後就是識別了，，，運行這個會出現兩個窗口，一個是實時的拍攝窗口，一個是識別的窗口（會出現藍色的框，，，

而後若是識別出來是以前錄入的那我的的話，，cmd裏會出現True的字樣，，不然是False，，，若是沒有識別出來有人臉在畫面裏的話會卡住不動，，，

大概以前錄的時間是2-3分鐘左右的準確度就很高了，，

import tensorflow as tf
import cv2
import dlib
import numpy as np
import os
import random
import sys
import time
from sklearn.model_selection import train_test_split

my_faces_path = './my_faces'
other_faces_path = './other_faces'
size = 64

imgs = []
labs = []

def getPaddingSize(img):
    h, w, _ = img.shape
    top, bottom, left, right = (0,0,0,0)
    longest = max(h, w)

    if w < longest:
        tmp = longest - w
        # //表示整除符號
        left = tmp // 2
        right = tmp - left
    elif h < longest:
        tmp = longest - h
        top = tmp // 2
        bottom = tmp - top
    else:
        pass
    return top, bottom, left, right

def readData(path , h=size, w=size):
    for filename in os.listdir(path):
        if filename.endswith('.jpg'):
            filename = path + '/' + filename

            img = cv2.imread(filename)

            top,bottom,left,right = getPaddingSize(img)
            # 將圖片放大， 擴充圖片邊緣部分
            img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=[0,0,0])
            img = cv2.resize(img, (h, w))

            imgs.append(img)
            labs.append(path)

readData(my_faces_path)
readData(other_faces_path)
# 將圖片數據與標籤轉換成數組
imgs = np.array(imgs)
labs = np.array([[0,1] if lab == my_faces_path else [1,0] for lab in labs])
# 隨機劃分測試集與訓練集
train_x,test_x,train_y,test_y = train_test_split(imgs, labs, test_size=0.05, random_state=random.randint(0,100))
# 參數：圖片數據的總數，圖片的高、寬、通道
train_x = train_x.reshape(train_x.shape[0], size, size, 3)
test_x = test_x.reshape(test_x.shape[0], size, size, 3)
# 將數據轉換成小於1的數
train_x = train_x.astype('float32')/255.0
test_x = test_x.astype('float32')/255.0

print('train size:%s, test size:%s' % (len(train_x), len(test_x)))
# 圖片塊，每次取128張圖片
batch_size = 128
num_batch = len(train_x) // 128

x = tf.placeholder(tf.float32, [None, size, size, 3])
y_ = tf.placeholder(tf.float32, [None, 2])

keep_prob_5 = tf.placeholder(tf.float32)
keep_prob_75 = tf.placeholder(tf.float32)

def weightVariable(shape):
    init = tf.random_normal(shape, stddev=0.01)
    return tf.Variable(init)

def biasVariable(shape):
    init = tf.random_normal(shape)
    return tf.Variable(init)

def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME')

def maxPool(x):
    return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')

def dropout(x, keep):
    return tf.nn.dropout(x, keep)

def cnnLayer():
    # 第一層
    W1 = weightVariable([3,3,3,32]) # 卷積核大小(3,3)， 輸入通道(3)， 輸出通道(32)
    b1 = biasVariable([32])
    # 卷積
    conv1 = tf.nn.relu(conv2d(x, W1) + b1)
    # 池化
    pool1 = maxPool(conv1)
    # 減小過擬合，隨機讓某些權重不更新
    drop1 = dropout(pool1, keep_prob_5)

    # 第二層
    W2 = weightVariable([3,3,32,64])
    b2 = biasVariable([64])
    conv2 = tf.nn.relu(conv2d(drop1, W2) + b2)
    pool2 = maxPool(conv2)
    drop2 = dropout(pool2, keep_prob_5)

    # 第三層
    W3 = weightVariable([3,3,64,64])
    b3 = biasVariable([64])
    conv3 = tf.nn.relu(conv2d(drop2, W3) + b3)
    pool3 = maxPool(conv3)
    drop3 = dropout(pool3, keep_prob_5)

    # 全鏈接層
    Wf = weightVariable([8*16*32, 512])
    bf = biasVariable([512])
    drop3_flat = tf.reshape(drop3, [-1, 8*16*32])
    dense = tf.nn.relu(tf.matmul(drop3_flat, Wf) + bf)
    dropf = dropout(dense, keep_prob_75)

    # 輸出層
    Wout = weightVariable([512,2])
    bout = biasVariable([2])
    out = tf.add(tf.matmul(dropf, Wout), bout)
    return out

output = cnnLayer()  
predict = tf.argmax(output, 1)  
   
saver = tf.train.Saver()  
sess = tf.Session()  
saver.restore(sess, tf.train.latest_checkpoint('.'))   
   
def is_my_face(image):  
    res = sess.run(predict, feed_dict={x: [image/255.0], keep_prob_5:1.0, keep_prob_75: 1.0})  
    if res[0] == 1:  
        return True  
    else:  
        return False  

#使用dlib自帶的frontal_face_detector做爲咱們的特徵提取器
detector = dlib.get_frontal_face_detector()

cam = cv2.VideoCapture(0)  
   
while True:  
    time.sleep(0.2) 
    _, img = cam.read()  
    gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    dets = detector(gray_image, 1)
    if not len(dets):
        #print('Can`t get face.')
        cv2.imshow('img', img)
        key = cv2.waitKey(30) & 0xff  
        if key == 27:
            sys.exit(0)
            
    for i, d in enumerate(dets):
        x1 = d.top() if d.top() > 0 else 0
        y1 = d.bottom() if d.bottom() > 0 else 0
        x2 = d.left() if d.left() > 0 else 0
        y2 = d.right() if d.right() > 0 else 0
        face = img[x1:y1,x2:y2]
        # 調整圖片的尺寸
        face = cv2.resize(face, (size,size))
        print('Is this my face? %s' % is_my_face(face))

        cv2.rectangle(img, (x2,x1),(y2,y1), (255,0,0),3)
        cv2.imshow('image',img)
        key = cv2.waitKey(30) & 0xff
        if key == 27:
            sys.exit(0)
  
sess.close()