摘要: 深度學習項目何其多?對於入門而言,合適且有趣的項目能產生積極的影響,本文總結了一份計算機視覺領域的幾個有趣項目,感興趣的能夠動手嘗試一下。
自從2012年AlexNet網絡在ImageNet挑戰賽上取得巨大成功以後,計算機視覺和深度學習等領域再一次迎來研究熱潮。計算機視覺,從字面意義上理解就是讓計算機等機器也具有人類視覺,研究讓機器進行圖像分類、目標檢測等。在這近十年裏,該領域取得的成就讓人大吃一驚,有些研究已經超越了人類的表現水平。html
對於想入門或者從事深度學習領域的工做者而言,通常是從計算機視覺入手,網上有不少資料去介紹理論方面的知識,進行相應的實踐也必不可少。本文總結一些在計算機視覺和深度學習領域的一些實踐項目,供讀者針對本身的感興趣點挑選並實踐。python
若是不熟悉上述術語,能夠從下面的文章中瞭解更多的相關信息:git
下面進行正文介紹:github
項目地址:akshaybahadur21/HandMovementTracking算法
爲了執行視頻跟蹤,算法分析連續視頻幀並輸出幀之間的目標移動。針對這類問題,有各類各樣的算法,每種算法都有各自的優缺點。在選擇使用哪一種算法時,針對實際應用場景考慮是很重要的。視覺跟蹤系統有兩個主要組成部分:目標表示和局部化,以及過濾和數據關聯。
視頻跟蹤是使用相機隨時間定位移動物體(或多個物體)的過程。它有多種用途,好比:人機交互、安全和監控、視頻通訊和壓縮、加強現實、交通控制、醫學成像和視頻編輯等。數據庫
項目代碼以下:macos
import numpy as np import cv2 import argparse from collections import deque cap=cv2.VideoCapture(0) pts = deque(maxlen=64) Lower_green = np.array([110,50,50]) Upper_green = np.array([130,255,255]) while True: ret, img=cap.read() hsv=cv2.cvtColor(img,cv2.COLOR_BGR2HSV) kernel=np.ones((5,5),np.uint8) mask=cv2.inRange(hsv,Lower_green,Upper_green) mask = cv2.erode(mask, kernel, iterations=2) mask=cv2.morphologyEx(mask,cv2.MORPH_OPEN,kernel) #mask=cv2.morphologyEx(mask,cv2.MORPH_CLOSE,kernel) mask = cv2.dilate(mask, kernel, iterations=1) res=cv2.bitwise_and(img,img,mask=mask) cnts,heir=cv2.findContours(mask.copy(),cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)[-2:] center = None if len(cnts) > 0: c = max(cnts, key=cv2.contourArea) ((x, y), radius) = cv2.minEnclosingCircle(c) M = cv2.moments(c) center = (int(M["m10"] / M["m00"]), int(M["m01"] / M["m00"])) if radius > 5: cv2.circle(img, (int(x), int(y)), int(radius),(0, 255, 255), 2) cv2.circle(img, center, 5, (0, 0, 255), -1) pts.appendleft(center) for i in xrange (1,len(pts)): if pts[i-1]is None or pts[i] is None: continue thick = int(np.sqrt(len(pts) / float(i + 1)) * 2.5) cv2.line(img, pts[i-1],pts[i],(0,0,225),thick) cv2.imshow("Frame", img) cv2.imshow("mask",mask) cv2.imshow("res",res) k=cv2.waitKey(30) & 0xFF if k==32: break # cleanup the camera and close any open windows cap.release() cv2.destroyAllWindows()
54行代碼便可解決這個問題,很簡單吧!該項目依賴Opencv包,須要在計算機中安裝OpenCV才能執行這個項目,下面是針對Mac系統、Ubuntu系統以及Windows系統的安裝方法:ubuntu
Mac系統:windows
Ubuntu系統:安全
Windows系統:
項目地址:akshaybahadur21/Drowsiness_Detection
睡意檢測對於長時間駕駛、工廠生產等場景很重要,因爲長時間開車時,容易產生睡意,進而可能致使事故。當用戶昏昏欲睡時,此代碼能夠經過檢測眼睛來發出警報。
依賴包
CV2 immutils DLIB SciPy
實現算法
每隻眼睛用6個(x,y)座標表示,從眼睛的左角開始,而後順時針繞眼睛工做:
條件
它檢查20個連續幀,若是眼睛縱橫比低於0.25,則生成警報。
關係
實現
項目地址:akshaybahadur21/Digit-Recognizer
此代碼使用softmax迴歸對不一樣的數字進行分類,依賴的包有點多,能夠安裝下Conda,它包含了機器學習所須要的全部依賴包。
描述
Softmax迴歸(同義詞:多項Logistic、最大熵分類器、多類Logistic迴歸)是邏輯迴歸的推廣,能夠將其用於多類分類(假設類間是互斥的)。相反,在二分類任務中使用(標準)Logistic迴歸模型。
Python實現
使用的是MNIST數字手寫體數據集,每張圖像大小爲28×28,這裏嘗試使用三種方法對0到9的數字進行分類,分別是Logistic迴歸、淺層神經網絡和深度神經網絡。
import numpy as np import matplotlib.pyplot as plt def softmax(z): z -= np.max(z) sm = (np.exp(z).T / np.sum(np.exp(z), axis=1)) return sm def initialize(dim1, dim2): """ :param dim: size of vector w initilazied with zeros :return: """ w = np.zeros(shape=(dim1, dim2)) b = np.zeros(shape=(10, 1)) return w, b def propagate(w, b, X, Y): """ :param w: weights for w :param b: bias :param X: size of data(no of features, no of examples) :param Y: true label :return: """ m = X.shape[1] # getting no of rows # Forward Prop A = softmax((np.dot(w.T, X) + b).T) cost = (-1 / m) * np.sum(Y * np.log(A)) # backwar prop dw = (1 / m) * np.dot(X, (A - Y).T) db = (1 / m) * np.sum(A - Y) cost = np.squeeze(cost) grads = {"dw": dw, "db": db} return grads, cost def optimize(w, b, X, Y, num_iters, alpha, print_cost=False): """ :param w: weights for w :param b: bias :param X: size of data(no of features, no of examples) :param Y: true label :param num_iters: number of iterations for gradient :param alpha: :return: """ costs = [] for i in range(num_iters): grads, cost = propagate(w, b, X, Y) dw = grads["dw"] db = grads["db"] w = w - alpha * dw b = b - alpha * db # Record the costs if i % 50 == 0: costs.append(cost) # Print the cost every 100 training examples if print_cost and i % 50 == 0: print("Cost after iteration %i: %f" % (i, cost)) params = {"w": w, "b": b} grads = {"dw": dw, "db": db} return params, grads, costs def predict(w, b, X): """ :param w: :param b: :param X: :return: """ # m = X.shape[1] # y_pred = np.zeros(shape=(1, m)) # w = w.reshape(X.shape[0], 1) y_pred = np.argmax(softmax((np.dot(w.T, X) + b).T), axis=0) return y_pred def model(X_train, Y_train, Y,X_test,Y_test, num_iters, alpha, print_cost): """ :param X_train: :param Y_train: :param X_test: :param Y_test: :param num_iterations: :param learning_rate: :param print_cost: :return: """ w, b = initialize(X_train.shape[0], Y_train.shape[0]) parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iters, alpha, print_cost) w = parameters["w"] b = parameters["b"] y_prediction_train = predict(w, b, X_train) y_prediction_test = predict(w, b, X_test) print("Train accuracy: {} %", sum(y_prediction_train == Y) / (float(len(Y))) * 100) print("Test accuracy: {} %", sum(y_prediction_test == Y_test) / (float(len(Y_test))) * 100) d = {"costs": costs, "Y_prediction_test": y_prediction_test, "Y_prediction_train": y_prediction_train, "w": w, "b": b, "learning_rate": alpha, "num_iterations": num_iters} # Plot learning curve (with costs) #costs = np.squeeze(d['costs']) #plt.plot(costs) #plt.ylabel('cost') #plt.xlabel('iterations (per hundreds)') #plt.title("Learning rate =" + str(d["learning_rate"])) #plt.plot() #plt.show() #plt.close() #pri(X_test.T, y_prediction_test) return d def pri(X_test, y_prediction_test): example = X_test[2, :] print("Prediction for the example is ", y_prediction_test[2]) plt.imshow(np.reshape(example, [28, 28])) plt.plot() plt.show()
import numpy as np import matplotlib.pyplot as plt def softmax(z): z -= np.max(z) sm = (np.exp(z).T / np.sum(np.exp(z),axis=1)) return sm def layers(X, Y): """ :param X: :param Y: :return: """ n_x = X.shape[0] n_y = Y.shape[0] return n_x, n_y def initialize_nn(n_x, n_h, n_y): """ :param n_x: :param n_h: :param n_y: :return: """ np.random.seed(2) W1 = np.random.randn(n_h, n_x) * 0.01 b1 = np.random.rand(n_h, 1) W2 = np.random.rand(n_y, n_h) b2 = np.random.rand(n_y, 1) parameters = {"W1": W1, "b1": b1, "W2": W2, "b2": b2} return parameters def forward_prop(X, parameters): W1 = parameters['W1'] b1 = parameters['b1'] W2 = parameters['W2'] b2 = parameters['b2'] Z1 = np.dot(W1, X) + b1 A1 = np.tanh(Z1) Z2 = np.dot(W2, A1) + b2 A2 = softmax(Z2.T) cache = {"Z1": Z1, "A1": A1, "Z2": Z2, "A2": A2} return A2, cache def compute_cost(A2, Y, parameters): m = Y.shape[1] W1 = parameters['W1'] W2 = parameters['W2'] logprobs = np.multiply(np.log(A2), Y) cost = - np.sum(logprobs) / m cost = np.squeeze(cost) return cost def back_prop(parameters, cache, X, Y): m = Y.shape[1] W1 = parameters['W1'] W2 = parameters['W2'] A1 = cache['A1'] A2 = cache['A2'] dZ2 = A2 - Y dW2 = (1 / m) * np.dot(dZ2, A1.T) db2 = (1 / m) * np.sum(dZ2, axis=1, keepdims=True) dZ1 = np.multiply(np.dot(W2.T, dZ2), 1 - np.square(A1)) dW1 = (1 / m) * np.dot(dZ1, X.T) db1 = (1 / m) * np.sum(dZ1, axis=1, keepdims=True) grads = {"dW1": dW1, "db1": db1, "dW2": dW2, "db2": db2} return grads def update_params(parameters, grads, alpha): W1 = parameters['W1'] b1 = parameters['b1'] W2 = parameters['W2'] b2 = parameters['b2'] dW1 = grads['dW1'] db1 = grads['db1'] dW2 = grads['dW2'] db2 = grads['db2'] W1 = W1 - alpha * dW1 b1 = b1 - alpha * db1 W2 = W2 - alpha * dW2 b2 = b2 - alpha * db2 parameters = {"W1": W1, "b1": b1, "W2": W2, "b2": b2} return parameters def model_nn(X, Y,Y_real,test_x,test_y, n_h, num_iters, alpha, print_cost): np.random.seed(3) n_x,n_y = layers(X, Y) parameters = initialize_nn(n_x, n_h, n_y) W1 = parameters['W1'] b1 = parameters['b1'] W2 = parameters['W2'] b2 = parameters['b2'] costs = [] for i in range(0, num_iters): A2, cache = forward_prop(X, parameters) cost = compute_cost(A2, Y, parameters) grads = back_prop(parameters, cache, X, Y) if (i > 1500): alpha1 = 0.95*alpha parameters = update_params(parameters, grads, alpha1) else: parameters = update_params(parameters, grads, alpha) if i % 100 == 0: costs.append(cost) if print_cost and i % 100 == 0: print("Cost after iteration for %i: %f" % (i, cost)) predictions = predict_nn(parameters, X) print("Train accuracy: {} %", sum(predictions == Y_real) / (float(len(Y_real))) * 100) predictions=predict_nn(parameters,test_x) print("Train accuracy: {} %", sum(predictions == test_y) / (float(len(test_y))) * 100) #plt.plot(costs) #plt.ylabel('cost') #plt.xlabel('iterations (per hundreds)') #plt.title("Learning rate =" + str(alpha)) #plt.show() return parameters def predict_nn(parameters, X): A2, cache = forward_prop(X, parameters) predictions = np.argmax(A2, axis=0) return predictions
import numpy as np import matplotlib.pyplot as plt def softmax(z): cache = z z -= np.max(z) sm = (np.exp(z).T / np.sum(np.exp(z), axis=1)) return sm, cache def relu(z): """ :param z: :return: """ s = np.maximum(0, z) cache = z return s, cache def softmax_backward(dA, cache): """ :param dA: :param activation_cache: :return: """ z = cache z -= np.max(z) s = (np.exp(z).T / np.sum(np.exp(z), axis=1)) dZ = dA * s * (1 - s) return dZ def relu_backward(dA, cache): """ :param dA: :param activation_cache: :return: """ Z = cache dZ = np.array(dA, copy=True) # just converting dz to a correct object. dZ[Z <= 0] = 0 return dZ def initialize_parameters_deep(dims): """ :param dims: :return: """ np.random.seed(3) params = {} L = len(dims) for l in range(1, L): params['W' + str(l)] = np.random.randn(dims[l], dims[l - 1]) * 0.01 params['b' + str(l)] = np.zeros((dims[l], 1)) return params def linear_forward(A, W, b): """ :param A: :param W: :param b: :return: """ Z = np.dot(W, A) + b cache = (A, W, b) return Z, cache def linear_activation_forward(A_prev, W, b, activation): """ :param A_prev: :param W: :param b: :param activation: :return: """ if activation == "softmax": Z, linear_cache = linear_forward(A_prev, W, b) A, activation_cache = softmax(Z.T) elif activation == "relu": Z, linear_cache = linear_forward(A_prev, W, b) A, activation_cache = relu(Z) cache = (linear_cache, activation_cache) return A, cache def L_model_forward(X, params): """ :param X: :param params: :return: """ caches = [] A = X L = len(params) // 2 # number of layers in the neural network # Implement [LINEAR -> RELU]*(L-1). Add "cache" to the "caches" list. for l in range(1, L): A_prev = A A, cache = linear_activation_forward(A_prev, params["W" + str(l)], params["b" + str(l)], activation='relu') caches.append(cache) A_last, cache = linear_activation_forward(A, params["W" + str(L)], params["b" + str(L)], activation='softmax') caches.append(cache) return A_last, caches def compute_cost(A_last, Y): """ :param A_last: :param Y: :return: """ m = Y.shape[1] cost = (-1 / m) * np.sum(Y * np.log(A_last)) cost = np.squeeze(cost) # To make sure your cost's shape is what we expect (e.g. this turns [[17]] into 17). return cost def linear_backward(dZ, cache): """ :param dZ: :param cache: :return: """ A_prev, W, b = cache m = A_prev.shape[1] dW = (1. / m) * np.dot(dZ, cache[0].T) db = (1. / m) * np.sum(dZ, axis=1, keepdims=True) dA_prev = np.dot(cache[1].T, dZ) return dA_prev, dW, db def linear_activation_backward(dA, cache, activation): """ :param dA: :param cache: :param activation: :return: """ linear_cache, activation_cache = cache if activation == "relu": dZ = relu_backward(dA, activation_cache) dA_prev, dW, db = linear_backward(dZ, linear_cache) elif activation == "softmax": dZ = softmax_backward(dA, activation_cache) dA_prev, dW, db = linear_backward(dZ, linear_cache) return dA_prev, dW, db def L_model_backward(A_last, Y, caches): """ :param A_last: :param Y: :param caches: :return: """ grads = {} L = len(caches) # the number of layers m = A_last.shape[1] Y = Y.reshape(A_last.shape) # after this line, Y is the same shape as A_last dA_last = - (np.divide(Y, A_last) - np.divide(1 - Y, 1 - A_last)) current_cache = caches[-1] grads["dA" + str(L)], grads["dW" + str(L)], grads["db" + str(L)] = linear_activation_backward(dA_last, current_cache, activation="softmax") for l in reversed(range(L - 1)): current_cache = caches[l] dA_prev_temp, dW_temp, db_temp = linear_activation_backward(grads["dA" + str(l + 2)], current_cache, activation="relu") grads["dA" + str(l + 1)] = dA_prev_temp grads["dW" + str(l + 1)] = dW_temp grads["db" + str(l + 1)] = db_temp return grads def update_params(params, grads, alpha): """ :param params: :param grads: :param alpha: :return: """ L = len(params) // 2 # number of layers in the neural network for l in range(L): params["W" + str(l + 1)] = params["W" + str(l + 1)] - alpha * grads["dW" + str(l + 1)] params["b" + str(l + 1)] = params["b" + str(l + 1)] - alpha * grads["db" + str(l + 1)] return params def model_DL( X, Y, Y_real, test_x, test_y, layers_dims, alpha, num_iterations, print_cost): # lr was 0.009 """ Implements a L-layer neural network: [LINEAR->RELU]*(L-1)->LINEAR->SIGMOID. Arguments: X -- data, numpy array of shape (number of examples, num_px * num_px * 3) Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples) layers_dims -- list containing the input size and each layer size, of length (number of layers + 1). alpha -- learning rate of the gradient descent update rule num_iterations -- number of iterations of the optimization loop print_cost -- if True, it prints the cost every 100 steps Returns: params -- params learnt by the model. They can then be used to predict. """ np.random.seed(1) costs = [] # keep track of cost params = initialize_parameters_deep(layers_dims) for i in range(0, num_iterations): A_last, caches = L_model_forward(X, params) cost = compute_cost(A_last, Y) grads = L_model_backward(A_last, Y, caches) if (i > 800 and i<1700): alpha1 = 0.80 * alpha params = update_params(params, grads, alpha1) elif(i>=1700): alpha1 = 0.50 * alpha params = update_params(params, grads, alpha1) else: params = update_params(params, grads, alpha) if print_cost and i % 100 == 0: print("Cost after iteration %i: %f" % (i, cost)) if print_cost and i % 100 == 0: costs.append(cost) predictions = predict(params, X) print("Train accuracy: {} %", sum(predictions == Y_real) / (float(len(Y_real))) * 100) predictions = predict(params, test_x) print("Test accuracy: {} %", sum(predictions == test_y) / (float(len(test_y))) * 100) #plt.plot(np.squeeze(costs)) #plt.ylabel('cost') #plt.xlabel('iterations (per tens)') #plt.title("Learning rate =" + str(alpha)) #plt.show() return params def predict(parameters, X): A_last, cache = L_model_forward(X, parameters) predictions = np.argmax(A_last, axis=0) return predictions
經過網絡攝像頭執行寫入
運行代碼
python Dig-Rec.py
經過網絡攝像頭顯示圖像
運行代碼
python Digit-Recognizer.py
項目地址:akshaybahadur21/Devanagiri-Recognizer
此代碼可幫助您使用CNN對不一樣梵文字母進行分類。
使用技術
使用了卷積神經網絡,使用Tensorflow做爲框架和Keras API來提供高級抽象。
網絡結構
CONV2D→MAXPOOL→CONV2D→MAXPOOL→FC→Softmax→Classification
額外的要點
1.能夠嘗試不一樣的網絡結構;
2.添加正則化以防止過擬合;
3.能夠向訓練集添加其餘圖像以提升準確性;
Python實現
使用Dataset-DHCD(Devnagari Character Dataset)數據集,每張圖大小爲32 X 32,詳細代碼請參考項目地址。
運行代碼
python Dev-Rec.py
項目地址:akshaybahadur21/Face-Recoinion
此代碼使用facenet進行面部識別。facenet的概念最初是在一篇研究論文中提出的,主要思想是談到三重損失函數來比較不一樣人的圖像。爲了提供穩定性和更好的檢測,額外添加了本身的幾個功能。
本項目依賴的包以下:
numpy matplotlib cv2 keras dlib h5py scipy
描述
面部識別系統是可以從數字圖像或來自視頻源的視頻幀識別或驗證人的技術。面部識別系統有多種方法,但通常來講,它們都是經過將給定圖像中的選定面部特徵與數據庫中的面部進行比較來工做。
功能增長
1.僅在眼睛睜開時檢測臉部(安全措施);
2.使用dlib中的面部對齊功能在實時流式傳輸時有效預測;
Python實現
1.網絡使用 - Inception Network;
2.原始論文 - Google的Facenet;
程序
1.若是想訓練網絡,運行Train-inception.py
,可是不須要這樣作,由於已經訓練了模型並將其保存爲face-rec_Google.h5
,在運行時只需加載這個文件便可;
2.如今須要在數據庫中放置一些圖像。檢查/images
文件夾,能夠將圖片粘貼到此,也可使用網絡攝像頭拍攝;
3.運行rec-feat.py
以運行該應用程序;
項目地址akshaybahadur21/Emojinator
此項目可識別和分類不一樣的表情符號。可是目前爲止,只支持手勢表達的情緒。
代碼依賴包
numpy matplotlib cv2 keras dlib h5py scipy
描述
表情符號是電子信息和網頁中使用的表意文字和表情的符號。表情符號存在於各類類型中,包括面部表情、常見物體、地點和天氣以及動物等。
功能
1.用於檢測手的過濾器;
2.CNN用於訓練模型;
Python實現
1.網絡使用 - 卷積神經網絡
程序
1.首先,建立一個手勢數據庫。爲此,運行CreateGest.py
。嘗試在框架內稍微移動一下手,以確保模型在訓練時不會發生過擬合;
2.對所需的全部功能重複此操做;
3.運行CreateCSV.py
將圖像轉換爲CSV文件;
4.訓練模型,運行TrainEmojinator.py
;
5.運行Emojinator.py
經過網絡攝像頭測試模型;
這些項目都使人印象深入,全部這些項目均可以在計算機上運行。若是你不想安裝任何東西,能夠更輕鬆地在Deep Cognition平臺上運行,而且能夠在線運行。
感謝網絡上各類開源項目的貢獻。嘗試一下各類感興趣的項目,運行它們並得到靈感。上述這些這只是深度學習和計算機視覺能夠作的事情中的一小部分,還有不少事情能夠嘗試,而且能夠將其轉變爲幫助世界變得更美好的事情,code change world!
此外,每一個人都應該對許多不一樣的事情感興趣。我認爲這樣能夠改善世界、改善生活、改善工做方式,思考和解決問題的方式,若是咱們依據如今所掌握的資源,使這些知識領域共同發揮做用,咱們就能夠在世界和生活中產生巨大的積極影響。
本文爲雲棲社區原創內容,未經容許不得轉載。