import numpy as np
import time
a = np.random.rand(1000000)
b = np.random.rand(1000000)
tic = time.time()
c = np.dot(a, b)
print("cost " + str((time.time() - tic)*1000) + "ms")
複製代碼
import numpy as np
a = np.random.randn(5) #do not use print("a:",a.shape,"\n", a)
b = np.random.randn(5, 1)
print("b:",b.shape,"\n", b)
c = np.random.randn(1, 5)
print("c:",c.shape,"\n", c)
a = a.reshape(5, 1)
assert(a.shape == (5, 1))
複製代碼
3. 淺層神經網絡
3.1 神經網絡概覽
3.2 神經網絡表示
3.5 向量化實現的解釋
3.6 激活函數
3.7 爲何使用非線性的激活函數
若是是線性的 通過幾層以後仍是線性的,多層就沒有意義了函數
3.8 激活函數的導數
3.9 激活函數的導數
3.11 隨機初始化
多神經元爲什麼W不能初始化爲0矩陣學習
4. 深層神經網絡
4.1 深層神經網絡
4.3 覈對矩陣的維數
4.7 參數VS超參數
課程二 改善深層神經網絡:超參數調試、正則化以及優化
1. 深度學習的實用層面
1.1 訓練、開發、測試集
1.2 誤差、方差
1.4 Regularization
lamda 很大會發生什麼:
1.6 Drop Out Regularization
1.8 其餘Regularization方法
early stopping
1.9 Normalizing inputs
1.10 vanishing/exploding gradients
1.11 權重初始化
1.13 Gradient Check
1.14 Gradient Check Implementation Notes
2. 優化算法
2.1 Mini-batch gradient descent
batch-size 要適配CPU/GPU memory
2.3 Exponentially weighted averages
移動平都可撫平短時間波動,將長線趨勢或週期顯現出來。數學上,移動平都可視爲一種卷積。
Bias correction
2.6 Gradient Descent with Momentum
2.7 RMSprop
2.8 Adam優化算法
Momentum + RMSprop
2.9 Learning rate decay
逐步減少Learning rate的方式
2.10 局部最優的問題
在高維空間,容易遇到saddle point可是local optima其實不容易遇到
plateaus是個問題,learning會很慢,可是相似adam的方法能減輕這個問題
3. 超參數調試、batch正則化和程序框架
3.1 搜索超參數
Try random values: don't use a grid
Coarse to fine
3.4 Batch Normalization
一個問題,在迴歸中能夠normalization在神經網絡中能否作相似的事情
經過lamda和beta能夠控制mean和variance
3.6 Batch Normalization爲何有效
By normlization values to similar range of values, it speed up learning
Batch normlization reduces the problem of input values(對於每一層) changing
Has a slight regulazation effect (like dropout, it adds some noice to each hidden layer's activations)
3.7 Batch Normalization at test time
使用訓練中加權指數平均算出來的mean,variance來test
3.8 Softmax regression
多類,而不是二類。generazation of logistic regression.
3.10 深度學習框架
課程三 結構化機器學習項目
1. 機器學習(ML)策略(1)
1.1 爲何是ML策略
1.2 正交化
Fit training set well in cost function
If it doesn’t fit well, the use of a bigger neural network or switching to a better optimization algorithm might help.
Fit development set well on cost function
If it doesn’t fit well, regularization or using bigger training set might help.
Fit test set well on cost function
If it doesn’t fit well, the use of a bigger development set might help
Performs well in real world
If it doesn’t perform well, the development test set is not set correctly or the cost function is not evaluating the right thing