#Softmax Classifier softmax分類器和logistics regression有點像,softmax其實就是從logistics發張過來的。因爲是多分類了,須要走更多的機率來表示每個分類。softmax的公式: 問題來了,爲何不直接求?而是繞這麼大的一圈最後仍是求最大值。①咱們須要的其實就是max,可是這個max有一個缺點,就是不可導。因此咱們須要一個函數來模擬max,exp是指數函數,數值大的增加的速度就會更塊,這樣就能夠把最大的區分出來。同時也是可導的,這樣設計也可使得特徵對機率的影響是乘性的。②softmax是從logistics發展過來的,天然就用到了交叉熵損失函數,,目標類其餘的都是0,這個時候求導,,這個形式很是簡潔,並且與線性迴歸(採用最小均方偏差目標函數)、兩類分類(採用cross-entropy目標函數)時的形式一致。 主要實現流程: 首先就是exp的歸一化操做,獲得當前樣本屬於每個類別的機率, 而後就是求對數化求cost function。 求導操做: ###Softmax裏的參數特色 因此能夠看出,最優參數減去一些向量φ對預測結果是沒有什麼影響的,也就是說在模型裏面,是有多組的最優解,由於φ的不一樣就意味着不一樣的解,而φ對於結果又是沒有影響的,因此就存在多組解的可能。 ###Softmax和logistics的關係 因此說softmax是logistics的一種擴展,回到二分類,softmax也是同樣的,都是用的cross-entropy。 ###代碼實現 使用手寫數字識別的數據集:git
class DataPrecessing(object):
def loadFile(self):
(x_train, x_target_tarin), (x_test, x_target_test) = mnist.load_data()
x_train = x_train.astype('float32')/255.0
x_test = x_test.astype('float32')/255.0
x_train = x_train.reshape(len(x_train), np.prod(x_train.shape[1:]))
x_test = x_test.reshape(len(x_test), np.prod(x_test.shape[1:]))
x_train = np.mat(x_train)
x_test = np.mat(x_test)
x_target_tarin = np.mat(x_target_tarin)
x_target_test = np.mat(x_target_test)
return x_train, x_target_tarin, x_test, x_target_test
def Calculate_accuracy(self, target, prediction):
score = 0
for i in range(len(target)):
if target[i] == prediction[i]:
score += 1
return score/len(target)
def predict(self, test, weights):
h = test * weights
return h.argmax(axis=1)
複製代碼
引入數據集,格式的轉換等等。github
def gradientAscent(feature_data, label_data, k, maxCycle, alpha):
'''train softmax model by gradientAscent input:feature_data(mat) feature label_data(mat) target k(int) number of classes maxCycle(int) max iterator alpha(float) learning rate '''
Dataprecessing = DataPrecessing()
x_train, x_target_tarin, x_test, x_target_test = Dataprecessing.loadFile()
x_target_tarin = x_target_tarin.tolist()[0]
x_target_test = x_target_test.tolist()[0]
m, n = np.shape(feature_data)
weights = np.mat(np.ones((n, k)))
i = 0
while i <= maxCycle:
err = np.exp(feature_data*weights)
if i % 100 == 0:
print('cost score : ', cost(err, label_data))
train_predict = Dataprecessing.predict(x_train, weights)
test_predict = Dataprecessing.predict(x_test, weights)
print('Train_accuracy : ', Dataprecessing.Calculate_accuracy(x_target_tarin, train_predict))
print('Test_accuracy : ', Dataprecessing.Calculate_accuracy(x_target_test, test_predict))
rowsum = -err.sum(axis = 1)
rowsum = rowsum.repeat(k, axis = 1)
err = err / rowsum
for x in range(m):
err[x, label_data[x]] += 1
weights = weights + (alpha/m) * feature_data.T * err
i += 1
return weights
def cost(err, label_data):
m = np.shape(err)[0]
sum_cost = 0.0
for i in range(m):
if err[i, label_data[i]] / np.sum(err[i, :]) > 0:
sum_cost -= np.log(err[i, label_data[i]] / np.sum(err[i, :]))
else:
sum_cost -= 0
return sum_cost/m
複製代碼
實現其實仍是比較簡單的。bash
Dataprecessing = DataPrecessing()
x_train, x_target_tarin, x_test, x_target_test = Dataprecessing.loadFile()
x_target_tarin = x_target_tarin.tolist()[0]
gradientAscent(x_train, x_target_tarin, 10, 100000, 0.001)
複製代碼
運行函數。函數
###GitHub代碼https://github.com/GreenArrow2017/MachineLearning/tree/master/MachineLearning/Linear%20Model/LogosticRegressionui