首先得明確邏輯迴歸與線性迴歸不一樣,它是一種分類模型。並且是一種二分類模型。git
首先咱們須要知道sigmoid函數,其公式表達以下:github
其函數曲線以下:app
sigmoid函數有什麼性質呢?dom
一、關於(0,0.5) 對稱ide
二、值域範圍在(0,1)之間函數
三、單調遞增3d
四、光滑orm
五、中間較陡,兩側較平緩blog
六、其導數爲g(z)(1-g(z)),便可以用原函數直接計算get
因而邏輯迴歸的函數形式能夠用如下公式表示:
其中θ表示權重參數,x表示輸入。θTx爲決策邊界,就是該決策邊界將不一樣類數據區分開來。
爲何使用sigmoid函數呢?
一、sigmoid函數自己的性質
二、推導而來
咱們知道伯努利分佈:
當x=1時,f(1|p) =p,當x=0時,f(0|p)=1-p
首先要明確伯努利分佈也是指數族,指數族的通常表達式爲:
因爲:
則有:
因此:
由於:
則有:
邏輯迴歸代價函數:
爲何這麼定義呢?
以單個樣本爲例:
上面式子等價於:
當y=1時,其圖像以下:
也就是說當hθ(x)的值越接近1,C(θ) 的值就越小。
同理當y=0時,其圖像以下:
也就是說當hθ(x)的值越接近0,C(θ) 的值就越小。
這樣就能夠將不一樣類區分開來。
代價函數的倒數以下:
推導過程以下:
上面參考了:
https://zhuanlan.zhihu.com/p/28415991
接下來就是代碼實現了,代碼來源: https://github.com/eriklindernoren/ML-From-Scratch
from __future__ import print_function, division import numpy as np import math from mlfromscratch.utils import make_diagonal, Plot from mlfromscratch.deep_learning.activation_functions import Sigmoid class LogisticRegression(): """ Logistic Regression classifier. Parameters: ----------- learning_rate: float The step length that will be taken when following the negative gradient during training. gradient_descent: boolean True or false depending if gradient descent should be used when training. If false then we use batch optimization by least squares. """ def __init__(self, learning_rate=.1, gradient_descent=True): self.param = None self.learning_rate = learning_rate self.gradient_descent = gradient_descent self.sigmoid = Sigmoid() def _initialize_parameters(self, X): n_features = np.shape(X)[1] # Initialize parameters between [-1/sqrt(N), 1/sqrt(N)] limit = 1 / math.sqrt(n_features) self.param = np.random.uniform(-limit, limit, (n_features,)) def fit(self, X, y, n_iterations=4000): self._initialize_parameters(X) # Tune parameters for n iterations for i in range(n_iterations): # Make a new prediction y_pred = self.sigmoid(X.dot(self.param)) if self.gradient_descent: # Move against the gradient of the loss function with # respect to the parameters to minimize the loss self.param -= self.learning_rate * -(y - y_pred).dot(X) else: # Make a diagonal matrix of the sigmoid gradient column vector diag_gradient = make_diagonal(self.sigmoid.gradient(X.dot(self.param))) # Batch opt: self.param = np.linalg.pinv(X.T.dot(diag_gradient).dot(X)).dot(X.T).dot(diag_gradient.dot(X).dot(self.param) + y - y_pred) def predict(self, X): y_pred = np.round(self.sigmoid(X.dot(self.param))).astype(int) return y_pred
說明:np.linalg.pinv()用於計算矩陣的pseudo-inverse(僞逆)。第一種方法求解使用隨機梯度降低。
其中make_diagonal()函數以下:用於將向量轉換爲對角矩陣
def make_diagonal(x): """ Converts a vector into an diagonal matrix """ m = np.zeros((len(x), len(x))) for i in range(len(m[0])): m[i, i] = x[i] return m
其中Sigmoid代碼以下:
class Sigmoid(): def __call__(self, x): return 1 / (1 + np.exp(-x)) def gradient(self, x): return self.__call__(x) * (1 - self.__call__(x))
最後是主函數運行代碼:
from __future__ import print_function from sklearn import datasets import numpy as np import matplotlib.pyplot as plt # Import helper functions import sys sys.path.append("/content/drive/My Drive/learn/ML-From-Scratch/") from mlfromscratch.utils import make_diagonal, normalize, train_test_split, accuracy_score from mlfromscratch.deep_learning.activation_functions import Sigmoid from mlfromscratch.utils import Plot from mlfromscratch.supervised_learning import LogisticRegression def main(): # Load dataset data = datasets.load_iris() X = normalize(data.data[data.target != 0]) y = data.target[data.target != 0] y[y == 1] = 0 y[y == 2] = 1 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, seed=1) clf = LogisticRegression(gradient_descent=True) clf.fit(X_train, y_train) y_pred = clf.predict(X_test) accuracy = accuracy_score(y_test, y_pred) print ("Accuracy:", accuracy) # Reduce dimension to two using PCA and plot the results Plot().plot_in_2d(X_test, y_pred, title="Logistic Regression", accuracy=accuracy) if __name__ == "__main__": main()
結果:
Accuracy: 0.9393939393939394