邏輯迴歸 —— Yes Or No

時間 2019-12-08

標籤邏輯迴歸 yes 简体版

原文原文鏈接

邏輯迴歸解決的即是一個分類的問題。就是須要一段代碼回答YES或者NO。好比辣雞郵件的分類，當一封郵件過來，須要識別這封郵件是不是垃圾郵件。python

一個簡單的例子

筆者借用Andrew Ng給的學生成績與申請大學的例子來說述Logistic Regression算法實現。假設你有學生的兩門課的歷史成績與是否被錄取的記錄，須要你預測一批新的學生是否會被大學錄取。其中部分數據以下:算法

exam1	exam2	錄取(0:failed; 1: passed)
34.62365962451697	78.0246928153624	0
30.28671076822607	43.89499752400101	0
35.84740876993872	72.90219802708364	0
60.18259938620976	86.30855209546826	1
79.0327360507101	75.3443764369103	1
45.08327747668339	56.3163717815305	0
61.10666453684766	96.51142588489624	1

這裏exam1和exam2的分數做爲模型的輸入，即X。是否被錄取則是模型的輸出，即Y, 其中 $Y\in{1, 0}$ 函數

假設函數

咱們定義個假設函數 $h_{\theta}(x)$ , 經過這個函數來預測是否會被錄取的機率。因此咱們但願 $h_{\theta}(x)$ 的取值範圍是[0, 1]。sigmoid函數是一個匹配度很高的函數。以下是sigmoid函數的圖像:優化

咱們利用python來實現這個函數:spa

import numpy as np

def sigmoid(z):
    g = np.zeros(z.size)
    g = 1 / (1 + np.exp(-z))
    return g
複製代碼

咱們假設 $g(\theta)$ 的定義爲：.net

g(\theta) = \theta_0 + \theta_1 * X_1 + \theta_2 * X_2 = \theta^TX

咱們能夠令 $h_{\theta}(x)$ 的定義爲：3d

h_{\theta}(x) = h_{\theta}(g(\theta)) = 1 / (1 + e^{-\theta^TX})

代價函數 $J(\theta)$

給出了 $h(\theta)$ 的定義之後，咱們就能夠定義 $J(\theta)$ 了code

J(\theta) = 1 / m \sum_{i=1}^mCost(h_{\theta}(x^i), y^i)

爲了在進行梯度降低時找到全局的最優解， $J(\theta)$ 的函數必須是一個凸函數。因此咱們能夠對Cost進行以下定義:cdn

Cost(h_{\theta}(x), y) = -log(h_{\theta}(x)) if y = 1

Cost(h_{\theta}(x), y) = -log(1 - h_{\theta}(x)) if y = 0

最後求出梯度降低算法所須要使用的微分便可：blog

\frac{\partial}{\partial \theta_j}J(\theta) = 1 / m * \sum_{i=1}^m(h_{\theta}(x^{i}) - y^{i}) * x_{j}^{i}

利用python的實現以下：

import numpy as np
from sigmoid import *


def cost_function(theta, X, y):
    m = y.size
    cost = 0
    grad = np.zeros(theta.shape)

    item1 = -y * np.log(sigmoid(X.dot(theta)))
    item2 = (1 - y) * np.log(1 - sigmoid(X.dot(theta)))

    cost = (1 / m) * np.sum(item1 - item2)

    grad = (1 / m) * ((sigmoid(X.dot(theta)) - y).dot(X))

    return cost, grad
複製代碼

咱們使用scipy來對該算法的求解作必定程度的優化:

import scipy.optimize as opt
def cost_func(t):
    return cost_function(t, X, y)[0]


def grad_func(t):
    return cost_function(t, X, y)[1]


theta, cost, *unused = opt.fmin_bfgs(f=cost_func, fprime=grad_func,
                                     x0=initial_theta, maxiter=400, full_output=True, disp=False)
複製代碼

經過這個方法，能夠求出在模型中所須要使用的 $\theta$ 。從而將該模型訓練好。

可視化

爲了方便觀察數據之間的規律，咱們能夠將數據進行可視化出來

def plot_data(X, y):
    x1 = X[y == 1]
    x2 = X[y == 0]

    plt.scatter(x1[:, 0], x1[:, 1], marker='+', label='admitted')
    plt.scatter(x2[:, 0], x2[:, 1], marker='.', label='Not admitted')
    plt.legend()

def plot_decision_boundary(theta, X, y):
    plot_data(X[:, 1:3], y)

    # Only need two points to define a line, so choose two endpoints
    plot_x = np.array([np.min(X[:, 1]) - 2, np.max(X[:, 1]) + 2])

    # Calculate the decision boundary line
    plot_y = (-1/theta[2]) * (theta[1]*plot_x + theta[0])

    plt.plot(plot_x, plot_y)

    plt.legend(['Decision Boundary', 'Admitted', 'Not admitted'], loc=1)
    plt.axis([30, 100, 30, 100])
    plt.show()
複製代碼

效果是這樣的:

預測

當求出了 $\theta$ 以後咱們固然能夠利用這個來進行預測了，因而能夠編寫predict函數

import numpy as np
from sigmoid import *


def predict(theta, X):
    m = X.shape[0]
    p = np.zeros(m)

    prob = sigmoid(X.dot(theta))
    p = prob > 0.5
    return p
複製代碼

對於X中的數據，當機率大於0.5時，咱們預測爲能經過大學申請。

以上即是邏輯迴歸算法的基本實現，邏輯迴歸在ML中是一個很基本也很強大的算法，但願這篇文章對你有所幫助

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。