機器學習迴歸算法之線性迴歸

1、概念python

線性迴歸(Linear Regression)是迴歸算法中比較簡單的一種,是一種監督學習算法,相似於邏輯迴歸,可是線性迴歸不須要Sigmoid函數處理。算法

線性迴歸會擬合出一條直線,這條線能夠某種程度上表明這些點的發展趨勢和分佈,擬合出線後,就能夠推測後續點的分佈,從而實現預測。dom

 

2、計算函數

除 Sigmoid函數外相似邏輯迴歸。學習

 

3、實現spa

算法分別使用sklearn和本身實現的算法實現線性迴歸:code

# !/usr/bin/env python
# -*- coding: utf-8 -*-
import numpy as np
import matplotlib.pyplot as plt
from sklearn import linear_model

cs = ['black', 'blue', 'brown', 'red', 'yellow', 'green']


def create_sample():
    np.random.seed(5)  # 隨機數種子,保證隨機數生成的順序同樣
    n_dim = 2
    num = 100
    k = 1
    data_mat = 1 * np.random.randn(1, n_dim)
    for i in range(num - 1):
        k += 1
        b = k * np.random.randn(1, n_dim)
        data_mat = np.concatenate((data_mat, b))
    return {'data_mat': data_mat}


def grad_ascent(data_mat, class_label, alpha):
    data_matrix = np.mat(data_mat).transpose()
    label_mat = np.mat(class_label).transpose()
    m, n = np.shape(data_matrix)
    data_matrix = augment(data_matrix)  # 增廣
    n += 1
    weight = np.ones((n, 1))
    while True:
        error = data_matrix * weight - label_mat
        cha = alpha * data_matrix.transpose() * error
        if np.abs(np.sum(cha)) < 0.00001:
            break
        weight = weight - cha
    return np.asarray(weight).flatten()


def augment(data_matrix):
    n, n_dim = data_matrix.shape
    a = np.mat(np.ones((n, 1)))
    return np.concatenate((data_matrix, a), axis=1)


def plot_data(samples, color, plot_type='o'):
    plt.plot(samples[:, 0], samples[:, 1], plot_type, markerfacecolor=color, markersize=14)


def sk_linear_regression(x, y):
    linear_regression = linear_model.LinearRegression()
    linear_regression.fit(x, y)
    return np.asarray((linear_regression.coef_, linear_regression.intercept_)).flatten()


def main():
    data = create_sample()
    weight_sk = sk_linear_regression(data['data_mat'][:, 0:1], data['data_mat'][:, 1:2])
    print(weight_sk)
    weight = grad_ascent(data['data_mat'][:, 0], data['data_mat'][:, 1], 0.000001)
    print(weight)
    plot_data(data['data_mat'], 'red')
    lx = [-200, 200]
    ly = [-200 * weight[0] + weight[1], 200 * weight[0] + weight[1]]
    ly_sk = [-200 * weight_sk[0] + weight_sk[1], 200 * weight_sk[0] + weight_sk[1]]
    plt.plot(lx, ly)
    plt.plot(lx, ly_sk)
    plt.show()


if __name__ == '__main__':
    main()

結果:blog

sklearn:[0.1165388985642626 3.958251157566739]
本身的算法:[0.11655941 3.85822306]utf-8

 

 

能夠看出,差異不大,擬合出的線畫出來也基本是重合的。it

相關文章
相關標籤/搜索