機器學習之分類結果的評價

時間 2019-12-06

標籤機器學習分類結果評價简体版

原文原文鏈接

以邏輯迴歸爲例，介紹分類結果的評價方式。python

精準率和召回率

對於極度偏斜的數據，使用分類準確度來評判模型的好壞是不恰當的，精確度和召回率是兩個更好的指標來幫助咱們斷定模型的好快。git

二分類的混淆矩陣

精準率和召回率是存在於混淆矩陣之上的，以二分類爲例，分類0是偏斜數據中佔優點的一方，將關注的重點放在分類爲1上，其混淆矩陣以下：github

真實值預測值	0	1
0	9978（TN）	12（FP）
1	2（FN）	8（TP）

TN 的含義是預測 negative 正確的數量，即真實分類爲0預測的分類結果也爲0的共有9978個；
FN 的含義是預測 negative 錯誤的數量，即真實分類爲1預測的分類結果爲0的共有2個；
FP 的含義是預測 positive 錯誤的數量，即真實分類爲0預測的分類結果爲1的共有12個；
TP 的含義是預測 positive 正確的數量，即真實分類爲1預測的分類結果也爲1的共有8個。

TN、FN、FP、TP 的實現以下（y_true 表示真實分類， y_predict 表示預測結果）：app

import numpy as np

def TN(y_true, y_predict):
    return np.sum((y_true == 0) & (y_predict == 0))

def FP(y_true, y_predict):
    return np.sum((y_true == 0) & (y_predict == 1))

def FN(y_true, y_predict):
    return np.sum((y_true == 1) & (y_predict == 0))

def TP(y_true, y_predict):
    return np.sum((y_true == 1) & (y_predict == 1))

混淆矩陣的結果爲：dom

def confusion_matrix(y_test, y_predict):
    return np.array([
        [TN(y_test, y_log_predict), FP(y_test, y_log_predict)],
        [FN(y_test, y_log_predict), TP(y_test, y_log_predict)]
    ])

Scikit Learn 中封裝了混淆矩陣方法 confusion_matrix()：測試

from sklearn.metrics import confusion_matrix

confusion_matrix(y_true, y_predict)

精準率和召回率及實現

有了混淆矩陣，精準率和召回率久很好表示了。spa

精準率表示預測分類結果中預測正確的數量的佔比，即：code

$$ precision=\frac{TP}{TP+FP} $$orm

將其用代碼表示爲：ip

def precision_score(y_true, y_predict):
    tp = TP(y_true, y_predict)
    fp = FP(y_true, y_predict)
    try:
        return tp / (tp + fp)
    except:
        return 0.0

召回率表示真實分類中被預測正確的數量的佔比，即：

$$ recall=\frac{TP}{TP+FN} $$

將其用代碼表示爲：

def recall_score(y_true, y_predict):
    tp = TP(y_true, y_predict)
    fn = FN(y_true, y_predict)
    try:
        return tp / (tp + fn)
    except:
        return 0.0

Scikit Learn 中也封裝了計算精準率的方法 precision_score() 和計算召回率的方法 recall_score()：

from sklearn.metrics import precision_score
precision_score(y_true, y_predict)

from sklearn.metrics import recall_score
recall_score(y_true, y_predict)

F1 Score

精準率和召回率這兩個指標的側重點不一樣，有的時候咱們注重精準率（如股票預測），有的時候咱們注重召回率（病人診斷）。但有時候又須要把二者都考慮進行，此後就可使用 F1 Score 指標。

F1 Score 是精準率和召回率的調和平均值，公式爲：

$$ \frac{1}{F1}=\frac{1}{2}(\frac{1}{precision}+\frac{1}{recall}) $$

即：$F1=\frac{2·precision·recall}{precesion+recall}$，而且 F1 Score 的取值是在區間 $[0, 1]$ 之中的。

代碼實現爲：

def f1_score(y_true, y_predict):
    precision = precision_score(y_true, y_predict)
    recall = recall_score(y_true, y_predict)
    try:
        return 2 * precision * recall / (precision + recall)
    except:
        return 0.0

Scikit Learn 中封裝了方法 f1_score() 來計算 F1 Score：

from sklearn.metrics import f1_score

f1_score(y_true, y_predict)

Precision-Recall 曲線

Scikit Learn 的邏輯迴歸中的機率公式爲 $\hat p=\sigma(\theta^T·x_b)$ ，其決策邊界爲 $\theta^T·x_b=0$，可是若是決策邊界不爲0會如何呢？

假定決策邊界爲 $\theta^T·x_b=threshold$，當 threshold 的取值不一樣（0、大於0的某個值、小於0的某個值），對應的精確度和召回率也不一樣。如圖：

圓形和星形是不一樣的的分類，而且重點放在星形的分類上，能夠看出，threshold 的取值越大時，精確率越高，而召回率越低。

若是要更改決策邊界，邏輯迴歸的 decision_function() 返回傳入數據的 $\theta^T·x_b$ 計算結果 decision_scores，接着再構建一個新的預測結果；如代碼所示，設定 decision_scores >= 5（默認decision_scores >= 0 ) 的預測結果才爲1，其他爲0：

from sklearn.linear_model import LogisticRegression

log_reg = LogisticRegression()

# X_train, y_train 爲訓練數據集
log_reg.fit(X_train, y_train)

# X_test，y_test 爲測試數據集
decision_scores = log_reg.decision_function(X_test)

y_log_predict_threshold = np.array(decision_scores >= 5, dtype='int')

如此能夠獲得不一樣的 y_log_predict_threshold，進而獲得不一樣的精準率和召回率。

以手寫數字識別數據爲例，將標記爲9的數據分類爲1，其他分類爲0：

from sklearn import datasets

digits = datasets.load_digits()
X = digits.data
y = digits.target.copy()

y[digits.target==9] = 1
y[digits.target!=9] = 0

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=500)

接着訓練邏輯迴歸模型：

from sklearn.linear_model import LogisticRegression

log_reg = LogisticRegression()
log_reg.fit(X_train, y_train)

獲取測試數據集 X_test 對應的 $\theta^T·x_b$ 取值：

decision_scores = log_reg.decision_function(X_test)

在 decision_scores 的取值區間劃分爲一系列值 thresholds ，並將其中的值依次做爲決策邊界，進而獲得不一樣的精確率和召回率

from sklearn.metrics import precision_score
from sklearn.metrics import recall_score

precisions = []
recalls = []
thresholds = np.arange(np.min(decision_scores), np.max(decision_scores), 0.1)

for threshold in thresholds:
    y_predict = np.array(decision_scores >= threshold, dtype='int')
    precisions.append(precision_score(y_test, y_predict))
    recalls.append(recall_score(y_test, y_predict))

將精確率、召回率與決策邊界的關係繪製如圖：

精確度與召回率的關係，即 Precision-Recall 曲線則爲：

Scikit Learn 中提供的 precision_recall_curve() 方法傳入真實分類結果和 decision_scores，返回 precisions、recalls 和 thresholds：

from sklearn.metrics import precision_recall_curve

precisions, recalls, thresholds = precision_recall_curve(y_test, decision_scores)

Precision-Recall 曲線越靠外（即面積越大）則表示模型越好。

多分類中的精確率和召回率

在過度類中，Sckit Learn 提供的 confusion_matrix() 能夠直接返回多分類的混淆矩陣，而對於精確率和召回率，則要在 Sckit Learn 提供的方法中指定 average 參數值爲 micro，如：

from sklearn.metrics import precision_score

precision_score(y_test, y_predict, average='micro')

ROC 曲線

對於用圖形面積判斷模型好快，ROC 曲線比 Precision-Recall 曲線要好。

ROC 曲線涉及兩個指標，TPR 和 FPR。TPR 就是召回率，即：$TPR=\frac{TP}{TP+FN}$；FPR 表示真實分類（偏斜數據中佔優點的分類）中被預測錯誤的數量的佔比，即：$FPR=\frac{FP}{TN+FP}$。實現代碼爲：

def TPR(y_true, y_predict):
    tp = TP(y_true, y_predict)
    fn = FN(y_true, y_predict)
    try:
        return tp / (tp + fn)
    except:
        return 0.0


def FPR(y_true, y_predict):
    fp = FP(y_true, y_predict)
    tn = TN(y_true, y_predict)
    try:
        return fp / (fp + tn)
    except:
        return 0.0

對於決策邊界的不一樣，這兩個指標的變化趨勢是一致的。仍是以上面的手寫數字識別數據爲例，計算不一樣決策邊界下的兩指標的值爲：

fprs = []
tprs = []

for threshold in thresholds:
    y_predict = np.array(decision_scores >= threshold, dtype='int')
    fprs.append(FPR(y_test, y_predict))
    tprs.append(TPR(y_test, y_predict))

做出的 TPR 和 FPR 的關係圖（即 ROC 曲線）爲：

Scikit Learn 中提供的 roc_curve() 方法傳入真實分類結果和 decision_scores，返回 TPR、FPR 和 thresholds：

from sklearn.metrics import roc_curve

fprs, tprs, thresholds = roc_curve(y_test, decision_scores)

roc_auc_score() 方法傳入真實分類結果和 decision_scores，返回 ROC 曲線表示的面積。

from sklearn.metrics import roc_auc_score

roc_auc_score(y_test, decision_scores)

面積越大，則模型越好。

源碼地址

Github | ML-Algorithms-Action

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。