詳細解讀：機器學習的模型評估與調參

選自 Python-Machine-Learning-Book On GitHub
python

做者：Sebastian Raschkagit

翻譯&整理 By Samgithub

這個系列的文章寫得也是夠長時間的了，不過總算是寫完了，今晚就把以前的內容作一次彙總，同時也把相關code打包share出來，後臺回覆「評估」獲取吧。算法

因爲文章較長，因此我仍是先把目錄提早。數組

🚙 🚙 🚙微信

1、認識管道流網絡

1.1 數據導入app

1.2 使用管道建立工做流dom

2、K折交叉驗證機器學習

2.1 K折交叉驗證原理

2.2 K折交叉驗證明現

3、曲線調參

3.1 模型準確度

3.2 繪製學習曲線獲得樣本數與準確率的關係

3.3 繪製驗證曲線獲得超參和準確率關係

4、網格搜索

4.1 兩層for循環暴力檢索

4.2 構建字典暴力檢索

5、嵌套交叉驗證

6、相關評價指標

6.1 混淆矩陣及其實現

6.2 相關評價指標實現

6.3 ROC曲線及其實現

1、認識管道流 🐢

今天先介紹一下管道工做流的操做。

「管道工做流」這個概念可能有點陌生，其實能夠理解爲一個容器，而後把咱們須要進行的操做都封裝在這個管道里面進行操做，好比數據標準化、特徵降維、主成分分析、模型預測等等，下面仍是以一個實例來說解。

1.1 數據導入與預處理

本次咱們導入一個二分類數據集 Breast Cancer Wisconsin，它包含569個樣本。首列爲主鍵ID，第2列爲類別值(M=惡性腫瘤，B=良性腫瘤)，第3-32列是實數值的特徵。

先導入數據集：

 1# 導入相關數據集
 2import pandas as pd
 3import urllib
 4try:
 5    df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases'
 6                     '/breast-cancer-wisconsin/wdbc.data', header=None)
 7except urllib.error.URLError:
 8    df = pd.read_csv('https://raw.githubusercontent.com/rasbt/'
 9                     'python-machine-learning-book/master/code/'
10                     'datasets/wdbc/wdbc.data', header=None)
11print('rows, columns:', df.shape)
12df.head()

使用咱們學習過的LabelEncoder來轉化類別特徵：

1from sklearn.preprocessing import LabelEncoder
2X = df.loc[:, 2:].values
3y = df.loc[:, 1].values
4le = LabelEncoder()
5# 將目標轉爲0-1變量
6y = le.fit_transform(y)
7le.transform(['M', 'B'])

劃分訓練驗證集：

1## 建立訓練集和測試集
2from sklearn.model_selection import train_test_split
3X_train, X_test, y_train, y_test = \
4    train_test_split(X, y, test_size=0.20, random_state=1)

1.2 使用管道建立工做流

不少機器學習算法要求特徵取值範圍要相同，所以須要對特徵作標準化處理。此外，咱們還想將原始的30維度特徵壓縮至更少維度，這就須要用到主成分分析，要用PCA來完成，再接着就能夠進行logistic迴歸預測了。

Pipeline對象接收元組構成的列表做爲輸入，每一個元組第一個值做爲變量名，元組第二個元素是sklearn中的transformer或Estimator。管道中間每一步由sklearn中的transformer構成，最後一步是一個Estimator。

本次數據集中，管道包含兩個中間步驟：StandardScaler和PCA，其都屬於transformer，而邏輯斯蒂迴歸分類器屬於Estimator。

本次實例，當管道pipe_lr執行fit方法時：

1）StandardScaler執行fit和transform方法；

2）將轉換後的數據輸入給PCA；

3）PCA一樣執行fit和transform方法；

4）最後數據輸入給LogisticRegression，訓練一個LR模型。

對於管道來講，中間有多少個transformer均可以。管道的工做方式能夠用下圖來展現(必定要注意管道執行fit方法，而transformer要執行fit_transform)：

上面的代碼實現以下：

 1from sklearn.preprocessing import StandardScaler # 用於進行數據標準化
 2from sklearn.decomposition import PCA # 用於進行特徵降維
 3from sklearn.linear_model import LogisticRegression # 用於模型預測
 4from sklearn.pipeline import Pipeline
 5pipe_lr = Pipeline([('scl', StandardScaler()),
 6                    ('pca', PCA(n_components=2)),
 7                    ('clf', LogisticRegression(random_state=1))])
 8pipe_lr.fit(X_train, y_train)
 9print('Test Accuracy: %.3f' % pipe_lr.score(X_test, y_test))
10y_pred = pipe_lr.predict(X_test)

Test Accuracy: 0.947

2、K折交叉驗證 🔍

爲何要評估模型的泛化能力，相信這個你們應該沒有疑惑，一個模型若是性能很差，要麼是由於模型過於複雜致使過擬合(高方差)，要麼是模型過於簡單致使致使欠擬合(高誤差)。如何評估它，用什麼數據來評估它，成爲了模型評估須要重點考慮的問題。

咱們常規作法，就是將數據集劃分爲3部分，分別是訓練、測試和驗證，彼此之間的數據不重疊。但，若是咱們碰見了數據量很少的時候，這種操做就顯得不太現實，這個時候k折交叉驗證就發揮優點了。

2.1 K折交叉驗證原理

先很少說，先貼一張原理圖（以10折交叉驗證爲例）。

k折交叉驗證步驟：

Step 1:使用不重複抽樣將原始數據隨機分爲k份；

Step 2:其中k-1份數據用於模型訓練，剩下的那1份數據用於測試模型；

Step 3:重複Step 2 k次，獲得k個模型和他的評估結果。

Step 4:計算k折交叉驗證結果的平均值做爲參數/模型的性能評估。

2.1 K折交叉驗證明現

K折交叉驗證，那麼K的取值該如何確認呢？通常咱們默認10折，但根據實際狀況有所調整。咱們要知道，當K很大的時候，你須要訓練的模型就會不少，這樣子對效率影響較大，並且每一個模型的訓練集都差很少，效果也差很少。咱們經常使用的K值在5～12。

咱們根據k折交叉驗證的原理步驟，在sklearn中進行10折交叉驗證的代碼實現：

 1import numpy as np
 2from sklearn.model_selection import StratifiedKFold
 3kfold = StratifiedKFold(n_splits=10,
 4                            random_state=1).split(X_train, y_train)
 5scores = []
 6for k, (train, test) in enumerate(kfold):
 7    pipe_lr.fit(X_train[train], y_train[train])
 8    score = pipe_lr.score(X_train[test], y_train[test])
 9    scores.append(score)
10    print('Fold: %s, Class dist.: %s, Acc: %.3f' % (k+1,
11          np.bincount(y_train[train]), score))
12print('\nCV accuracy: %.3f +/- %.3f' % (np.mean(scores), np.std(scores)))

output：

固然，實際使用的時候不必這樣子寫，sklearn已經有現成封裝好的方法，直接調用便可。

1from sklearn.model_selection import cross_val_score
2scores = cross_val_score(estimator=pipe_lr,
3                         X=X_train,
4                         y=y_train,
5                         cv=10,
6                         n_jobs=1)
7print('CV accuracy scores: %s' % scores)
8print('CV accuracy: %.3f +/- %.3f' % (np.mean(scores), np.std(scores)))

3、曲線調參 🌈

咱們講到的曲線，具體指的是學習曲線(learning curve)和驗證曲線(validation curve)。

3.1 模型準確率（Accuracy）

模型準確率反饋了模型的效果，你們看下圖：

1）左上角子的模型誤差很高。它的訓練集和驗證集準確率都很低，極可能是欠擬合。解決欠擬合的方法就是增長模型參數，好比，構建更多的特徵，減少正則項。

2）右上角子的模型方差很高，表現就是訓練集和驗證集準確率相差太多。解決過擬合的方法有增大訓練集或者下降模型複雜度，好比增大正則項，或者經過特徵選擇減小特徵數。

3）右下角的模型就很好。

3.2 繪製學習曲線獲得樣本數與準確率的關係

直接上代碼：

 1import matplotlib.pyplot as plt
 2from sklearn.model_selection import learning_curve
 3pipe_lr = Pipeline([('scl', StandardScaler()),
 4                    ('clf', LogisticRegression(penalty='l2', random_state=0))])
 5train_sizes, train_scores, test_scores =\
 6                learning_curve(estimator=pipe_lr,
 7                               X=X_train,
 8                               y=y_train,
 9                               train_sizes=np.linspace(0.1, 1.0, 10), #在0.1和1間線性的取10個值
10                               cv=10,
11                               n_jobs=1)
12train_mean = np.mean(train_scores, axis=1)
13train_std = np.std(train_scores, axis=1)
14test_mean = np.mean(test_scores, axis=1)
15test_std = np.std(test_scores, axis=1)
16plt.plot(train_sizes, train_mean,
17         color='blue', marker='o',
18         markersize=5, label='training accuracy')
19plt.fill_between(train_sizes,
20                 train_mean + train_std,
21                 train_mean - train_std,
22                 alpha=0.15, color='blue')
23plt.plot(train_sizes, test_mean,
24         color='green', linestyle='--',
25         marker='s', markersize=5,
26         label='validation accuracy')
27plt.fill_between(train_sizes,
28                 test_mean + test_std,
29                 test_mean - test_std,
30                 alpha=0.15, color='green')
31plt.grid()
32plt.xlabel('Number of training samples')
33plt.ylabel('Accuracy')
34plt.legend(loc='lower right')
35plt.ylim([0.8, 1.0])
36plt.tight_layout()
37plt.show()

Learning_curve中的train_sizes參數控制產生學習曲線的訓練樣本的絕對/相對數量，此處，咱們設置的train_sizes=np.linspace(0.1, 1.0, 10)，將訓練集大小劃分爲10個相等的區間，在0.1和1之間線性的取10個值。learning_curve默認使用分層k折交叉驗證計算交叉驗證的準確率，咱們經過cv設置k。

下圖能夠看到，模型在測試集表現很好，不過訓練集和測試集的準確率仍是有一段小間隔，多是模型有點過擬合。

3.3 繪製驗證曲線獲得超參和準確率關係

驗證曲線是用來提升模型的性能，驗證曲線和學習曲線很相近，不一樣的是這裏畫出的是不一樣參數下模型的準確率而不是不一樣訓練集大小下的準確率：

 1from sklearn.model_selection import validation_curve
 2param_range = [0.001, 0.01, 0.1, 1.0, 10.0, 100.0]
 3train_scores, test_scores = validation_curve(
 4                estimator=pipe_lr, 
 5                X=X_train, 
 6                y=y_train, 
 7                param_name='clf__C', 
 8                param_range=param_range,
 9                cv=10)
10train_mean = np.mean(train_scores, axis=1)
11train_std = np.std(train_scores, axis=1)
12test_mean = np.mean(test_scores, axis=1)
13test_std = np.std(test_scores, axis=1)
14plt.plot(param_range, train_mean, 
15         color='blue', marker='o', 
16         markersize=5, label='training accuracy')
17plt.fill_between(param_range, train_mean + train_std,
18                 train_mean - train_std, alpha=0.15,
19                 color='blue')
20plt.plot(param_range, test_mean, 
21         color='green', linestyle='--', 
22         marker='s', markersize=5, 
23         label='validation accuracy')
24plt.fill_between(param_range, 
25                 test_mean + test_std,
26                 test_mean - test_std, 
27                 alpha=0.15, color='green')
28plt.grid()
29plt.xscale('log')
30plt.legend(loc='lower right')
31plt.xlabel('Parameter C')
32plt.ylabel('Accuracy')
33plt.ylim([0.8, 1.0])
34plt.tight_layout()
35plt.show()

咱們獲得了參數C的驗證曲線。和learning_curve方法很像，validation_curve方法使用採樣k折交叉驗證來評估模型的性能。在validation_curve內部，咱們設定了用來評估的參數（這裏咱們設置C做爲觀測）。

從下圖能夠看出，最好的C值是0.1。

4、網格搜索

網格搜索(grid search)，做爲調參很經常使用的方法，這邊仍是要簡單介紹一下。

在咱們的機器學習算法中，有一類參數，須要人工進行設定，咱們稱之爲「超參」，也就是算法中的參數，好比學習率、正則項係數或者決策樹的深度等。

網格搜索就是要找到一個最優的參數，從而使得模型的效果最佳，而它實現的原理其實就是暴力搜索；即咱們事先爲每一個參數設定一組值，而後窮舉各類參數組合，找到最好的那一組。

4.1. 兩層for循環暴力檢索

網格搜索的結果得到了指定的最優參數值，c爲100，gamma爲0.001

 1# naive grid search implementation
 2from sklearn.datasets import load_iris
 3from sklearn.svm import SVC
 4from sklearn.model_selection import train_test_split
 5iris = load_iris()
 6X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, random_state=0)
 7print("Size of training set: %d   size of test set: %d" % (X_train.shape[0], X_test.shape[0]))
 8best_score = 0
 9for gamma in [0.001, 0.01, 0.1, 1, 10, 100]:
10    for C in [0.001, 0.01, 0.1, 1, 10, 100]:
11        # for each combination of parameters
12        # train an SVC
13        svm = SVC(gamma=gamma, C=C)
14        svm.fit(X_train, y_train)
15        # evaluate the SVC on the test set 
16        score = svm.score(X_test, y_test)
17        # if we got a better score, store the score and parameters
18        if score > best_score:
19            best_score = score
20            best_parameters = {'C': C, 'gamma': gamma}
21print("best score: ", best_score)
22print("best parameters: ", best_parameters)

output：
Size of training set: 112   size of test set: 38
best score:  0.973684210526
best parameters:  {'C': 100, 'gamma': 0.001}

4.2. 構建字典暴力檢索

網格搜索的結果得到了指定的最優參數值，c爲1

 1from sklearn.svm import SVC
 2from sklearn.model_selection import GridSearchCV
 3pipe_svc = Pipeline([('scl', StandardScaler()),
 4            ('clf', SVC(random_state=1))])
 5param_range = [0.0001, 0.001, 0.01, 0.1, 1.0, 10.0, 100.0, 1000.0]
 6param_grid = [{'clf__C': param_range, 
 7               'clf__kernel': ['linear']},
 8                 {'clf__C': param_range, 
 9                  'clf__gamma': param_range, 
10                  'clf__kernel': ['rbf']}]
11gs = GridSearchCV(estimator=pipe_svc, 
12                  param_grid=param_grid, 
13                  scoring='accuracy', 
14                  cv=10,
15                  n_jobs=-1)
16gs = gs.fit(X_train, y_train)
17print(gs.best_score_)
18print(gs.best_params_)

output：
0.978021978022
{'clf__C': 0.1, 'clf__kernel': 'linear'}

GridSearchCV中param_grid參數是字典構成的列表。對於線性SVM，咱們只評估參數C；對於RBF核SVM，咱們評估C和gamma。最後，咱們經過best_parmas_獲得最優參數組合。

接着，咱們直接利用最優參數建模(best_estimator_)：

1clf = gs.best_estimator_
2clf.fit(X_train, y_train)
3print('Test accuracy: %.3f' % clf.score(X_test, y_test))

網格搜索雖然不錯，可是窮舉過於耗時，sklearn中還實現了隨機搜索，使用 RandomizedSearchCV類，隨機採樣出不一樣的參數組合。

5、嵌套交叉驗證 🎯

嵌套交叉驗證(nested cross validation)選擇算法（外循環經過k折等進行參數優化，內循環使用交叉驗證），對特定數據集進行模型選擇。Varma和Simon在論文Bias in Error Estimation When Using Cross-validation for Model Selection中指出使用嵌套交叉驗證獲得的測試集偏差幾乎就是真實偏差。

嵌套交叉驗證外部有一個k折交叉驗證將數據分爲訓練集和測試集，內部交叉驗證用於選擇模型算法。

下圖演示了一個5折外層交叉沿則和2折內部交叉驗證組成的嵌套交叉驗證，也被稱爲5*2交叉驗證：

咱們仍是用到以前的數據集，相關包的導入操做這裏就省略了。

SVM分類器的預測準確率代碼實現：

 1gs = GridSearchCV(estimator=pipe_svc,
 2                  param_grid=param_grid,
 3                  scoring='accuracy',
 4                  cv=2)
 5
 6# Note: Optionally, you could use cv=2 
 7# in the GridSearchCV above to produce
 8# the 5 x 2 nested CV that is shown in the figure.
 9
10scores = cross_val_score(gs, X_train, y_train, scoring='accuracy', cv=5)
11print('CV accuracy: %.3f +/- %.3f' % (np.mean(scores), np.std(scores)))

CV accuracy: 0.965 +/- 0.025

決策樹分類器的預測準確率代碼實現：

1from sklearn.tree import DecisionTreeClassifier
2
3gs = GridSearchCV(estimator=DecisionTreeClassifier(random_state=0),
4                  param_grid=[{'max_depth': [1, 2, 3, 4, 5, 6, 7, None]}],
5                  scoring='accuracy',
6                  cv=2)
7scores = cross_val_score(gs, X_train, y_train, scoring='accuracy', cv=5)
8print('CV accuracy: %.3f +/- %.3f' % (np.mean(scores), np.std(scores)))

CV accuracy: 0.921 +/- 0.029

6、相關評價指標 🎲

6.1 混淆矩陣及其實現

混淆矩陣，你們應該都有據說過，大體就是長下面這樣子的：

因此，有幾個概念須要先說明：

TP(True Positive): 真實爲0，預測也爲0

FN(False Negative): 真實爲0，預測爲1

FP(False Positive): 真實爲1，預測爲0

TN(True Negative): 真實爲1，預測也爲1

因此，衍生了幾個經常使用的指標：

: 分類模型整體判斷的準確率(包括了全部class的整體準確率)

: 預測爲0的準確率

: 真實爲0的準確率

: 真實爲1的準確率

: 預測爲1的準確率

: 對於某個分類，綜合了Precision和Recall的一個判斷指標，F1-Score的值是從0到1的，1是最好，0是最差

: 另一個綜合Precision和Recall的標準，F1-Score的變形

再舉個例子：

混淆矩陣網絡上有不少文章，也不用說刻意地去背去記，須要的時候百度一下你就知道，混淆矩陣實現代碼：

1from sklearn.metrics import confusion_matrix
2
3pipe_svc.fit(X_train, y_train)
4y_pred = pipe_svc.predict(X_test)
5confmat = confusion_matrix(y_true=y_test, y_pred=y_pred)
6print(confmat)

output：
[[71  1]
 [ 2 40]]

 1fig, ax = plt.subplots(figsize=(2.5, 2.5))
 2ax.matshow(confmat, cmap=plt.cm.Blues, alpha=0.3)
 3for i in range(confmat.shape[0]):
 4    for j in range(confmat.shape[1]):
 5        ax.text(x=j, y=i, s=confmat[i, j], va='center', ha='center')
 6
 7plt.xlabel('predicted label')
 8plt.ylabel('true label')
 9
10plt.tight_layout()
11plt.show()

6.2 相關評價指標實現

分別是準確度、recall以及F1指標的實現。

1from sklearn.metrics import precision_score, recall_score, f1_score
2
3print('Precision: %.3f' % precision_score(y_true=y_test, y_pred=y_pred))
4print('Recall: %.3f' % recall_score(y_true=y_test, y_pred=y_pred))
5print('F1: %.3f' % f1_score(y_true=y_test, y_pred=y_pred))

Precision: 0.976
Recall: 0.952
F1: 0.964

指定評價指標自動選出最優模型：

能夠經過在make_scorer中設定參數，肯定須要用來評價的指標（這裏用了fl_score），這個函數能夠直接輸出結果。

 1from sklearn.metrics import make_scorer
 2
 3scorer = make_scorer(f1_score, pos_label=0)
 4
 5c_gamma_range = [0.01, 0.1, 1.0, 10.0]
 6
 7param_grid = [{'clf__C': c_gamma_range,
 8               'clf__kernel': ['linear']},
 9              {'clf__C': c_gamma_range,
10               'clf__gamma': c_gamma_range,
11               'clf__kernel': ['rbf']}]
12
13gs = GridSearchCV(estimator=pipe_svc,
14                  param_grid=param_grid,
15                  scoring=scorer,
16                  cv=10,
17                  n_jobs=-1)
18gs = gs.fit(X_train, y_train)
19print(gs.best_score_)
20print(gs.best_params_)

0.982798668208
{'clf__C': 0.1, 'clf__kernel': 'linear'}

6.3 ROC曲線及其實現

若是須要理解ROC曲線，那你就須要先了解一下混淆矩陣了，具體的內容能夠查看一下以前的文章，這裏重點引入2個概念：

真正率(true positive rate,TPR)，指的是被模型正確預測的正樣本的比例：

假正率(false positive rate,FPR) ，指的是被模型錯誤預測的正樣本的比例：

ROC曲線概念：

ROC(receiver operating characteristic)接受者操做特徵，其顯示的是分類器的真正率和假正率之間的關係，以下圖所示：

ROC曲線繪製：

對於一個特定的分類器和測試數據集，顯然只能獲得一個分類結果，即一組FPR和TPR結果，而要獲得一個曲線，咱們實際上須要一系列FPR和TPR的值。

那麼如何處理？很簡單，咱們能夠根據模型預測的機率值，而且設置不一樣的閾值來得到不一樣的預測結果。什麼意思？

好比說：

5個樣本，真實的target（目標標籤）是y=c(1,1,0,0,1)

模型分類器將預測樣本爲1的機率p=c(0.5,0.6,0.55,0.4,0.7)

咱們須要選定閾值才能把機率轉化爲類別，

若是咱們選定閾值爲0.1，那麼5個樣本被分進1的類別

若是選定0.3，結果仍然同樣

若是選了0.45做爲閾值，那麼只有樣本4被分進0

以後把全部獲得的全部分類結果計算FTR,PTR，並繪製成線，就能夠獲得ROC曲線了，當threshold（閾值）取值越多，ROC曲線越平滑。

ROC曲線代碼實現：

 1from sklearn.metrics import roc_curve, auc
 2from scipy import interp
 3
 4pipe_lr = Pipeline([('scl', StandardScaler()),
 5                    ('pca', PCA(n_components=2)),
 6                    ('clf', LogisticRegression(penalty='l2', 
 7                                               random_state=0, 
 8                                               C=100.0))])
 9  
10X_train2 = X_train[:, [4, 14]]
11   # 由於所有特徵丟進去的話，預測效果太好，畫ROC曲線很差看哈哈哈，因此只是取了2個特徵
12
13
14cv = list(StratifiedKFold(n_splits=3, 
15                              random_state=1).split(X_train, y_train))
16
17fig = plt.figure(figsize=(7, 5))
18
19mean_tpr = 0.0
20mean_fpr = np.linspace(0, 1, 100)
21all_tpr = []
22
23for i, (train, test) in enumerate(cv):
24    probas = pipe_lr.fit(X_train2[train],
25                         y_train[train]).predict_proba(X_train2[test])
26
27    fpr, tpr, thresholds = roc_curve(y_train[test],
28                                     probas[:, 1],
29                                     pos_label=1)
30    mean_tpr += interp(mean_fpr, fpr, tpr)
31    mean_tpr[0] = 0.0
32    roc_auc = auc(fpr, tpr)
33    plt.plot(fpr,
34             tpr,
35             lw=1,
36             label='ROC fold %d (area = %0.2f)'
37                   % (i+1, roc_auc))
38
39plt.plot([0, 1],
40         [0, 1],
41         linestyle='--',
42         color=(0.6, 0.6, 0.6),
43         label='random guessing')
44
45mean_tpr /= len(cv)
46mean_tpr[-1] = 1.0
47mean_auc = auc(mean_fpr, mean_tpr)
48plt.plot(mean_fpr, mean_tpr, 'k--',
49         label='mean ROC (area = %0.2f)' % mean_auc, lw=2)
50plt.plot([0, 0, 1],
51         [0, 1, 1],
52         lw=2,
53         linestyle=':',
54         color='black',
55         label='perfect performance')
56
57plt.xlim([-0.05, 1.05])
58plt.ylim([-0.05, 1.05])
59plt.xlabel('false positive rate')
60plt.ylabel('true positive rate')
61plt.title('Receiver Operator Characteristic')
62plt.legend(loc="lower right")
63
64plt.tight_layout()
65plt.show()

查看下AUC和準確率的結果：

1pipe_lr = pipe_lr.fit(X_train2, y_train)
2y_labels = pipe_lr.predict(X_test[:, [4, 14]])
3y_probas = pipe_lr.predict_proba(X_test[:, [4, 14]])[:, 1]
4# note that we use probabilities for roc_auc
5# the `[:, 1]` selects the positive class label only

1from sklearn.metrics import roc_auc_score, accuracy_score
2print('ROC AUC: %.3f' % roc_auc_score(y_true=y_test, y_score=y_probas))
3print('Accuracy: %.3f' % accuracy_score(y_true=y_test, y_pred=y_labels))

ROC AUC: 0.752
Accuracy: 0.711