對於不一樣類型的模型,會有不一樣的評估指標,那麼咱們從最直接的迴歸和分類這兩個類型,對於結果連續的迴歸問題,算法
通常使用的大體爲:MSE(均方差),MAE(絕對平均差),RMSE(根均方差)這三種評估方法,這三種方式公式此處補貼出來。app
對於離散的分類問題,咱們通常看ROC曲線,以及AUC曲線,通常好的模型,ROC曲線,在一開始就直接上升到1,而後一直保持1,也就是使得AUC=1.0或者儘量的讓其dom
接近這個值,這是咱們奮鬥的目標.url
摘個實際的例子:--出自《預測分析核心算法》這本書.spa
1 #-*-coding:utf-8-*- 2 __author__ ='gxjun' 3 import pandas as pd 4 import matplotlib.pyplot as plt 5 from pandas import DataFrame 6 from random import uniform 7 import math 8 import numpy as np 9 import random 10 import pylab as pl 11 from sklearn import datasets,linear_model 12 from sklearn.metrics import roc_curve ,auc 13 14 15 ##計算RP值 16 def confusionMatrix(predicted ,actual , threshold): 17 if len(predicted) != len(actual): 18 return -1; 19 tp=0.0; 20 fp=0.0; 21 tn=0.0; 22 fn=0.0; 23 for i in range(len(actual)): 24 if actual[i] >0.5: 25 if predicted[i] > threshold: 26 tp+=1.0; 27 else: 28 fn+=1.0; 29 else: 30 if predicted[i]<threshold: 31 tn+=1.0; 32 else: 33 fp+=1.0; 34 rtn=[tp,tn,fp,fn]; 35 return rtn; 36 target_url =("https://archive.ics.uci.edu/ml/machine-learning-databases/undocumented/connectionist-bench/sonar/sonar.all-data") 37 data = pd.read_csv(target_url,header=None,prefix='V'); 38 print('-'*80) 39 print(data.head()) 40 print('-'*80) 41 print(data.tail()) 42 print('-'*80) 43 print(data.describe()) 44 print('-'*80) 45 label = []; 46 dataRows = []; 47 48 for i in range(208): 49 if data.iat[i,-1]=='M': 50 label.append(1.0); 51 else: 52 label.append(0); 53 print label 54 dataRows=data.iloc[:,0:-1]; 55 x_train = np.array(dataRows); 56 y_train = np.array(label); 57 print "x_train shape: {} , y_train shape: {}".format(x_train.shape,y_train.shape); 58 print "x_test shape: {} , y_test shape: {}".format(x_test.shape,y_test.shape); 59 x_test = np.array(dataRows[0:int(208/3)]); 60 y_test = np.array(label[0:int(208/3)]); 61 #train model 62 rockModel = linear_model.LinearRegression(); 63 rockModel.fit(x_train,y_train); 64 prob = rockModel.predict(x_train); 65 print('-'*80); 66 confusionMatrain = confusionMatrix(prob,y_train,threshold=0.5); 67 68 #print confusionMatrain 69 fpr ,tpr,threshold = roc_curve(y_train,prob); 70 roc_auc = auc(fpr,tpr); 71 72 plt.clf(); 73 plt.plot(fpr,tpr,label='ROC curve(area =%0.2f)'%roc_auc); 74 pl.plot([0,1],[0,1],'k-'); 75 pl.xlim([0.0,1.0]); 76 pl.ylim([0.0,1.0]); 77 pl.xlabel("FP rate}"); 78 pl.ylabel("TP rate}"); 79 pl.title("ROC"); 80 pl.legend(loc="lower right"); 81 pl.show()
結果爲:code