寫在前面:python
每當提到損失函數,不少人都有個誤解,覺得用在GridSearchCV(網格搜索交叉驗證「Cross Validation」)裏邊的scoring就是損失函數,其實並非。咱們使用構造函數構造XGBRegressor的時候,裏邊的objective參數纔是真正的損失函數(loss function)。xgb使用sklearn api的時候須要用到的損失函數,其返回值是一階導和二階導,而GridSearchCV使用的scoring函數,返回的是一個float類型的數值評分(或叫準確率、或叫誤差值)。api
You should be careful with the notation.函數
There are 2 levels of optimization here:動畫
- The loss function optimized when the
XGBRegressor
is fitted to the data.- The scoring function that is optimized during the grid search.
I prefer calling the second scoring function instead of loss function, since loss function usually refers to a term that is subject to optimization during the model fitting process itself.code
Scikit-Learn: Custom Loss Function for GridSearchCVblog
所以,下文對於objective,統一叫「目標函數」;而對scoring,統一叫「評價函數」。教程
========== 原文分割線 ===================utf-8
許多特定的任務須要定製目標函數,來達到更優的效果。這裏以xgboost的迴歸預測爲例,介紹一下objective函數的定製過程。一個簡單的例子以下:ci
def customObj1(real, predict): grad = predict - real hess = np.power(np.abs(grad), 0.5) return grad, hess
網上有許多教程定義的objective函數中的第一個參數是preds,第二個是dtrain,而本文因爲使用xgboost的sklearn API,所以定製的objective函數須要與sklearn的格式相符。調用目標函數的過程以下:get
model = xgb.XGBRegressor(objective=customObj1, booster="gblinear")
下面是不一樣迭代次數的動畫演示:
咱們發現,不一樣的目標函數對模型的收斂速度影響較大,但最終收斂目標大體相同,以下圖:
完整代碼以下:
# coding=utf-8 import pandas as pd import numpy as np import xgboost as xgb import matplotlib.pyplot as plt plt.rcParams.update({'figure.autolayout': True}) df = pd.DataFrame({'x': [-2.1, -0.9, 0, 1, 2, 2.5, 3, 4], 'y': [ -10, 0, -5, 10, 20, 10, 30, 40]}) X_train = df.drop('y', axis=1) Y_train = df['y'] X_pred = [-4, -3, -2, -1, 0, 0.4, 0.6, 1, 1.4, 1.6, 2, 3, 4, 5, 6, 7, 8] def process_list(list_in): result = map(lambda x: "%8.2f" % round(float(x), 2), list_in) return list(result) def customObj3(real, predict): grad = predict - real hess = np.power(np.abs(grad), 0.1) # print 'predict', process_list(predict.tolist()), type(predict) # print ' real ', process_list(real.tolist()), type(real) # print ' grad ', process_list(grad.tolist()), type(grad) # print ' hess ', process_list(hess.tolist()), type(hess), '\n' return grad, hess def customObj1(real, predict): grad = predict - real hess = np.power(np.abs(grad), 0.5) return grad, hess for n_estimators in range(5, 600, 5): booster_str = "gblinear" model = xgb.XGBRegressor(objective=customObj1, booster=booster_str, n_estimators=n_estimators) model2 = xgb.XGBRegressor(objective="reg:linear", booster=booster_str, n_estimators=n_estimators) model3 = xgb.XGBRegressor(objective=customObj3, booster=booster_str, n_estimators=n_estimators) model.fit(X=X_train, y=Y_train) model2.fit(X=X_train, y=Y_train) model3.fit(X=X_train, y=Y_train) y_pred = model.predict(data=pd.DataFrame({'x': X_pred})) y_pred2 = model2.predict(data=pd.DataFrame({'x': X_pred})) y_pred3 = model3.predict(data=pd.DataFrame({'x': X_pred})) plt.figure(figsize=(6, 5)) plt.axes().set(title='n_estimators='+str(n_estimators)) plt.plot(df['x'], df['y'], marker='o', linestyle=":", label="Real Y") plt.plot(X_pred, y_pred, label="predict - real; |grad|**0.5") plt.plot(X_pred, y_pred3, label="predict - real; |grad|**0.1") plt.plot(X_pred, y_pred2, label="reg:linear") plt.xlim(-4.5, 8.5) plt.ylim(-25, 55) plt.legend() # plt.show() plt.savefig("output/n_estimators_"+str(n_estimators)+".jpg") plt.close() print(n_estimators)