《用Python中的自定義損失函數和正則化來擬合線性模型》python
假設有100個樣本點,每一個樣本點的feature是10維(9個基礎變量和1個截距),爲了更好地展示實驗效果,咱們爲樣本添加噪聲:dom
# Generate predictors X_raw = np.random.random(100*9) X_raw = np.reshape(X_raw, (100, 9)) # Standardize the predictors scaler = StandardScaler().fit(X_raw) X = scaler.transform(X_raw) # Add an intercept column to the model. X = np.abs(np.concatenate((np.ones((X.shape[0],1)), X), axis=1)) # Define my "true" beta coefficients beta = np.array([2,6,7,3,5,7,1,2,2,8]) # Y = Xb Y_true = np.matmul(X,beta) # Observed data with noise Y = Y_true*np.exp(np.random.normal(loc=0.0, scale=0.2, size=100))
其中有2種選擇:ide
Mean Absolute Percentage Error (MAPE)函數
Weighted MAPEspa
損失函數爲:code
def mean_absolute_percentage_error(y_true, y_pred, sample_weights=None): y_true = np.array(y_true) y_pred = np.array(y_pred) assert len(y_true) == len(y_pred) if np.any(y_true==0): print("Found zeroes in y_true. MAPE undefined. Removing from set...") idx = np.where(y_true==0) y_true = np.delete(y_true, idx) y_pred = np.delete(y_pred, idx) if type(sample_weights) != type(None): sample_weights = np.array(sample_weights) sample_weights = np.delete(sample_weights, idx) if type(sample_weights) == type(None): return(np.mean(np.abs((y_true - y_pred) / y_true)) * 100) else: sample_weights = np.array(sample_weights) assert len(sample_weights) == len(y_true) return(100/sum(sample_weights)*np.dot( sample_weights, (np.abs((y_true - y_pred) / y_true)) ))
傳統求解方式orm
本方法的求解方式:ip
相應的代碼爲:ci
from scipy.optimize import minimize loss_function = mean_absolute_percentage_error def objective_function(beta, X, Y): error = loss_function(np.matmul(X,beta), Y) return(error) # You must provide a starting point at which to initialize # the parameter search space beta_init = np.array([1]*X.shape[1]) result = minimize(objective_function, beta_init, args=(X,Y), method='BFGS', options={'maxiter': 500}) # The optimal values for the input parameters are stored # in result.x beta_hat = result.x print(beta_hat)