如何預測股票分析--k-近鄰

時間 2020-01-25

標籤如何預測股票分析近鄰简体版

原文原文鏈接

在上一篇中，咱們學習了線性迴歸，這一次來看看k近鄰的表現
html

K最近鄰(k-Nearest Neighbor，KNN)分類算法，是一個理論上比較成熟的方法，也是最簡單的機器學習算法之一。該方法的思路是：在特徵空間中，若是一個樣本附近的k個最近(即特徵空間中最鄰近)樣本的大多數屬於某一個類別，則該樣本也屬於這個類別。算法

#importing libraries #導入相對應的庫函數（第一個是用來使用k n n的，第二個是用來網格搜索，第三個用來歸一化）機器學習

from sklearn import neighbors函數

from sklearn.model_selection import GridSearchCV學習

from sklearn.preprocessing import MinMaxScalerrest

scaler = MinMaxScaler(feature_range=(0, 1))orm

使用上一節中相同的訓練和驗證集:htm

#scaling data 處理數據（歸一化）、將數據集轉化爲pandas的執行規格blog

x_train_scaled = scaler.fit_transform(x_train)get

x_train = pd.DataFrame(x_train_scaled)

x_valid_scaled = scaler.fit_transform(x_valid)

x_valid = pd.DataFrame(x_valid_scaled)

#using gridsearch to find the best parameter 用網格搜索尋找最優參數

params = {'n_neighbors':[2,3,4,5,6,7,8,9]}

knn = neighbors.KNeighborsRegressor()

#創建模型

model = GridSearchCV(knn, params, cv=5)

#fit the model and make predictions 給模型喂數據並預測

model.fit(x_train,y_train)

preds = model.predict(x_valid)

結果

#rmse計算r m s

rms=np.sqrt(np.mean(np.power((np.array(y_valid)-np.array(preds)),2)))

#這裏顯示結果，可不執行

rms

115.17086550026721

RMSE值並無太大的差別，可是一個預測值和實際值的曲線圖應該能夠提供一個更清晰的理解。

#plot 繪圖畫出訓練的數據（綠線）、預測值（藍線）與訓練集的觀測值（橙線）

valid['Predictions'] = 0

valid['Predictions'] = preds

plt.plot(valid[['Close', 'Predictions']])

plt.plot(train['Close'])

推論

RMSE值與線性迴歸模型近似，圖中呈現出相同的模式。與線性迴歸同樣，kNN也發現了2018年1月的降低，由於這是過去幾年的模式。

咱們能夠有把握地說，迴歸算法在這個數據集上表現得並很差。

參考：https://www.jiqizhixin.com/articles/2019-01-04-16

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。