本文由 「AI前線」原創,原文連接: 利用TensorFlow和Keras進行比特幣價格預測
做者|黃功詳 Steeve Huang
譯者|Erica Yi
編輯|Emily
AI 前線導讀: 」電子加密貨幣尤爲是比特幣,近來一直是社交媒體和搜索引擎的熱點之一。因爲加密貨幣的高波動性,在智能合理的投資策略下,人們是有可能從中得到巨大收益的。忽然,彷佛世界上全部的人都在討論加密貨幣。可是與傳統的金融工具相比,因爲加密貨幣缺少相應的指標,價格相對來講難以預測。本文旨在以比特幣爲例,教你如何使用深度學習來預測這些加密貨幣的價格,以便深刻了解將來比特幣的發展趨勢。」git
電子加密貨幣尤爲是比特幣,近來一直是社交媒體和搜索引擎的熱點之一。因爲加密貨幣的高波動性,在智能合理的投資策略下,人們是有可能從中得到巨大收益的。忽然,彷佛世界上全部的人都在討論加密貨幣。可是與傳統的金融工具相比,因爲加密貨幣缺少相應的指標,價格相對來講難以預測。本文旨在以比特幣爲例,教你如何使用深度學習來預測這些加密貨幣的價格,以便深刻了解將來比特幣的發展趨勢。github
準備工做json
爲了可以順利的運行下面的代碼,請確保你已經安裝瞭如下的環境和庫:bash
數據收集網絡
用於作預測的數據能夠從 Kaggle 或者 Poloniex 收集到。爲了保證一致性,從 Poloniex 採集到數據的列的名稱會被修改的與 Kaggle 中的名稱一致。session
import json
import numpy as np
import os
import pandas as pd
import urllib2
# connect to poloniex's API
url = 'https://poloniex.com/public?command=returnChartData¤cyPair=USDT_BTC&start=1356998100&end=9999999999&period=300'
# parse json returned from the API to Pandas DF
openUrl = urllib2.urlopen(url)
r = openUrl.read()
openUrl.close()
d = json.loads(r.decode())
df = pd.DataFrame(d)
original_columns=[u'close', u'date', u'high', u'low', u'open']
new_columns = ['Close','Timestamp','High','Low','Open']
df = df.loc[:,original_columns]
df.columns = new_columns
df.to_csv('data/bitcoin2015to2017.csv',index=None)
複製代碼
(原始代碼來源可見於 GitHub)app
準備數據dom
在將收集到的數據用於模型預測以前,須要對數據進行解析。PastSampler 類參考了這篇博客,將引用的數據分割成了數據列表和標籤列表。輸入大小(N)爲 256,輸出大小(K)爲 16。值得注意的是,Poloniex 每五分鐘會收集一次數據。也就是說輸入數據跨越了 1280 分鐘,而輸出覆蓋超過了 80 分鐘。ide
import numpy as np
import pandas as pd
class PastSampler:
''' Forms training samples for predicting future values from past value '''
def __init__(self, N, K, sliding_window = True):
''' Predict K future sample using N previous samples '''
self.K = K
self.N = N
self.sliding_window = sliding_window
def transform(self, A):
M = self.N + self.K #Number of samples per row (sample + target)
#indexes
if self.sliding_window:
I = np.arange(M) + np.arange(A.shape[0] - M + 1).reshape(-1, 1)
else:
if A.shape[0]%M == 0:
I = np.arange(M)+np.arange(0,A.shape[0],M).reshape(-1,1)
else:
I = np.arange(M)+np.arange(0,A.shape[0] -M,M).reshape(-1,1)
B = A[I].reshape(-1, M * A.shape[1], A.shape[2])
ci = self.N * A.shape[1] #Number of features per sample
return B[:, :ci], B[:, ci:] #Sample matrix, Target matrix
#data file path
dfp = 'data/bitcoin2015to2017.csv'
#Columns of price data to use
columns = ['Close']
df = pd.read_csv(dfp)
time_stamps = df['Timestamp']
df = df.loc[:,columns]
original_df = pd.read_csv(dfp).loc[:,columns]
複製代碼
(原始代碼來源可見於 GitHub)函數
在建立了 Pastsampler 類以後,我將其應用在了收集到的數據上。由於原始數據的範圍是從 0 到 10000 以上,爲了讓神經網絡更容易理解收集到的數據,須要進行必定的數據縮放(data scaling)。
file_name='bitcoin2015to2017_close.h5'
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
# normalization
for c in columns:
df[c] = scaler.fit_transform(df[c].values.reshape(-1,1))
#Features are input sample dimensions(channels)
A = np.array(df)[:,None,:]
original_A = np.array(original_df)[:,None,:]
time_stamps = np.array(time_stamps)[:,None,None]
#Make samples of temporal sequences of pricing data (channel)
NPS, NFS = 256, 16 #Number of past and future samples
ps = PastSampler(NPS, NFS, sliding_window=False)
B, Y = ps.transform(A)
input_times, output_times = ps.transform(time_stamps)
original_B, original_Y = ps.transform(original_A)
import h5py
with h5py.File(file_name, 'w') as f:
f.create_dataset("inputs", data = B)
f.create_dataset('outputs', data = Y)
f.create_dataset("input_times", data = input_times)
f.create_dataset('output_times', data = output_times)
f.create_dataset("original_datas", data=np.array(original_df))
f.create_dataset('original_inputs',data=original_B)
f.create_dataset('original_outputs',data=original_Y)
複製代碼
創建模型
CNN
在內核於輸入數據上滑動的狀況下,預期一個一維卷積神經網絡(1D C)就能很好地捕捉數據的局部性。 以下圖所示。
(關於 CNN 的說明 (摘自:cs231n.github.io/convolution…)
import pandas as pd
import numpy as numpy
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv1D, MaxPooling1D, LeakyReLU, PReLU
from keras.utils import np_utils
from keras.callbacks import CSVLogger, ModelCheckpoint
import h5py
import os
import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
# Make the program use only one GPU
os.environ['CUDA_DEVICE_ORDER'] = 'PCI_BUS_ID'
os.environ['CUDA_VISIBLE_DEVICES'] = '1'
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
set_session(tf.Session(config=config))
with h5py.File(''.join(['bitcoin2015to2017_close.h5']), 'r') as hf:
datas = hf['inputs'].value
labels = hf['outputs'].value
output_file_name='bitcoin2015to2017_close_CNN_2_relu'
step_size = datas.shape[1]
batch_size= 8
nb_features = datas.shape[2]
epochs = 100
#split training validation
training_size = int(0.8* datas.shape[0])
training_datas = datas[:training_size,:]
training_labels = labels[:training_size,:]
validation_datas = datas[training_size:,:]
validation_labels = labels[training_size:,:]
#build model
# 2 layers
model = Sequential()
model.add(Conv1D(activation='relu', input_shape=(step_size, nb_features), strides=3, filters=8, kernel_size=20))
model.add(Dropout(0.5))
model.add(Conv1D( strides=4, filters=nb_features, kernel_size=16))
''' # 3 Layers model.add(Conv1D(activation='relu', input_shape=(step_size, nb_features), strides=3, filters=8, kernel_size=8)) #model.add(LeakyReLU()) model.add(Dropout(0.5)) model.add(Conv1D(activation='relu', strides=2, filters=8, kernel_size=8)) #model.add(LeakyReLU()) model.add(Dropout(0.5)) model.add(Conv1D( strides=2, filters=nb_features, kernel_size=8)) # 4 layers model.add(Conv1D(activation='relu', input_shape=(step_size, nb_features), strides=2, filters=8, kernel_size=2)) #model.add(LeakyReLU()) model.add(Dropout(0.5)) model.add(Conv1D(activation='relu', strides=2, filters=8, kernel_size=2)) #model.add(LeakyReLU()) model.add(Dropout(0.5)) model.add(Conv1D(activation='relu', strides=2, filters=8, kernel_size=2)) #model.add(LeakyReLU()) model.add(Dropout(0.5)) model.add(Conv1D( strides=2, filters=nb_features, kernel_size=2)) '''
model.compile(loss='mse', optimizer='adam')
model.fit(training_datas, training_labels,verbose=1, batch_size=batch_size,validation_data=(validation_datas,validation_labels), epochs = epochs, callbacks=[CSVLogger(output_file_name+'.csv', append=True),ModelCheckpoint('weights/'+output_file_name+'-{epoch:02d}-{val_loss:.5f}.hdf5', monitor='val_loss', verbose=1,mode='min')])
複製代碼
(原代碼見 GitHub)
我建的第一個模型是 CNN。如下代碼將 GPU 編號設置爲「1」(由於我有 4 個,你能夠將其設置爲任何你喜歡的 GPU)。 因爲 Tensorflow 在多 GPU 上運行狀況彷佛不太好,所以限制它只能在一個 GPU 上運行是明智的。若是你沒有 GPU,請不要擔憂,直接忽略這段話就行了。
os.environ['CUDA_DEVICE_ORDER'] = 'PCI_BUS_ID'
os.environ['CUDA_VISIBLE_DEVICES'] ='1'
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
複製代碼
用於構建 CNN 模型的代碼很簡單。用 Dropout layer 來防止過分擬合的問題。定義損失函數爲均方偏差(Mean Squared Error,縮寫爲 MSE),優化器則選用目前最早進的 Adam。
model = Sequential()
model.add(Conv1D(activation='relu', input_shape=(step_size, nb_features), strides=3, filters=8, kernel_size=20))
model.add(Dropout(0.5))
model.add(Conv1D( strides=4, filters=nb_features, kernel_size=16))
model.compile(loss='mse', optimizer='adam')
複製代碼
惟一一個你須要擔憂的是層與層間輸入輸出的維度問題。用於計算某個卷積層的公式是:
Output time step = (Input time step — Kernel size) / Strides + 1
複製代碼
在文件末尾,我添加了兩個回調函數 CSVLogger 和 ModelCheckpoint。前者幫助我跟蹤全部的訓練和驗證過程,然後者則容許我存儲每一個 epoch 的模型權重。
model.fit(training_datas, training_labels,verbose=1, batch_size=batch_size,validation_data=(validation_datas,validation_labels), epochs = epochs, callbacks=[CSVLogger(output_file_name+'.csv', append=True),ModelCheckpoint('weights/'+output_file_name+'-{epoch:02d}-{val_loss:.5f}.hdf5', monitor='val_loss', verbose=1,mode='min')]
複製代碼
LSTM
長短時間記憶(Long Short Term Memory,簡寫爲 LSTM)網絡是時間遞歸神經網絡(Recurrent Neural Network,簡寫爲 RNN)的一種變體(variation)。它被創造用來解決由 vanilla RNN 致使的梯度消失問題。據稱,LSTM 可以用更長的時間步長(time step)來記住輸入。
(LSTM 的說明,摘自:colah.github.io/posts/2015-…)
import pandas as pd
import numpy as numpy
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten,Reshape
from keras.layers import Conv1D, MaxPooling1D
from keras.utils import np_utils
from keras.layers import LSTM, LeakyReLU
from keras.callbacks import CSVLogger, ModelCheckpoint
import h5py
import os
import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
os.environ['CUDA_DEVICE_ORDER'] = 'PCI_BUS_ID'
os.environ['CUDA_VISIBLE_DEVICES'] = '1'
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
set_session(tf.Session(config=config))
with h5py.File(''.join(['bitcoin2015to2017_close.h5']), 'r') as hf:
datas = hf['inputs'].value
labels = hf['outputs'].value
step_size = datas.shape[1]
units= 50
second_units = 30
batch_size = 8
nb_features = datas.shape[2]
epochs = 100
output_size=16
output_file_name='bitcoin2015to2017_close_LSTM_1_tanh_leaky_'
#split training validation
training_size = int(0.8* datas.shape[0])
training_datas = datas[:training_size,:]
training_labels = labels[:training_size,:,0]
validation_datas = datas[training_size:,:]
validation_labels = labels[training_size:,:,0]
#build model
model = Sequential()
model.add(LSTM(units=units,activation='tanh', input_shape=(step_size,nb_features),return_sequences=False))
model.add(Dropout(0.8))
model.add(Dense(output_size))
model.add(LeakyReLU())
model.compile(loss='mse', optimizer='adam')
model.fit(training_datas, training_labels, batch_size=batch_size,validation_data=(validation_datas,validation_labels), epochs = epochs, callbacks=[CSVLogger(output_file_name+'.csv', append=True),ModelCheckpoint('weights/'+output_file_name+'-{epoch:02d}-{val_loss:.5f}.hdf5', monitor='val_loss', verbose=1,mode='min')])
複製代碼
(原代碼見 GitHub)
LSTM 比 CNN 更容易實現,由於你甚至不須要關心內核的大小、跨度、輸入大小和輸出大小之間的關係。只要確保輸入和輸出的維度在網絡中定義是正確的。
model = Sequential()
model.add(LSTM(units=units,activation='tanh', input_shape=(step_size,nb_features),return_sequences=False))
model.add(Dropout(0.8))
model.add(Dense(output_size))
model.add(LeakyReLU())
model.compile(loss='mse', optimizer='adam')
複製代碼
GRU
GRU(Gated Recurrent Units)是 RNN 的另外一種變體。它的網絡結構不如 LSTM 那麼複雜,只有一個 reset 和 forget gate,可是省略了內存單元。 據稱,GRU 的性能與 LSTM 至關,但效率更高。(在這個博客中也是如此,由於 LSTM 大約須要 45 秒 /epoch,而 GRU 則不到 40 秒 /epoch)。
(關於 GRU 的說明,摘自:www.jackdermody.net/brightwire/…)
import pandas as pd
import numpy as numpy
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten,Reshape
from keras.layers import Conv1D, MaxPooling1D, LeakyReLU
from keras.utils import np_utils
from keras.layers import GRU,CuDNNGRU
from keras.callbacks import CSVLogger, ModelCheckpoint
import h5py
import os
import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
os.environ['CUDA_DEVICE_ORDER'] = 'PCI_BUS_ID'
os.environ['CUDA_VISIBLE_DEVICES'] = '1'
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
set_session(tf.Session(config=config))
with h5py.File(''.join(['bitcoin2015to2017_close.h5']), 'r') as hf:
datas = hf['inputs'].value
labels = hf['outputs'].value
output_file_name='bitcoin2015to2017_close_GRU_1_tanh_relu_'
step_size = datas.shape[1]
units= 50
batch_size = 8
nb_features = datas.shape[2]
epochs = 100
output_size=16
#split training validation
training_size = int(0.8* datas.shape[0])
training_datas = datas[:training_size,:]
training_labels = labels[:training_size,:,0]
validation_datas = datas[training_size:,:]
validation_labels = labels[training_size:,:,0]
#build model
model = Sequential()
model.add(GRU(units=units, input_shape=(step_size,nb_features),return_sequences=False))
model.add(Activation('tanh'))
model.add(Dropout(0.2))
model.add(Dense(output_size))
model.add(Activation('relu'))
model.compile(loss='mse', optimizer='adam')
model.fit(training_datas, training_labels, batch_size=batch_size,validation_data=(validation_datas,validation_labels), epochs = epochs, callbacks=[CSVLogger(output_file_name+'.csv', append=True),ModelCheckpoint('weights/'+output_file_name+'-{epoch:02d}-{val_loss:.5f}.hdf5', monitor='val_loss', verbose=1,mode='min')])
複製代碼
(原代碼見 GitHub)
僅需將正在建的 LSTM 模型代碼中的第二行,
model.add(LSTM(units=units,activation='tanh', input_shape=(step_size,nb_features),return_sequences=False))
複製代碼
替換爲:
model.add(GRU(units=units,activation='tanh', input_shape=(step_size,nb_features),return_sequences=False))
複製代碼
繪製結果圖(result plotting)
因爲三個模型的結果 de 繪圖類似,因此這裏我只放了 CNN 版本的圖。首先,咱們須要從新構建模型並將訓練權重加載到模型中。
from keras import applications
from keras.models import Sequential
from keras.models import Model
from keras.layers import Dropout, Flatten, Dense, Activation
from keras.callbacks import CSVLogger
import tensorflow as tf
from scipy.ndimage import imread
import numpy as np
import random
from keras.layers import LSTM
from keras.layers import Conv1D, MaxPooling1D, LeakyReLU
from keras import backend as K
import keras
from keras.callbacks import CSVLogger, ModelCheckpoint
from keras.backend.tensorflow_backend import set_session
from keras import optimizers
import h5py
from sklearn.preprocessing import MinMaxScaler
import os
import pandas as pd
# import matplotlib
import matplotlib.pyplot as plt
os.environ['CUDA_DEVICE_ORDER'] = 'PCI_BUS_ID'
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
with h5py.File(''.join(['bitcoin2015to2017_close.h5']), 'r') as hf:
datas = hf['inputs'].value
labels = hf['outputs'].value
input_times = hf['input_times'].value
output_times = hf['output_times'].value
original_inputs = hf['original_inputs'].value
original_outputs = hf['original_outputs'].value
original_datas = hf['original_datas'].value
scaler=MinMaxScaler()
#split training validation
training_size = int(0.8* datas.shape[0])
training_datas = datas[:training_size,:,:]
training_labels = labels[:training_size,:,:]
validation_datas = datas[training_size:,:,:]
validation_labels = labels[training_size:,:,:]
validation_original_outputs = original_outputs[training_size:,:,:]
validation_original_inputs = original_inputs[training_size:,:,:]
validation_input_times = input_times[training_size:,:,:]
validation_output_times = output_times[training_size:,:,:]
ground_true = np.append(validation_original_inputs,validation_original_outputs, axis=1)
ground_true_times = np.append(validation_input_times,validation_output_times, axis=1)
step_size = datas.shape[1]
batch_size= 8
nb_features = datas.shape[2]
model = Sequential()
# 2 layers
model.add(Conv1D(activation='relu', input_shape=(step_size, nb_features), strides=3, filters=8, kernel_size=20))
# model.add(LeakyReLU())
model.add(Dropout(0.25))
model.add(Conv1D( strides=4, filters=nb_features, kernel_size=16))
model.load_weights('weights/bitcoin2015to2017_close_CNN_2_relu-44-0.00030.hdf5')
model.compile(loss='mse', optimizer='adam')
複製代碼
(原代碼見 GitHub)
其次,咱們須要對預測數據進行反向縮放,由於以前使用了 MinMaxScaler,因此預測的數據範圍爲 [0,1]。
predicted = model.predict(validation_datas)
predicted_inverted = []
for i in range(original_datas.shape[1]):
scaler.fit(original_datas[:,i].reshape(-1,1))
predicted_inverted.append(scaler.inverse_transform(predicted[:,:,i]))
print np.array(predicted_inverted).shape
#get only the close data
ground_true = ground_true[:,:,0].reshape(-1)
ground_true_times = ground_true_times.reshape(-1)
ground_true_times = pd.to_datetime(ground_true_times, unit='s')
# since we are appending in the first dimension
predicted_inverted = np.array(predicted_inverted)[0,:,:].reshape(-1)
print np.array(predicted_inverted).shape
validation_output_times = pd.to_datetime(validation_output_times.reshape(-1), unit='s')
複製代碼
(原代碼見 GitHub)
創建兩個 Dataframe 用於比特幣的市場實際價格和預測價格。出於可視化的目的,繪製的數字僅顯示了 2017 年 8 月以後的數據。
ground_true_df = pd.DataFrame()
ground_true_df['times'] = ground_true_times
ground_true_df['value'] = ground_true
prediction_df = pd.DataFrame()
prediction_df['times'] = validation_output_times
prediction_df['value'] = predicted_inverted
prediction_df = prediction_df.loc[(prediction_df["times"].dt.year == 2017 )&(prediction_df["times"].dt.month > 7 ),: ]
ground_true_df = ground_true_df.loc[(ground_true_df["times"].dt.year == 2017 )&(ground_true_df["times"].dt.month > 7 ),:]
複製代碼
(原代碼見 GitHub)
用 pyplot 來繪製圖形。因爲預測價格是基於每 16 分鐘的基礎上,因此將這些點分散會使咱們更容易查看結果。所以,這裏預測的數據被繪製爲紅點,如第三行中的「ro」所示。下圖中的藍線表示市場的真實狀況(實際數據),而紅點表示預測的比特幣價格。
plt.figure(figsize=(20,10))
plt.plot(ground_true_df.times,ground_true_df.value, label = 'Actual')
plt.plot(prediction_df.times,prediction_df.value,'ro', label='Predicted')
plt.legend(loc='upper left')
plt.show()
複製代碼
(原代碼見 GitHub)
(用雙層 CNN 預測比特幣價格的最佳結果繪製)
從上圖能夠看出,預測出的價格與比特幣的實際價格很是類似。爲了選擇最佳模型,我決定測試幾種不一樣的網絡配置,也就有了下表。
(不一樣模型的預測結果)
上表中的每一行都是從總共 100 個訓練 epoch 中導出的最佳驗證損失(validation loss)的模型。從以上結果能夠看出,LeakyReLU 彷佛老是比正規的 ReLU 產生更好的損失(loss)。可是,使用 Leaky ReLU 做爲激活函數的 4 層 CNN 會形成較大的驗證損失。這多是因爲模型的錯誤部署形成的,對此可能須要進行從新的驗證(re-validation)。CNN 模型能夠訓練得很是快(用 GPU,2 秒 /epoch),性能比 LSTM 和 GRU 稍差。 最好的模式彷佛是 LSTM tanh 和 Leaky ReLU 做爲激活函數,雖然 3 層 CNN 彷佛能更好地捕捉局部臨時的數據依賴性。
(有 tanh 函數和將 Leaky ReLu 做爲激活函數的 LSTM)
(以 Leaky ReLu 做爲激活函數的 3 層 CNN)
雖然目前預測看起來至關不錯,可是仍是須要注意過分擬合的問題。訓練和驗證損失之間是有差距的(5.97E-06 vs 3.92E-05),在使用 LeakyReLU 進行 LSTM 訓練時,應使用正則化( regularization)來最小化差別。
正則化
爲了找出最佳的正則化策略,我用幾個不一樣的 L1 和 L2 值作了些試驗。首先,須要定義一個新的函數來幫助咱們將數據擬合成 LSTM。 在這裏,我將以用誤差正則化器(bias regularizer)來正則化誤差向量爲例。
def fit_lstm(reg):
global training_datas, training_labels, batch_size, epochs,step_size,nb_features, units
model = Sequential()
model.add(CuDNNLSTM(units=units, bias_regularizer=reg, input_shape=(step_size,nb_features),return_sequences=False))
model.add(Activation('tanh'))
model.add(Dropout(0.2))
model.add(Dense(output_size))
model.add(LeakyReLU())
model.compile(loss='mse', optimizer='adam')
model.fit(training_datas, training_labels, batch_size=batch_size, epochs = epochs, verbose=0)
return model
複製代碼
(原代碼見 GitHub)
經過重複訓練模型 30 次,每次用 30 個 epochs 進行試驗。
def experiment(validation_datas,validation_labels,original_datas,ground_true,ground_true_times,validation_original_outputs, validation_output_times, nb_repeat, reg):
error_scores = list()
#get only the close data
ground_true = ground_true[:,:,0].reshape(-1)
ground_true_times = ground_true_times.reshape(-1)
ground_true_times = pd.to_datetime(ground_true_times, unit='s')
validation_output_times = pd.to_datetime(validation_output_times.reshape(-1), unit='s')
for i in range(nb_repeat):
model = fit_lstm(reg)
predicted = model.predict(validation_datas)
predicted_inverted = []
scaler.fit(original_datas[:,0].reshape(-1,1))
predicted_inverted.append(scaler.inverse_transform(predicted))
# since we are appending in the first dimension
predicted_inverted = np.array(predicted_inverted)[0,:,:].reshape(-1)
error_scores.append(mean_squared_error(validation_original_outputs[:,:,0].reshape(-1),predicted_inverted))
return error_scores
regs = [regularizers.l1(0),regularizers.l1(0.1), regularizers.l1(0.01), regularizers.l1(0.001), regularizers.l1(0.0001),regularizers.l2(0.1), regularizers.l2(0.01), regularizers.l2(0.001), regularizers.l2(0.0001)]
nb_repeat = 30
results = pd.DataFrame()
for reg in regs:
name = ('l1 %.4f,l2 %.4f' % (reg.l1, reg.l2))
print "Training "+ str(name)
results[name] = experiment(validation_datas,validation_labels,original_datas,ground_true,ground_true_times,validation_original_outputs, validation_output_times, nb_repeat,reg)
results.describe().to_csv('result/lstm_bias_reg.csv')
results.describe()
複製代碼
(原代碼見 GitHub)
若是你使用的是 Jupyter 筆記本,則能夠直接從輸出中查看下錶。
(誤差正則化器的運行結果)
爲了可視化比較的結果,咱們可使用 boxplot:
results.describe().boxplot()
plt.show()
複製代碼
(原代碼見 GitHub)
根據比較,誤差向量的係數爲 0.01 的 L2 正則項(regularizers)彷佛獲得的結果最好。
爲了找出全部正則項之間的最佳組合,包括激活、誤差、內核、循環矩陣,就必需要逐一地去測試這些正則項,就我目前使用的硬件配置而言還作不到。所以,我將把這個留待未來解決。
結論
經過這篇文章,你已經瞭解了:
從此這個博客的任務是找出最佳模型的最佳超參數(hyper-parameter),並可能會使用社交媒體來使預測的結果更爲準確。這是我第一次在 Medium 中發佈文章。若是有任何錯誤或問題,請不要猶豫,在下面留下你的評論。
更多有關信息,請參閱個人 github。
原文連接:
blog.goodaudience.com/predicting-…
更多幹貨內容,可關注AI前線,ID:ai-front,後臺回覆「AI」、「TF」、「大數據」可得到《AI前線》系列PDF迷你書和技能圖譜。