如何用RNN進行股票價格預測

RNN它是一種處理時間序列數據很是流行的模型，在NLP、時間序列預測等領域已經展現出了很大的做用。因爲本文注重的是RNN的實踐，不是對RNN的理論知識的講解，因此感興趣的能夠去系統地學習RNN。web

下面的例子是經過tensorflow進行實現的。使用tensorflow實現rnn或者lstm很方便，只需建立rnn或者lstm神經單元，而後建立網絡就能夠了，可是rnn或者lstm不一樣於常規的nn神經網絡，由於它是處理時間序列的，因此在進行batch訓練時，對數據格式的要求也不同。
sql

下面是一個常見的RNN模型：ruby

數據預處理
微信

首先，咱們須要作的就是先導入依賴庫，對數據進行劃分訓練集和測試集。以及將數據進行歸一化處理：網絡

import pandas as pdimport numpy as npimport tensorflow as tfimport matplotlib.pyplot as pltfrom sklearn.cluster import k_meansfrom sklearn.preprocessing import MinMaxScaler        df = pd.read_csv('SP_2000_2017_Daily.csv')    data = df[:4100].Close.as_matrix().astype(float) data = np.reshape(data, (-1, 1)) mm = MinMaxScaler(feature_range=(-1,1)) data = mm.fit_transform(data)    print(data.shape) x, y = set_window(data)
# --------------------------劃分訓練和測試數據集-------------------- train_x = x[:2000] train_y = y[:2000] test_x = x[2000:4000] test_y = y[2000:4000]

因爲RNN的結構須要按照窗口進行輸入，因此還須要對數據進行劃分窗口：
app

def set_window(data, windowSize = WINDOW_SIZE): x = [] label = [] length = len(data) for i in range(length - windowSize): x.append(data[i:i+windowSize, 0]) label.append(data[i+windowSize]) return x, labeldef set_window(data, windowSize = WINDOW_SIZE): x = [] label = [] length = len(data) for i in range(length - windowSize): x.append(data[i:i+windowSize, 0]) label.append(data[i+windowSize]) return x, label

網絡搭建less

數據處理好了以後，就須要對網絡進行搭建了。ide

接受的訓練數據是一個[batchSize, 4]的shape，就是用前4個數據預測後一個。函數

hidden_size是一個lstm或者rnn單元的神經元的個數，也就是結構圖中的一個方框A，能夠想象其中有這麼多個神經元。學習

class RNN(object): def __init__(self): self.stateNum = WINDOW_SIZE self.batchSize = 50 self.time_step = 20 # 時間步 self.hidden_size = 100 # 隱層單元數目 self._build_net() self.sess = tf.Session() self.sess.run(tf.global_variables_initializer())
 def _build_net(self): self.x = tf.placeholder(tf.float32, [None, self.stateNum]) self.y = tf.placeholder(tf.float32, [None, 1])
 w = tf.Variable(tf.truncated_normal([self.hidden_size, 1], stddev=0.1)) b = tf.Variable(tf.constant(0.1, shape=[1]))
        input_data = tf.reshape(self.x, [-1, self.stateNum, 1]) rnn_cell = tf.nn.rnn_cell.BasicRNNCell(self.hidden_size)
 batchSize = tf.shape(self.x)[0]
        init_state = rnn_cell.zero_state(batchSize, tf.float32) outputs_rnn, final_state = tf.nn.dynamic_rnn(rnn_cell, input_data, dtype=tf.float32, initial_state=init_state)        output = tf.reshape(outputs_rnn, [-1, self.hidden_size]) self.prediction = tf.matmul(final_state, w) + b
 self.loss = tf.reduce_mean(tf.square(self.y - self.prediction)) self.train = tf.train.AdamOptimizer(0.001).minimize(self.loss)   def train_net(self, x, label): batch_num = len(x) // self.batchSize for epoch in range(100): loss_sum = 0 for i in range(batch_num): dict = {self.x: x[i * self.batchSize:(i + 1) * self.batchSize], self.y: label[i * self.batchSize:(i + 1) * self.batchSize]} loss, _, pre = self.sess.run([self.loss, self.train, self.prediction], feed_dict=dict) # print(pre) loss_sum += loss print(str(epoch) + str(':') + str(loss_sum))
    def predict(self, x): dict = {self.x: x} prediction = self.sess.run(self.prediction, feed_dict=dict) return prediction

上面的代碼有關鍵的地方就是爲何要reshape成這種結構？

首先reshape的這個結構是做爲tf.nn.dynamic_rnn的參數傳入的，咱們先看一下這個函數的參數介紹：

 cell: An instance of RNNCell. inputs: The RNN inputs. If `time_major == False` (default), this must be a `Tensor` of shape: `[batch_size, max_time, ...]`, or a nested tuple of such elements. If `time_major == True`, this must be a `Tensor` of shape: `[max_time, batch_size, ...]`, or a nested tuple of such elements. This may also be a (possibly nested) tuple of Tensors satisfying this property. The first two dimensions must match across all the inputs, but otherwise the ranks and other shape components may differ. In this case, input to `cell` at each time-step will replicate the structure of these tuples, except for the time dimension (from which the time is taken). The input to `cell` at each time step will be a `Tensor` or (possibly nested) tuple of Tensors each with dimensions `[batch_size, ...]`. sequence_length: (optional) An int32/int64 vector sized `[batch_size]`. Used to copy-through state and zero-out outputs when past a batch element's sequence length. So it's more for correctness than performance. initial_state: (optional) An initial state for the RNN. If `cell.state_size` is an integer, this must be a `Tensor` of appropriate type and shape `[batch_size, cell.state_size]`. If `cell.state_size` is a tuple, this should be a tuple of tensors having shapes `[batch_size, s] for s in cell.state_size`. dtype: (optional) The data type for the initial state and expected output. Required if initial_state is not provided or RNN state has a heterogeneous dtype. parallel_iterations: (Default: 32). The number of iterations to run in parallel. Those operations which do not have any temporal dependency and can be run in parallel, will be. This parameter trades off time for space. Values >> 1 use more memory but take less time, while smaller values use less memory but computations take longer. swap_memory: Transparently swap the tensors produced in forward inference but needed for back prop from GPU to CPU. This allows training RNNs which would typically not fit on a single GPU, with very minimal (or no) performance penalty. time_major: The shape format of the `inputs` and `outputs` Tensors. If true, these `Tensors` must be shaped `[max_time, batch_size, depth]`. If false, these `Tensors` must be shaped `[batch_size, max_time, depth]`. Using `time_major = True` is a bit more efficient because it avoids transposes at the beginning and end of the RNN calculation. However, most TensorFlow data is batch-major, so by default this function accepts input and emits output in batch-major form. scope: VariableScope for the created subgraph; defaults to "rnn".

參數介紹有點長，先看一下關於inputs的介紹，其中說了inputs的格式是[batch_size, max_time, .....]這裏的max_time的意思是這個rnn網絡在展開的時候有多長，就是圖中所示t這麼長。而後這個.....的意思就是每次的輸入這個x的維度了。這樣的話reshape爲[-1, 4, 1]就好解釋了-1就是不用管這個維度，若是咱們的batch_size是50的話，-1這裏就被計算爲50*4/4/1=50了，也就是分50次輸入rnn網絡，每次輸入長度是4，就是有4個方框，每一個方框接受的數據是1維的。

輸出是兩個結果，一個是outputs，一個是state。

outputs輸出是一個[batch_size, max_time, cell.out_size]shape的輸出。對於咱們設計的網絡，相對應的就是一個shape[50, 4, 100]其實就是這一個batch中50組輸入的數據經過這4個單元每一個單元100個神經元的輸出。用常規的NN來比較，正常的NN輸出應該是一個相似[50, 100]的輸出，可是rnn經過展開後，獲得的就是[50, 4, 100]的一個結果。

理解了outputs後，再理解state就好說了。這個state就是final state，也就是batch中一個組數據輸入後，最後一個單元的神經元的輸出，由於設置了100個神經元，因此總的輸出就是[50, 100]。在rnn模型圖中就是對應最後一個A方框的隱層輸出結果。

模型訓練

模型訓練部分很簡單，就是實例化RNN對象，而後調用其中的方法進行對訓練集訓練，而後拿測試集進行測試：

 rnn = RNN() rnn.train_net(train_x, train_y)
 result = rnn.predict(test_x) prediction = mm.inverse_transform(result) y = mm.inverse_transform(test_y) print(result) print(prediction)
 fig = plt.figure() ax = fig.add_subplot(111) ax.plot(range(len(prediction)), prediction) ax.plot(range(len(y)), y) ax.legend(['prediction', 'true'])
 cal_accr(prediction, y)
 plt.show()

訓練結果

因爲沒有對神經元的個數以及窗口進行調參，因此獲得的預測結果並非很理想，有興趣的能夠拿本身的數據跑一下以及參數進行調整：

本文分享自微信公衆號 - 人工智能學術前沿（AI_Frontier）。
若有侵權，請聯繫 support@oschina.cn 刪除。
本文參與「OSC源創計劃」，歡迎正在閱讀的你也加入，一塊兒分享。