1.搭建神經網絡基本流程

時間 2019-11-12

標籤搭建神經網絡基本流程简体版

原文原文鏈接

定義添加神經層的函數html

1.訓練的數據
2.定義節點準備接收數據
3.定義神經層：隱藏層和預測層
4.定義 loss 表達式
5.選擇 optimizer 使 loss 達到最小python

而後對全部變量進行初始化，經過 sess.run optimizer，迭代 1000 次進行學習：api

import tensorflow as tf
import numpy as np

# 添加層
def add_layer(inputs, in_size, out_size, activation_function=None):
    # add one more layer and return the output of this layer
    Weights = tf.Variable(tf.random_normal([in_size, out_size]))
    biases = tf.Variable(tf.zeros([1, out_size]) + 0.1)
    Wx_plus_b = tf.matmul(inputs, Weights) + biases
    if activation_function is None:
        outputs = Wx_plus_b
    else:
        outputs = activation_function(Wx_plus_b)
    return outputs

# 1.訓練的數據
# Make up some real data 
x_data = np.linspace(-1,1,300)[:, np.newaxis]
noise = np.random.normal(0, 0.05, x_data.shape)
y_data = np.square(x_data) - 0.5 + noise

# 2.定義節點準備接收數據
# define placeholder for inputs to network  
xs = tf.placeholder(tf.float32, [None, 1])
ys = tf.placeholder(tf.float32, [None, 1])

# 3.定義神經層：隱藏層和預測層
# add hidden layer 輸入值是 xs，在隱藏層有 10 個神經元   
l1 = add_layer(xs, 1, 10, activation_function=tf.nn.relu)
# add output layer 輸入值是隱藏層 l1，在預測層輸出 1 個結果
prediction = add_layer(l1, 10, 1, activation_function=None)

# 4.定義 loss 表達式
# the error between prediciton and real data    
loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction),
                     reduction_indices=[1]))

# 5.選擇 optimizer 使 loss 達到最小                   
# 這一行定義了用什麼方式去減小 loss，學習率是 0.1       
train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)


# important step 對全部變量進行初始化
init = tf.initialize_all_variables()
sess = tf.Session()
# 上面定義的都沒有運算，直到 sess.run 纔會開始運算
sess.run(init)

# 迭代 1000 次學習，sess.run optimizer
for i in range(1000):
    # training train_step 和 loss 都是由 placeholder 定義的運算，因此這裏要用 feed 傳入參數
    sess.run(train_step, feed_dict={xs: x_data, ys: y_data})
    if i % 50 == 0:
        # to see the step improvement
        print(sess.run(loss, feed_dict={xs: x_data, ys: y_data}))

2. 主要步驟的解釋：

以前寫過一篇文章 TensorFlow 入門講了 tensorflow 的安裝，這裏使用時直接導入：
```
import tensorflow as tf
import numpy as np
```

導入或者隨機定義訓練的數據 x 和 y：

x_data = np.random.rand(100).astype(np.float32)
y_data = x_data*0.1 + 0.3

先定義出參數 Weights，biases，擬合公式 y，偏差公式 loss：

Weights = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
biases = tf.Variable(tf.zeros([1]))
y = Weights*x_data + biases
loss = tf.reduce_mean(tf.square(y-y_data))

選擇 Gradient Descent 這個最基本的 Optimizer：

optimizer = tf.train.GradientDescentOptimizer(0.5)

神經網絡的 key idea，就是讓 loss 達到最小：

train = optimizer.minimize(loss)

前面是定義，在運行模型前先要初始化全部變量：

init = tf.initialize_all_variables()

接下來把結構激活，sesseion像一個指針指向要處理的地方：

sess = tf.Session()

init 就被激活了，不要忘記激活：

sess.run(init)

訓練201步：

for step in range(201):

要訓練 train，也就是 optimizer：

sess.run(train)

每 20 步打印一下結果，sess.run 指向 Weights，biases 並被輸出：

if step % 20 == 0:
print(step, sess.run(Weights), sess.run(biases))

因此關鍵的就是 y，loss，optimizer 是如何定義的。瀏覽器

3. TensorFlow 基本概念及代碼：

在 TensorFlow 入門也提到了幾個基本概念，這裏是幾個常見的用法。網絡

Session

矩陣乘法：tf.matmul框架

product = tf.matmul(matrix1, matrix2) # matrix multiply np.dot(m1, m2)

定義 Session，它是個對象，注意大寫：dom

sess = tf.Session()

result 要去 sess.run 那裏取結果：ide

result = sess.run(product)

Variable

用 tf.Variable 定義變量，與python不一樣的是，必須先定義它是一個變量，它纔是一個變量，初始值爲0，還能夠給它一個名字 counter：函數

state = tf.Variable(0, name='counter')

將 new_value 加載到 state 上，counter就被更新：工具

update = tf.assign(state, new_value)

若是有變量就必定要作初始化：

init = tf.initialize_all_variables() # must have if define variable

placeholder：

要給節點輸入數據時用 placeholder，在 TensorFlow 中用placeholder 來描述等待輸入的節點，只須要指定類型便可，而後在執行節點的時候用一個字典來「喂」這些節點。至關於先把變量 hold 住，而後每次從外部傳入data，注意 placeholder 和 feed_dict 是綁定用的。

這裏簡單提一下 feed 機制，給 feed 提供數據，做爲 run()
調用的參數， feed 只在調用它的方法內有效, 方法結束, feed 就會消失。

import tensorflow as tf

input1 = tf.placeholder(tf.float32)
input2 = tf.placeholder(tf.float32)
ouput = tf.mul(input1, input2)

with tf.Session() as sess:
  print(sess.run(ouput, feed_dict={input1: [7.], input2: [2.]}))

4. 神經網絡基本概念

激勵函數：

例如一個神經元對貓的眼睛敏感，那當它看到貓的眼睛的時候，就被激勵了，相應的參數就會被調優，它的貢獻就會越大。

下面是幾種常見的激活函數：
x軸表示傳遞過來的值，y軸表示它傳遞出去的值：

激勵函數在預測層，判斷哪些值要被送到預測結果那裏：

TensorFlow 經常使用的 activation function

添加神經層：

輸入參數有 inputs, in_size, out_size, 和 activation_function

import tensorflow as tf

def add_layer(inputs, in_size, out_size,  activation_function=None):

  Weights = tf.Variable(tf.random_normal([in_size, out_size]))
  biases = tf.Variable(tf.zeros([1, out_size]) + 0.1)
  Wx_plus_b = tf.matmul(inputs, Weights) + biases

  if activation_function is None:
    outputs = Wx_plus_b
  else:
    outputs = activation_function(Wx_plus_b)

return outputs

分類問題的 loss 函數 cross_entropy ：

# the error between prediction and real data
# loss 函數用 cross entropy
cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction),
                                              reduction_indices=[1]))       # loss
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

overfitting：

下面第三個圖就是 overfitting，就是過分準確地擬合了歷史數據，而對新數據預測時就會有很大偏差：

Tensorflow 有一個很好的工具, 叫作dropout, 只須要給予它一個不被 drop 掉的百分比，就能很好地下降 overfitting。

dropout 是指在深度學習網絡的訓練過程當中，按照必定的機率將一部分神經網絡單元暫時從網絡中丟棄，至關於從原始的網絡中找到一個更瘦的網絡，這篇博客中講的很是詳細

代碼實現就是在 add layer 函數里加上 dropout, keep_prob 就是保持多少不被 drop，在迭代時在 sess.run 中被 feed:

def add_layer(inputs, in_size, out_size, layer_name, activation_function=None, ):
    # add one more layer and return the output of this layer
    Weights = tf.Variable(tf.random_normal([in_size, out_size]))
    biases = tf.Variable(tf.zeros([1, out_size]) + 0.1, )
    Wx_plus_b = tf.matmul(inputs, Weights) + biases

    # here to dropout
    # 在 Wx_plus_b 上drop掉必定比例
    # keep_prob 保持多少不被drop，在迭代時在 sess.run 中 feed
    Wx_plus_b = tf.nn.dropout(Wx_plus_b, keep_prob)

    if activation_function is None:
        outputs = Wx_plus_b
    else:
        outputs = activation_function(Wx_plus_b, )
    tf.histogram_summary(layer_name + '/outputs', outputs)  
    return outputs

5. 可視化 Tensorboard

Tensorflow 自帶 tensorboard ，能夠自動顯示咱們所建造的神經網絡流程圖：

就是用 with tf.name_scope 定義各個框架，注意看代碼註釋中的區別：

import tensorflow as tf


def add_layer(inputs, in_size, out_size, activation_function=None):
    # add one more layer and return the output of this layer
    # 區別：大框架，定義層 layer，裏面有 小部件
    with tf.name_scope('layer'):
        # 區別：小部件
        with tf.name_scope('weights'):
            Weights = tf.Variable(tf.random_normal([in_size, out_size]), name='W')
        with tf.name_scope('biases'):
            biases = tf.Variable(tf.zeros([1, out_size]) + 0.1, name='b')
        with tf.name_scope('Wx_plus_b'):
            Wx_plus_b = tf.add(tf.matmul(inputs, Weights), biases)
        if activation_function is None:
            outputs = Wx_plus_b
        else:
            outputs = activation_function(Wx_plus_b, )
        return outputs


# define placeholder for inputs to network
# 區別：大框架，裏面有 inputs x，y
with tf.name_scope('inputs'):
    xs = tf.placeholder(tf.float32, [None, 1], name='x_input')
    ys = tf.placeholder(tf.float32, [None, 1], name='y_input')

# add hidden layer
l1 = add_layer(xs, 1, 10, activation_function=tf.nn.relu)
# add output layer
prediction = add_layer(l1, 10, 1, activation_function=None)

# the error between prediciton and real data
# 區別：定義框架 loss
with tf.name_scope('loss'):
    loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction),
                                        reduction_indices=[1]))

# 區別：定義框架 train
with tf.name_scope('train'):
    train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)

sess = tf.Session()

# 區別：sess.graph 把全部框架加載到一個文件中放到文件夾"logs/"裏 
# 接着打開terminal，進入你存放的文件夾地址上一層，運行命令 tensorboard --logdir='logs/'
# 會返回一個地址，而後用瀏覽器打開這個地址，在 graph 標籤欄下打開
writer = tf.train.SummaryWriter("logs/", sess.graph)
# important step
sess.run(tf.initialize_all_variables())

運行完上面代碼後，打開 terminal，進入你存放的文件夾地址上一層，運行命令 tensorboard --logdir='logs/' 後會返回一個地址，而後用瀏覽器打開這個地址，點擊 graph 標籤欄下就能夠看到流程圖了：

6. 保存和加載

訓練好了一個神經網絡後，能夠保存起來下次使用時再次加載：

import tensorflow as tf
import numpy as np

## Save to file
# remember to define the same dtype and shape when restore
W = tf.Variable([[1,2,3],[3,4,5]], dtype=tf.float32, name='weights')
b = tf.Variable([[1,2,3]], dtype=tf.float32, name='biases')

init= tf.initialize_all_variables()

saver = tf.train.Saver()

# 用 saver 將全部的 variable 保存到定義的路徑
with tf.Session() as sess:
   sess.run(init)
   save_path = saver.save(sess, "my_net/save_net.ckpt")
   print("Save to path: ", save_path)


################################################

# restore variables
# redefine the same shape and same type for your variables
W = tf.Variable(np.arange(6).reshape((2, 3)), dtype=tf.float32, name="weights")
b = tf.Variable(np.arange(3).reshape((1, 3)), dtype=tf.float32, name="biases")

# not need init step

saver = tf.train.Saver()
# 用 saver 從路徑中將 save_net.ckpt 保存的 W 和 b restore 進來
with tf.Session() as sess:
    saver.restore(sess, "my_net/save_net.ckpt")
    print("weights:", sess.run(W))
    print("biases:", sess.run(b))