定義添加神經層的函數html
1.訓練的數據
2.定義節點準備接收數據
3.定義神經層:隱藏層和預測層
4.定義 loss 表達式
5.選擇 optimizer 使 loss 達到最小python
而後對全部變量進行初始化,經過 sess.run optimizer,迭代 1000 次進行學習:api
import tensorflow as tf import numpy as np # 添加層 def add_layer(inputs, in_size, out_size, activation_function=None): # add one more layer and return the output of this layer Weights = tf.Variable(tf.random_normal([in_size, out_size])) biases = tf.Variable(tf.zeros([1, out_size]) + 0.1) Wx_plus_b = tf.matmul(inputs, Weights) + biases if activation_function is None: outputs = Wx_plus_b else: outputs = activation_function(Wx_plus_b) return outputs # 1.訓練的數據 # Make up some real data x_data = np.linspace(-1,1,300)[:, np.newaxis] noise = np.random.normal(0, 0.05, x_data.shape) y_data = np.square(x_data) - 0.5 + noise # 2.定義節點準備接收數據 # define placeholder for inputs to network xs = tf.placeholder(tf.float32, [None, 1]) ys = tf.placeholder(tf.float32, [None, 1]) # 3.定義神經層:隱藏層和預測層 # add hidden layer 輸入值是 xs,在隱藏層有 10 個神經元 l1 = add_layer(xs, 1, 10, activation_function=tf.nn.relu) # add output layer 輸入值是隱藏層 l1,在預測層輸出 1 個結果 prediction = add_layer(l1, 10, 1, activation_function=None) # 4.定義 loss 表達式 # the error between prediciton and real data loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction), reduction_indices=[1])) # 5.選擇 optimizer 使 loss 達到最小 # 這一行定義了用什麼方式去減小 loss,學習率是 0.1 train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss) # important step 對全部變量進行初始化 init = tf.initialize_all_variables() sess = tf.Session() # 上面定義的都沒有運算,直到 sess.run 纔會開始運算 sess.run(init) # 迭代 1000 次學習,sess.run optimizer for i in range(1000): # training train_step 和 loss 都是由 placeholder 定義的運算,因此這裏要用 feed 傳入參數 sess.run(train_step, feed_dict={xs: x_data, ys: y_data}) if i % 50 == 0: # to see the step improvement print(sess.run(loss, feed_dict={xs: x_data, ys: y_data}))
import tensorflow as tf import numpy as np
x_data = np.random.rand(100).astype(np.float32)
y_data = x_data*0.1 + 0.3
Weights = tf.Variable(tf.random_uniform([1], -1.0, 1.0)) biases = tf.Variable(tf.zeros([1])) y = Weights*x_data + biases loss = tf.reduce_mean(tf.square(y-y_data))
optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(loss)
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
for step in range(201):
sess.run(train)
if step % 20 == 0: print(step, sess.run(Weights), sess.run(biases))
因此關鍵的就是 y,loss,optimizer 是如何定義的。瀏覽器
在 TensorFlow 入門 也提到了幾個基本概念,這裏是幾個常見的用法。網絡
矩陣乘法:tf.matmul框架
product = tf.matmul(matrix1, matrix2) # matrix multiply np.dot(m1, m2)
定義 Session,它是個對象,注意大寫:dom
sess = tf.Session()
result 要去 sess.run 那裏取結果:ide
result = sess.run(product)
用 tf.Variable 定義變量,與python不一樣的是,必須先定義它是一個變量,它纔是一個變量,初始值爲0,還能夠給它一個名字 counter:函數
state = tf.Variable(0, name='counter')
將 new_value 加載到 state 上,counter就被更新:工具
update = tf.assign(state, new_value)
若是有變量就必定要作初始化:
init = tf.initialize_all_variables() # must have if define variable
要給節點輸入數據時用 placeholder,在 TensorFlow 中用placeholder 來描述等待輸入的節點,只須要指定類型便可,而後在執行節點的時候用一個字典來「喂」這些節點。至關於先把變量 hold 住,而後每次從外部傳入data,注意 placeholder 和 feed_dict 是綁定用的。
這裏簡單提一下 feed 機制, 給 feed 提供數據,做爲 run()
調用的參數, feed 只在調用它的方法內有效, 方法結束, feed 就會消失。
import tensorflow as tf input1 = tf.placeholder(tf.float32) input2 = tf.placeholder(tf.float32) ouput = tf.mul(input1, input2) with tf.Session() as sess: print(sess.run(ouput, feed_dict={input1: [7.], input2: [2.]}))
例如一個神經元對貓的眼睛敏感,那當它看到貓的眼睛的時候,就被激勵了,相應的參數就會被調優,它的貢獻就會越大。
下面是幾種常見的激活函數:
x軸表示傳遞過來的值,y軸表示它傳遞出去的值:
激勵函數在預測層,判斷哪些值要被送到預測結果那裏:
TensorFlow 經常使用的 activation function
輸入參數有 inputs, in_size, out_size, 和 activation_function
import tensorflow as tf def add_layer(inputs, in_size, out_size, activation_function=None): Weights = tf.Variable(tf.random_normal([in_size, out_size])) biases = tf.Variable(tf.zeros([1, out_size]) + 0.1) Wx_plus_b = tf.matmul(inputs, Weights) + biases if activation_function is None: outputs = Wx_plus_b else: outputs = activation_function(Wx_plus_b) return outputs
# the error between prediction and real data # loss 函數用 cross entropy cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys * tf.log(prediction), reduction_indices=[1])) # loss train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
下面第三個圖就是 overfitting,就是過分準確地擬合了歷史數據,而對新數據預測時就會有很大偏差:
Tensorflow 有一個很好的工具, 叫作dropout, 只須要給予它一個不被 drop 掉的百分比,就能很好地下降 overfitting。
dropout 是指在深度學習網絡的訓練過程當中,按照必定的機率將一部分神經網絡單元暫時從網絡中丟棄,至關於從原始的網絡中找到一個更瘦的網絡,這篇博客中講的很是詳細
代碼實現就是在 add layer 函數里加上 dropout, keep_prob 就是保持多少不被 drop,在迭代時在 sess.run 中被 feed:
def add_layer(inputs, in_size, out_size, layer_name, activation_function=None, ): # add one more layer and return the output of this layer Weights = tf.Variable(tf.random_normal([in_size, out_size])) biases = tf.Variable(tf.zeros([1, out_size]) + 0.1, ) Wx_plus_b = tf.matmul(inputs, Weights) + biases # here to dropout # 在 Wx_plus_b 上drop掉必定比例 # keep_prob 保持多少不被drop,在迭代時在 sess.run 中 feed Wx_plus_b = tf.nn.dropout(Wx_plus_b, keep_prob) if activation_function is None: outputs = Wx_plus_b else: outputs = activation_function(Wx_plus_b, ) tf.histogram_summary(layer_name + '/outputs', outputs) return outputs
Tensorflow 自帶 tensorboard ,能夠自動顯示咱們所建造的神經網絡流程圖:
就是用 with tf.name_scope 定義各個框架,注意看代碼註釋中的區別:
import tensorflow as tf def add_layer(inputs, in_size, out_size, activation_function=None): # add one more layer and return the output of this layer # 區別:大框架,定義層 layer,裏面有 小部件 with tf.name_scope('layer'): # 區別:小部件 with tf.name_scope('weights'): Weights = tf.Variable(tf.random_normal([in_size, out_size]), name='W') with tf.name_scope('biases'): biases = tf.Variable(tf.zeros([1, out_size]) + 0.1, name='b') with tf.name_scope('Wx_plus_b'): Wx_plus_b = tf.add(tf.matmul(inputs, Weights), biases) if activation_function is None: outputs = Wx_plus_b else: outputs = activation_function(Wx_plus_b, ) return outputs # define placeholder for inputs to network # 區別:大框架,裏面有 inputs x,y with tf.name_scope('inputs'): xs = tf.placeholder(tf.float32, [None, 1], name='x_input') ys = tf.placeholder(tf.float32, [None, 1], name='y_input') # add hidden layer l1 = add_layer(xs, 1, 10, activation_function=tf.nn.relu) # add output layer prediction = add_layer(l1, 10, 1, activation_function=None) # the error between prediciton and real data # 區別:定義框架 loss with tf.name_scope('loss'): loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys - prediction), reduction_indices=[1])) # 區別:定義框架 train with tf.name_scope('train'): train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss) sess = tf.Session() # 區別:sess.graph 把全部框架加載到一個文件中放到文件夾"logs/"裏 # 接着打開terminal,進入你存放的文件夾地址上一層,運行命令 tensorboard --logdir='logs/' # 會返回一個地址,而後用瀏覽器打開這個地址,在 graph 標籤欄下打開 writer = tf.train.SummaryWriter("logs/", sess.graph) # important step sess.run(tf.initialize_all_variables())
運行完上面代碼後,打開 terminal,進入你存放的文件夾地址上一層,運行命令 tensorboard --logdir='logs/' 後會返回一個地址,而後用瀏覽器打開這個地址,點擊 graph 標籤欄下就能夠看到流程圖了:
訓練好了一個神經網絡後,能夠保存起來下次使用時再次加載:
import tensorflow as tf import numpy as np ## Save to file # remember to define the same dtype and shape when restore W = tf.Variable([[1,2,3],[3,4,5]], dtype=tf.float32, name='weights') b = tf.Variable([[1,2,3]], dtype=tf.float32, name='biases') init= tf.initialize_all_variables() saver = tf.train.Saver() # 用 saver 將全部的 variable 保存到定義的路徑 with tf.Session() as sess: sess.run(init) save_path = saver.save(sess, "my_net/save_net.ckpt") print("Save to path: ", save_path) ################################################ # restore variables # redefine the same shape and same type for your variables W = tf.Variable(np.arange(6).reshape((2, 3)), dtype=tf.float32, name="weights") b = tf.Variable(np.arange(3).reshape((1, 3)), dtype=tf.float32, name="biases") # not need init step saver = tf.train.Saver() # 用 saver 從路徑中將 save_net.ckpt 保存的 W 和 b restore 進來 with tf.Session() as sess: saver.restore(sess, "my_net/save_net.ckpt") print("weights:", sess.run(W)) print("biases:", sess.run(b))
tensorflow 如今只能保存 variables,還不能保存整個神經網絡的框架,因此再使用的時候,須要從新定義框架,而後把 variables 放進去學習。