Tensorflow 筆記

時間 2019-11-09

標籤 tensorflow 筆記简体版

原文原文鏈接

1、神經網絡實現過程

　　一、準備數據集，提取特徵，做爲輸入餵給神經網絡。算法

　　二、搭建NN結構，從輸入到輸出。（先搭建計算圖，再用會話執行）網絡

　　　（NN前向傳播算法——>計算輸出）app

　　三、大量特徵餵給NN，迭代優化NN參數。dom

　　　（NN反向傳播算法——>優化參數訓練模型）函數

　　四、使用訓練好的模型預測和分類。學習

2、Tensorflow主要函數

　一、變量的聲名函數 tf.Variable( initialize , name= )

　　i）參數initializer是初始化參數，name是可自定義的變量名稱優化

　　ii）函數做用：保存和更新神經網絡中的參數spa

　　iii）張量與變量的關係：tf.Variable是一個運算，其輸出結果是一個張量。變量是一種特殊的張量。　.net

iiii）參數shape 的一列表示一個變量的權重。code

`二、生成張量的常見函數`

　　tf.zeros(shape, dtype=, name=)

　　tf.zeros_like(tensor, dtype=, name=)

　　tf.constant(value, dtype=, shape=, name=)

　　tf.fill(dims, value, name=)

　　tf.ones_like(tensor, dtype=, name=)

　　tf.ones(shape, dtype=, name=)

　三、生成隨機數函數

　　tf.random_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)

　　tf.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)

　　tf.random_uniform(shape, minval=0.0, maxval=1.0, dtype=tf.float32, seed=None, name=None)

　　tf.random_shuffle(value, seed=None, name=None)

　四、佔位函數 tf.placeholder(dtype,shape=,name=)

x=tf.placeholder(tf.float32,shape=(None,3), name='input')
...
sess.run( y ,feed_dict={ x:[[1,2,3]]  } )

　　i）參數dtype爲數據類型，shape爲維度，name爲名稱

　　ii）經過會話執行計算時，經過feed_dict字典指定相應值

　五、矩陣乘法 tf.matmul( matrix1 , matrix2 )

　　i）matrix1*matrix2 表示矩陣內每一個元素對應相乘

　六、全部變量（參數）初始化 tf.global_variables_initializer()

###第一種
init_op=tf.global_variable_initializer()
sess.run(init_op)

###第二種
init_op=tf.initialize_all_variables()
sess.rund(init_op)

　　i）因爲變量之間可能存在依賴關係，單個調用方案比較麻煩？

　七、激活函數

　　tf.nn.relu()
　　tf.nn.sigmoid()
　　tf.nn.tanh()
　　tf.nn.elu()
　　tf.nn.bias_add()
　　tf.nn.crelu()
　　tf.nn.relu6()
　　tf.nn.softplus()
　　tf.nn.softsign()
　　tf.nn.dropout()#防止過擬合，用來捨棄某些神經元

　八、損失函數　　　

　　i）均方根偏差（MSE） —— 迴歸問題中最經常使用的損失函數。優勢是便於梯度降低，偏差大時降低快，偏差小時降低慢，有利於函數收斂。缺點是受明顯偏離正常範圍的離羣樣本的影響較大

　　 loss=tf.losses.mean_squared_error(y_true,y_pred)
　　 loss=tf.reduce_mean( tf.square( y_true,y_pred ) ) 上一函數的另外一實現　　

　　ii）平均絕對偏差（MAE） —— 想格外加強對離羣樣本的健壯性時使用

　　 mae = tf.losses.absolute_difference(y_true,y_pred )
　　 mae.loss = tf.reduce_sum( mae)

　九、自定義損失函數

　　tf.greater(x, y)：判斷 x 是否大於 y，當維度不一致時廣播後比較
　　tf.where(condition, x, y)：當 condition 爲 true 時返回 x，不然返回 y
　　tf.reduce_mean()：沿維度求平均
　　tf.reduce_sum()：沿維度相加
　　tf.reduce_prod()：沿維度相乘
　　tf.reduce_min()：沿維度找最小
　　tf.reduce_max()：沿維度找最大

　十、優化函數

　　tf.train.GradientDescentOptimizer
　　tf.train.AdadeltaOptimizer
　　tf.train.AdagradOptimizer
　　tf.train.AdagradDAOptimizer
　　tf.train.MomentumOptimizer
　　tf.train.AdamOptimizer
　　tf.train.FtrlOptimizer
　　tf.train.RMSPropOptimizer

　十一、正則化函數（在損失函數中引入模型複雜度指標，利用W加權值，弱化訓練數據的噪聲，通常不正則化 b 偏置）

　　tensorflow 中的兩種正則化函數　　

　　i）L1正則化 : tf.contrib.layers.l1_regularizer(REGULARIZER)(w) =》 loss_l1( w )= Σ | w_i|

　　ii）L2正則化 : tf.contrib.layers.l2_regularizer(REGULARIZER)(w) =》 loss_l2( w )= ∑ | w_i² |

　　　　使用例子：

w=tf.Variable(...)
y=tf.matmul( x,w )

loss=tf.reduce_mean(tf.square(y_ - y ))



#第一種
loss_total=loss+ tf.contrib.layers.l2_regularizer(regularizer_rate)(w)    #或者regularizer=tf.tf.contrib.layers.l2_regularizer(regularizer_rate) ; regular_loss=regularizer(w)

#第二種  將正則項加入名爲losses 集合。
tf.add_to_collection ( 'losses' ,  tf.contrib.layers.l2_regularizer(regularizer_rate)(w) )
loss_total =loss+tf.add_n( tf.get_collection('losses') ) 

#!!經常使用的方法是第二種
#緣由：一、第一種方式致使損失函數loss的定義很長
#          二、定義網絡結構的部分和計算損失函數的部分可能不在同一函數中，此時使用第一種方式不方便1

　十二、學習率設置函數: learning_rate = tf.train.exponential_decay( learning_rate_base,global_step,decay_step, decay_rate , staircase=True/False)

　　　i）該函數實現的是如下功能：learning_rate = learning_rate_base * decay_rate ^ (global_step / decay_step)

　　　ii）參數learning_rate_base表示初始學習率，global_step表示當前迭代次數，是不可訓練參數，decay_step表示衰減速度，decay_rate表示衰減率，staircase=True表示每訓練decay_step輪更新一次學習率，不然每輪更新一次。

　　　　使用例子:

global_step=tf.Variable(0,trainable=False)
learning_rate = tf.train.exponential_decay( learning_rate_base,global_step,decay_step,decay_rate,staricase=True)  
train_=tf.train.GradientDescentOptimizer(learning_rate).minimize(...loss...,global_step=global_step)

　1三、滑動平均模型 tf.train.ExponentialMovingAverage( decay,step )

　　　i）基本思想：在使用梯度降低算法訓練中，每次更新權重時，爲每一個權重維護一個影子變量，該影子變量隨着訓練的進行，最終會穩定在一個接近真實權重的值附近，那麼在進行預測時，使用影子變量的值能夠獲得更好的結果

　　　其中：shadowVariable=decay∗shadowVariable+(1−decay)∗variable　，具體可看 https://blog.csdn.net/u012436149/article/details/56484572 　　　　　　 $shadowVariable = d e c a y * shadowVariable + (1 - d e c a y) * v a r i a b l e$

　　　ii）參數decay表示衰減率，step用於decay= min（decay，（1+step）/（10+step））來進行手動設置decay

　　　iii）decay一般設置爲很是接近1的數（好比0.999或0.9999）.

　　　　使用例子：

step= tf.Variable( 0,trainable=False )  #用於模擬神經網絡中迭代的輪數
v1=tf.Variable(0,dtype=tf.float32)       # 滑動平均模型變量

第一種
#定義一個滑動平均類。初始化給定衰減率和控制衰減率的變量
ema  =  tf.train.ExponentialMovingAverage( 0.99, step )

#定義一個更新變量滑動平均的操做。須要給定一個列表，每次執行此操做，列表的變量會更新
maintain_average_op = ema.apply( [v1] ) #實際應用中使用 maintain_average_op = ema.apply( tf.trainable_variables())  全部變量求滑動平均 ； tf.trainable_variables()將全部可訓練變量造成列表
...
sess.run(maintain_average_op)  #每迭代優化一次v1 都要加上此語句。

第二種
ema= tf.train.ExponentialMovingAverage( 0.99, step )
maintain_average_op = ema.apply( tf.trainable_variables()) 
#將訓練過程和滑動平均綁定在一塊兒運行，合成一個節點
with tf.control_dependencies( [train_step,maintain_average_op] ): train_op=tf.no_op(name='train') #使用ema.average( 參數名 ) 查看某參數的滑動平均