tensorflow：tensorflow進階

一.合併與分割

import  matplotlib
from    matplotlib import pyplot as plt
# Default parameters for plots
matplotlib.rcParams['font.size'] = 20
matplotlib.rcParams['figure.titlesize'] = 20
matplotlib.rcParams['figure.figsize'] = [9, 7]
matplotlib.rcParams['font.family'] = ['STKaiTi']
matplotlib.rcParams['axes.unicode_minus']=False 
import numpy as np
import  tensorflow as tf
from    tensorflow import keras
from    tensorflow.keras import datasets, layers, optimizers
import  os
from mpl_toolkits.mplot3d import Axes3D

1.1 合併

在 TensorFlow 中，能夠經過 tf.concat(tensors, axis)，其中 tensors 保存了全部須要
合併的張量 List，axis 指定須要合併的維度。合併張量 A,B 以下：
python

a = tf.random.normal([2,4]) # 模擬成績冊 A 
b = tf.random.normal([2,4]) # 模擬成績冊 B
tf.concat([a,b],axis=0)

<tf.Tensor: shape=(4, 4), dtype=float32, numpy=
array([[ 0.16198424, -0.7170487 , -0.20940438, -0.46842927],
       [ 0.48012358,  0.82777774, -0.37541786, -0.6629169 ],
       [-0.15179256, -0.41726607, -1.9034436 ,  0.72443116],
       [-0.48332193,  0.23101914,  0.87393326, -1.2042308 ]],
      dtype=float32)>

tf.concat([a,b],axis=1)

<tf.Tensor: shape=(2, 8), dtype=float32, numpy=
array([[ 0.16198424, -0.7170487 , -0.20940438, -0.46842927, -0.15179256,
        -0.41726607, -1.9034436 ,  0.72443116],
       [ 0.48012358,  0.82777774, -0.37541786, -0.6629169 , -0.48332193,
         0.23101914,  0.87393326, -1.2042308 ]], dtype=float32)>

使用 tf.stack(tensors, axis) 能夠合併多個張量 tensors, 當axis ≥ 0時，在 axis 以前插入;當axis < 0時，在 axis 以後插入新維度。算法

a = tf.random.normal([2,2])
b = tf.random.normal([2,2])
tf.stack([a,b],axis=0) #

<tf.Tensor: shape=(2, 2, 2), dtype=float32, numpy=
array([[[-2.1611633 ,  0.4318549 ],
        [-1.7556009 ,  0.6960343 ]],

       [[-0.84239227,  0.9800302 ],
        [ 0.5497298 ,  0.0607984 ]]], dtype=float32)>

一樣能夠選擇在其餘位置插入新維度，如在最末尾插入：緩存

a = tf.random.normal([2,2])
b = tf.random.normal([2,2])
tf.stack([a,b],axis=-1)

<tf.Tensor: shape=(2, 2, 2), dtype=float32, numpy=
array([[[-2.09798   ,  0.5505884 ],
        [-1.1372471 ,  0.08376882]],

       [[-1.0453051 ,  0.47830236],
        [-1.1234645 , -0.97358865]]], dtype=float32)>

1.2 分割

合併操做的逆過程就是分割，將一個張量分拆爲多個張量。網絡

經過 tf.split(x, axis, num_or_size_splits) 能夠完成張量的分割操做:多線程

-x：待分割張量
-axis：分割的維度索引號 
-num_or_size_splits：切割方案。當 num_or_size_splits 爲單個數值時，如 10，表示切割爲 10 份；當 num_or_size_splits 爲 List 時，每一個元素表示每份的長度, 如[2,4,2,2]表示切割爲 4 份，每份的長度分別爲 2,4,2,2

x = tf.random.normal([4,2])
print(x)
result = tf.split(x, axis = 0, num_or_size_splits=2)
result

tf.Tensor(
[[ 0.77127916  0.62768835]
 [-0.76758057  1.3676474 ]
 [-0.10122015 -0.917917  ]
 [-0.1592799  -0.33670765]], shape=(4, 2), dtype=float32)






[<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
 array([[ 0.77127916,  0.62768835],
        [-0.76758057,  1.3676474 ]], dtype=float32)>,
 <tf.Tensor: shape=(2, 2), dtype=float32, numpy=
 array([[-0.10122015, -0.917917  ],
        [-0.1592799 , -0.33670765]], dtype=float32)>]

tf.split(x, axis = 0, num_or_size_splits=[1,2,1])

[<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[0.77127916, 0.62768835]], dtype=float32)>,
 <tf.Tensor: shape=(2, 2), dtype=float32, numpy=
 array([[-0.76758057,  1.3676474 ],
        [-0.10122015, -0.917917  ]], dtype=float32)>,
 <tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[-0.1592799 , -0.33670765]], dtype=float32)>]

若是但願在某個維度上所有按長度爲 1 的方式分割，還能夠直接使用 tf.unstack(x, axis)。這種方式是 tf.split 的一種特殊狀況，切割長度固定爲 1，只須要指定切割維度即
可。
app

x = tf.random.normal([4,2])
tf.unstack(x, axis = 0)

[<tf.Tensor: shape=(2,), dtype=float32, numpy=array([-0.69661826,  0.42463547], dtype=float32)>,
 <tf.Tensor: shape=(2,), dtype=float32, numpy=array([ 0.40786335, -0.9408407 ], dtype=float32)>,
 <tf.Tensor: shape=(2,), dtype=float32, numpy=array([-0.71312106, -0.33494622], dtype=float32)>,
 <tf.Tensor: shape=(2,), dtype=float32, numpy=array([0.9833806, 0.7918092], dtype=float32)>]

2.數據統計

在神經網絡的計算過程當中，常常須要統計數據的各類屬性，如最大值，均值，範數等。dom

2.1 向量範數

L1 範數，定義爲向量 𝒙 的全部元素絕對值之和svg

x = tf.ones([2,2])
tf.norm(x, ord = 1)

<tf.Tensor: shape=(), dtype=float32, numpy=4.0>

L2 範數，定義爲向量 𝒙 的全部元素的平方和，再開根號函數

tf.norm(x, ord = 2)

<tf.Tensor: shape=(), dtype=float32, numpy=2.0>

∞ −範數，定義爲向量 𝒙 的全部元素絕對值的最大值工具

tf.norm(x, ord = np.inf)

<tf.Tensor: shape=(), dtype=float32, numpy=1.0>

2.2 最大最小值、均值、和

經過 tf.reduce_max, tf.reduce_min, tf.reduce_mean, tf.reduce_sum 能夠求解張量在某個維度上的最大、最小、均值、和，也能夠求全局最大、最小、均值、和信息。

x = tf.random.normal([2,3])
tf.reduce_max(x, axis = 1)

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([1.1455595, 0.8110037], dtype=float32)>

tf.reduce_min(x, axis = 1)

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([-0.8374149, -1.2768023], dtype=float32)>

tf.reduce_mean(x, axis = 1)

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([ 0.21712641, -0.16247804], dtype=float32)>

tf.reduce_sum(x, axis = 1)

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([ 0.6513792 , -0.48743412], dtype=float32)>

在求解偏差函數時，經過 TensorFlow 的 MSE 偏差函數能夠求得每一個樣本的偏差，需
要計算樣本的平均偏差，此時能夠經過 tf.reduce_mean 在樣本數維度上計算均值：

out = tf.random.normal([4,10]) # 網絡預測輸出
y = tf.constant([1,2,2,0]) # 真實標籤
y = tf.one_hot(y,depth=10) # one-hot 編碼
loss = keras.losses.mse(y,out) # 計算每一個樣本的偏差
loss = tf.reduce_mean(loss) # 平均偏差
loss

<tf.Tensor: shape=(), dtype=float32, numpy=1.0784723>

除了但願獲取張量的最值信息，還但願得到最值所在的索引號，例如分類任務的標籤
預測。考慮 10 分類問題，咱們獲得神經網絡的輸出張量 out，shape 爲[2,10]，表明了 2 個
樣本屬於 10 個類別的機率，因爲元素的位置索引表明了當前樣本屬於此類別的機率，預測
時每每會選擇機率值最大的元素所在的索引號做爲樣本類別的預測值：

out = tf.random.normal([2,10])
out = tf.nn.softmax(out, axis=1) # 經過 softmax 轉換爲機率值
out

<tf.Tensor: shape=(2, 10), dtype=float32, numpy=
array([[0.03961995, 0.26136935, 0.01498432, 0.03388612, 0.03053044,
        0.05304638, 0.05151249, 0.0134019 , 0.17832054, 0.3233286 ],
       [0.06895317, 0.13860522, 0.14075696, 0.02185706, 0.04494175,
        0.21044637, 0.20726745, 0.04014605, 0.01419329, 0.11283264]],
      dtype=float32)>

經過 tf.argmax(x, axis)，tf.argmin(x, axis) 能夠求解在 axis 軸上，x 的最大值、最小值所在的索引號：

pred = tf.argmax(out, axis=1)
pred

<tf.Tensor: shape=(2,), dtype=int64, numpy=array([9, 5], dtype=int64)>

2.3 張量比較

爲了計算分類任務的準確率等指標，通常須要將預測結果和真實標籤比較，統計比較
結果中正確的數量來就是計算準確率。考慮 10 個樣本的預測結果：

out = tf.random.normal([10,10])
out = tf.nn.softmax(out, axis=1)
pred = tf.argmax(out, axis=1)
pred

<tf.Tensor: shape=(10,), dtype=int64, numpy=array([3, 2, 4, 3, 0, 4, 5, 0, 2, 6], dtype=int64)>

能夠看到咱們模擬的 10 個樣本的預測值，咱們與這 10 樣本的真實值比較：

y = tf.random.uniform([10],dtype=tf.int64,maxval=10)
y

<tf.Tensor: shape=(10,), dtype=int64, numpy=array([7, 3, 9, 2, 7, 4, 3, 1, 4, 5], dtype=int64)>

經過 tf.equal(a, b) (或 tf.math.equal(a, b) )函數能夠比較這 2個張量是否相等：

out = tf.equal(pred,y)
out

<tf.Tensor: shape=(10,), dtype=bool, numpy=
array([False, False, False, False, False,  True, False, False, False,
       False])>

tf.equal() 函數返回布爾型的張量比較結果，只須要統計張量中 True 元素的個數，便可知道
預測正確的個數。爲了達到這個目的，咱們先將布爾型轉換爲整形張量，再求和其中 1 的
個數，能夠獲得比較結果中 True 元素的個數：

out = tf.cast(out, dtype=tf.float32) # 布爾型轉 int 型
correct = tf.reduce_sum(out) # 統計 True 的個數
correct

<tf.Tensor: shape=(), dtype=float32, numpy=1.0>

2.4 填充與複製

填充

填充操做能夠經過 tf.pad(x, paddings)函數實現，paddings 是包含了多個
[𝐿𝑒𝑓𝑡 𝑃𝑎𝑑𝑑𝑖𝑛𝑔, 𝑅𝑖𝑔ℎ𝑡 𝑃𝑎𝑑𝑑𝑖𝑛𝑔]的嵌套方案 List，如 [[0,0],[2,1],[1,2]] 表示第一個維度不填
充，第二個維度左邊(起始處)填充兩個單元，右邊(結束處)填充一個單元，第三個維度左邊
填充一個單元，右邊填充兩個單元。

b = tf.constant([1,2,3,4])
tf.pad(b, [[0,2]])  # 第一維，左邊不填充，右邊填充倆個

<tf.Tensor: shape=(6,), dtype=int32, numpy=array([1, 2, 3, 4, 0, 0])>

tf.pad(b, [[2,2]])#第一維，左邊填充倆個，右邊填充倆個

<tf.Tensor: shape=(8,), dtype=int32, numpy=array([0, 0, 1, 2, 3, 4, 0, 0])>

複製

經過 tf.tile 函數能夠在任意維度將數據重複複製多份

x = tf.random.normal([2,2])
tf.tile(x, [1,2])

<tf.Tensor: shape=(2, 4), dtype=float32, numpy=
array([[ 1.462598  ,  1.7452018 ,  1.462598  ,  1.7452018 ],
       [-1.4659724 , -0.47004214, -1.4659724 , -0.47004214]],
      dtype=float32)>

3.數據限幅

在 TensorFlow 中，能夠經過 tf.maximum(x, a)實現數據的下限幅：𝑥 ∈ [𝑎, +∞)；能夠
經過 tf.minimum(x, a)實現數據的上限幅：𝑥 ∈ (−∞,𝑎]，舉例以下：

x = tf.range(9)
tf.maximum(x, 3) # 下限幅3

<tf.Tensor: shape=(9,), dtype=int32, numpy=array([3, 3, 3, 3, 4, 5, 6, 7, 8])>

tf.minimum(x, 5) # 上限幅5

<tf.Tensor: shape=(9,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5, 5, 5, 5])>

ReLU 函數能夠實現爲：

def relu(x):
    return tf.minimum(x,0.) # 下限幅爲 0 便可

經過組合 tf.maximum(x, a)和 tf.minimum(x, b) 能夠實現同時對數據的上下邊界限幅：
𝑥 ∈ [𝑎, 𝑏]:

x = tf.range(9)
tf.minimum(tf.maximum(x, 2), 7)

<tf.Tensor: shape=(9,), dtype=int32, numpy=array([2, 2, 2, 3, 4, 5, 6, 7, 7])>

更方便地，咱們可使用 tf.clip_by_value 實現上下限幅：

tf.clip_by_value(x,2,7) # 限幅爲 2~7

<tf.Tensor: shape=(9,), dtype=int32, numpy=array([2, 2, 2, 3, 4, 5, 6, 7, 7])>

4.高級操做

4.1 tf.gather

x = tf.random.uniform([4,3,2],maxval=100,dtype=tf.int32)
tf.gather(x,[0,1],axis=0)

<tf.Tensor: shape=(2, 3, 2), dtype=int32, numpy=
array([[[51, 45],
        [36, 18],
        [56, 57]],

       [[18, 16],
        [64, 82],
        [13,  4]]])>

實際上，對於上述需求，經過切片𝑥[: 2]能夠更加方便地實現。

x[0:2]

<tf.Tensor: shape=(2, 3, 2), dtype=int32, numpy=
array([[[51, 45],
        [36, 18],
        [56, 57]],

       [[18, 16],
        [64, 82],
        [13,  4]]])>

可是對於不規則的索引方式，好比，須要抽查全部班級的第 1,4,9,12,13,27 號同窗的成績，則切片方式實現起來很是麻煩，而 tf.gather 則是針對於此需求設計的，使用起來很是方便：

x = tf.random.uniform([10,3,2],maxval=100,dtype=tf.int32)
tf.gather(x,[0,3,8],axis=0)

<tf.Tensor: shape=(3, 3, 2), dtype=int32, numpy=
array([[[86, 82],
        [32, 80],
        [35, 71]],

       [[97, 16],
        [22, 83],
        [20, 82]],

       [[79, 86],
        [13, 46],
        [68, 23]]])>

4.2 tf.gather_nd

經過 tf.gather_nd，能夠經過指定每次採樣的座標來實現採樣多個點的目的。

x = tf.random.normal([3,4,4])
tf.gather_nd(x, [[1,2], [2,3]])

<tf.Tensor: shape=(2, 4), dtype=float32, numpy=
array([[-0.5388145 ,  0.00821999,  0.41113982,  1.0409608 ],
       [-0.42242923, -0.29552126,  0.6467382 , -1.7555269 ]],
      dtype=float32)>

tf.gather_nd(x, [[1,1,3], [2,3,3]])

<tf.Tensor: shape=(2,), dtype=float32, numpy=array([ 0.07165062, -1.7555269 ], dtype=float32)>

4.3 tf.boolean_mask

除了能夠經過給定索引號的方式採樣，還能夠經過給定掩碼(mask)的方式採樣。經過 tf.boolean_mask(x, mask, axis) 能夠在 axis 軸上根據 mask 方案進行採樣，實現爲：

x = tf.random.normal([3,4,4])
tf.boolean_mask(x,mask=[True, True,False],axis=0)

<tf.Tensor: shape=(2, 4, 4), dtype=float32, numpy=
array([[[ 1.0542077 , -0.48954943, -0.7491975 , -0.43464097],
        [-0.46667233, -1.2484705 , -1.7732694 , -1.2128644 ],
        [ 1.7540843 ,  0.48327965,  0.95591843, -1.5143739 ],
        [ 1.3619318 ,  1.168045  , -0.351565  ,  0.1630519 ]],

       [[-0.13046652, -2.2438517 , -2.3416731 ,  1.4573859 ],
        [ 0.3127366 ,  1.4858567 ,  0.24127336, -1.2466795 ],
        [-0.05732883, -0.75874144,  0.6504554 ,  0.756288  ],
        [-2.8709486 ,  0.11397363, -0.15979192, -0.07177942]]],
      dtype=float32)>

多維掩碼採樣

x = tf.random.uniform([2,3,4],maxval=100,dtype=tf.int32)
print(x)
tf.boolean_mask(x,[[True,True,False],[False,False,True]])

tf.Tensor(
[[[63 32 59 60]
  [56 92 36 63]
  [53 66 69 30]]

 [[75 96 67 15]
  [17 11 64 38]
  [17 81 53 21]]], shape=(2, 3, 4), dtype=int32)






<tf.Tensor: shape=(3, 4), dtype=int32, numpy=
array([[63, 32, 59, 60],
       [56, 92, 36, 63],
       [17, 81, 53, 21]])>

4.4 tf.where

經過 tf.where(cond, a, b) 操做能夠根據 cond 條件的真假從 a 或 b 中讀取數據

a = tf.ones([3,3])
b = tf.zeros([3,3])
cond = tf.constant([[True,False,False],[False,True,False],[True,True,False]])
tf.where(cond,a,b)

<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[1., 0., 0.],
       [0., 1., 0.],
       [1., 1., 0.]], dtype=float32)>

當 a = b = None 即 a,b 參數不指定時，tf.where 會返回 cond 張量中全部 True 的元素的索引座標。

tf.where(cond)

<tf.Tensor: shape=(4, 2), dtype=int64, numpy=
array([[0, 0],
       [1, 1],
       [2, 0],
       [2, 1]], dtype=int64)>

下面咱們來看一個例子

x = tf.random.normal([3,3])
mask = x > 0
mask

<tf.Tensor: shape=(3, 3), dtype=bool, numpy=
array([[False,  True, False],
       [ True, False, False],
       [ True,  True, False]])>

經過 tf.where 提取此掩碼處 True 元素的索引:

indices=tf.where(mask) # 提取爲True 的元素索引
indices

<tf.Tensor: shape=(4, 2), dtype=int64, numpy=
array([[0, 1],
       [1, 0],
       [2, 0],
       [2, 1]], dtype=int64)>

拿到索引後，經過 tf.gather_nd 便可恢復出全部正數的元素：

tf.gather_nd(x,indices) # 提取正數的元素值

<tf.Tensor: shape=(4,), dtype=float32, numpy=array([0.8857748 , 0.5748998 , 1.3066388 , 0.82504845], dtype=float32)>

也能夠直接用下面的代碼一步實現：

tf.boolean_mask(x,x > 0)

<tf.Tensor: shape=(4,), dtype=float32, numpy=array([0.8857748 , 0.5748998 , 1.3066388 , 0.82504845], dtype=float32)>

x[x>0]

<tf.Tensor: shape=(4,), dtype=float32, numpy=array([0.8857748 , 0.5748998 , 1.3066388 , 0.82504845], dtype=float32)>

4.5 scatter_nd

經過 tf.scatter_nd(indices, updates, shape) 能夠高效地刷新張量的部分數據，可是隻能在
全 0 張量的白板上面刷新，所以可能須要結合其餘操做來實現現有張量的數據刷新功能。

scatter_nd方法的更新示意圖以下：

# 構造須要刷新數據的位置
indices = tf.constant([[4], [3],[1],[7]])
# 構造須要寫入的數據
updates = tf.constant([4.4,3.3,1.1,7.7])
# 在長度爲 8 的全 0 向量上根據 indices 寫入 updates
tf.scatter_nd(indices, updates, [8])

<tf.Tensor: shape=(8,), dtype=float32, numpy=array([0. , 1.1, 0. , 3.3, 4.4, 0. , 0. , 7.7], dtype=float32)>

咱們來看多維的數據更新狀況

# 構造寫入位置
indices = tf.constant([[1],[3]])
# 構造寫入數據
updates = tf.constant([[[5,5,5,5],[6,6,6,6]],
                       [[1,1,1,1],[2,2,2,2]]])
# 在 shape 爲[4,4,4]白板上根據 indices 寫入 updates
tf.scatter_nd(indices, updates, [4,2,4])

<tf.Tensor: shape=(4, 2, 4), dtype=int32, numpy=
array([[[0, 0, 0, 0],
        [0, 0, 0, 0]],

       [[5, 5, 5, 5],
        [6, 6, 6, 6]],

       [[0, 0, 0, 0],
        [0, 0, 0, 0]],

       [[1, 1, 1, 1],
        [2, 2, 2, 2]]])>

4.6 meshgrid

經過 tf.meshgrid 能夠方便地生成二維網格採樣點座標，方即可視化等應用場合。

經過在 x 軸上進行採樣 100 個數據點，y 軸上採樣 100 個數據點，而後經過tf.meshgrid(x, y)便可返回這 10000 個數據點的張量數據，shape 爲[100,100,2]。爲了方便計算，tf.meshgrid 會返回在 axis=2 維度切割後的 2 個張量 a,b，其中張量 a 包含了全部點的 x座標，b 包含了全部點的 y 座標，shape 都爲[100,100]：

x = tf.linspace(-8.,8,100) # 設置 x 座標的間隔
y = tf.linspace(-8.,8,100) # 設置 y 座標的間隔
x,y = tf.meshgrid(x,y) # 生成網格點，並拆分後返回
x.shape,y.shape # 打印拆分後的全部點的 x,y 座標張量 shape

(TensorShape([100, 100]), TensorShape([100, 100]))

考慮2 個自變量 x,y 的 Sinc 函數表達式爲：z = sin(x2 + y2) / (x2 + y2)

Sinc 函數在 TensorFlow 中實現以下：

z = tf.sqrt(x**2+y**2) 
z = tf.sin(z)/z # sinc 函數實現

fig = plt.figure()
ax = Axes3D(fig)
ax.contour3D(x.numpy(), y.numpy(), z.numpy(), 50)
plt.show()

<Figure size 648x504 with 0 Axes>




<Figure size 648x504 with 0 Axes>



findfont: Font family ['STKaiTi'] not found. Falling back to DejaVu Sans.

5.數據集加載

在 TensorFlow 中，keras.datasets 模塊提供了經常使用經典數據集的自動下載、管理、加載
與轉換功能，而且提供了tf.data.Dataset 數據集對象，方便實現多線程(Multi-thread)，預處
理(Preprocess)，隨機打散(Shuffle)和批訓練(Train on batch)等經常使用數據集功能。

經過 datasets.xxx.load_data() 便可實現經典數據集的自動加載，其中 xxx 表明具體的數
據集名稱。TensorFlow 會默認將數據緩存在用戶目錄下的 .keras/datasets 文件夾，

所示，用戶不須要關心數據集是如何保存的。若是當前數據集不在緩存中，則會自動從網站下載和解壓，加載；若是已經在緩存中，自動完成加載：

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import datasets # 導入經典數據集加載模塊
(x, y), (x_test, y_test) = datasets.mnist.load_data()
print('x:', x.shape, 'y:', y.shape, 'x test:', x_test.shape, 'y test:', y_test)

x: (60000, 28, 28) y: (60000,) x test: (10000, 28, 28) y test: [7 2 1 ... 4 5 6]

經過 load_data() 會返回相應格式的數據，對於圖片數據集 MNIST, CIFAR10 等，會返回 2 個 tuple，第一個 tuple 保存了用於訓練的數據 x,y 訓練集對象；第 2 個 tuple 則保存了用於
測試的數據 x_test,y_test 測試集對象，全部的數據都用 Numpy.array 容器承載。

數據加載進入內存後，須要轉換成 Dataset 對象，以利用 TensorFlow 提供的各類便捷功能。經過 Dataset.from_tensor_slices 能夠將訓練部分的數據圖片 x 和標籤 y 都轉換成Dataset 對象：

train_db = tf.data.Dataset.from_tensor_slices((x, y))

5.1 隨機打亂

經過 Dataset.shuffle(buffer_size)工具能夠設置 Dataset 對象隨機打散數據之間的順序，防止每次訓練時數據按固定順序產生，從而使得模型嘗試「記憶」住標籤信息：train_db = train_db.shuffle(10000) 其中 buffer_size 指定緩衝池的大小，通常設置爲一個較大的參數便可。經過 Dataset 提供的這些工具函數會返回新的 Dataset 對象，能夠經過 db = db. shuffle(). step2(). step3. () 方式完成全部的數據處理步驟，實現起來很是方便

5.2 批訓練

爲了利用顯卡的並行計算能力，通常在網絡的計算過程當中會同時計算多個樣本，咱們把這種訓練方式叫作批訓練，其中樣本的數量叫作 batch size。爲了一次可以從 Dataset 中產生 batch size 數量的樣本，須要設置 Dataset 爲批訓練方式：

train_db = train_db.batch(128)

其中 128 爲 batch size`參數，即一次並行計算 128 個樣本的數據。Batch size 通常根據用戶的 GPU 顯存資源來設置，當顯存不足時，能夠適量減小 batch size 來減小算法的顯存使用量

5.3預處理

從 keras.datasets 中加載的數據集的格式大部分狀況都不能知足模型的輸入要求，所以須要根據用戶的邏輯本身實現預處理函數。Dataset 對象經過提供 map(func)工具函數能夠很是方便地調用用戶自定義的預處理邏輯，它實如今 func 函數裏:

#預處理函數實如今 preprocess 函數中，傳入函數引用便可
train_db = train_db.map(preprocess)

def preprocess(x, y): # 自定義的預處理函數
# 調用此函數時會自動傳入 x,y 對象，shape 爲[b, 28, 28], [b]
 # 標準化到 0~1
 x = tf.cast(x, dtype=tf.float32) / 255.
 x = tf.reshape(x, [-1, 28*28]) # 打平
 y = tf.cast(y, dtype=tf.int32) # 轉成整形張量
 y = tf.one_hot(y, depth=10) # one-hot 編碼
 # 返回的 x,y 將替換傳入的 x,y 參數，從而實現數據的預處理功能
 return x,y

train_db = train_db.map(preprocess)

5.4 循環訓練

對於 Dataset 對象，在使用時能夠經過

for step, (x,y) in enumerate(train_db): # 迭代數據集對象，帶 step 參數

或：

for x,y in train_db: # 迭代數據集對象

方式進行迭代，每次返回的 x,y 對象即爲批量樣本和標籤，當對 train_db 的全部樣本完成
一次迭代後，for 循環終止退出。咱們通常把完成一個 batch 的數據訓練，叫作一個 step；

經過多個 step 來完成整個訓練集的一次迭代，叫作一個 epoch。在實際訓練時，一般須要
對數據集迭代多個 epoch 才能取得較好地訓練效果

此外，也能夠經過設置:

train_db = train_db.repeat(20) # 數據集跌打 20 遍才終止

使得 for x,y in train_db 循環迭代 20 個 epoch 纔會退出。無論使用上述哪一種方式，都能取得同樣的效果。

6.MNIST手寫數字識別實戰

# 導入要使用的庫
import  matplotlib
from    matplotlib import pyplot as plt
# Default parameters for plots
matplotlib.rcParams['font.size'] = 20
matplotlib.rcParams['figure.titlesize'] = 20
matplotlib.rcParams['figure.figsize'] = [9, 7]
matplotlib.rcParams['font.family'] = ['STKaiTi']
matplotlib.rcParams['axes.unicode_minus']=False 
import  tensorflow as tf
from    tensorflow import keras
from    tensorflow.keras import datasets, layers, optimizers
import  os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
print(tf.__version__)

def preprocess(x, y): 
    # [b, 28, 28], [b]
    x = tf.cast(x, dtype=tf.float32) / 255.
    x = tf.reshape(x, [-1, 28*28])
    y = tf.cast(y, dtype=tf.int32)
    y = tf.one_hot(y, depth=10)
    return x,y
(x, y), (x_test, y_test) = datasets.mnist.load_data()
print('x:', x.shape, 'y:', y.shape, 'x test:', x_test.shape, 'y test:', y_test)

# 數據預處理
batchsz = 512
train_db = tf.data.Dataset.from_tensor_slices((x, y))
train_db = train_db.shuffle(1000).batch(batchsz).map(preprocess).repeat(20)

test_db = tf.data.Dataset.from_tensor_slices((x_test, y_test))
test_db = test_db.shuffle(1000).batch(batchsz).map(preprocess)

x,y = next(iter(train_db))
print('train sample:', x.shape, y.shape)
# print(x[0], y[0])

def main():
    # learning rate
    lr = 1e-2
    accs,losses = [], []
    # 784 => 512
    
    #獲得參數
    w1, b1 = tf.Variable(tf.random.normal([784, 256], stddev=0.1)), tf.Variable(tf.zeros([256]))
    # 512 => 256
    w2, b2 = tf.Variable(tf.random.normal([256, 128], stddev=0.1)), tf.Variable(tf.zeros([128]))
    # 256 => 10
    w3, b3 = tf.Variable(tf.random.normal([128, 10], stddev=0.1)), tf.Variable(tf.zeros([10]))
    #開始訓練
    for step, (x,y) in enumerate(train_db):
        # [b, 28, 28] => [b, 784]
        x = tf.reshape(x, (-1, 784))
        with tf.GradientTape() as tape: 
            # layer1.
            h1 = x @ w1 + b1
            h1 = tf.nn.relu(h1)
            # layer2
            h2 = h1 @ w2 + b2
            h2 = tf.nn.relu(h2)
            # output
            out = h2 @ w3 + b3
            # out = tf.nn.relu(out)
            # compute loss
            # [b, 10] - [b, 10]
            loss = tf.square(y-out)
            # [b, 10] => scalar
            loss = tf.reduce_mean(loss)
        grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3])  #獲得梯度
        for p, g in zip([w1, b1, w2, b2, w3, b3], grads): # 更新參數
            p.assign_sub(lr * g)
        # print
        if step % 80 == 0:
            print(step, 'loss:', float(loss))
            losses.append(float(loss))
        if step %80 == 0:
            # evaluate/test
            total, total_correct = 0., 0

            for x, y in test_db:
                # layer1.
                h1 = x @ w1 + b1
                h1 = tf.nn.relu(h1)
                # layer2
                h2 = h1 @ w2 + b2
                h2 = tf.nn.relu(h2)
                # output
                out = h2 @ w3 + b3
                # [b, 10] => [b]
                pred = tf.argmax(out, axis=1)
                # convert one_hot y to number y
                y = tf.argmax(y, axis=1)
                # bool type
                correct = tf.equal(pred, y)
                # bool tensor => int tensor => numpy
                total_correct += tf.reduce_sum(tf.cast(correct, dtype=tf.int32)).numpy()
                total += x.shape[0]

            print(step, 'Evaluate Acc:', total_correct/total)

            accs.append(total_correct/total)


    plt.figure()
    x = [i*80 for i in range(len(losses))]
    plt.plot(x, losses, color='C0', marker='s', label='訓練')
    plt.ylabel('MSE')
    plt.xlabel('Step')
    plt.legend()
    plt.savefig('train.svg')

    plt.figure()
    plt.plot(x, accs, color='C1', marker='s', label='測試')
    plt.ylabel('準確率')
    plt.xlabel('Step')
    plt.legend()
    plt.savefig('test.svg')
    plt.show()

if __name__ == '__main__':
    main()