【機器學習】TensorFlow學習（一）

時間 2019-11-19

標籤機器學習 tensorflow 學習简体版

原文原文鏈接

感謝中國人民大學胡鶴老師，課講得很是好~python

首先，何謂tensor？即高維向量，例如矩陣是二維，tensor是更廣義意義上的n維向量（有type+shape）docker

TensorFlow執行過程爲定義圖，其中定義子節點，計算時只計算所需節點所依賴的節點，是一種高效且適應大規模的數據計算，方便分佈式設計，對於複雜神經網絡的計算，可將其拆開到其餘核中同時計算。網絡

Theano——torch———caffe（尤爲是圖像處理）——deeplearning5j——H20——MXNet，TensorFlowsession

運行環境dom

下載docker機器學習

打開docker quickstart terminal分佈式

標紅地方顯示該docker虛擬機IP地址（即以後的localhost）學習

docker tensorflow/tensorflow　　//自動找到TensorFlow容器並下載fetch

docker images　　//瀏覽當前容器優化

docker run -p 8888:8888 tensorflow/tensorflow　　//在8888端口運行

會出現一個token，複製該連接並替換掉localhost，既能夠打開TensorFlow的一個編寫器，jupyter

大致雛形

#python導入
import tensorflow as tf
#定義變量（節點）
x = tf.Variable(3, name="x")
y = tf.Variable(4, name="y")
f = x*x*y + y + 2
#定義session
sess = tf.Session()
#爲已經定義的節點賦值
sess.run(x.initializer)
sess.run(y.initializer)
#運行session
result = sess.run(f)
print(result)  #42
#釋放空間
sess.close

還有一個更簡潔的一種定義並運行session方法

# a better way
with tf.Session() as sess:
    x.initializer.run()
    y.initializer.run()
    #即evaluate，求解f的值
    result = f.eval()

初始化的兩行也能夠寫做

init = tf.global_variables_initializer()

init.run()

而session能夠改做sess=tf.InteractiveSession()運行起來更方便

init = tf.global_variables_initializer()
sess = tf.InteractiveSession()
init.run()
result = f.eval()
print(result)

於是TensorFlow的代碼分爲兩部分，定義部分和執行部分

TensorFlow是一個圖的操做，有自動缺省的默認圖和你本身定義的圖

#系統默認缺省的圖
>>> x1 = tf.Variable(1)
>>> x1.graph is tf.get_default_graph()
True
#自定義的圖
>>> graph = tf.Graph()
>>> with graph.as_default():
x2 = tf.Variable(2)
>>> x2.graph is graph
True
>>> x2.graph is tf.get_default_graph()
False

節點的生命週期

第二種方法能夠找出公共部分，避免x被計算2次。

運行結束後全部節點的值都被清空，若是沒有單獨保存，還需從新run一遍。

w = tf.constant(3)
x = w + 2
y = x + 5
z = x * 3
with tf.Session() as sess:
    print(y.eval()) # 10
    print(z.eval()) # 15

with tf.Session() as sess:
    y_val, z_val = sess.run([y, z])
    print(y_val) # 10
    print(z_val) # 15

Linear Regression with TensorFlow（線性迴歸上的應用）

y = wx+b = wx'　　//這裏x'是相較於x多了一維全是1的向量

這裏引用California housing的數據

TensorFlow上向量是列向量，須要reshape(-1,1)即轉置成列向量

使用normal equation方法求解

import numpy as np
from sklearn.datasets import fetch_california_housing
housing = fetch_california_housing()
#得到數據維度，矩陣的行列長度
m, n = housing.data.shape
#np.c_是鏈接的含義，加了一個全爲1的維度
housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]
#數據量並不大，能夠直接用常量節點裝載進來，可是以後海量數據沒法使用（會用minbatch的方式導入數據）
X = tf.constant(housing_data_plus_bias, dtype=tf.float32, name="X")
#轉置成列向量
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
XT = tf.transpose(X)
#使用normal equation的方法求解theta，以前線性模型中有說起
theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y)
#求出權重
with tf.Session() as sess:
    theta_value = theta.eval()

若是是本來的方法，可能更直接些。但因爲使用底層的庫不一樣，它們計算出來的值不徹底相同。

#使用numpy
X = housing_data_plus_bias
y = housing.target.reshape(-1, 1)
theta_numpy = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)
#使用sklearn
from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(housing.data, housing.target.reshape(-1, 1))

這裏不由感到疑惑，爲何TensorFlow感受變複雜了呢？其實，這不過由於這裏數據規模較小，進行大規模的計算時，TensorFlow的自動優化所發揮的效果，是十分厲害的。

使用gradient descent（梯度降低）方法求解

#使用gradient時須要scale一下
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaled_housing_data = scaler.fit_transform(housing.data)
scaled_housing_data_plus_bias = np.c_[np.ones((m, 1)), scaled_housing_data]
#迭代1000次
n_epochs = 1000
learning_rate = 0.01
#因爲使用gradient，寫入x的值須要scale一下
X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
#使用gradient須要有一個初值
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0), name="theta")
#當前預測的y，x是m*（n+1），theta是（n+1）*1，恰好是y的維度
y_pred = tf.matmul(X, theta, name="predictions")
#總體偏差
error = y_pred - y
#TensorFlow求解均值功能強大，能夠指定維數，也能夠像下面方法求總體的
mse = tf.reduce_mean(tf.square(error), name="mse")
#暫時本身寫出訓練過程，實際能夠採用TensorFlow自帶的功能更強大的自動求解autodiff方法
gradients = 2/m * tf.matmul(tf.transpose(X), error)
training_op = tf.assign(theta, theta - learning_rate * gradients)
#初始化並開始求解
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    for epoch in range(n_epochs):
        #每運行100次打印一下當前平均偏差
        if epoch % 100 == 0:
            print("Epoch", epoch, "MSE =", mse.eval())
        sess.run(training_op)
    best_theta = theta.eval()

上述代碼中的autodiff以下，能夠自動求出gradient

gradients = tf.gradients(mse, [theta])[0]

使用Optimizer

上述的整個梯度降低和迭代方法，都封裝了在以下方法中

optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)

training_op = optimizer.minimize(mse)

這樣的optimizer還有不少

例如帶衝量的optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate,momentum=0.9)

Feeding data to training algorithm

當數據量達到幾G，幾十G時，使用constant直接導入數據顯然是不現實的，於是咱們用placeholder作一個佔位符

（通常行都是none，即數據量是任意的）

真正運行，run的時候再feed數據。能夠不斷使用新的數據。

>>> A = tf.placeholder(tf.float32, shape=(None, 3))
>>> B = A + 5
>>> with tf.Session() as sess:
... B_val_1 = B.eval(feed_dict={A: [[1, 2, 3]]})
... B_val_2 = B.eval(feed_dict={A: [[4, 5, 6], [7, 8, 9]]})
...
>>> print(B_val_1)
[[ 6. 7. 8.]]
>>> print(B_val_2)
[[ 9. 10. 11.]
[ 12. 13. 14.]]

這樣，就能夠經過定義min_batch來分批次隨機抽取指定數量的數據，即使是幾T的數據也能夠抽取。

batch_size = 100
n_batches = int(np.ceil(m / batch_size))
#有放回的隨機抽取數據
def fetch_batch(epoch, batch_index, batch_size):
    #定義一個隨機種子
    np.random.seed(epoch * n_batches + batch_index)  # not shown in the book
    indices = np.random.randint(m, size=batch_size)  # not shown
    X_batch = scaled_housing_data_plus_bias[indices] # not shown
    y_batch = housing.target.reshape(-1, 1)[indices] # not shown
    return X_batch, y_batch
#開始運行
with tf.Session() as sess:
    sess.run(init)
#每次都抽取新的數據作訓練
    for epoch in range(n_epochs):
        for batch_index in range(n_batches):
            X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
            sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
#最終結果
    best_theta = theta.eval()

Saving and Restoring models（保存模型）

有時候，運行幾天的模型可能因故暫時沒法繼續跑下去，於是須要暫時保持已訓練好的部分模型到硬盤上。

init = tf.global_variables_initializer()
saver = tf.train.Saver()
#保存模型
with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):
        if epoch % 100 == 0:
            #print("Epoch", epoch, "MSE =", mse.eval()) 
            save_path = saver.save(sess, "/tmp/my_model.ckpt")
        sess.run(training_op)
    
    best_theta = theta.eval()
    save_path = saver.save(sess, "/tmp/my_model_final.ckpt")

#恢復模型
with tf.Session() as sess:
    saver.restore(sess, "/tmp/my_model_final.ckpt")
    best_theta_restored = theta.eval()

關於TensorBoard

衆所周知，神經網絡和機器學習大可能是黑盒模型，讓人有點忐忑。TensorBoard所起的功能就是將這個黑盒稍微變白一些~

啓用tensorboard

輸入docker ps查看當前容器id

進入容器

使用tensorboard --log-dir=tf_logs命令打開已經存入的tf_logs文件，其生成代碼以下所示

from datetime import datetime

now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir = "tf_logs"
logdir = "{}/run-{}/".format(root_logdir, now)
...
mse_summary = tf.summary.scalar('MSE', mse)
file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())
...
if batch_index % 10 == 0:
                summary_str = mse_summary.eval(feed_dict={X: X_batch, y: y_batch})
                step = epoch * n_batches + batch_index
                file_writer.add_summary(summary_str, step)

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。