[Tensorflow] Cookbook - The Tensorflow Way

時間 2019-11-12

標籤 tensorflow cookbook way 简体版

原文原文鏈接

本章介紹tf基礎知識，主要包括cookbook的第1、二章節。node

方針：先會用，後定製python

Ref: TensorFlow 如何入門？git

Ref: 如何高效的學習 TensorFlow 代碼?算法

順便推薦該領域三件裝備：網絡

How TensorFlow Works?

Stepssession

Import or generate datasets
Transform and normalize data
Partition datasets into train, test, and validation sets
Set algorithm parameters (hyperparameters)
Initialize variables and placeholders
Define the model structure
Declare the loss functions
Initialize and train the model
Evaluate the model
Tune hyperparameters
Deploy/predict new outcomes

Declaring Variables and Tensors

Tensor 張量【至關於常數的設置】app

import tensorflow as tf # Fixed tensors 四種建立
zero_tsr = tf.zeros([2, 3]) zero_tsr ones_tsr = tf.ones([2, 3]) ones_tsr filled_tsr = tf.fill([2, 3], 42) filled_tsr constant_tsr = tf.constant([1,2,3]) constant_tsr # Tensors of similar shape 拷貝建立
zeros_similar = tf.zeros_like(constant_tsr) zeros_similar ones_similar = tf.ones_like(constant_tsr) ones_similar 
# 對比，變量的拷貝式建立
w2 = tf.Variable(w1.initialized_value())　　

# Sequence tensors 線性建立

linear_tsr = tf.linspace(start=0.0, stop=1.0, num=3) linear_tsr integer_seq_tsr = tf.range(start=6, limit=15, delta=3) integer_seq_tsr # Random tensors
randunif_tsr = tf.random_uniform([2, 3], minval=0, maxval=1) randunif_tsr randnorm_tsr = tf.random_normal([2, 3], mean=0.0, stddev=1.0) randnorm_tsr runcnorm_tsr = tf.truncated_normal([2, 3], mean=0.0, stddev=1.0) runcnorm_tsr 
# 拷貝建立 input_tensor = runcnorm_tsr shuffled_output = tf.random_shuffle(input_tensor) shuffled_output crop_size = 10 cropped_output = tf.random_crop(input_tensor, crop_size) cropped_output #cropped_image = tf.random_crop(my_image, [height/2, width/2, 3])
 my_var = tf.Variable(tf.zeros([2, 3])) my_var #We can convert any numpy array to a Python list, or #constant to a tensor using the function convert_to_tensor().  #convert_to_tensor()

打印/顯示 tensor 的內容：經過sess將tensor動（運算）起來，造成流，以後便能打印出內容。less

# How to print detail in tensor
# <Sol 1>
sess = tf.Session() print(sess.run(zero_tsr))
close(sess) # <Sol 2> 推薦! with tf.Session(): print(zero_tsr.eval())

Computational graph and Placeholder dom

import tensorflow as tf
 #定義‘符號’變量，也稱爲佔位符
a = tf.placeholder("float") b = tf.placeholder("float") y = tf.mul(a, b) #構造一個op節點
 sess = tf.Session()#創建會話
 #運行會話，輸入數據，並計算節點，同時打印結果
print(sess.run(y, feed_dict={a:3, b:3}))
 # 任務完成, 關閉會話.
sess.close()

通俗易懂：http://blog.csdn.net/fei13971414170/article/details/73309106ide

在 TensorFlow 中，數據不是以整數，浮點數或者字符串形式存在的，而是被封裝在一個叫作 tensor 的對象中。【tensor是個Object】

Tensor是張量的意思，張量包含了0到任意維度的量，其中：

- 零維的叫做常數；
- 一維的叫做向量；
- 二維的叫做矩陣；
- 多維度的就直接叫做張量。

# tensor1 是一個零維的 int32 tensor
tensor1 = tf.constant(1234) 
 # tensor2 是一個一維的 int32 tensor
tensor2 = tf.constant([123,456,789]) 
 # tensor3 是一個二維的 int32 tensor
tensor3 = tf.constant([ [123,456,789], [222,333,444] ])

Placeholder佔位符

使用feed_dict設置tensor的時候，須要你給出的值類型與佔位符定義的類型相同。

x = tf.placeholder(tf.string) y = tf.placeholder(tf.int32) z = tf.placeholder(tf.float32) with tf.Session() as sess: output = sess.run(x, feed_dict={x: 'Test String', y: 123, z: 45.67})

Using Placeholders and Variables

What's the difference between tf.placeholder and tf.Variable

In short, you use tf.Variable for trainable variables such as weights (W) and biases (B) for your model. 【權重和偏移，要訓練的數據】

weights = tf.Variable(tf.truncated_normal([IMAGE_PIXELS, hidden1_units], stddev=1.0 / math.sqrt(float(IMAGE_PIXELS))), name='weights') biases = tf.Variable(tf.zeros([hidden1_units]), name='biases')

tf.placeholder is used to feed actual training examples. 【用於獲得傳遞進來的真實的訓練樣本】

images_placeholder = tf.placeholder(tf.float32, shape=(batch_size, IMAGE_PIXELS)) labels_placeholder = tf.placeholder(tf.int32,   shape=(batch_size))

This is how you feed the training examples during the training:

for step in xrange(FLAGS.max_steps): feed_dict = {　　# 這麼寫，看上去友好些 images_placeholder: images_feed, labels_placeholder: labels_feed, } _, loss_value = sess.run([train_op, loss], feed_dict=feed_dict)

Your tf.variables will be trained (modified) as the result of this training.

The difference is that with tf.Variable you have to provide an initial value when you declare it.

With tf.placeholder you don't have to provide an initial value and you can specify it at run time with the feed_dict argument inside Session.run .

Working with Matrices

矩陣計算

# Matrices and Matrix Operations #---------------------------------- # # This function introduces various ways to create # matrices and how to use them in Tensorflow

import numpy as np import tensorflow as tf from tensorflow.python.framework import ops ops.reset_default_graph() # Declaring matrices
sess = tf.Session() # Declaring matrices

# Identity matrix
identity_matrix = tf.diag([1.0,1.0,1.0]) print(sess.run(identity_matrix)) #[[ 1. 0. 0.] # [ 0. 1. 0.] # [ 0. 0. 1.]]

# 2x3 random norm matrix
A = tf.truncated_normal([2,3]) print(sess.run(A)) #[[ 4.79249517e-04 -6.00280046e-01 1.36713833e-01] # [ -1.25442386e+00 8.82814229e-02 -2.52978474e-01]]

# 2x3 constant matrix
B = tf.fill([2,3], 5.0) print(sess.run(B)) #[[ 5. 5. 5.] # [ 5. 5. 5.]]

# 3x2 random uniform matrix
C = tf.random_uniform([3,2]) print(sess.run(C)) #[[ 0.07532465 0.23328125] # [ 0.21761775 0.35856724] # [ 0.88200712 0.27035964]]

print(sess.run(C)) # Note that we are reinitializing, hence the new random variabels

# Create matrix from np array [python lib]
D = tf.convert_to_tensor(np.array([[1., 2., 3.], [-3., -7., -1.], [0., 5., -2.]])) print(sess.run(D)) #[[ 1. 2. 3.] # [-3. -7. -1.] # [ 0. 5. -2.]]

# Matrix addition/subtraction
print(sess.run(A+B)) #[[ 6.08604956 4.14716625 4.46166086] # [ 3.24823093 4.94008398 5.14025211]]
 
print(sess.run(B-B)) #[[ 0. 0. 0.] # [ 0. 0. 0.]]

# Matrix Multiplication
print(sess.run(tf.matmul(B, identity_matrix))) #[[ 5. 5. 5.] # [ 5. 5. 5.]]

# Matrix Transpose 由於C是隨即變量，故sess一次，就會從新隨機，故，這裏選擇D作實驗會好些
print(sess.run(tf.transpose(C))) # Again, new random variables

# Matrix Determinant 方陣的行列式
print(sess.run(tf.matrix_determinant(D))) # Matrix Inverse
print(sess.run(tf.matrix_inverse(D))) # Cholesky Decomposition　-->
print(sess.run(tf.cholesky(identity_matrix))) # Eigenvalues and Eigenvectors 特徵值 and 特徵向量
print(sess.run(tf.self_adjoint_eig(D)))
#( array([-10.65907521, -0.22750691, 2.88658212]), 
# array([[ 0.21749542, 0.63250104, -0.74339638],
# [ 0.84526515, 0.2587998 , 0.46749277],
# [-0.4880805 , 0.73004459, 0.47834331]]) )

補充：Cholesky Decomposition

將一個正定Hermite矩陣分解成爲一個下三角陣和它的共軛轉置陣的乘積。
若是矩陣A是正定Hermite陣，那麼矩陣A能夠作以下分解：

$\mathbf{A} = \mathbf{L} \mathbf{L}^{*},$

其中L是一個下三角矩陣且主對角線元素嚴格正定，L*是L的共軛轉置矩陣。

Declaring Operations

矩陣計算

# Operations #---------------------------------- # # This function introduces various operations # in Tensorflow

# Declaring Operations
import matplotlib.pyplot as plt import numpy as np import tensorflow as tf from tensorflow.python.framework import ops ops.reset_default_graph() # Open graph session
sess = tf.Session() # div() vs truediv() vs floordiv()
print(sess.run(tf.div(3,4))) #0
print(sess.run(tf.truediv(3,4))) #0.75
print(sess.run(tf.floordiv(3.0,4.0))) #0.0

# Mod function
print(sess.run(tf.mod(22.0,5.0))) #2.0 
# Cross Product
print(sess.run(tf.cross([1.,0.,0.],[0.,1.,0.]))) #[ 0. 0. 1.]


# Trig functions
print(sess.run(tf.sin(3.1416))) #-7.23998e-06

print(sess.run(tf.cos(3.1416))) #-1.0

# Tangemt
print(sess.run(tf.div(tf.sin(3.1416/4.), tf.cos(3.1416/4.)))) #1.0

# Custom operation
test_nums = range(15) #from tensorflow.python.ops import math_ops #print(sess.run(tf.equal(test_num, 3)))
def custom_polynomial(x_val): # Return 3x^2 - x + 10
    return(tf.sub(3 * tf.square(x_val), x_val) + 10) print(sess.run(custom_polynomial(11)))
 # What should we get with list comprehension
expected_output = [3*x*x-x+10 for x in test_nums] print(expected_output) # Tensorflow custom function output
for num in test_nums: print(sess.run(custom_polynomial(num)))

這爲以後激活函數的實現打下了基礎。

Implementing Activation Functions

# Activation Functions #---------------------------------- # # This function introduces activation # functions in Tensorflow

# Implementing Activation Functions
import matplotlib.pyplot as plt import numpy as np import tensorflow as tf from tensorflow.python.framework import ops ops.reset_default_graph() # Open graph session
sess = tf.Session() # X range
x_vals = np.linspace(start=-10., stop=10., num=100) x_vals # ReLU activation
print(sess.run(tf.nn.relu([-3., 3., 10.]))) y_relu = sess.run(tf.nn.relu(x_vals)) #[ 0. 3. 10.]

# ReLU-6 activation # This is defined as min(max(0,x),6).  # This will come in handy when we # discuss deeper neural networks in Chapters 8, Convolutional Neural Networks and # Chapter 9, Recurrent Neural Networks.
print(sess.run(tf.nn.relu6([-3., 3., 10.]))) y_relu6 = sess.run(tf.nn.relu6(x_vals)) # Sigmoid activation
print(sess.run(tf.nn.sigmoid([-1., 0., 1.]))) y_sigmoid = sess.run(tf.nn.sigmoid(x_vals)) # Hyper Tangent activation
print(sess.run(tf.nn.tanh([-1., 0., 1.]))) y_tanh = sess.run(tf.nn.tanh(x_vals)) # Softsign activation
print(sess.run(tf.nn.softsign([-1., 0., 1.]))) y_softsign = sess.run(tf.nn.softsign(x_vals)) # Softplus activation
print(sess.run(tf.nn.softplus([-1., 0., 1.]))) y_softplus = sess.run(tf.nn.softplus(x_vals)) # Exponential linear activation
print(sess.run(tf.nn.elu([-1., 0., 1.]))) y_elu = sess.run(tf.nn.elu(x_vals)) # Plot the different functions
plt.plot(x_vals, y_softplus, 'r--', label='Softplus', linewidth=2) plt.plot(x_vals, y_relu, 'b:',  label='ReLU',     linewidth=2) plt.plot(x_vals, y_relu6, 'g-.', label='ReLU6',    linewidth=2) plt.plot(x_vals, y_elu, 'k-',  label='ExpLU',    linewidth=0.5) plt.ylim([-1.5,7]) plt.legend(loc='top left') plt.show() plt.plot(x_vals, y_sigmoid, 'r--', label='Sigmoid',  linewidth=2) plt.plot(x_vals, y_tanh, 'b:',  label='Tanh',     linewidth=2) plt.plot(x_vals, y_softsign, 'g-.', label='Softsign', linewidth=2) plt.ylim([-2,2]) plt.legend(loc='top left') plt.show()

Result:

Working with Data Sources

# Data gathering #---------------------------------- # # This function gives us the ways to access # the various data sets we will need

# Data Gathering
import matplotlib.pyplot as plt import numpy as np import tensorflow as tf from tensorflow.python.framework import ops ops.reset_default_graph() # Iris Data
from sklearn import datasets iris = datasets.load_iris() print(len(iris.data)) print(len(iris.target)) print(iris.data[0]) print(set(iris.target)) # Low Birthrate Data
import requests birthdata_url = 'https://www.umass.edu/statdata/statdata/data/lowbwt.dat' birth_file = requests.get(birthdata_url) birth_data = birth_file.text.split('\r\n')[5:] birth_header = [x for x in birth_data[0].split(' ') if len(x)>=1] birth_data = [[float(x) for x in y.split(' ') if len(x)>=1] for y in birth_data[1:] if len(y)>=1] print(len(birth_data)) print(len(birth_data[0])) # Housing Price Data
import requests housing_url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data' housing_header = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV'] housing_file = requests.get(housing_url) housing_data = [[float(x) for x in y.split(' ') if len(x)>=1] for y in housing_file.text.split('\n') if len(y)>=1] print(len(housing_data)) print(len(housing_data[0])) # MNIST Handwriting Data
from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)　　# 本地加載 print(len(mnist.train.images)) print(len(mnist.test.images)) print(len(mnist.validation.images)) print(mnist.train.labels[1,:]) # Ham/Spam Text Data
import requests import io from zipfile import ZipFile # Get/read zip file
zip_url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/00228/smsspamcollection.zip' r = requests.get(zip_url) z = ZipFile(io.BytesIO(r.content)) file = z.read('SMSSpamCollection') # Format Data
text_data = file.decode() text_data = text_data.encode('ascii',errors='ignore') text_data = text_data.decode().split('\n') text_data = [x.split('\t') for x in text_data if len(x)>=1] [text_data_target, text_data_train] = [list(x) for x in zip(*text_data)] print(len(text_data_train)) print(set(text_data_target)) print(text_data_train[1]) # Movie Review Data
import requests import io import tarfile movie_data_url = 'http://www.cs.cornell.edu/people/pabo/movie-review-data/rt-polaritydata.tar.gz' r = requests.get(movie_data_url) # Stream data into temp object
stream_data = io.BytesIO(r.content) tmp = io.BytesIO() while True: s = stream_data.read(16384) if not s: break tmp.write(s) stream_data.close() tmp.seek(0) # Extract tar file
tar_file = tarfile.open(fileobj=tmp, mode="r:gz") pos = tar_file.extractfile('rt-polaritydata/rt-polarity.pos') neg = tar_file.extractfile('rt-polaritydata/rt-polarity.neg') # Save pos/neg reviews
pos_data = [] for line in pos: pos_data.append(line.decode('ISO-8859-1').encode('ascii',errors='ignore').decode()) neg_data = [] for line in neg: neg_data.append(line.decode('ISO-8859-1').encode('ascii',errors='ignore').decode()) tar_file.close() print(len(pos_data)) print(len(neg_data)) print(neg_data[0]) # The Works of Shakespeare Data
import requests shakespeare_url = 'http://www.gutenberg.org/cache/epub/100/pg100.txt'
# Get Shakespeare text
response = requests.get(shakespeare_url) shakespeare_file = response.content # Decode binary into string
shakespeare_text = shakespeare_file.decode('utf-8') # Drop first few descriptive paragraphs.
shakespeare_text = shakespeare_text[7675:] print(len(shakespeare_text)) # English-German Sentence Translation Data
import requests import io from zipfile import ZipFile sentence_url = 'http://www.manythings.org/anki/deu-eng.zip' r = requests.get(sentence_url) z = ZipFile(io.BytesIO(r.content)) file = z.read('deu.txt') # Format Data
eng_ger_data = file.decode() eng_ger_data = eng_ger_data.encode('ascii',errors='ignore') eng_ger_data = eng_ger_data.decode().split('\n') eng_ger_data = [x.split('\t') for x in eng_ger_data if len(x)>=1] [english_sentence, german_sentence] = [list(x) for x in zip(*eng_ger_data)] print(len(english_sentence)) print(len(german_sentence)) print(eng_ger_data[10])

Outlines

Operations in a Computational Graph
Layering Nested Operations
Working with Multiple Layers
Implementing Loss Functions
Implementing Back Propagation
Working with Batch and Stochastic Training
Combining Everything Together
Evaluating Models

這裏經過簡單的線性迴歸來作一些基本的算法練習。

基本的"計算圖"構建

Operations in a Computational Graph

# Operations on a Computational Graph
import matplotlib.pyplot as plt import numpy as np import tensorflow as tf from tensorflow.python.framework import ops ops.reset_default_graph() # Create graph
sess = tf.Session() # Create tensors

# Create data to feed in
x_vals = np.array([1., 3., 5., 7., 9.])

# 計算圖的構造 x_data = tf.placeholder(tf.float32)　　# 變量：輸入不能肯定 m = tf.constant(3.)　　 # 常量
prod   = tf.mul(x_data, m)　　 # Operation 
# 喂數據 for x_val in x_vals: print(sess.run(prod, feed_dict={x_data: x_val}))　　# 往 x_data（變量）裏面喂x_val（輸入數據） merged = tf.merge_all_summaries() my_writer = tf.train.SummaryWriter('/home/unsw/Programmer/1-python/Tensorflow/TF/TensreFlowMachineLearningCookbook_Code/myTest', sess.graph)

全部可用的 summary 操做詳細信息，能夠查看summary_operation文檔。

在TensorFlow中，全部的操做只有當你執行，或者另外一個操做依賴於它的輸出時纔會運行。咱們剛纔建立的這些節點（summary nodes）都圍繞着你的圖像：沒有任何操做依賴於它們的結果。所以，爲了生成彙總信息，咱們須要運行全部這些節點。這樣的手動工做是很乏味的，所以可使用tf.merge_all_summaries來將他們合併爲一個操做。

而後你能夠執行合併命令，它會依據特色步驟將全部數據生成一個序列化的Summary protobuf對象。最後，爲了將彙總數據寫入磁盤，須要將彙總的protobuf對象傳遞給tf.train.Summarywriter。

SummaryWriter 的構造函數中包含了參數 logdir。這個 logdir 很是重要，全部事件都會寫到它所指的目錄下。此外，SummaryWriter 中還包含了一個可選擇的參數 GraphDef。若是輸入了該參數，那麼 TensorBoard 也會顯示你的圖像。

Layering Nested Operations

# Layering Nested Operations

import numpy as np import tensorflow as tf from tensorflow.python.framework import ops ops.reset_default_graph() # Create graph
sess = tf.Session() # Create tensors

# Create data to feed in
my_array = np.array([[1., 3., 5., 7., 9.], [-2., 0., 2., 4., 6.], [-6., -3., 0., 3., 6.]]) x_vals = np.array([my_array, my_array + 1]) x_data = tf.placeholder(tf.float32, shape=(3, 5)) m1 = tf.constant([[1.],[0.],[-1.],[2.],[4.]]) sess.run(m1) m2 = tf.constant([[2.]]) sess.run(m2) a1 = tf.constant([[10.]]) sess.run(a1) # 1st Operation Layer = Multiplication
prod1 = tf.matmul(x_data, m1) # 2nd Operation Layer = Multiplication
prod2 = tf.matmul(prod1, m2) # 3rd Operation Layer = Addition
add1 = tf.add(prod2, a1) for x_val in x_vals: print(sess.run(add1, feed_dict={x_data: x_val})) merged = tf.merge_all_summaries() my_writer = tf.train.SummaryWriter('/home/nick/OneDrive/Documents/tensor_flow_book/Code/2_Tensorflow_Way', sess.graph)

Working with Multiple Layers

Figure, 簡單的卷積網絡

# Layering Nested Operations
import numpy as np import tensorflow as tf from tensorflow.python.framework import ops ops.reset_default_graph() # Create graph
sess = tf.Session() # Create tensors

# Create a small random 'image' of size 4x4
x_shape = [1, 4, 4, 1] x_val = np.random.uniform(size=x_shape) x_data = tf.placeholder(tf.float32, shape=x_shape) # 【x_data <--feed-- x_val】
 # 卷積
my_filter    = tf.constant(0.25, shape=[2, 2, 1, 1]) my_strides = [1, 2, 2, 1] mov_avg_layer= tf.nn.conv2d(x_data, my_filter, my_strides, padding='SAME', name='Moving_Avg_Window') # Define a custom layer which will be sigmoid(Ax+b) where # x is a 2x2 matrix and A and b are 2x2 matrices
def custom_layer(input_matrix): input_matrix_sqeezed = tf.squeeze(input_matrix) A = tf.constant([[1., 2.], [-1., 3.]]) # Const b = tf.constant(1., shape=[2, 2])  # Const_1
 # 以上：節點輸入；如下，節點計算
 temp1 = tf.matmul(A, input_matrix_sqeezed) # [MatMul] temp = tf.add(temp1, b) # Ax + b # [Add]
    return(tf.sigmoid(temp)) # [Sigmoid] # Add custom layer to graph
with tf.name_scope('Custom_Layer') as scope: custom_layer1 = custom_layer(mov_avg_layer) # <-- 卷積的結果給於自定義層來處理 # The output should be an array that is 2x2, but size (1,2,2,1) #print(sess.run(mov_avg_layer, feed_dict={x_data: x_val}))

# After custom operation, size is now 2x2 (squeezed out size 1 dims)
print(sess.run(custom_layer1, feed_dict={x_data: x_val})) merged = tf.merge_all_summaries() my_writer = tf.train.SummaryWriter('/home/unsw/Programmer/1-python/Tensorflow/TF/TensreFlowMachineLearningCookbook_Code/myTest', sess.graph)

tf.squeeze()

給定張量輸入，此操做返回相同類型的張量，並刪除全部尺寸爲1的尺寸。

若是不想刪除全部大小是1的維度，能夠經過squeeze_dims指定。

# 't' is a tensor of shape [1, 2, 1, 3, 1, 1]
shape(squeeze(t))  # => [2, 3]

# 't' is a tensor of shape [1, 2, 1, 3, 1, 1]
shape(squeeze(t, [2, 4]))  # => [1, 2, 3, 1]　　指定了刪除的‘1’.

損失函數

Implementing Loss Functions

看樣子就是scikit一套的接口【以前的激活函數和這裏的損失函數需單獨總結數學原理】

# Loss Functions #---------------------------------- # # This python script illustrates the different # loss functions for regression and classification.

import matplotlib.pyplot as plt import tensorflow as tf from tensorflow.python.framework import ops ops.reset_default_graph() # Create graph
sess = tf.Session() ###### Numerical Predictions ######
x_vals = tf.linspace(-1., 1., 500) target = tf.constant(0.) # L2 loss # L = (pred - actual)^2
l2_y_vals = tf.square(target - x_vals) l2_y_out = sess.run(l2_y_vals) # L1 loss # L = abs(pred - actual)
l1_y_vals = tf.abs(target - x_vals) l1_y_out = sess.run(l1_y_vals) # Pseudo-Huber loss # L = delta^2 * (sqrt(1 + ((pred - actual)/delta)^2) - 1)
delta1 = tf.constant(0.25) phuber1_y_vals = tf.mul(tf.square(delta1), tf.sqrt(1. + tf.square((target - x_vals)/delta1)) - 1.) phuber1_y_out = sess.run(phuber1_y_vals) delta2 = tf.constant(5.) phuber2_y_vals = tf.mul(tf.square(delta2), tf.sqrt(1. + tf.square((target - x_vals)/delta2)) - 1.) phuber2_y_out = sess.run(phuber2_y_vals) # Plot the output:
x_array = sess.run(x_vals)　　# <-- 橫軸表明x_vals的取值 plt.plot(x_array, l2_y_out, 'b-',  label='L2 Loss') plt.plot(x_array, l1_y_out, 'r--', label='L1 Loss') plt.plot(x_array, phuber1_y_out, 'k-.', label='P-Huber Loss (0.25)') plt.plot(x_array, phuber2_y_out, 'g:',  label='P-Huber Loss (5.0)') plt.ylim(-0.2, 0.4) plt.legend(loc='lower right', prop={'size': 11}) plt.show() 
############################################################################################### ###### Categorical Predictions ######
x_vals  = tf.linspace(-3., 5., 500) target = tf.constant(1.) targets = tf.fill([500,], 1.) # Hinge loss # Use for predicting binary (-1, 1) classes # L = max(0, 1 - (pred * actual))
hinge_y_vals = tf.maximum(0., 1. - tf.mul(target, x_vals)) hinge_y_out = sess.run(hinge_y_vals) # Cross entropy loss # L = -actual * (log(pred)) - (1-actual)(log(1-pred))
xentropy_y_vals = - tf.mul(target, tf.log(x_vals)) - tf.mul((1. - target), tf.log(1. - x_vals)) xentropy_y_out = sess.run(xentropy_y_vals) # Sigmoid entropy loss # L = -actual * (log(sigmoid(pred))) - (1-actual)(log(1-sigmoid(pred))) # or # L = max(actual, 0) - actual * pred + log(1 + exp(-abs(actual)))
xentropy_sigmoid_y_vals = tf.nn.sigmoid_cross_entropy_with_logits(x_vals, targets) xentropy_sigmoid_y_out = sess.run(xentropy_sigmoid_y_vals) # Weighted (softmax) cross entropy loss # L = -actual * (log(pred)) * weights - (1-actual)(log(1-pred)) # or # L = (1 - pred) * actual + (1 + (weights - 1) * pred) * log(1 + exp(-actual))
weight = tf.constant(0.5) xentropy_weighted_y_vals = tf.nn.weighted_cross_entropy_with_logits(x_vals, targets, weight) xentropy_weighted_y_out = sess.run(xentropy_weighted_y_vals) # Plot the output
x_array = sess.run(x_vals) plt.plot(x_array, hinge_y_out, 'b-', label='Hinge Loss') plt.plot(x_array, xentropy_y_out, 'r--', label='Cross Entropy Loss') plt.plot(x_array, xentropy_sigmoid_y_out, 'k-.', label='Cross Entropy Sigmoid Loss') plt.plot(x_array, xentropy_weighted_y_out, 'g:', label='Weighted Cross Entropy Loss (x0.5)') plt.ylim(-1.5, 3) #plt.xlim(-1, 3)
plt.legend(loc='lower right', prop={'size': 11}) plt.show() # Softmax entropy loss # L = -actual * (log(softmax(pred))) - (1-actual)(log(1-softmax(pred)))
unscaled_logits = tf.constant([[1., -3., 10.]]) target_dist = tf.constant([[0.1, 0.02, 0.88]]) softmax_xentropy = tf.nn.softmax_cross_entropy_with_logits(unscaled_logits, target_dist) print(sess.run(softmax_xentropy)) # Sparse entropy loss # Use when classes and targets have to be mutually exclusive # L = sum( -actual * log(pred) )
unscaled_logits = tf.constant([[1., -3., 10.]]) sparse_target_dist = tf.constant([2]) sparse_xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(unscaled_logits, sparse_target_dist) print(sess.run(sparse_xentropy))

Result:

Loss function	Use	Benefits	Disadvantages
L2	Regression	More stable	Less robust
L1	Regression	More robust	Less stable
Psuedo-Huber	Regression	More robust and stable	One more parameter
Hinge	Classification	Creates a max margin for use in SVM	Unbounded loss affected by outliers
Cross-entropy	Classification	More stable	Unbounded loss, less robust

Implementing Back Propagation

Online training - 非batch

# Back Propagation #---------------------------------- # # This python function shows how to implement back propagation # in regression and classification models.

import matplotlib.pyplot as plt import numpy as np import tensorflow as tf from tensorflow.python.framework import ops ops.reset_default_graph() # Create graph
sess = tf.Session() # Regression Example: # We will create sample data as follows: # x-data: 100 random samples from a normal ~ N(1, 0.1) # target: 100 values of the value 10. # We will fit the model: # x-data * A = target # Theoretically, A = 10.

# Create data
x_vals   = np.random.normal(1, 0.1, 100) y_vals = np.repeat(10., 100) x_data = tf.placeholder(shape=[1], dtype=tf.float32) y_target = tf.placeholder(shape=[1], dtype=tf.float32) # Create variable (one model parameter = A)
A = tf.Variable(tf.random_normal(shape=[1])) # Add operation to graph
my_output = tf.mul(x_data, A) # Add L2 loss operation to graph
loss = tf.square(my_output - y_target) # Initialize variables
init = tf.initialize_all_variables() sess.run(init) # Create Optimizer (solver)
my_opt = tf.train.GradientDescentOptimizer(0.02) train_step = my_opt.minimize(loss) # Run Loop: 100 times
for i in range(100): rand_index = np.random.choice(100) rand_x = [ x_vals[rand_index] ] rand_y = [ y_vals[rand_index] ] sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y}) if (i+1)%25==0: print('Step #' + str(i+1) + ' A = ' + str(sess.run(A))) print('Loss = ' + str(sess.run(loss, feed_dict={x_data: rand_x, y_target: rand_y})))　　<-- loss查看

Result:

Step #200 A = [ 3.09365487]
Loss = [[ 0.00431256]]
Step #400 A = [ 0.26573253]
Loss = [[ 0.0705369]] Step #600 A = [-0.67981237] Loss = [[ 0.03407416]] Step #800 A = [-0.81504095] Loss = [[ 0.17166217]] Step #1000 A = [-1.02729309] Loss = [[ 0.25573865]] Step #1200 A = [-0.96181494] Loss = [[ 0.04576259]] Step #1400 A = [-1.02739477] Loss = [[ 0.05697485]] 
Ending Accuracy = 0.98

batch training的batch index

如下示例，主要演示了增長一個維度在數據，專門用於做爲batch idx。【expand_dims(...)】

至於batch training的具體內容，將在下一個環節講解。

# Classification Example # We will create sample data as follows: # x-data: sample 50 random values from a normal = N(-1, 1) # + sample 50 random values from a normal = N(1, 1) # target: 50 values of 0 + 50 values of 1. # These are essentially 100 values of the corresponding output index # We will fit the binary classification model: # If sigmoid(x+A) < 0.5 -> 0 else 1 # Theoretically, A should be -(mean1 + mean2)/2
 ops.reset_default_graph() # Create graph
sess = tf.Session() # Create data
x_vals   = np.concatenate((np.random.normal(-1, 1, 50), np.random.normal(3, 1, 50))) y_vals = np.concatenate((np.repeat(0., 50), np.repeat(1., 50))) x_data = tf.placeholder(shape=[1], dtype=tf.float32) y_target = tf.placeholder(shape=[1], dtype=tf.float32) # Create variable (one model parameter = A)
A = tf.Variable(tf.random_normal(mean=10, shape=[1])) # Add operation to graph # Want to create the operstion sigmoid(x + A) # Note, the sigmoid() part is in the loss function
my_output = tf.add(x_data, A) 【這裏就一個簡單的加法運算】
 # Now we have to add another dimension to each (batch size of 1)
my_output_expanded = tf.expand_dims(my_output, 0) y_target_expanded = tf.expand_dims(y_target, 0) # Initialize variables
init = tf.initialize_all_variables() sess.run(init) # Add classification loss (cross entropy)
xentropy = tf.nn.sigmoid_cross_entropy_with_logits(my_output_expanded, y_target_expanded) # Create Optimizer
my_opt = tf.train.GradientDescentOptimizer(0.05) train_step = my_opt.minimize(xentropy) # Run loop
for i in range(1400): rand_index = np.random.choice(100) rand_x = [x_vals[rand_index]] rand_y = [y_vals[rand_index]] sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y}) if (i+1)%200==0: print('Step #' + str(i+1) + ' A = ' + str(sess.run(A))) print('Loss = ' + str(sess.run(xentropy, feed_dict={x_data: rand_x, y_target: rand_y}))) # Evaluate Predictions
predictions = [] for i in range(len(x_vals)): x_val = [x_vals[i]] prediction = sess.run(tf.round(tf.sigmoid(my_output)), feed_dict={x_data: x_val})　　# my_output是實際輸出值，sigmoid化後再四捨五入 predictions.append(prediction[0]) accuracy = sum(x==y for x,y in zip(predictions, y_vals))/100.　　# 實際輸出值 與 預計值 比較，求正確率 print('Ending Accuracy = ' + str(np.round(accuracy, 2)))

Working with Batch and Stochastic Training

根據 np.random.choice(100)，原本就是Stochastic training or Online training.

【這個例子是不錯的呦】

# Batch and Stochastic Training
#----------------------------------
#
#  This python function illustrates two different training methods:
#  batch and stochastic training.  For each model, we will use
#  a regression model that predicts one model variable.

import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from tensorflow.python.framework import ops
ops.reset_default_graph()

# We will implement a regression example in stochastic and batch training

# Stochastic Training:
# Create graph
sess = tf.Session()

# Create data
x_vals   = np.random.normal(1, 0.1, 100)
y_vals   = np.repeat(10., 100)
x_data   = tf.placeholder(shape=[1], dtype=tf.float32)
y_target = tf.placeholder(shape=[1], dtype=tf.float32)

# Create variable (one model parameter = A)
A = tf.Variable(tf.random_normal(shape=[1]))

# Add operation to graph
my_output = tf.mul(x_data, A)

# Add L2 loss operation to graph
loss = tf.square(my_output - y_target)

# Initialize variables
init = tf.initialize_all_variables()
sess.run(init)

# Create Optimizer
my_opt = tf.train.GradientDescentOptimizer(0.02)
train_step = my_opt.minimize(loss)

loss_stochastic = []
# Run Loop
for i in range(100):
    rand_index = np.random.choice(100)
    rand_x = [x_vals[rand_index]]
    rand_y = [y_vals[rand_index]]
    sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y})
    if (i+1)%5==0:
        print('Step #' + str(i+1) + ' A = ' + str(sess.run(A)))
        temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_target: rand_y})
        print('Loss = ' + str(temp_loss))
        loss_stochastic.append(temp_loss)

View Code

# Batch Training: # Re-initialize graph
ops.reset_default_graph() sess = tf.Session() # Declare batch size
batch_size = 20

# Create data
x_vals   = np.random.normal(1, 0.1, 100) y_vals = np.repeat(10., 100) x_data = tf.placeholder(shape=[None, 1], dtype=tf.float32) y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32) # Create variable (one model parameter = A)
A = tf.Variable(tf.random_normal(shape=[1,1])) # Add operation to graph
my_output = tf.matmul(x_data, A) # Add L2 loss operation to graph 由於是迴歸問題
loss = tf.reduce_mean(tf.square(my_output - y_target))　　# 由於是batch training # Initialize variables
init = tf.initialize_all_variables() sess.run(init) # Create Optimizer
my_opt = tf.train.GradientDescentOptimizer(0.02) train_step = my_opt.minimize(loss) loss_batch = [] # Run Loop
for i in range(100):　　# NB： 這是否是標準的minnbatch and epoch方式 rand_index = np.random.choice(100, size=batch_size) rand_x = np.transpose([x_vals[rand_index]]) rand_y = np.transpose([y_vals[rand_index]]) sess.run(train_step, feed_dict={x_data: rand_x, y_target: rand_y}) if (i+1)%5==0: print('Step #' + str(i+1) + ' A = ' + str(sess.run(A))) temp_loss = sess.run(loss, feed_dict={x_data: rand_x, y_target: rand_y})　　<-- loss查看print('Loss = ' + str(temp_loss)) loss_batch.append(temp_loss) plt.plot(range(0, 100, 5), loss_stochastic, 'b-', label='Stochastic Loss') plt.plot(range(0, 100, 5), loss_batch, 'r--', label='Batch Loss, size=20') plt.legend(loc='upper right', prop={'size': 11}) plt.show()

Result:

Binary classifier for Iris Dataset

# Combining Everything Together
#----------------------------------
# This file will perform binary classification on the
# class if iris dataset. We will only predict if a flower is
# I.setosa or not.
#
# We will create a simple binary classifier by creating a line
# and running everything through a sigmoid to get a binary predictor.
# The two features we will use are pedal length and pedal width.
#
# We will use batch training, but this can be easily
# adapted to stochastic training.

import matplotlib.pyplot as plt
import numpy as np
from sklearn import datasets
import tensorflow as tf
from tensorflow.python.framework import ops
ops.reset_default_graph()

# Load the iris data
# iris.target = {0, 1, 2}, where '0' is setosa
# iris.data ~ [sepal.width, sepal.length, pedal.width, pedal.length]
iris = datasets.load_iris()
binary_target = np.array([1. if x==0 else 0. for x in iris.target])
iris_2d = np.array([[x[2], x[3]] for x in iris.data])   # Jeff: only consider two features.

# Declare batch size
batch_size = 20

# Create graph
sess = tf.Session()

Init

Construct graph:

# Declare placeholders 三個feed_dict的入口
x1_data  = tf.placeholder(shape=[None, 1], dtype=tf.float32) x2_data = tf.placeholder(shape=[None, 1], dtype=tf.float32) y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32) # Create variables A and b (0 = x1 - A*x2 + b)
A = tf.Variable(tf.random_normal(shape=[1, 1])) b = tf.Variable(tf.random_normal(shape=[1, 1])) ################################### # Add model to graph: # x1 - A*x2 + b ###################################
my_mult   = tf.matmul(x2_data, A) my_add = tf.add(my_mult, b) my_output = tf.sub(x1_data, my_add) #my_output = tf.sub(x_data[0], tf.add(tf.matmul(x_data[1], A), b))


# Add classification loss (cross entropy)
xentropy = tf.nn.sigmoid_cross_entropy_with_logits(my_output, y_target) # Create Optimizer
my_opt     = tf.train.GradientDescentOptimizer(0.05) train_step = my_opt.minimize(xentropy) # Initialize variables
init = tf.initialize_all_variables() sess.run(init)

Training loop:

# Run Loop
for i in range(1000): rand_index = np.random.choice(len(iris_2d), size=batch_size) #rand_x = np.transpose([iris_2d[rand_index]])
    rand_x  = iris_2d[rand_index]
 # 獲取一個樣本數據 rand_x1 = np.array([[x[0]] for x in rand_x]) rand_x2 = np.array([[x[1]] for x in rand_x]) #rand_y = np.transpose([binary_target[rand_index]])
    rand_y  = np.array([[y] for y in binary_target[rand_index]])
 sess.run(train_step, feed_dict={x1_data: rand_x1, x2_data: rand_x2, y_target: rand_y}) if (i+1)%200==0: print('Step #' + str(i+1) + ' A = ' + str(sess.run(A)) + ', b = ' + str(sess.run(b)))

# Visualize Results
# Pull out slope/intercept
[[slope]] = sess.run(A)
[[intercept]] = sess.run(b)

# Create fitted line
x = np.linspace(0, 3, num=50)
ablineValues = []
for i in x:
  ablineValues.append(slope*i+intercept)

# Plot the fitted line over the data
setosa_x = [a[1] for i,a in enumerate(iris_2d) if binary_target[i]==1]
setosa_y = [a[0] for i,a in enumerate(iris_2d) if binary_target[i]==1]
non_setosa_x = [a[1] for i,a in enumerate(iris_2d) if binary_target[i]==0]
non_setosa_y = [a[0] for i,a in enumerate(iris_2d) if binary_target[i]==0]
plt.plot(setosa_x, setosa_y, 'rx', ms=10, mew=2, label='setosa')
plt.plot(non_setosa_x, non_setosa_y, 'ro', label='Non-setosa')
plt.plot(x, ablineValues, 'b-')
plt.xlim([0.0, 2.7])
plt.ylim([0.0, 7.1])
plt.suptitle('Linear Separator For I.setosa', fontsize=20)
plt.xlabel('Petal Length')
plt.ylabel('Petal Width')
plt.legend(loc='lower right')
plt.show()