TensorFlow-Gpu環境搭建——Win10+ Python+Anaconda+cuda

參考:http://blog.csdn.net/sb19931201/article/details/53648615python

https://segmentfault.com/a/1190000009803319segmentfault

 

python版本tensorflow分爲Cpu版本和Gpu版本,Nvidia的Gpu很是適合機器學校的訓練windows

python和tensorflow的安裝較簡單,能夠參考上面的連接,主要是經過Anaconda來管理。app

使用Nvidia的Gpu,須要安裝Cuda和cudnndom

須要注意ide

一、顯卡是否支持GPU加速測試

二、軟件的版本this

windows 10--python 3.5--tensorflow-gpu 1.4.0--cuda cuda_8.0.61_win10 --cudnn-8.0-windows10-x64-v6.0spa

 

Cuda.net

The NVIDIA® CUDA® Toolkit provides a development environment for creating high performance GPU-accelerated applications. With the CUDA Toolkit, you can develop, optimize and deploy your applications on GPU-accelerated embedded systems, desktop workstations, enterprise data centers, cloud-based platforms and HPC supercomputers. The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler and a runtime library to deploy your application.

介紹及最新版下載地址:https://developer.nvidia.com/cuda-toolkit

cuda個版本下載地址:https://developer.nvidia.com/cuda-toolkit-archive,根據提示安裝便可

 

cudnn

The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. cuDNN is part of the NVIDIA Deep Learning SDK.

cudnn 是一個dll文件,須要複製到cuda的安裝目錄的bin文件中

 

測試代碼,使用的是tensorflow官網的代碼

import tensorflow as tf
import numpy as np

# 使用 NumPy 生成假數據(phony data), 總共 100 個點.
x_data = np.float32(np.random.rand(2, 100)) # 隨機輸入
y_data = np.dot([0.100, 0.200], x_data) + 0.300

# 構造一個線性模型
#
b = tf.Variable(tf.zeros([1]))
W = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0))
y = tf.matmul(W, x_data) + b

# 最小化方差
loss = tf.reduce_mean(tf.square(y - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(loss)

# 初始化變量
init = tf.initialize_all_variables()

# 啓動圖 (graph)
sess = tf.Session()
sess.run(init)

# 擬合平面
for step in range(0, 201):
    sess.run(train)
    if step % 20 == 0:
        print (step, sess.run(W), sess.run(b))

# 獲得最佳擬合結果 W: [[0.100  0.200]], b: [0.300]

輸出結果:

能夠看到顯卡的計算能力是6.1

D:\Tools\Anaconda35\python.exe D:/PythonProj/tensorFlow/tensor8.py
WARNING:tensorflow:From D:\Tools\Anaconda35\lib\site-packages\tensorflow\python\util\tf_should_use.py:107: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.
2017-11-19 17:08:40.225423: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
2017-11-19 17:08:40.882335: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1030] Found device 0 with properties: 
name: GeForce GTX 1060 3GB major: 6 minor: 1 memoryClockRate(GHz): 1.7085
pciBusID: 0000:01:00.0
totalMemory: 3.00GiB freeMemory: 254.16MiB
2017-11-19 17:08:40.883414: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\35\tensorflow\core\common_runtime\gpu\gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1060 3GB, pci bus id: 0000:01:00.0, compute capability: 6.1)
0 [[ 0.29419887 -0.23337287]] [ 1.0515306]
20 [[ 0.00030054  0.03563837]] [ 0.44433528]
40 [[ 0.04815638  0.14494912]] [ 0.35854429]
60 [[ 0.07746208  0.17898612]] [ 0.32386735]
80 [[ 0.09062619  0.19159497]] [ 0.30974501]
100 [[ 0.09614999  0.19658807]] [ 0.30398068]
120 [[ 0.09842454  0.1986087 ]] [ 0.30162627]
140 [[ 0.09935603  0.1994319 ]] [ 0.3006644]
160 [[ 0.09973686  0.19976793]] [ 0.30027145]
180 [[ 0.09989249  0.1999052 ]] [ 0.30011091]
200 [[ 0.09995609  0.19996127]] [ 0.30004531]

Process finished with exit code 0

 

MNIST教程,訓練結果比cup版本快了大約百倍

from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf

#加載訓練數據
MNIST_data_folder=r"D:\WorkSpace\tensorFlow\data"
mnist=input_data.read_data_sets(MNIST_data_folder,one_hot=True)
print(mnist.train.next_batch(1))
#
# 創建抽象模型
x = tf.placeholder("float", [None, 784])
W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x,W) + b)
y_ = tf.placeholder("float", [None,10])
#權重初始化
def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

#卷積和池化
def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
  return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
                        strides=[1, 2, 2, 1], padding='SAME')

#第一層卷積
W_conv1 = weight_variable([5, 5, 1, 32])
b_conv1 = bias_variable([32])
x_image = tf.reshape(x, [-1,28,28,1])
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

#第二層卷積
W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

#密集鏈接層
W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

#Dropout
keep_prob = tf.placeholder("float")
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
#輸出層
W_fc2 = weight_variable([1024, 10])
b_fc2 = bias_variable([10])

y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

#訓練和評估模型
cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

sess = tf.InteractiveSession();
init = tf.global_variables_initializer();
sess.run(init);

for i in range(20000):
  batch = mnist.train.next_batch(50)
  if i%100 == 0:
    train_accuracy = accuracy.eval(feed_dict={
        x:batch[0], y_: batch[1], keep_prob: 1.0})
    print("step %d, training accuracy %g"%(i, train_accuracy))
  train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})

print("test accuracy %g"%accuracy.eval(feed_dict={
    x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}))
相關文章
相關標籤/搜索