如何搭建GPU深度學習環境

時間 2019-12-04

標籤如何搭建 gpu 深度學習環境简体版

原文原文鏈接

這篇文章將帶領你完全安裝好如下內容：python

Tensorflow-gpuc++

CUDAgit

cudnngithub

Vswindows

首先咱們默認你已經安裝了anaconda，若是沒有那就去安裝一個。至於怎麼安裝，能夠隨便百度一個教程，由於安裝anaconda的教程實在太多了。記得添加環境變量，這個教程裏也確定有。app

好，如今你已經安裝好了anaconda，請你這樣操做：ui

先看一下你的顯卡能不能跑gpu。this

首先，你要打開你的英偉達控制面板（不知道咋打開的能夠百度一下，爲了節省篇幅這裏不寫了）google

點擊幫助-系統信息-組件，而後看到這裏：spa

OK，說明你的顯卡目前最高支持CUDA10.2.120的版本，固然比這個版本低的你也能夠安裝，不過也是有要求的，咱們後面再講。

如今，你要肯定你要用哪一個版本的tensorflow-gpu，我我的認爲，要先肯定tensorflow-gpu的版本才能確認CUDA和cudnn。通常說來，版本越高對硬件的要求就越高，我爲了折中，選擇了1.8.0的版本。至於你怎麼選擇呢，能夠看你的硬件了，若是硬件還不錯，能夠選擇高一點的版本。並且有的代碼特別嬌氣，只能用特定版本的tensorflow跑，因此在安裝前最好選擇一個合適的gpu版本，而不是瞎裝。

咱們能夠從tensorflow的官網上看到這樣一幅圖：

不過對你來講這張圖必定比較老了，由於google公司不斷地推陳出新，在你看到這篇攻略的時候，估計已經升級了好幾代了。不過不要緊，選一個差很少的就行，不要喜新厭舊，最新的版本有可能存在兼容問題。

我選擇的是1.8.0的版本，能夠看到它的要求是python3.5-3.6，咱們選擇3.6，還須要VS2015，還須要CUDA9和cudnn7.

OK，這就是咱們須要的全部材料了，下面就是去下載了！

下載的內容放在這裏，老老實實下載下來：

CUDA：https://developer.nvidia.com/cuda-toolkit-archive

Cudnn：https://developer.nvidia.com/rdp/cudnn-archive

Tensorflow：https://github.com/fo40225/tensorflow-windows-wheel

DXSDK_jun10.exe ：https://www.microsoft.com/en-us/download/details.aspx?id=6812
微軟的DirectX Software Development Kit ，安裝它爲了編譯後年的cuda_samples

打開上面的連接選擇你想要的版本，下載下來，必定注意CUDA和cudnn的版本搭配。

下面開始安裝咯！

首先在anaconda裏建立一個新的環境，像這樣

這裏設定的是python3.6，這是出於上面tensorflow那張圖，若是你安裝的gpu版本較低，可能須要安裝python3.5。

建立完成後什麼都不須要作，咱們去作別的事。

如今你須要打開你的vs2015進行安裝，下載完成後應該是獲得這些東西：

雙擊，等待一段時間，

按照圖示操做，而後點擊下一步。

咱們選擇vs只是爲了這個c++環境，若是你想用vs跑python也能夠，不過我是用的pycharm。畢竟先入爲主了。

等安裝完成咱們進入下一步。

對了，這裏有個事要說，若是你是第一次安裝vs，那麼恭喜你。若是你以前安裝過vs，那就太好了，你須要把原來的vs完全清除乾淨。至於怎麼清除，你能夠看看這個：

https://blog.csdn.net/a359877454/article/details/52679041

好的，如今須要安裝的是這個：

DXSDK_jun10.exe

不用管，直接安裝就能夠，無論最後有沒有error，只要過一遍就ok

接下來就是重頭戲CUDA了，你要把你安裝版本的所有內容下載下來，包括補丁包，而後先安裝本體，再安裝補丁包。

這是我下載的全部CUDA，一個本體+4個補丁包

若是一開始提示你不兼容，也不要緊，只要把

裏最後面那個driver components取消掉就能夠了。

都安裝完以後，檢測一下是否安裝成功：

解壓cudnn，

裏面有bin、include、lib三個目錄，將三個文件夾複製到CUDA對應文件夾（其實是將cuDNN這三個目錄中的文件，添加到CUDA對應bin、include、lib文件夾中，CUDA對應文件夾不須要刪除，也不會有文件被覆蓋），默認文件夾在：C:\ProgramFiles\NVIDIA GPU Computing Toolkit\CUDA\v9.0，解壓後的cudnn裏除了這三個文件夾外還有一個文件，也須要放到v9.0的文件夾下

這一步完成以後須要編譯一下cuda_samples，就是打開

的文件，這裏要選擇你對應版本的vs文件，好比我用的vs2015，因此我打開vs2015.sln

打開後是這兒樣的：

注意上面紅框部分要選爲64位和Release

而後在有邊框找到1_ Utilities，而後右鍵選擇Build

稍等片刻，下方會出現這樣的字樣：

那就對咯！

配置完成後，咱們能夠驗證是否配置成功，主要使用CUDA內置的deviceQuery.exe 和 bandwithTest.exe：首先啓動cmd，cd到安裝目錄下的C:\ProgramFiles\NVIDIA GPU Computing Toolkit\CUDA\v9.0\extras\demo_suite,而後分別執行bandwidthTest.exe和deviceQuery.exe,

咱們須要關注的是最後的result是否是=pass

如今環境基本搭建完成了，最後一步就是安裝gpu了。

我安裝gpu的方法有點另類，是下載完成後再安裝，而不少博主的安裝方法是直接用pip或者是conda

我是小白哈，因此就下載後安裝了，由於pip的速度太慢了

咱們要作的就是找到下載好的whl文件

而後打開

首先要輸入activate tf-gpu，這個tf-gpu是你當初給anaconda新建的環境記得名字

而後輸入pip install，按一下空格，別回車，把你的whl文件拖進來，而後回車

接下來又是漫長的等待，直到安裝完成。

安裝完成，你的gpu就完全搭建好了。如今你只須要打開你的pycharm，而後在file-setting-project；code裏指定anaconda新環境的python（好比我這裏是tf-gpu的python3.6）

而後點擊肯定就能夠了。如今你就能夠試試你的代碼能不能在gpu裏運行了！

須要注意一點，既然你已經裝了gpu的tensorflow，就不要再裝cpu了，一個環境下有倆tensorflow會起衝突。

在安裝完這一切以後，咱們能夠輸入如下代碼，來檢測還存在哪些問題：

import ctypes
import imp
import sys


def main():
    try:
        import tensorflow as tf
        print("TensorFlow successfully installed.")
        if tf.test.is_built_with_cuda():
            print("The installed version of TensorFlow includes GPU support.")
        else:
            print("The installed version of TensorFlow does not include GPU support.")
        sys.exit(0)
    except ImportError:
        print("ERROR: Failed to import the TensorFlow module.")

    candidate_explanation = False

    python_version = sys.version_info.major, sys.version_info.minor
    print("\n- Python version is %d.%d." % python_version)
    if not (python_version == (3, 5) or python_version == (3, 6)):
        candidate_explanation = True
        print("- The official distribution of TensorFlow for Windows requires "
              "Python version 3.5 or 3.6.")

    try:
        _, pathname, _ = imp.find_module("tensorflow")
        print("\n- TensorFlow is installed at: %s" % pathname)
    except ImportError:
        candidate_explanation = False
        print("""
- No module named TensorFlow is installed in this Python environment. You may
  install it using the command `pip install tensorflow`.""")

    try:
        msvcp140 = ctypes.WinDLL("msvcp140.dll")
    except OSError:
        candidate_explanation = True
        print("""
- Could not load 'msvcp140.dll'. TensorFlow requires that this DLL be
  installed in a directory that is named in your %PATH% environment
  variable. You may install this DLL by downloading Microsoft Visual
  C++ 2015 Redistributable Update 3 from this URL:
  https://www.microsoft.com/en-us/download/details.aspx?id=53587""")

    try:
        cudart64_80 = ctypes.WinDLL("cudart64_80.dll")
    except OSError:
        candidate_explanation = True
        print("""
- Could not load 'cudart64_80.dll'. The GPU version of TensorFlow
  requires that this DLL be installed in a directory that is named in
  your %PATH% environment variable. Download and install CUDA 8.0 from
  this URL: https://developer.nvidia.com/cuda-toolkit""")

    try:
        nvcuda = ctypes.WinDLL("nvcuda.dll")
    except OSError:
        candidate_explanation = True
        print("""
- Could not load 'nvcuda.dll'. The GPU version of TensorFlow requires that
  this DLL be installed in a directory that is named in your %PATH%
  environment variable. Typically it is installed in 'C:\Windows\System32'.
  If it is not present, ensure that you have a CUDA-capable GPU with the
  correct driver installed.""")

    cudnn5_found = False
    try:
        cudnn5 = ctypes.WinDLL("cudnn64_5.dll")
        cudnn5_found = True
    except OSError:
        candidate_explanation = True
        print("""
- Could not load 'cudnn64_5.dll'. The GPU version of TensorFlow
  requires that this DLL be installed in a directory that is named in
  your %PATH% environment variable. Note that installing cuDNN is a
  separate step from installing CUDA, and it is often found in a
  different directory from the CUDA DLLs. You may install the
  necessary DLL by downloading cuDNN 5.1 from this URL:
  https://developer.nvidia.com/cudnn""")

    cudnn6_found = False
    try:
        cudnn = ctypes.WinDLL("cudnn64_6.dll")
        cudnn6_found = True
    except OSError:
        candidate_explanation = True

    if not cudnn5_found or not cudnn6_found:
        print()
        if not cudnn5_found and not cudnn6_found:
            print("- Could not find cuDNN.")
        elif not cudnn5_found:
            print("- Could not find cuDNN 5.1.")
        else:
            print("- Could not find cuDNN 6.")
            print("""
  The GPU version of TensorFlow requires that the correct cuDNN DLL be installed
  in a directory that is named in your %PATH% environment variable. Note that
  installing cuDNN is a separate step from installing CUDA, and it is often
  found in a different directory from the CUDA DLLs. The correct version of
  cuDNN depends on your version of TensorFlow:

  * TensorFlow 1.2.1 or earlier requires cuDNN 5.1. ('cudnn64_5.dll')
  * TensorFlow 1.3 or later requires cuDNN 6. ('cudnn64_6.dll')

  You may install the necessary DLL by downloading cuDNN from this URL:
  https://developer.nvidia.com/cudnn""")

    if not candidate_explanation:
        print("""
- All required DLLs appear to be present. Please open an issue on the
  TensorFlow GitHub page: https://github.com/tensorflow/tensorflow/issues""")

    sys.exit(-1)


if __name__ == "__main__":
    main()

若是獲得的結果是這樣的，那麼恭喜你，你成功安裝了tensorflow-gpu，以後你能夠隨便檢測你編寫的代碼了！

說明：以上內容是博主總結大量安裝攻略後得出的，本身也是按照這方法安裝的，雖然出了不少差錯，但最後仍是成功搭建了gpu環境。

上述內容不必定對任何人都有用，只是博主本身這樣安裝的，如實地總結了出來。若是你按照上述方法安裝失敗，能夠百度一下別人的方法，放心，安裝失敗不會對你的電腦形成影響，我安裝失敗了3次，用了三天時間，最後才安裝成功。