y450 archlinux cuda6.5

y450 archlinux cuda6.5

January 28, 2018 4:11 PMnode

archlinux是最新更新版本,gcc版本到了7.幾,太新了。linux

[qiangge@lqspc ~]$ gcc --version
gcc (GCC) 7.2.1 20180116
Copyright © 2017 Free Software Foundation, Inc.
本程序是自由軟件;請參看源代碼的版權聲明。本軟件沒有任何擔保;
包括沒有適銷性和某一專用目的下的適用性擔保。

這系統對中文翻譯的不太習慣哈。bash

整體步驟app

  1. 確認安裝的archlinux比較新,不想降級gcc等。
  2. 確認y450的筆記本顯卡型號,g 110M。
  3. 肯定能夠安裝的cuda版本。這個地方走過彎路,開始直接pa cuda,結果就給我裝了個9.1的版本。反覆測試發現安裝失敗。通過查詢顯卡型號(上一步)支持的計算能力(compute capability?但願沒拼錯)只是支持1.2如下,後來安裝完發現是1.1.而1.2如下的最多安裝cuda-6.5之前的版本。
  4. yaourt cuda找到相關版本安裝(上一步),安裝過程當中遇到/tmp不夠用,新建個目錄掛載到/tmp,沖掉了內存掛載的/tmp,這樣能夠充分利用硬盤空間來操做。之因此不夠用由於內存只有8G,這樣默認/tmp就只有4G,廢話了。
  5. 安裝完後測試/opt/cuda/samples的devicequery例子,最好拷貝到本身的/home目錄吧。
  6. 開始不能編譯任何例子,有兩個錯誤。主要參考cuda社區解決。
(1)Here is a patch to /usr/include/bits/floatn.h for avoiding __FLOAT128 only when compiling via NVCC
(2)Here is how to use other GCC compiing via NVCC
  1. 第一個錯誤是floatn.h錯誤。參考論壇解決,本質上是判斷條件裏面添加一個條件,就是不編譯cuda代碼的意思。
  2. 第二個錯誤是默認的gcc版本太新了,cuda65不支持,那就採用5試試看(參考下一步方法),發現這隻能編譯devicequery。因而通過google,知道必須4.7左右。本機yaourt編譯4.7失敗,固然依然要/tmp,編譯個編譯器真的很容易失敗,浪費了好幾天的電費哈。上海電費蠻貴的,尤爲是租房,嗚嗚。那麼總有解決辦法吧,參考資料在archlinux的yaourt源裏面。做者提到了要動態庫加上軟鏈接,
sudo ln -s /usr/lib/libisl.so /usr/lib/libisl.so.10 && sudo ldconfig

否則會失敗,固然做爲折騰專家,我必須先不加看看效果,果真不行學習

/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.4/cc1plus: error while loading shared libraries: libisl.so.10: cannot open shared object file: No such file or directory
make: *** [Makefile:196:bandwidthTest.o] 錯誤 1

加上還提示另一個錯誤,這個是做者沒考慮的吧,哈哈測試

/usr/lib/gcc/x86_64-unknown-linux-gnu/4.7.4/cc1plus: error while loading shared libraries: libmpfr.so.4: cannot open shared object file: No such file or directory

解決辦法是相同的思路,類似的代碼,讀者自行思考哈。ui

  1. 解決gcc問題的方法有兩個,本質是一個事情,請看參考1參考2。最後的效果
[qiangge@lqspc ~]$ ll /opt/cuda/
bin/                          jre/                          libnvvp/                      samples/
doc/                          lib/                          NVIDIA_SLA_cuDNN_Support.txt  share/
extras/                       lib64/                        nvvm/                         src/
include/                      libnsight/                    open64/                       tools/
[qiangge@lqspc ~]$ ll /opt/cuda/bin/gcc/
總用量 8.0K
drwxr-xr-x 2 root 4.0K 1月  28 22:52 .
lrwxrwxrwx 1 root   16 1月  28 22:52 gcc -> /usr/bin/gcc-4.7
lrwxrwxrwx 1 root   16 1月  28 22:52 cpp -> /usr/bin/cpp-4.7
lrwxrwxrwx 1 root   16 1月  28 22:52 g++ -> /usr/bin/g++-4.7
drwxr-xr-x 4 root 4.0K 1月  22 09:45 ..
[qiangge@lqspc ~]$
[qiangge@lqspc 1_Utilities]$ cd bandwidthTest/
[qiangge@lqspc bandwidthTest]$ nvidia-smi
Mon Jan 29 00:01:22 2018       
+------------------------------------------------------+                       
| NVIDIA-SMI 340.106    Driver Version: 340.106        |                       
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce G 110M      Off  | 0000:01:00.0     N/A |                  N/A |
| N/A   52C   P12    N/A /  N/A |     50MiB /   255MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Compute processes:                                               GPU Memory |
|  GPU       PID  Process name                                     Usage      |
|=============================================================================|
|    0            Not Supported                                               |
+-----------------------------------------------------------------------------+
[qiangge@lqspc bandwidthTest]$
[qiangge@lqspc bandwidthTest]$ ./bandwidthTest 
[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: GeForce G 110M
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)    Bandwidth(MB/s)
   33554432         2551.5

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)    Bandwidth(MB/s)
   33554432         1675.0

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)    Bandwidth(MB/s)
   33554432         6319.8

Result = PASS
[qiangge@lqspc bandwidthTest]$
[qiangge@lqspc 1_Utilities]$ cd deviceQuery
[qiangge@lqspc deviceQuery]$ ls
deviceQuery  deviceQuery.cpp  deviceQuery.o  Makefile  NsightEclipse.xml  readme.txt
[qiangge@lqspc deviceQuery]$ ./deviceQuery 
./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce G 110M"
  CUDA Driver Version / Runtime Version          6.5 / 6.5
  CUDA Capability Major/Minor version number:    1.1
  Total amount of global memory:                 256 MBytes (268107776 bytes)
  ( 2) Multiprocessors, (  8) CUDA Cores/MP:     16 CUDA Cores
  GPU Clock rate:                                1000 MHz (1.00 GHz)
  Memory Clock rate:                             700 Mhz
  Memory Bus Width:                              64-bit
  Maximum Texture Dimension Size (x,y,z)         1D=(8192), 2D=(65536, 32768), 3D=(2048, 2048, 2048)
  Maximum Layered 1D Texture Size, (num) layers  1D=(8192), 512 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(8192, 8192), 512 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       16384 bytes
  Total number of registers available per block: 8192
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  768
  Maximum number of threads per block:           512
  Max dimension size of a thread block (x,y,z): (512, 512, 64)
  Max dimension size of a grid size    (x,y,z): (65535, 65535, 1)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             256 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      No
  Device PCI Bus ID / PCI location ID:           1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs = 1, Device0 = GeForce G 110M
Result = PASS
[qiangge@lqspc deviceQuery]$

配置雖然低,學習可可以用吧,不行就去買個新點的臺式二手顯卡?二手是否是摳門了呢?的確是,可是其實本身不用買,公司有1080TI顯卡,能夠加班學習用就好了。這裏只是想本身安裝一次,而且能夠簡單用來學習、練習和測試。同時幫朋友解決了y550上cuda65,那個顯卡是g 240m的樣子,最多也是1.2的計算能力。可是他用的Ubuntu。臃腫的Ubuntu還不是個人菜。以後又發現本身硬盤快滿了,原來是須要pacman -Sc一下了。回頭考慮配置一下自動清除不安裝的包吧。google

相關文章
相關標籤/搜索