nvidia-smi也有顯示,顯卡驅動是在的,並且nvcc顯示出來的cuda版本9.0也沒錯,不是9.1。不知道問題所在,索性重裝所有。node
sudo tee /proc/acpi/bbswitch <<<ON # ON nvidia-smi
顯示以下:python
Tue May 28 22:21:07 2019 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 390.67 Driver Version: 390.67 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 950M Off | 00000000:01:00.0 Off | N/A | | N/A 50C P0 N/A / N/A | 0MiB / 2004MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
nvcc --version
顯示以下:linux
nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2017 NVIDIA Corporation Built on Fri_Sep__1_21:08:03_CDT_2017 Cuda compilation tools, release 9.0, V9.0.176
lspci | grep -i nvidia
顯示以下:shell
01:00.0 3D controller: NVIDIA Corporation GM107M [GeForce GTX 950M] (rev a2)
檢查pytorch調用cuda是否正常:vim
python -c 'import torch; print(torch.cuda.is_available())'
顯示以下:bash
False
sudo /usr/local/cuda-9.0/bin/uninstall_cuda_9.0.pl #這裏以後只剩下cudnn的東西,也能夠徹底刪了。 sudo rm -rf /usr/local/cuda-9.0/
sudo apt-get remove --purge nvidia-cuda-dev nvidia-cuda-toolkit nvidia-nsight nvidia-visual-profiler sudo apt autoremove --purge bumblebee-nvidia nvidia-driver nvidia-settings
sudo apt-get install nvidia-smi sudo apt-get install bumblebee-nvidia nvidia-driver nvidia-settings
sudo apt-get install mesa-utils
顯示N卡相關信息:測試
optirun glxinfo|grep NVIDIA
運行測試程序ui
optirun glxgears -info
成功調用顯卡驅動,信息以下:this
GL_RENDERER = GeForce GTX 950M/PCIe/SSE2 GL_VERSION = 4.6.0 NVIDIA 390.67 GL_VENDOR = NVIDIA Corporation
sudo ./cuda_9.0.176_384.81_linux.run
安裝過程只有這個選no
Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 384.81? (y)es/(n)o/(q)uit: n
<https://developer.nvidia.com/rdp/cudnn-archive>
登陸下載對應版本我是選擇了
cudnn-9.0-linux-x64-v7.5.0.56
這個版本的
把對應的額外的cudnn庫放入cuda對應的位置:
sudo cp lib64/* /usr/local/cuda/lib64/ sudo cp include/* /usr/local/cuda/include/
而後檢查環境變量並開啓默認N卡
# 檢查LD_LIABRARY_PATH和PATH sudo vim ~/.bashrc # 用大黃蜂開啓默認N卡 sudo tee /proc/acpi/bbswitch<<<ON
再次檢查pytorch是否能調用cuda
python -c "import torch;print(torch.cuda.is_available())"
顯示以下:
True
檢查tensorflow是否正常調用gpu
python3 -c "import tensorflow as tf;print(tf.test.is_gpu_available());print(tf.test.gpu_device_name())"
顯示以下:
2019-05-28 22:52:25.862539: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2019-05-28 22:52:26.319239: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-05-28 22:52:26.319674: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: name: GeForce GTX 950M major: 5 minor: 0 memoryClockRate(GHz): 1.124 pciBusID: 0000:01:00.0 totalMemory: 1.96GiB freeMemory: 1.92GiB 2019-05-28 22:52:26.319696: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
都正常了,沒有比我這更復雜了吧,卸了重裝,有卸載過程和安裝過程。