【tf.keras】Linux 非 root 用戶安裝 CUDA 和 cuDNN

TF 2.0 for Linux 使用時報錯:(cuDNN 版本低了)python

E tensorflow/stream_executor/cuda/cuda_dnn.cc:319] Loaded runtime CuDNN library: 7.4.1 but source was compiled with: 7.6.0.  CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
...
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]

解決方法:升級 cuDNN。TF 2.0 和 CUDA 10.0、cuDNN 7.6.4 是匹配的。linux

root 用戶

Linux 服務器的用戶若是有 root 權限,那直接刪除原來的 cuDNN,而後再重裝新版本。
刪除已有的 cuDNN:shell

sudo rm -rf /usr/local/cuda/include/cudnn.h
sudo rm -rf /usr/local/cuda/lib64/libcudnn*

進入 cuDNN 解壓後的 cuda 文件夾,安裝新版本:ubuntu

sudo cp include/cudnn.h /usr/local/cuda/include/
sudo cp lib64/lib* /usr/local/cuda/lib64/

創建軟鏈接:(以 7.6.4 版本爲例)bash

cd /usr/local/cuda/lib64/
sudo chmod +r libcudnn.so.7.6.4
sudo ln -sf libcudnn.so.7.6.4 libcudnn.so.7
sudo ln -sf libcudnn.so.7 libcudnn.so   
sudo ldconfig

非 root 用戶

若是沒有 root 權限,一種作法是本身從新裝 CUDA 和 cuDNN。服務器

在用戶目錄下安裝 CUDA

從官網 https://developer.nvidia.com/cuda-10.0-download-archive 下載 ubuntu 使用的 cuda_10.0.130_410.48_linux.run,安裝指令 sh cuda_10.0.130_410.48_linux.run,以後:ui

# 按q退出協議說明. 
 
Do you accept the previously read EULA?
accept/decline/quit: accept

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 410.48?
(y)es/(n)o/(q)uit: no

Install the CUDA 10.0 Toolkit?
(y)es/(n)o/(q)uit: yes

# 改爲本身的用戶名
Enter Toolkit Location
 [ default is /usr/local/cuda-10.0 ]: /home/wuliyttaotao/cuda-10.0

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: n

Install the CUDA 10.0 Samples?
(y)es/(n)o/(q)uit: y

# 使用默認路徑,回車
Enter CUDA Samples Location
 [ default is /home/wuliyttaotao ]:

配置 cuDNN

複製 cuDNN 文件到 CUDA 安裝目錄:(cuDNN 解壓到 ~/cuda 目錄了,~/cuda-10.0 爲本身設定的 CUDA 安裝目錄,~ 表明 /home/wuliyttaotao).net

cp ~/cuda/include/cudnn.h ~/cuda-10.0/include
cp ~/cuda/lib64/lib* ~/cuda-10.0/lib64

chmod a+r ~/cuda-10.0/include/cudnn.h ~/cuda-10.0/lib64/libcudnn*

創建軟鏈接:code

cd ~/cuda-10.0/lib64
ln -sf libcudnn.so.7.6.4 libcudnn.so.7
ln -sf libcudnn.so.7 libcudnn.so
ldconfig -v

配置用戶環境變量

修改 ~/.bashrc 文件,將下面兩行加進去:(將 wuliyttaotao 改爲對應的用戶名)blog

export PATH=/home/wuliyttaotao/cuda-10.0/bin${PATH:+:${PATH}}  
export LD_LIBRARY_PATH=/home/wuliyttaotao/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

保存 ~/.bashrc 文件後,source ~/.bashrc 使其生效。

References

Linux之cudnn升級方法 -- ZONG_XP
非root用戶在linux下安裝多個版本的CUDA和cuDNN(cuda 八、cuda 10.1 等)-- 隨性拂塵傾心

相關文章
相關標籤/搜索