Ubuntu(這裏是用16.04LTS桌面版,若是是17.04及之後版本,由於使用的顯示服務器不一樣,可能又會有所不一樣)安裝NVIDIA的顯卡驅動常常出現啓動後死循環進不去系統的狀況,這裏推薦的方法能夠安裝最新的驅動(版本396)和Cuda Toolkit,在最新的Titan V顯卡測試可用。linux
sudo apt install synaptic sudo synaptic
這樣安裝的驅動是通過Ubuntu測試過得,比較保險。不過,版本較舊一點,我安裝的Ubuntu16.04 LTS裏的NVidia驅動默認是384版本。git
sudo apt install nvidia-384
sudo add-apt-repository ppa:graphics-drivers/ppa sudo apt update sudo apt install nvidia-396 #開發使用 sudo apt install nvidia-396-dev
安裝CUDA驅動會自動安裝圖形卡驅動,能夠從 https://developer.nvidia.com/cuda-downloads 下載安裝,已經支持最新的Volta架構(目前只有採用V100芯片的Titan V圖形卡和Tesla計算卡使用)。github
#獲取Cuda9.1安裝文件文件和2018.5.5的補丁包: wget -c https://developer.nvidia.com/compute/cuda/9.1/Prod/local_installers/cuda_9.1.85_387.26_linux wget -c https://developer.nvidia.com/compute/cuda/9.1/Prod/patches/3/cuda_9.1.85.3_linux #而後運行 sudo chmod +x ...,再執行就能夠了。
Ubuntu18.04:docker
https://developer.nvidia.com/compute/cuda/10.0/Prod/local_installers/cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64
Install:ubuntu
sudo dpkg -i cuda-repo-ubuntu1804-10-0-local-10.0.130-410.48_1.0-1_amd64.deb sudo apt-key add /var/cuda-repo-<version>/7fa2af80.pub sudo apt-get update sudo apt-get install cuda
裝完後還須要一些設置,纔可使用,提示以下:服務器
=========== = Summary = =========== Driver: Not Selected Toolkit: Installed in /usr/local/cuda-9.1 Samples: Installed in /home/openthings, but missing recommended libraries Please make sure that - PATH includes /usr/local/cuda-9.1/bin - LD_LIBRARY_PATH includes /usr/local/cuda-9.1/lib64, or, add /usr/local/cuda-9.1/lib64 to /etc/ld.so.conf and run ldconfig as root To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-9.1/bin Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-9.1/doc/pdf for detailed information on setting up CUDA. ***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least 384.00 is required for CUDA 9.1 functionality to work. To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file: sudo <CudaInstaller>.run -silent -driver
要安裝更新的驅動,能夠到NVidia官網(http://www.nvidia.cn/Download/index.aspx?lang=cn)下載。架構
安裝時,要求關閉xserver,運行:curl
sudo service lightdm stop
按ctl+alt+F1進入命令行模式。再按ctl+alt+F7能夠返回圖形界面。ide
運行完後,重啓lightdm,再運行:測試
sudo service lightdm start
不過,因爲測試不太充分,安裝複雜不說,還會遇到重啓後掛起的現象,致使沒法登陸。
能夠啓動時進入「高級-Recovery」模式,而後在命令行下從新設置。
運行:
dpkg-reconfigure lightdm
系統修復措施,參考:
要是還不行的話,就只能從新安裝系統了。
安裝NVidia支持的Docker引擎,就能夠在容器中使用GPU了。具體步驟以下:
# If you have nvidia-docker 1.0 installed: we need to remove it and all existing GPU containers docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f sudo apt-get purge -y nvidia-docker # Add the package repositories curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \ sudo apt-key add - distribution=$(. /etc/os-release;echo $ID$VERSION_ID) curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \ sudo tee /etc/apt/sources.list.d/nvidia-docker.list sudo apt-get update # Install nvidia-docker2 and reload the Docker daemon configuration sudo apt-get install -y nvidia-docker2 sudo pkill -SIGHUP dockerd # Test nvidia-smi with the latest official CUDA image docker run --runtime=nvidia --rm nvidia/cuda nvidia-smi
注意,如今像上面運行 Docker 能夠直接支持GPU了,不用再單獨運行Docker-Nvidia命令了,大大加強了與各類容器編排系統的兼容性,Kubernetes目前也已經能夠支持Docker容器運行GPU了。
目前版本依賴Docker 18.03版,若是已經安裝了其它版本,能夠指定安裝的版本,以下:
sudo apt install docker-ce=18.03.1~ce-0~ubuntu
詳細的參考: