1200字范文,内容丰富有趣,写作的好帮手!
1200字范文 > GPU服务器安装显卡驱动 CUDA和cuDNN

GPU服务器安装显卡驱动 CUDA和cuDNN

时间:2023-06-18 21:39:11

相关推荐

GPU服务器安装显卡驱动 CUDA和cuDNN

GPU服务器安装cuda和cudnn

1. 服务器驱动安装2. cuda安装3. cudNN安装4. 安装docker环境5. 安装nvidia-docker25.1 ubuntu系统安装5.2 centos系统安装 6. 测试docker容调用GPU服务

1. 服务器驱动安装

显卡驱动下载地址/Download/index.aspx?lang=cn显卡驱动安装完成后可以通过命令:nvidia-smi 查看驱动信息显卡型号查看命令:lspci |grep -i vga

root@hk-MZ32-AR0-00:~# nvidia-smi Fri Feb 10 17:27:58 +-----------------------------------------------------------------------------+| NVIDIA-SMI 460.106.00 Driver Version: 460.106.00 CUDA Version: 11.2||-------------------------------+----------------------+----------------------+| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC || Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. || | |MIG M. ||===============================+======================+======================|| 0 Tesla T4 Off | 00000000:04:00.0 Off |0 || N/A 46C P0 27W / 70W |0MiB / 15109MiB |0%Default || | | N/A |+-------------------------------+----------------------+----------------------+| 1 Tesla T4 Off | 00000000:06:00.0 Off |0 || N/A 43C P0 28W / 70W |0MiB / 15109MiB |0%Default || | | N/A |+-------------------------------+----------------------+----------------------+| 2 Tesla T4 Off | 00000000:0D:00.0 Off |0 || N/A 48C P0 28W / 70W |0MiB / 15109MiB |0%Default || | | N/A |+-------------------------------+----------------------+----------------------+| 3 Tesla T4 Off | 00000000:0F:00.0 Off |0 || N/A 45C P0 26W / 70W |0MiB / 15109MiB |0%Default || | | N/A |+-------------------------------+----------------------+----------------------+| 4 Tesla T4 Off | 00000000:17:00.0 Off |0 || N/A 48C P0 27W / 70W |0MiB / 15109MiB |0%Default || | | N/A |+-------------------------------+----------------------+----------------------+| 5 Tesla T4 Off | 00000000:19:00.0 Off |0 || N/A 48C P0 28W / 70W |0MiB / 15109MiB |0%Default || | | N/A |+-------------------------------+----------------------+----------------------+| 6 Tesla T4 Off | 00000000:21:00.0 Off |0 || N/A 45C P0 26W / 70W |0MiB / 15109MiB |0%Default || | | N/A |+-------------------------------+----------------------+----------------------+| 7 Tesla T4 Off | 00000000:23:00.0 Off |0 || N/A 45C P0 27W / 70W |0MiB / 15109MiB |4%Default || | | N/A |+-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+| Processes:|| GPU GI CI PID Type Process name GPU Memory || ID ID Usage||=============================================================================|| No running processes found |+-----------------------------------------------------------------------------+

2. cuda安装

CUDA安装的时候需要注意显卡的驱动版本参考文档 :接入附上一份 此次实验机的驱动版本是 460.106.00,我选用的版本是CUDA 11.0下载地址:/cuda-toolkit-archive

root@hk-MZ32-AR0-00:~# wget http://developer./compute/cuda/11.0.2/local_installers/cuda_11.0.2_450.51.05_linux.run ---01-29 19:55:42-- http://developer./compute/cuda/11.0.2/local_installers/cuda_11.0.2_450.51.05_linux.runResolving developer. (developer.)... 152.199.39.144Connecting to developer. (developer.)|152.199.39.144|:80... connected.HTTP request sent, awaiting response... 301 Moved PermanentlyLocation: https://developer./compute/cuda/11.0.2/local_installers/cuda_11.0.2_450.51.05_linux.run [following]---01-29 19:55:43-- https://developer./compute/cuda/11.0.2/local_installers/cuda_11.0.2_450.51.05_linux.runConnecting to developer. (developer.)|152.199.39.144|:443... connected.HTTP request sent, awaiting response... 301 Moved PermanentlyLocation: https://developer./compute/cuda/11.0.2/local_installers/cuda_11.0.2_450.51.05_linux.run [following]---01-29 19:55:44-- https://developer./compute/cuda/11.0.2/local_installers/cuda_11.0.2_450.51.05_linux.runResolving developer. (developer.)... 125.64.2.195, 125.64.2.196, 150.138.231.66, ...Connecting to developer. (developer.)|125.64.2.195|:443... connected.HTTP request sent, awaiting response... 200 OKLength: 3066694836 (2.9G) [application/octet-stream]Saving to: ‘cuda_11.0.2_450.51.05_linux.run’100%[=====================================================================================================================================>] 3,066,694,836 11.3MB/s in 4m 25s -01-29 20:00:15 (11.0 MB/s) - ‘cuda_11.0.2_450.51.05_linux.run’ saved [3066694836/3066694836]

root@hk-MZ32-AR0-00:~# ./cuda_11.0.2_450.51.05_linux.run ┌──────────────────────────────────────────────────────────────────────────────┐│ Existing package manager installation of the driver found. It is strongly ││ recommended that you remove this before continuing.││ Abort ││ Continue ││ ││ ││ ││ ││ ││ ││ ││ ││ ││ ││ ││ ││ ││ ││ ││ ││ ││ ││ Up/Down: Move | 'Enter': Select│└──────────────────────────────────────────────────────────────────────────────┘# 上下键选择 Continue,按enter,会出现如下画面┌──────────────────────────────────────────────────────────────────────────────┐│ End User License Agreement││ --------------------------││ ││ NVIDIA Software License Agreement and CUDA Supplement to││ Software License Agreement. ││ ││ ││ Preface ││ ------- ││ ││ The Software License Agreement in Chapter 1 and the Supplement ││ in Chapter 2 contain license terms and conditions that govern││ the use of NVIDIA software. By accepting this agreement, you││ agree to comply with all the terms and conditions applicable││ to the product(s) included herein. ││ ││ ││ NVIDIA Driver ││ ││ ││──────────────────────────────────────────────────────────────────────────────││ Do you accept the above EULA? (accept/decline/quit):││ │└──────────────────────────────────────────────────────────────────────────────┘#输入 accept,按enter,回出现如下┌──────────────────────────────────────────────────────────────────────────────┐│ CUDA Installer ││ - [X] Driver││[X] 450.51.05 ││ + [X] CUDA Toolkit 11.0 ││ [X] CUDA Samples 11.0 ││ [X] CUDA Demo Suite 11.0 ││ [X] CUDA Documentation 11.0 ││ Options ││ Install ││ ││ ││ ││ ││ ││ ││ ││ ││ ││ ││ ││ ││ ││ Up/Down: Move | Left/Right: Expand | 'Enter': Select | 'A': Advanced options │└──────────────────────────────────────────────────────────────────────────────┘# 按上下键到 Driver,按空格,取消安装驱动,驱动我们前面已经安装过了。上下键到install,按enter,会出现安装过程============ Summary ============Driver: Not SelectedToolkit: Installed in /usr/local/cuda-11.0/Samples: Installed in /home/hk/, but missing recommended librariesPlease make sure that- PATH includes /usr/local/cuda-11.0/bin- LD_LIBRARY_PATH includes /usr/local/cuda-11.0/lib64, or, add /usr/local/cuda-11.0/lib64 to /etc/ld.so.conf and run ldconfig as rootTo uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-11.0/binPlease see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-11.0/doc/pdf for detailed information on setting up CUDA.***WARNING: Incomplete installation! This installation did not install the CUDA Driver. A driver of version at least .00 is required for CUDA 11.0 functionality to work.To install the driver using this installer, run the following command, replacing <CudaInstaller> with the name of this run file:sudo <CudaInstaller>.run --silent --driverLogfile is /var/log/cuda-installer.log

把cuda的命令添加到系统环境变量

root@hk-MZ32-AR0-00:~# export PATH=$PATH:/usr/local/cuda/bin/ >> /etc/profileroot@hk-MZ32-AR0-00:~# source /etc/profile# 执行nvcc命令即可显示cuda的信息root@hk-MZ32-AR0-00:~# nvcc -Vnvcc: NVIDIA (R) Cuda compiler driverCopyright (c) - NVIDIA CorporationBuilt on Thu_Jun_11_22:26:38_PDT_Cuda compilation tools, release 11.0, V11.0.194Build cuda_11.0_bu.TC445_37.28540450_0

3. cudNN安装

下载链接:/rdp/cudnn-archivecudNN下载的时候也需要注意CUDA的版本,如下图红色框标注的版本

root@hk-MZ32-AR0-00:~# rzZMODEM Session started e50------------------------ Sent cudnn-linux-x86_64-8.7.0.84_cuda11-archive.tar.xzroot@hk-MZ32-AR0-00:~# tar -xvf cudnn-linux-x86_64-8.7.0.84_cuda11-archive.tar.xz cudnn-linux-x86_64-8.7.0.84_cuda11-archive/cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_adv_infer_static.acudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_adv_infer_static_v8.acudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_adv_train_static.acudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_adv_train_static_v8.acudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_cnn_infer_static.acudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_cnn_infer_static_v8.acudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_cnn_train_static.acudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_cnn_train_static_v8.acudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_ops_infer_static.acudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_ops_infer_static_v8.acudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_ops_train_static.acudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_ops_train_static_v8.acudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn.so.8cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn.socudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn.so.8.7.0cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_adv_infer.so.8cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_adv_infer.socudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_adv_infer.so.8.7.0cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_adv_train.so.8cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_adv_train.socudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_adv_train.so.8.7.0cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_cnn_infer.socudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_cnn_infer.so.8cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_cnn_infer.so.8.7.0cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_cnn_train.socudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_cnn_train.so.8.7.0cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_cnn_train.so.8cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_ops_infer.so.8.7.0cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_ops_infer.socudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_ops_infer.so.8cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_ops_train.so.8.7.0cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_ops_train.socudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/libcudnn_ops_train.so.8cudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_v8.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_adv_infer_v8.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_adv_train_v8.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_backend_v8.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_cnn_infer_v8.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_cnn_train_v8.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_ops_infer_v8.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_ops_train_v8.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_version_v8.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_adv_infer.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_adv_train.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_backend.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_cnn_infer.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_cnn_train.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_ops_infer.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_ops_train.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/cudnn_version.hcudnn-linux-x86_64-8.7.0.84_cuda11-archive/LICENSE

root@hk-MZ32-AR0-00:~# ll cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/总用量 256drwxr-xr-x 2 25503 21744096 11月 22 04:14 ./drwxr-xr-x 4 25503 21744096 11月 22 04:14 ../lrwxrwxrwx 1 25503 2174 23 11月 22 03:58 libcudnn_adv_infer.so -> libcudnn_adv_infer.so.8*lrwxrwxrwx 1 25503 2174 27 11月 22 03:58 libcudnn_adv_infer.so.8 -> libcudnn_adv_infer.so.8.7.0*-rwxr-xr-x 1 25503 2174 130381904 11月 22 03:58 libcudnn_adv_infer.so.8.7.0*-rw-r--r-- 1 25503 2174 132979922 11月 22 03:58 libcudnn_adv_infer_static.alrwxrwxrwx 1 25503 2174 27 11月 22 03:58 libcudnn_adv_infer_static_v8.a -> libcudnn_adv_infer_static.alrwxrwxrwx 1 25503 2174 23 11月 22 03:58 libcudnn_adv_train.so -> libcudnn_adv_train.so.8*lrwxrwxrwx 1 25503 2174 27 11月 22 03:58 libcudnn_adv_train.so.8 -> libcudnn_adv_train.so.8.7.0*-rwxr-xr-x 1 25503 2174 121095120 11月 22 03:58 libcudnn_adv_train.so.8.7.0*-rw-r--r-- 1 25503 2174 123566296 11月 22 03:58 libcudnn_adv_train_static.alrwxrwxrwx 1 25503 2174 27 11月 22 03:58 libcudnn_adv_train_static_v8.a -> libcudnn_adv_train_static.alrwxrwxrwx 1 25503 2174 23 11月 22 03:58 libcudnn_cnn_infer.so -> libcudnn_cnn_infer.so.8*lrwxrwxrwx 1 25503 2174 27 11月 22 03:58 libcudnn_cnn_infer.so.8 -> libcudnn_cnn_infer.so.8.7.0*-rwxr-xr-x 1 25503 2174 639185544 11月 22 03:58 libcudnn_cnn_infer.so.8.7.0*-rw-r--r-- 1 25503 2174 829548950 11月 22 03:58 libcudnn_cnn_infer_static.alrwxrwxrwx 1 25503 2174 27 11月 22 03:58 libcudnn_cnn_infer_static_v8.a -> libcudnn_cnn_infer_static.alrwxrwxrwx 1 25503 2174 23 11月 22 03:58 libcudnn_cnn_train.so -> libcudnn_cnn_train.so.8*lrwxrwxrwx 1 25503 2174 27 11月 22 03:58 libcudnn_cnn_train.so.8 -> libcudnn_cnn_train.so.8.7.0*-rwxr-xr-x 1 25503 2174 102197000 11月 22 03:58 libcudnn_cnn_train.so.8.7.0*-rw-r--r-- 1 25503 2174 153525776 11月 22 03:58 libcudnn_cnn_train_static.alrwxrwxrwx 1 25503 2174 27 11月 22 03:58 libcudnn_cnn_train_static_v8.a -> libcudnn_cnn_train_static.alrwxrwxrwx 1 25503 2174 23 11月 22 03:58 libcudnn_ops_infer.so -> libcudnn_ops_infer.so.8*lrwxrwxrwx 1 25503 2174 27 11月 22 03:58 libcudnn_ops_infer.so.8 -> libcudnn_ops_infer.so.8.7.0*-rwxr-xr-x 1 25503 2174 97489336 11月 22 03:58 libcudnn_ops_infer.so.8.7.0*-rw-r--r-- 1 25503 2174 100636906 11月 22 03:58 libcudnn_ops_infer_static.alrwxrwxrwx 1 25503 2174 27 11月 22 03:58 libcudnn_ops_infer_static_v8.a -> libcudnn_ops_infer_static.alrwxrwxrwx 1 25503 2174 23 11月 22 03:58 libcudnn_ops_train.so -> libcudnn_ops_train.so.8*lrwxrwxrwx 1 25503 2174 27 11月 22 03:58 libcudnn_ops_train.so.8 -> libcudnn_ops_train.so.8.7.0*-rwxr-xr-x 1 25503 2174 74703096 11月 22 03:58 libcudnn_ops_train.so.8.7.0*-rw-r--r-- 1 25503 2174 75156862 11月 22 03:58 libcudnn_ops_train_static.alrwxrwxrwx 1 25503 2174 27 11月 22 03:58 libcudnn_ops_train_static_v8.a -> libcudnn_ops_train_static.alrwxrwxrwx 1 25503 2174 13 11月 22 03:58 libcudnn.so -> libcudnn.so.8*lrwxrwxrwx 1 25503 2174 17 11月 22 03:58 libcudnn.so.8 -> libcudnn.so.8.7.0*-rwxr-xr-x 1 25503 2174 150200 11月 22 03:58 libcudnn.so.8.7.0*root@hk-MZ32-AR0-00:~# ll cudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/总用量 448drwxr-xr-x 2 25503 2174 4096 11月 22 04:14 ./drwxr-xr-x 4 25503 2174 4096 11月 22 04:14 ../-rw-r--r-- 1 25503 2174 29025 11月 22 03:58 cudnn_adv_infer.h-rw-r--r-- 1 25503 2174 29025 11月 22 03:58 cudnn_adv_infer_v8.h-rw-r--r-- 1 25503 2174 27700 11月 22 03:58 cudnn_adv_train.h-rw-r--r-- 1 25503 2174 27700 11月 22 03:58 cudnn_adv_train_v8.h-rw-r--r-- 1 25503 2174 24727 11月 22 03:58 cudnn_backend.h-rw-r--r-- 1 25503 2174 24727 11月 22 03:58 cudnn_backend_v8.h-rw-r--r-- 1 25503 2174 29083 11月 22 03:58 cudnn_cnn_infer.h-rw-r--r-- 1 25503 2174 29083 11月 22 03:58 cudnn_cnn_infer_v8.h-rw-r--r-- 1 25503 2174 10217 11月 22 03:58 cudnn_cnn_train.h-rw-r--r-- 1 25503 2174 10217 11月 22 03:58 cudnn_cnn_train_v8.h-rw-r--r-- 1 25503 2174 2968 11月 22 03:58 cudnn.h-rw-r--r-- 1 25503 2174 49631 11月 22 03:58 cudnn_ops_infer.h-rw-r--r-- 1 25503 2174 49631 11月 22 03:58 cudnn_ops_infer_v8.h-rw-r--r-- 1 25503 2174 25733 11月 22 03:58 cudnn_ops_train.h-rw-r--r-- 1 25503 2174 25733 11月 22 03:58 cudnn_ops_train_v8.h-rw-r--r-- 1 25503 2174 2968 11月 22 03:58 cudnn_v8.h-rw-r--r-- 1 25503 2174 3113 11月 22 03:58 cudnn_version.h-rw-r--r-- 1 25503 2174 3113 11月 22 03:58 cudnn_version_v8.h

root@hk-MZ32-AR0-00:~# cp -P cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/* /usr/local/cuda/lib64/root@hk-MZ32-AR0-00:~# cp -P cudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/* /usr/local/cuda/include/

root@hk-MZ32-AR0-00:~# ll /usr/local/cuda/lib64/libcudnn* lrwxrwxrwx 1 root root 23 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_adv_infer.so -> libcudnn_adv_infer.so.8*lrwxrwxrwx 1 root root 27 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_adv_infer.so.8 -> libcudnn_adv_infer.so.8.7.0*-rwxr-xr-x 1 root root 130381904 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_adv_infer.so.8.7.0*-rw-r--r-- 1 root root 132979922 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_adv_infer_static.alrwxrwxrwx 1 root root 27 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_adv_infer_static_v8.a -> libcudnn_adv_infer_static.alrwxrwxrwx 1 root root 23 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_adv_train.so -> libcudnn_adv_train.so.8*lrwxrwxrwx 1 root root 27 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_adv_train.so.8 -> libcudnn_adv_train.so.8.7.0*-rwxr-xr-x 1 root root 121095120 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_adv_train.so.8.7.0*-rw-r--r-- 1 root root 123566296 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_adv_train_static.alrwxrwxrwx 1 root root 27 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_adv_train_static_v8.a -> libcudnn_adv_train_static.alrwxrwxrwx 1 root root 23 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_cnn_infer.so -> libcudnn_cnn_infer.so.8*lrwxrwxrwx 1 root root 27 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_cnn_infer.so.8 -> libcudnn_cnn_infer.so.8.7.0*-rwxr-xr-x 1 root root 639185544 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_cnn_infer.so.8.7.0*-rw-r--r-- 1 root root 829548950 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_cnn_infer_static.alrwxrwxrwx 1 root root 27 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_cnn_infer_static_v8.a -> libcudnn_cnn_infer_static.alrwxrwxrwx 1 root root 23 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_cnn_train.so -> libcudnn_cnn_train.so.8*lrwxrwxrwx 1 root root 27 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_cnn_train.so.8 -> libcudnn_cnn_train.so.8.7.0*-rwxr-xr-x 1 root root 102197000 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_cnn_train.so.8.7.0*-rw-r--r-- 1 root root 153525776 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_cnn_train_static.alrwxrwxrwx 1 root root 27 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_cnn_train_static_v8.a -> libcudnn_cnn_train_static.alrwxrwxrwx 1 root root 23 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_ops_infer.so -> libcudnn_ops_infer.so.8*lrwxrwxrwx 1 root root 27 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_ops_infer.so.8 -> libcudnn_ops_infer.so.8.7.0*-rwxr-xr-x 1 root root 97489336 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_ops_infer.so.8.7.0*-rw-r--r-- 1 root root 100636906 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_ops_infer_static.alrwxrwxrwx 1 root root 27 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_ops_infer_static_v8.a -> libcudnn_ops_infer_static.alrwxrwxrwx 1 root root 23 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_ops_train.so -> libcudnn_ops_train.so.8*lrwxrwxrwx 1 root root 27 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_ops_train.so.8 -> libcudnn_ops_train.so.8.7.0*-rwxr-xr-x 1 root root 74703096 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_ops_train.so.8.7.0*-rw-r--r-- 1 root root 75156862 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_ops_train_static.alrwxrwxrwx 1 root root 27 2月 10 17:39 /usr/local/cuda/lib64/libcudnn_ops_train_static_v8.a -> libcudnn_ops_train_static.alrwxrwxrwx 1 root root 13 2月 10 17:39 /usr/local/cuda/lib64/libcudnn.so -> libcudnn.so.8*lrwxrwxrwx 1 root root 17 2月 10 17:39 /usr/local/cuda/lib64/libcudnn.so.8 -> libcudnn.so.8.7.0*-rwxr-xr-x 1 root root 150200 2月 10 17:39 /usr/local/cuda/lib64/libcudnn.so.8.7.0*root@hk-MZ32-AR0-00:~# ll /usr/local/cuda/lib64/libcudnn* | wc -l33root@hk-MZ32-AR0-00:~# ll cudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/ lib/LICENSE root@hk-MZ32-AR0-00:~# ll cudnn-linux-x86_64-8.7.0.84_cuda11-archive/lib/* |wc -l33root@hk-MZ32-AR0-00:~# ll /usr/local/cuda/include/cudn* -rw-r--r-- 1 root root 29025 2月 10 17:39 /usr/local/cuda/include/cudnn_adv_infer.h-rw-r--r-- 1 root root 29025 2月 10 17:39 /usr/local/cuda/include/cudnn_adv_infer_v8.h-rw-r--r-- 1 root root 27700 2月 10 17:39 /usr/local/cuda/include/cudnn_adv_train.h-rw-r--r-- 1 root root 27700 2月 10 17:39 /usr/local/cuda/include/cudnn_adv_train_v8.h-rw-r--r-- 1 root root 24727 2月 10 17:39 /usr/local/cuda/include/cudnn_backend.h-rw-r--r-- 1 root root 24727 2月 10 17:39 /usr/local/cuda/include/cudnn_backend_v8.h-rw-r--r-- 1 root root 29083 2月 10 17:39 /usr/local/cuda/include/cudnn_cnn_infer.h-rw-r--r-- 1 root root 29083 2月 10 17:39 /usr/local/cuda/include/cudnn_cnn_infer_v8.h-rw-r--r-- 1 root root 10217 2月 10 17:39 /usr/local/cuda/include/cudnn_cnn_train.h-rw-r--r-- 1 root root 10217 2月 10 17:39 /usr/local/cuda/include/cudnn_cnn_train_v8.h-rw-r--r-- 1 root root 2968 2月 10 17:39 /usr/local/cuda/include/cudnn.h-rw-r--r-- 1 root root 49631 2月 10 17:39 /usr/local/cuda/include/cudnn_ops_infer.h-rw-r--r-- 1 root root 49631 2月 10 17:39 /usr/local/cuda/include/cudnn_ops_infer_v8.h-rw-r--r-- 1 root root 25733 2月 10 17:39 /usr/local/cuda/include/cudnn_ops_train.h-rw-r--r-- 1 root root 25733 2月 10 17:39 /usr/local/cuda/include/cudnn_ops_train_v8.h-rw-r--r-- 1 root root 2968 2月 10 17:39 /usr/local/cuda/include/cudnn_v8.h-rw-r--r-- 1 root root 3113 2月 10 17:39 /usr/local/cuda/include/cudnn_version.h-rw-r--r-- 1 root root 3113 2月 10 17:39 /usr/local/cuda/include/cudnn_version_v8.hroot@hk-MZ32-AR0-00:~# ll /usr/local/cuda/include/cudn* |wc -l18root@hk-MZ32-AR0-00:~# ll cudnn-linux-x86_64-8.7.0.84_cuda11-archive/include/* | wc -l 18

4. 安装docker环境

root@hk-MZ32-AR0-00:~# curl -fsSL /docker-ce/linux/ubuntu/gpg | sudo apt-key add -root@hk-MZ32-AR0-00:~# add-apt-repository "deb [arch=amd64] /docker-ce/linux/ubuntu $(lsb_release -cs) stable"root@hk-MZ32-AR0-00:~# apt-get -y install docker-ce

5. 安装nvidia-docker2

5.1 ubuntu系统安装

root@hk-MZ32-AR0-00:~# curl -s -L https://nvidia.github.io/nvidia-docker/$(. /etc/os-release;echo $ID$VERSION_ID)/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.listdeb https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/$(ARCH) /#deb https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/$(ARCH) /deb https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/$(ARCH) /#deb https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu18.04/$(ARCH) /deb https://nvidia.github.io/nvidia-docker/ubuntu18.04/$(ARCH) /root@hk-MZ32-AR0-00:~# sed -i 's/18.04/22.04/g' /etc/apt/sources.list.d/nvidia-docker.listroot@hk-MZ32-AR0-00:~# apt-get update命中:1 /ubuntu bionic InRelease命中:2 /docker-ce/linux/ubuntu focal InRelease 获取:3 /ubuntu bionic-security InRelease [88.7 kB] 命中:4 https://mirrors.tuna./ubuntu bionic InRelease 获取:5 https://mirrors.tuna./ubuntu bionic-updates InRelease [88.7 kB] 获取:6 /ubuntu bionic-updates InRelease [88.7 kB] 获取:7 https://mirrors.tuna./ubuntu bionic-backports InRelease [83.3 kB] 获取:8 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 InRelease [1,484 B] 命中:9 /ubuntu/18.04/prod bionic InRelease 获取:10 https://mirrors.tuna./ubuntu bionic-security InRelease [88.7 kB] 获取:11 /ubuntu bionic-proposed InRelease [242 kB] 获取:12 https://mirrors.tuna./ubuntu bionic-proposed InRelease [242 kB] 命中:13 /graphics-drivers/ppa/ubuntu focal InRelease 命中:14 /deb stable InRelease 获取:15 https://mirrors.tuna./ubuntu bionic-updates/main i386 Packages [1,604 kB]获取:16 /ubuntu bionic-backports InRelease [83.3 kB] 获取:17 https://mirrors.tuna./ubuntu bionic-updates/main amd64 Packages [2,909 kB]获取:18 /ubuntu bionic-security/main amd64 DEP-11 Metadata [76.8 kB] 获取:19 https://mirrors.tuna./ubuntu bionic-updates/main amd64 DEP-11 Metadata [297 kB] 获取:20 https://mirrors.tuna./ubuntu bionic-updates/universe amd64 DEP-11 Metadata [302 kB]获取:21 https://mirrors.tuna./ubuntu bionic-updates/multiverse amd64 DEP-11 Metadata [2,468 B]获取:22 https://mirrors.tuna./ubuntu bionic-backports/main amd64 DEP-11 Metadata [8,108 B]获取:23 https://mirrors.tuna./ubuntu bionic-backports/universe amd64 DEP-11 Metadata [10.0 kB] 获取:24 https://nvidia.github.io/libnvidia-container/stable/ubuntu22.04/amd64 InRelease [1,484 B] 获取:25 /ubuntu bionic-security/universe amd64 DEP-11 Metadata [61.0 kB]获取:26 /ubuntu bionic-security/multiverse amd64 DEP-11 Metadata [2,464 B]获取:27 /ubuntu bionic-updates/main amd64 Packages [2,909 kB] 获取:28 https://mirrors.tuna./ubuntu bionic-security/main amd64 DEP-11 Metadata [76.8 kB]获取:29 https://mirrors.tuna./ubuntu bionic-security/universe amd64 DEP-11 Metadata [61.1 kB]获取:30 https://mirrors.tuna./ubuntu bionic-security/multiverse amd64 DEP-11 Metadata [2,464 B] 获取:31 https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu22.04/amd64 InRelease [1,481 B] 获取:32 https://mirrors.tuna./ubuntu bionic-proposed/main Sources [81.3 kB] 获取:33 https://mirrors.tuna./ubuntu bionic-proposed/main Translation-en [38.8 kB]获取:34 https://nvidia.github.io/nvidia-docker/ubuntu22.04/amd64 InRelease [1,474 B] 获取:35 https://mirrors.tuna./ubuntu bionic-proposed/main amd64 DEP-11 Metadata [6,552 B] 获取:36 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 Packages [22.3 kB]获取:37 https://nvidia.github.io/libnvidia-container/stable/ubuntu22.04/amd64 Packages [22.3 kB]获取:38 https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu22.04/amd64 Packages [7,416 B]获取:39 https://nvidia.github.io/nvidia-docker/ubuntu22.04/amd64 Packages [4,488 B] 获取:40 /ubuntu bionic-updates/main i386 Packages [1,604 kB] 获取:41 /ubuntu bionic-updates/main amd64 DEP-11 Metadata [297 kB]获取:42 /ubuntu bionic-updates/universe amd64 DEP-11 Metadata [302 kB] 获取:43 /ubuntu bionic-updates/multiverse amd64 DEP-11 Metadata [2,468 B] 获取:44 /ubuntu bionic-proposed/main Sources [81.3 kB] 获取:45 /ubuntu bionic-proposed/main Translation-en [38.8 kB]获取:46 /ubuntu bionic-proposed/main amd64 DEP-11 Metadata [6,516 B] 获取:47 /ubuntu bionic-backports/main amd64 DEP-11 Metadata [8,092 B] 获取:48 /ubuntu bionic-backports/universe amd64 DEP-11 Metadata [10.1 kB] 已下载 11.9 MB,耗时 11秒 (1,115 kB/s) 正在读取软件包列表... 2%正在读取软件包列表... 完成root@test:/etc/apt/sources.list.d# root@test:/etc/apt/sources.list.d# apt-get install nvidia-docker2正在读取软件包列表... 完成正在分析软件包的依赖关系树 正在读取状态信息... 完成 下列软件包是自动安装的并且现在不需要了:libevent-2.1-7 libnatpmp1 libxvmc1 transmission-common使用'apt autoremove'来卸载它(它们)。将会同时安装下列软件:libnvidia-container-tools libnvidia-container1 nvidia-container-toolkit nvidia-container-toolkit-base下列【新】软件包将被安装:libnvidia-container-tools libnvidia-container1 nvidia-container-toolkit nvidia-container-toolkit-base nvidia-docker2升级了 0 个软件包,新安装了 5 个软件包,要卸载 0 个软件包,有 80 个软件包未被升级。需要下载 3,773 kB 的归档。解压缩后会消耗 14.6 MB 的额外空间。您希望继续执行吗? [Y/n] y获取:1 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 libnvidia-container1 1.12.0-1 [927 kB]获取:2 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 libnvidia-container-tools 1.12.0-1 [24.5 kB] 获取:3 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 nvidia-container-toolkit-base 1.12.0-1 [2,066 kB] 获取:4 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 nvidia-container-toolkit 1.12.0-1 [750 kB]获取:5 https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/amd64 nvidia-docker2 2.12.0-1 [5,544 B]已下载 3,773 kB,耗时 2分 13秒 (28.3 kB/s) 正在选中未选择的软件包 libnvidia-container1:amd64。(正在读取数据库 ... 系统当前共安装有 74 个文件和目录。)准备解压 .../libnvidia-container1_1.12.0-1_amd64.deb ...正在解压 libnvidia-container1:amd64 (1.12.0-1) ...正在选中未选择的软件包 libnvidia-container-tools。准备解压 .../libnvidia-container-tools_1.12.0-1_amd64.deb ...正在解压 libnvidia-container-tools (1.12.0-1) ...正在选中未选择的软件包 nvidia-container-toolkit-base。准备解压 .../nvidia-container-toolkit-base_1.12.0-1_amd64.deb ...正在解压 nvidia-container-toolkit-base (1.12.0-1) ...正在选中未选择的软件包 nvidia-container-toolkit。准备解压 .../nvidia-container-toolkit_1.12.0-1_amd64.deb ...正在解压 nvidia-container-toolkit (1.12.0-1) ...正在选中未选择的软件包 nvidia-docker2。准备解压 .../nvidia-docker2_2.12.0-1_all.deb ...正在解压 nvidia-docker2 (2.12.0-1) ...正在设置 nvidia-container-toolkit-base (1.12.0-1) ...正在设置 libnvidia-container1:amd64 (1.12.0-1) ...正在设置 libnvidia-container-tools (1.12.0-1) ...正在设置 nvidia-container-toolkit (1.12.0-1) ...正在设置 nvidia-docker2 (2.12.0-1) ...正在处理用于 libc-bin (2.31-0ubuntu9.7) 的触发器 ...root@hk-MZ32-AR0-00:~# systemctl restart docker

5.2 centos系统安装

[root@bj ~]# sudo yum install -y nvidia-docker2Loaded plugins: fastestmirror, product-id, search-disabled-repos, subscription-managerThis system is not registered with an entitlement server. You can use subscription-manager to register.Loading mirror speeds from cached hostfileepel/x86_64/metalink | 6.2 kB 00:00:00* base: * epel: mirrors.* extras: mirrors.* updates: mirrors.base| 3.6 kB 00:00:00docker-ce-stable | 3.5 kB 00:00:00extras | 2.9 kB 00:00:00libnvidia-container/x86_64/signature | 833 B 00:00:00Retrieving key from https://nvidia.github.io/libnvidia-container/gpgkeyImporting GPG key 0xF796ECB0:Userid: "NVIDIA CORPORATION (Open Source Projects) <cudatools@>"Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0From : https://nvidia.github.io/libnvidia-container/gpgkeylibnvidia-container/x86_64/signature | 2.1 kB 00:00:00 !!! nvidia-container-runtime/x86_64/signature | 833 B 00:00:00Retrieving key from https://nvidia.github.io/nvidia-container-runtime/gpgkeyImporting GPG key 0xF796ECB0:Userid: "NVIDIA CORPORATION (Open Source Projects) <cudatools@>"Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0From : https://nvidia.github.io/nvidia-container-runtime/gpgkeynvidia-container-runtime/x86_64/signature | 2.1 kB 00:00:00 !!! nvidia-docker/x86_64/signature | 833 B 00:00:00Retrieving key from https://nvidia.github.io/nvidia-docker/gpgkeyImporting GPG key 0xF796ECB0:Userid: "NVIDIA CORPORATION (Open Source Projects) <cudatools@>"Fingerprint: c95b 321b 61e8 8c18 09c4 f759 ddca e044 f796 ecb0From : https://nvidia.github.io/nvidia-docker/gpgkeynvidia-docker/x86_64/signature | 2.1 kB 00:00:00 !!! teamviewer/x86_64/signature | 867 B 00:00:00teamviewer/x86_64/signature | 2.5 kB 00:00:00 !!! updates | 2.9 kB 00:00:00(1/3): nvidia-container-runtime/x86_64/primary | 11 kB 00:00:01(2/3): nvidia-docker/x86_64/primary | 8.0 kB 00:00:01(3/3): libnvidia-container/x86_64/primary | 27 kB 00:00:03libnvidia-container171/171nvidia-container-runtime 71/71nvidia-docker 54/54Resolving Dependencies--> Running transaction check---> Package nvidia-docker2.noarch 0:2.11.0-1 will be installed--> Processing Dependency: nvidia-container-toolkit >= 1.10.0-1 for package: nvidia-docker2-2.11.0-1.noarch--> Running transaction check---> Package nvidia-container-toolkit.x86_64 0:1.11.0-1 will be installed--> Processing Dependency: nvidia-container-toolkit-base = 1.11.0-1 for package: nvidia-container-toolkit-1.11.0-1.x86_64--> Processing Dependency: libnvidia-container-tools < 2.0.0 for package: nvidia-container-toolkit-1.11.0-1.x86_64--> Processing Dependency: libnvidia-container-tools >= 1.11.0-1 for package: nvidia-container-toolkit-1.11.0-1.x86_64--> Running transaction check---> Package libnvidia-container-tools.x86_64 0:1.11.0-1 will be installed--> Processing Dependency: libnvidia-container1(x86-64) >= 1.11.0-1 for package: libnvidia-container-tools-1.11.0-1.x86_64--> Processing Dependency: libnvidia-container.so.1(NVC_1.0)(64bit) for package: libnvidia-container-tools-1.11.0-1.x86_64--> Processing Dependency: libnvidia-container.so.1()(64bit) for package: libnvidia-container-tools-1.11.0-1.x86_64---> Package nvidia-container-toolkit-base.x86_64 0:1.11.0-1 will be installed--> Running transaction check---> Package libnvidia-container1.x86_64 0:1.11.0-1 will be installed--> Finished Dependency ResolutionDependencies Resolved=================================================================================================================================================================================Package Arch Version Repository Size=================================================================================================================================================================================Installing:nvidia-docker2 noarch 2.11.0-1 libnvidia-container 8.7 kInstalling for dependencies:libnvidia-container-tools x86_64 1.11.0-1 libnvidia-container 50 klibnvidia-container1x86_64 1.11.0-1 libnvidia-container 1.0 Mnvidia-container-toolkit x86_64 1.11.0-1 libnvidia-container 780 knvidia-container-toolkit-base x86_64 1.11.0-1 libnvidia-container 2.5 MTransaction Summary=================================================================================================================================================================================Install 1 Package (+4 Dependent packages)Total download size: 4.3 MInstalled size: 12 MDownloading packages:(1/5): libnvidia-container-tools-1.11.0-1.x86_64.rpm | 50 kB 00:00:01(2/5): libnvidia-container1-1.11.0-1.x86_64.rpm | 1.0 MB 00:00:03(3/5): nvidia-container-toolkit-1.11.0-1.x86_64.rpm | 780 kB 00:00:03(4/5): nvidia-docker2-2.11.0-1.noarch.rpm | 8.7 kB 00:00:00(5/5): nvidia-container-toolkit-base-1.11.0-1.x86_64.rpm | 2.5 MB 00:00:43---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Total 94 kB/s | 4.3 MB 00:00:46Running transaction checkRunning transaction testTransaction test succeededRunning transactionInstalling : nvidia-container-toolkit-base-1.11.0-1.x86_64 1/5 Installing : libnvidia-container1-1.11.0-1.x86_64 2/5 Installing : libnvidia-container-tools-1.11.0-1.x86_64 3/5 Installing : nvidia-container-toolkit-1.11.0-1.x86_64 4/5 Installing : nvidia-docker2-2.11.0-1.noarch 5/5 Verifying : libnvidia-container1-1.11.0-1.x86_64 1/5 Verifying : nvidia-container-toolkit-base-1.11.0-1.x86_64 2/5 Verifying : nvidia-container-toolkit-1.11.0-1.x86_64 3/5 Verifying : libnvidia-container-tools-1.11.0-1.x86_64 4/5 Verifying : nvidia-docker2-2.11.0-1.noarch 5/5 Installed:nvidia-docker2.noarch 0:2.11.0-1 Dependency Installed:libnvidia-container-tools.x86_64 0:1.11.0-1 libnvidia-container1.x86_64 0:1.11.0-1 nvidia-container-toolkit.x86_64 0:1.11.0-1 nvidia-container-toolkit-base.x86_64 0:1.11.0-1Complete!

若是centos系统,需要用yum安装过nvidia-docker2,虽然已经安装过nvidia-container-toolkit,但是在容器中使用gpu的时候报错,更新安装 nvidia-container-toolkit

# 设置yum源:nvidia-container-toolkit.repo[root@bj ~]# distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \> && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.repo | tee /etc/yum.repos.d/nvidia-container-toolkit.repo[libnvidia-container]name=libnvidia-containerbaseurl=https://nvidia.github.io/libnvidia-container/stable/centos7/$basearchrepo_gpgcheck=1gpgcheck=0enabled=1gpgkey=https://nvidia.github.io/libnvidia-container/gpgkeysslverify=1sslcacert=/etc/pki/tls/certs/ca-bundle.crt[libnvidia-container-experimental]name=libnvidia-container-experimentalbaseurl=https://nvidia.github.io/libnvidia-container/experimental/centos7/$basearchrepo_gpgcheck=1gpgcheck=0enabled=0gpgkey=https://nvidia.github.io/libnvidia-container/gpgkeysslverify=1sslcacert=/etc/pki/tls/certs/ca-bundle.crt[root@bj ~]# yum install -y nvidia-container-toolkitLoaded plugins: fastestmirror, product-id, search-disabled-repos, subscription-managerThis system is not registered with an entitlement server. You can use subscription-manager to register.Repository libnvidia-container is listed more than once in the configurationRepository libnvidia-container-experimental is listed more than once in the configurationLoading mirror speeds from cached hostfile* base: mirrors.* epel: mirrors.* extras: mirrors.* updates: mirrors.Resolving Dependencies--> Running transaction check---> Package nvidia-container-toolkit.x86_64 0:1.11.0-1 will be updated---> Package nvidia-container-toolkit.x86_64 0:1.12.0-0.1.rc.3 will be an update--> Processing Dependency: nvidia-container-toolkit-base = 1.12.0-0.1.rc.3 for package: nvidia-container-toolkit-1.12.0-0.1.rc.3.x86_64--> Processing Dependency: libnvidia-container-tools >= 1.12.0-0.1.rc.3 for package: nvidia-container-toolkit-1.12.0-0.1.rc.3.x86_64--> Running transaction check---> Package libnvidia-container-tools.x86_64 0:1.11.0-1 will be updated---> Package libnvidia-container-tools.x86_64 0:1.12.0-0.1.rc.3 will be an update--> Processing Dependency: libnvidia-container1(x86-64) >= 1.12.0-0.1.rc.3 for package: libnvidia-container-tools-1.12.0-0.1.rc.3.x86_64---> Package nvidia-container-toolkit-base.x86_64 0:1.11.0-1 will be updated---> Package nvidia-container-toolkit-base.x86_64 0:1.12.0-0.1.rc.3 will be an update--> Running transaction check---> Package libnvidia-container1.x86_64 0:1.11.0-1 will be updated---> Package libnvidia-container1.x86_64 0:1.12.0-0.1.rc.3 will be an update--> Finished Dependency ResolutionDependencies Resolved=================================================================================================================================================================================Package Arch VersionRepositorySize=================================================================================================================================================================================Updating:nvidia-container-toolkit x86_64 1.12.0-0.1.rc.3 libnvidia-container-experimental 797 kUpdating for dependencies:libnvidia-container-toolsx86_64 1.12.0-0.1.rc.3 libnvidia-container-experimental 50 klibnvidia-container1 x86_64 1.12.0-0.1.rc.3 libnvidia-container-experimental 1.0 Mnvidia-container-toolkit-base x86_64 1.12.0-0.1.rc.3 libnvidia-container-experimental 3.4 MTransaction Summary=================================================================================================================================================================================Upgrade 1 Package (+3 Dependent packages)Total download size: 5.2 MDownloading packages:Delta RPMs disabled because /usr/bin/applydeltarpm not installed.(1/4): libnvidia-container-tools-1.12.0-0.1.rc.3.x86_64.rpm| 50 kB 00:00:00(2/4): nvidia-container-toolkit-1.12.0-0.1.rc.3.x86_64.rpm| 797 kB 00:00:00(3/4): libnvidia-container1-1.12.0-0.1.rc.3.x86_64.rpm| 1.0 MB 00:00:02(4/4): nvidia-container-toolkit-base-1.12.0-0.1.rc.3.x86_64.rpm | 3.4 MB 00:00:00---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Total2.0 MB/s | 5.2 MB 00:00:02Running transaction checkRunning transaction testTransaction test succeededRunning transactionUpdating : nvidia-container-toolkit-base-1.12.0-0.1.rc.3.x86_641/8 Updating : libnvidia-container1-1.12.0-0.1.rc.3.x86_642/8 Updating : libnvidia-container-tools-1.12.0-0.1.rc.3.x86_643/8 Updating : nvidia-container-toolkit-1.12.0-0.1.rc.3.x86_64 4/8 Cleanup : nvidia-container-toolkit-1.11.0-1.x86_64 5/8 Cleanup : libnvidia-container-tools-1.11.0-1.x86_64 6/8 Cleanup : libnvidia-container1-1.11.0-1.x86_64 7/8 Cleanup : nvidia-container-toolkit-base-1.11.0-1.x86_64 8/8 Verifying : libnvidia-container1-1.12.0-0.1.rc.3.x86_641/8 Verifying : nvidia-container-toolkit-base-1.12.0-0.1.rc.3.x86_642/8 Verifying : libnvidia-container-tools-1.12.0-0.1.rc.3.x86_643/8 Verifying : nvidia-container-toolkit-1.12.0-0.1.rc.3.x86_64 4/8 Verifying : libnvidia-container-tools-1.11.0-1.x86_64 5/8 Verifying : nvidia-container-toolkit-base-1.11.0-1.x86_64 6/8 Verifying : nvidia-container-toolkit-1.11.0-1.x86_64 7/8 Verifying : libnvidia-container1-1.11.0-1.x86_64 8/8 Updated:nvidia-container-toolkit.x86_64 0:1.12.0-0.1.rc.3Dependency Updated:libnvidia-container-tools.x86_64 0:1.12.0-0.1.rc.3 libnvidia-container1.x86_64 0:1.12.0-0.1.rc.3 nvidia-container-toolkit-base.x86_64 0:1.12.0-0.1.rc.3 Complete![root@bj ~]# systemctl restart docker

6. 测试docker容调用GPU服务

root@hk-MZ32-AR0-00:~# docker run --rm --gpus all nvidia/cuda:10.0-base nvidia-smiSat Feb 11 07:13:48 +-----------------------------------------------------------------------------+| NVIDIA-SMI 460.106.00 Driver Version: 460.106.00 CUDA Version: 11.2||-------------------------------+----------------------+----------------------+| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC || Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. || | |MIG M. ||===============================+======================+======================|| 0 Tesla T4 Off | 00000000:04:00.0 Off |0 || N/A 47C P0 27W / 70W |0MiB / 15109MiB |0%Default || | | N/A |+-------------------------------+----------------------+----------------------+| 1 Tesla T4 Off | 00000000:06:00.0 Off |0 || N/A 43C P0 28W / 70W |0MiB / 15109MiB |0%Default || | | N/A |+-------------------------------+----------------------+----------------------+| 2 Tesla T4 Off | 00000000:0D:00.0 Off |0 || N/A 49C P0 28W / 70W |0MiB / 15109MiB |0%Default || | | N/A |+-------------------------------+----------------------+----------------------+| 3 Tesla T4 Off | 00000000:0F:00.0 Off |0 || N/A 45C P0 26W / 70W |0MiB / 15109MiB |0%Default || | | N/A |+-------------------------------+----------------------+----------------------+| 4 Tesla T4 Off | 00000000:17:00.0 Off |0 || N/A 48C P0 27W / 70W |0MiB / 15109MiB |0%Default || | | N/A |+-------------------------------+----------------------+----------------------+| 5 Tesla T4 Off | 00000000:19:00.0 Off |0 || N/A 49C P0 28W / 70W |0MiB / 15109MiB |0%Default || | | N/A |+-------------------------------+----------------------+----------------------+| 6 Tesla T4 Off | 00000000:21:00.0 Off |0 || N/A 45C P0 26W / 70W |0MiB / 15109MiB |0%Default || | | N/A |+-------------------------------+----------------------+----------------------+| 7 Tesla T4 Off | 00000000:23:00.0 Off |0 || N/A 45C P0 28W / 70W |0MiB / 15109MiB |5%Default || | | N/A |+-------------------------------+----------------------+----------------------++-----------------------------------------------------------------------------+| Processes:|| GPU GI CI PID Type Process name GPU Memory || ID ID Usage||=============================================================================|| No running processes found |+-----------------------------------------------------------------------------+

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。