1200字范文,内容丰富有趣,写作的好帮手!
1200字范文 > Ubuntu18.04安装CUDA cudnn pycharm tensorflow-gpu

Ubuntu18.04安装CUDA cudnn pycharm tensorflow-gpu

时间:2022-12-13 20:23:33

相关推荐

Ubuntu18.04安装CUDA cudnn pycharm tensorflow-gpu

前言

虚拟机里的linux系统似乎不能安装nvidia显卡驱动,在虚拟机Ubuntu系统上跑tensorflow只能使用cpu。。看来还得去物理机装双系统或者用服务器了

因为,安装nvidia驱动报错:

sudo sh cuda_11.2.2_460.32.03_linux.run出现错误:

Installation failed. See log at /var/log/cuda-installer.log for details.

显示nvidia驱动安装失败

安装NVIDIA驱动,官网下载,/Download/index.aspx?lang=cn,

su rootsh NVIDIA-Linux-x86_64-515.65.01.run

报错:

WARNING: You do not appear to have an NVIDIA GPU supported by the 515.65.01

NVIDIA Linux graphics driver installed in this system. For further

details, please see the appendix SUPPORTED NVIDIA GRAPHICS CHIPS in

the README available on the Linux driver download page at

.

查找原因后发现,虚拟机里的linux系统似乎不能安装nvidia显卡驱动,使用

ubuntu-drivers devices

看到的也只有VMWare的驱动

不过,以下在VMWare Ubuntu18.04安装失败的过程也可以看作我的笔记,供日后参考。

以下是我的安装过程:

一、更新源(有时下载时有用,有时没啥用,可跳过也可先加上)

为了方便,可以下载vim:

sudo apt-get install vim

若报错如下图,则

sudo apt-get updatesudo apt-get install vim不行的话再sudo rm /var/lib/dpkg/locksudo apt-get install vim

继续:

sudo vim /etc/apt/sources.list打开sources.list后把光标移到末尾,按i进入编辑,添加清华源、阿里源:deb http://mirrors.tuna./ubuntu/ bionic main restricted universe multiversedeb-src http://mirrors.tuna./ubuntu/ bionic main restricted universe multiversedeb http://mirrors.tuna./ubuntu/ bionic-updates main restricted universe multiversedeb-src http://mirrors.tuna./ubuntu/ bionic-updates main restricted universe multiversedeb http://mirrors.tuna./ubuntu/ bionic-security main restricted universe multiversedeb-src http://mirrors.tuna./ubuntu/ bionic-security main restricted universe multiversedeb /ubuntu/ bionic main restricted universe multiversedeb /ubuntu/ bionic-security main restricted universe multiversedeb /ubuntu/ bionic-updates main restricted universe multiversedeb /ubuntu/ bionic-proposed main restricted universe multiversedeb /ubuntu/ bionic-backports main restricted universe multiversedeb-src /ubuntu/ bionic main restricted universe multiversedeb-src /ubuntu/ bionic-security main restricted universe multiversedeb-src /ubuntu/ bionic-updates main restricted universe multiversedeb-src /ubuntu/ bionic-proposed main restricted universe multiversedeb-src /ubuntu/ bionic-backports main restricted universe multiverse按Esc键,再输入 :wq 后回车,保存并退出。输入命令更新源地址:sudo apt-get update

二、下载安装CUDA、cudnn

开始之前,先查看对应版本:/install/source

我安装tensorflow-gpu-2.6.0、CUDA11.2(nvidia显卡驱动好像对应≥460.32.03)、cudnn8.1、GCC7.3.1

1、下载CUDA:

/cuda-toolkit-archive

找到对应版本

复制链接到迅雷下载,嘎嘎快。下载完成后拖入到虚拟机主目录可以新建文件夹。

2、下载cudnn

/rdp/cudnn-archive

拖入虚拟机

4、安装CUDA

参考:linux安装CUDA+cuDNN

Ubuntu 配置多个版本cuda(10.0、10.1)

以下是我的安装过程:

(1)安装CUDA:

先查看是否安装了GCC,因为下一步可能报错(错误见下):

gcc -v

没安装的话安装gcc,注意版本配对:

sudo apt install gcc

gcc -v

显示是系统默认的7.5.0版本,tensorflow2.6.0官方给出的gcc版本是7.3.1,没找到,先试试下一句能不能成功验证gcc版本

sudo sh cuda_11.2.2_460.32.03_linux.run可能遇到的报错:Failed to verify gcc version. See log at /var/log/cuda-installer.log for details.如未报错输入accpet如果勾选了Driver安装,报错,则重来,按回车取消Driver,自行安装nvidia驱动(我在虚拟机里无法安装),光标移到install回车

此时

nvidia-smi

仍报错(因为虚拟机没安装nvidia驱动),物理机上安装可以参照前言部分

安装后nvidia-smi,如遇:

NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

sudo apt-get install dkmssudo dkms install -m nvidia -v 515.65.01

(2)添加环境变量

sudo vim ~/.bashrc光标移动到末尾,按i,进入编辑export PATH="/usr/local/cuda/bin:$PATH"export LD_LIBRARY_PATH="/usr/local/cuda/lib64:$LD_LIBRARY_PATH"按esc键退出vim编辑器,再输入:wq保存文件并退出。输入以下命令,激活更新后的环境变量:source ~/.bashrc注意,上面路径中是用/cuda而不是/cuda-11.2,因为接下来需要通过软链接,以实现多个CUDA版本共存。输入下面代码,即可完成软链接的生成,其中/cuda-11.2替换成自己的cuda安装目录名称。sudo rm -rf /usr/local/cuda #删除之前生成的软链接sudo ln -s /usr/local/cuda-11.2 /usr/local/cuda #生成新的软链接如果安装了多个版本的CUDA,也可使用上述两行命令进行版本切换最后nvcc -V显示CUDA版本即完成

至此

@ubuntu:~$ nvcc -Vnvcc: NVIDIA (R) Cuda compiler driverCopyright (c) - NVIDIA CorporationBuilt on Sun_Feb_14_21:12:58_PST_Cuda compilation tools, release 11.2, V11.2.152Build cuda_11.2.r11.2/compiler.29618528_0@ubuntu:~$ ls /usr/src | grep nvidianvidia-515.65.01

nvidia-smi应该能成功显示

(3)安装cudnn

tar -xzvf /home/qmj/cudnnfiles/cudnn-11.2-linux-x64-v8.1.1.33.tgz解压后生成名为CUDA的文件夹跟cuda_11.2.2_460.32.03_linux.run在同一个文件夹下sudo cp /home/qmj/CUDAfiles/cuda/include/cudnn*.h /usr/local/cuda/include/sudo cp /home/qmj/CUDAfiles/cuda/lib64/libcudnn* /usr/local/cuda/lib64/sudo chmod a+r /usr/local/cuda/include/cudnn.hsudo chmod a+r /usr/local/cuda/lib64/libcudnn*#查看cudnn版本cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2

完成

三、安装pip

打算偷懒,不下载安装python,直接使用系统自带的python3.6。。。

安装pip和依赖包并升级

sudo apt-get install python3-pip python3-devsudo pip3 install --upgrade pip

四、安装pycharm

下载,拖入到Ubuntu主目录:/pycharm/download/#section=linux

解压tar -xzvf pycharm-community-.2.tar安装. pycharm.sh

以后可以在pycharm.sh所在的文件夹下使用

sh pycharm.sh &

来打开pycharm

参考:安装pycharm

五、安装tensorflow-gpu

pip3 install tensorflow-gpu==2.6.0 -i https://pypi.tuna./simple/太慢可以换阿里源,否则跳过这条:pip3 install tensorflow-gpu==2.6.0 -i /pypi/simple/?spm=a2c6h.25603864.0.0.7a345992gApCnw

pychram创建项目时interpreter选择python3.6,并勾选inherit啥啥啥就可以用上所有packages

我的tensorflow-gpu跑得有点不够快。。一会想想办法。

六、安装其他包

sudo apt-get install python3-pandas

修改最后面的包名称即可,太慢就在后面加源,末尾添加 -i https://啥啥啥

附录

安装gcc7.3.0

/instg-9000-A800_9000_9010/atlastrain_03_0062.html

需要先安装C/C++编译器

sudo apt install gcc g++

然后

以下步骤请在root用户下执行:

(1)sudo passwd root

设置密码(设置过的可跳过)

su root

进入root用户权限(退出用exit,回车)

(2)下载gcc-7.3.0.tar.gz,下载地址为 https://mirrors.tuna./gnu/gcc/gcc-7.3.0/gcc-7.3.0.tar.gz。

安装gcc时候会占用大量临时空间,所以先执行下面的命令清空/tmp目录:

sudo rm -rf /tmp/*

安装依赖。

(1) centos/bclinux执行如下命令安装:

yum install bzip2

(2) ubuntu/debian执行如下命令安装:

apt-get install bzip2

编译安装gcc。

进入gcc-7.3.0.tar.gz源码包所在目录,解压源码包,命令为:

tar -zxvf gcc-7.3.0.tar.gz

进入解压后的文件夹,执行如下命令下载gcc依赖包:

cd gcc-7.3.0

./contrib/download_prerequisites

如果执行上述命令报错,需要执行如下命令在“gcc-7.3.0/”文件夹下下载依赖包:

wget /pub/gcc/infrastructure/gmp-6.1.0.tar.bz2

wget /pub/gcc/infrastructure/mpfr-3.1.4.tar.bz2

wget /pub/gcc/infrastructure/mpc-1.0.3.tar.gz

wget /pub/gcc/infrastructure/isl-0.16.1.tar.bz2

下载好上述依赖包后,重新执行以下命令:

./contrib/download_prerequisites

如果上述命令校验失败,需要确保依赖包为一次性下载成功,无重复下载现象。

执行配置、编译和安装命令:

./configure --enable-languages=c,c++ --disable-multilib --with-system-zlib --prefix=/usr/local/gcc7.3.0

make -j15 # 通过grep -w processor /proc/cpuinfo|wc -l查看cpu数,示例为15,用户可自行设置相应参数。(make -j4 用了1小时,下文有可能遇到的报错和解决方法)

make install

注意:

其中“–prefix”参数用于指定gcc7.3.0安装路径,用户可自行配置,但注意不要配置为“/usr/local”及“/usr”,因为会与系统使用软件源默认安装的gcc相冲突,导致系统原始gcc编译环境被破坏。示例指定为“/usr/local/gcc7.3.0”。

(3)配置环境变量。

当用户执行训练时,需要用到gcc升级后的编译环境,因此要在训练脚本中配置环境变量,通过如下命令配置。

export LD_LIBRARY_PATH= i n s t a l l p a t h / l i b 64 : {install_path}/lib64: installp​ath/lib64:{LD_LIBRARY_PATH}

其中${install_path}为4.c中配置的gcc7.3.0安装路径,本示例为“/usr/local/gcc7.3.0/”。

说明:

本步骤为用户在需要用到gcc升级后的编译环境时才配置环境变量。

以下为 make -j4 时的报错:

1、

root@ubuntu:/home/qmj/gcc-7.3.0# make -j4

Command ‘make’ not found, but can be installed with:

apt install make

apt install make-guile

安装make即可

2、

make -j4

make[3]: 离开目录“/home/qmj/gcc-7.3.0/build-x86_64-pc-linux-gnu/libiberty”

make[2]: 离开目录“/home/qmj/gcc-7.3.0”

Makefile:25224: recipe for target ‘stage1-bubble’ failed

make[1]: *** [stage1-bubble] Error 2

make[1]: 离开目录“/home/qmj/gcc-7.3.0”

Makefile:941: recipe for target ‘all’ failed

make: *** [all] Error 2

make[2]: 离开目录“/home/qmj/gcc-7.3.0”

Makefile:25224: recipe for target ‘stage1-bubble’ failed

make[1]: *** [stage1-bubble] Error 2

make[1]: 离开目录“/home/qmj/gcc-7.3.0”

Makefile:941: recipe for target ‘all’ failed

make: *** [all] Error 2

configure: error: C++ compiler missing or inoperational

Makefile:11605: recipe for target ‘configure-stage1-libcpp’ failed

make[2]: *** [configure-stage1-libcpp] Error 1

make[2]: 离开目录“/home/qmj/gcc-7.3.0”

Makefile:25224: recipe for target ‘stage1-bubble’ failed

make[1]: *** [stage1-bubble] Error 2

make[1]: 离开目录“/home/qmj/gcc-7.3.0”

Makefile:941: recipe for target ‘all’ failed

make: *** [all] Error 2

解决:

exit

回车,退出root

sudo apt-get install g++

su root

sudo rm -rf /tmp/*

cd gcc-7.3.0

./configure --enable-languages=c,c++ --disable-multilib --with-system-zlib --prefix=/usr/local/gcc7.3.0

make -j4

2、

…/…/./gcc/lto-compress.c:34:10: fatal error: zlib.h: 没有那个文件或目录

#include <zlib.h>

^~~~~~~~

compilation terminated.

Makefile:1099: recipe for target ‘lto-compress.o’ failed

make[3]: *** [lto-compress.o] Error 1

make[3]: *** 正在等待未完成的任务…

rm gcc.pod

make[3]: 离开目录“/home/qmj/gcc-7.3.0/host-x86_64-pc-linux-gnu/gcc”

Makefile:4555: recipe for target ‘all-stage1-gcc’ failed

make[2]: *** [all-stage1-gcc] Error 2

make[2]: 离开目录“/home/qmj/gcc-7.3.0”

Makefile:25224: recipe for target ‘stage1-bubble’ failed

make[1]: *** [stage1-bubble] Error 2

make[1]: 离开目录“/home/qmj/gcc-7.3.0”

Makefile:941: recipe for target ‘all’ failed

make: *** [all] Error 2

解决:

exit

回车,退出root

sudo apt-get install zlib1g-dev

su root

sudo rm -rf /tmp/*

cd gcc-7.3.0

./configure --enable-languages=c,c++ --disable-multilib --with-system-zlib --prefix=/usr/local/gcc7.3.0

make -j4

3、

libtool: link: ranlib .libs/libtsan.a

libtool: link: rm -fr .libs/libtsan.lax

libtool: link: ( cd “.libs” && rm -f “libtsan.la” && ln -s “…/libtsan.la” “libtsan.la” )

make[4]: 离开目录“/home/qmj/gcc-7.3.0/x86_64-pc-linux-gnu/libsanitizer/tsan”

make[4]: 进入目录“/home/qmj/gcc-7.3.0/x86_64-pc-linux-gnu/libsanitizer”

true “AR_FLAGS=rc” “CC_FOR_BUILD=gcc” “CFLAGS=-g -O2” “CXXFLAGS=-g -O2 -D_GNU_SOURCE” “CFLAGS_FOR_BUILD=-g -O2” “CFLAGS_FOR_TARGET=-g -O2” “INSTALL=/usr/bin/install -c” “INSTALL_DATA=/usr/bin/install -c -m 644” “INSTALL_PROGRAM=/usr/bin/install -c” “INSTALL_SCRIPT=/usr/bin/install -c” “JC1FLAGS=” “LDFLAGS=” “LIBCFLAGS=-g -O2” “LIBCFLAGS_FOR_TARGET=-g -O2” “MAKE=make” "MAKEINFO=/home/qmj/gcc-7.3.0/missing makeinfo --split-size=5000000 --split-size=5000000 " “PICFLAG=” “PICFLAG_FOR_TARGET=” “SHELL=/bin/bash” “RUNTESTFLAGS=” “exec_prefix=/usr/local/gcc7.3.0” “infodir=/usr/local/gcc7.3.0/share/info” “libdir=/usr/local/gcc7.3.0/lib” “prefix=/usr/local/gcc7.3.0” “includedir=/usr/local/gcc7.3.0/include” “AR=ar” “AS=/home/qmj/gcc-7.3.0/host-x86_64-pc-linux-gnu/gcc/as” “LD=/home/qmj/gcc-7.3.0/host-x86_64-pc-linux-gnu/gcc/collect-ld” “LIBCFLAGS=-g -O2” “NM=/home/qmj/gcc-7.3.0/host-x86_64-pc-linux-gnu/gcc/nm” “PICFLAG=” “RANLIB=ranlib” “DESTDIR=” DO=all multi-do # make

make[4]: 离开目录“/home/qmj/gcc-7.3.0/x86_64-pc-linux-gnu/libsanitizer”

make[3]: 离开目录“/home/qmj/gcc-7.3.0/x86_64-pc-linux-gnu/libsanitizer”

make[2]: 离开目录“/home/qmj/gcc-7.3.0/x86_64-pc-linux-gnu/libsanitizer”

make[1]: 离开目录“/home/qmj/gcc-7.3.0”

完成了?

在root下接着

make install

出现:

Libraries have been installed in:

/usr/local/gcc7.3.0/lib/…/lib64

If you ever happen to want to link against installed libraries

in a given directory, LIBDIR, you must either use libtool, and

specify the full pathname of the library, or use the `-LLIBDIR’

flag during linking and do at least one of the following:

add LIBDIR to the `LD_LIBRARY_PATH’ environment variable

during executionadd LIBDIR to the `LD_RUN_PATH’ environment variable

during linkinguse the `-Wl,-rpath -Wl,LIBDIR’ linker flaghave your system administrator add LIBDIR to `/etc/ld.so.conf’

See any operating system documentation about shared libraries for

more information, such as the ld(1) and ld.so(8) manual pages.

make[4]: 对“install-data-am”无需做任何事。

make[4]: 离开目录“/home/qmj/gcc-7.3.0/x86_64-pc-linux-gnu/libatomic”

make[3]: 离开目录“/home/qmj/gcc-7.3.0/x86_64-pc-linux-gnu/libatomic”

make[2]: 离开目录“/home/qmj/gcc-7.3.0/x86_64-pc-linux-gnu/libatomic”

make[1]: 离开目录“/home/qmj/gcc-7.3.0”

完成!

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。