【Linux】多个版本的CUDA切换

tensorflow 和 pytorch 与cuda对应版本。

u013250861

2260人浏览 · 2023-12-29 02:30:21

u013250861 · 2023-12-29 02:30:21 发布

tensorflow 和 pytorch 与cuda对应版本

PyTorchpytorch.org/get-started/previous-versions/

一、安装CUDA

1、查看已有CUDA环境

2、从官网下载CUDA 10.1的runfile到服务器上。

3、安装CUDA 10.1

执行如下指令：

sudo sh cuda_10.1.105_418.39_linux.run

当安装界面启动，选择continue

然后输入accept，回车

由于你的系统中已经有了NVIDIA显卡驱动，如果不想安装CUDA 10.1中附带的驱动，移动到Driver选项上，按空格键将该项取消。如下图

移动到Install选项，回车，等待安装完成

4、为了不影响现有的CUDA环境，就不修改环境变量了，下文会详细讲述怎么使用新安装的CUDA 10.1。

**5、如果只安装一个版本(可以按照如下方式配置环境变量)

安装完成后，需要为CUDA 10.1配置环境，打开家目录下的.bashrc文件： vim ~/.bashrc
在文件的末尾添加如下内容：
export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
保存并关闭文件，完成配置
更新环境：source ~/.bashrc
解压cudnn-10.1-linux-x64-v7.6.5.32.tgz： tar zxvf ./cudnn-10.1-linux-x64-v7.6.5.32.tgz -C ./
将解压出的cuda/include/cudnn.h文件复制到/usr/local/cuda/include文件夹
cuda/lib64/下所有文件复制到/usr/local/cuda/lib64文件夹
为上述文件添加读取和执行权限：
sudo chmod 755 /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
查看是否安装成功：nvcc -V
成功显示如下（出现版本号即可）：
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243

二、安装cuDNN

1、根据安装的CUDA工具包版本在官网选择适合版本的cuDNN，本文安装的CUDA版本是10.1，就选择TensorFlow 2.1.0对应的cuDNN 7.6.5，选择Local Installer for Linux x86_64 (Tar)。

2、复制cuDNN库的链接，使用wget下载或者下载到自己电脑之后再传到服务器上。

下载下来之后，文件名是cudnn-10.1-linux-x64-v7.6.5.32.tgz

3、解压cuDNN文件，并进入解压出的文件夹，拷贝文件到/usr/local/cuda-10.1中。

	tar -xvf cudnn-10.1-linux-x64-v7.6.5.32.tgz
	cd cuda
	sudo cp lib64/* /usr/local/cuda-10.1/lib64/
	sudo cp include/* /usr/local/cuda-10.1/include/
	sudo chmod a+r /usr/local/cuda-10.1/lib64/*
	sudo chmod a+r /usr/local/cuda-10.1/include/*

4、查看cuDNN版本，指令为cat /usr/local/cuda-10.1/include/cudnn.h | grep CUDNN_MAJOR -A2

5、更新软链接，此处安装的是7.6.5版本，记得更新下边命令中的数字。

	cd /usr/local/cuda-10.1/lib64/
	sudo rm -rf libcudnn.so libcudnn.so.7
	sudo ln -s libcudnn.so.7.6.5 libcudnn.so.7
	sudo ln -s libcudnn.so.7 libcudnn.so
	sudo ldconfig -v

6、最后避免影响到原来的CUDA环境，再执行一下

	source /etc/profile

此时另一个版本的CUDA和cuDNN已经偷摸的安装好了。

但是此时nvcc -V版本还是10.0，具体怎么实现CUDA版本转换，请看下节。

三、切换CUDA版本

切换到普通用户，查看CUDA版本，可以看到还是10.0（原来的版本）

下面我们要用到一个脚本CUDA版本切换脚本：

	#!/usr/bin/env bash
	
	# Copyright (c) 2018 Patrick Hohenecker
	#
	# Permission is hereby granted, free of charge, to any person obtaining a copy
	# of this software and associated documentation files (the "Software"), to deal
	# in the Software without restriction, including without limitation the rights
	# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
	# copies of the Software, and to permit persons to whom the Software is
	# furnished to do so, subject to the following conditions:
	#
	# The above copyright notice and this permission notice shall be included in all
	# copies or substantial portions of the Software.
	#
	# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
	# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
	# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
	# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
	# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
	# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
	# SOFTWARE.
	
	# author:   Patrick Hohenecker <mail@paho.at>
	# version:  2018.1
	# date:     May 15, 2018
	
	
	set -e
	
	
	# ensure that the script has been sourced rather than just executed
	if [[ "${BASH_SOURCE[0]}" = "${0}" ]]; then
	    echo "Please use 'source' to execute switch-cuda.sh!"
	    exit 1
	fi
	
	INSTALL_FOLDER="/usr/local"  # the location to look for CUDA installations at
	TARGET_VERSION=${1}          # the target CUDA version to switch to (if provided)
	
	# if no version to switch to has been provided, then just print all available CUDA installations
	if [[ -z ${TARGET_VERSION} ]]; then
	    echo "The following CUDA installations have been found (in '${INSTALL_FOLDER}'):"
	    ls -l "${INSTALL_FOLDER}" | egrep -o "cuda-[0-9]+\\.[0-9]+$" | while read -r line; do
	        echo "* ${line}"
	    done
	    set +e
	    return
	# otherwise, check whether there is an installation of the requested CUDA version
	elif [[ ! -d "${INSTALL_FOLDER}/cuda-${TARGET_VERSION}" ]]; then
	    echo "No installation of CUDA ${TARGET_VERSION} has been found!"
	    set +e
	    return
	fi
	
	# the path of the installation to use
	cuda_path="${INSTALL_FOLDER}/cuda-${TARGET_VERSION}"
	
	# filter out those CUDA entries from the PATH that are not needed anymore
	path_elements=(${PATH//:/ })
	new_path="${cuda_path}/bin"
	for p in "${path_elements[@]}"; do
	    if [[ ! ${p} =~ ^${INSTALL_FOLDER}/cuda ]]; then
	        new_path="${new_path}:${p}"
	    fi
	done
	
	# filter out those CUDA entries from the LD_LIBRARY_PATH that are not needed anymore
	ld_path_elements=(${LD_LIBRARY_PATH//:/ })
	new_ld_path="${cuda_path}/lib64:${cuda_path}/extras/CUPTI/lib64"
	for p in "${ld_path_elements[@]}"; do
	    if [[ ! ${p} =~ ^${INSTALL_FOLDER}/cuda ]]; then
	        new_ld_path="${new_ld_path}:${p}"
	    fi
	done
	
	# update environment variables
	export CUDA_HOME="${cuda_path}"
	export CUDA_ROOT="${cuda_path}"
	export LD_LIBRARY_PATH="${new_ld_path}"
	export PATH="${new_path}"
	
	echo "Switched to CUDA ${TARGET_VERSION}."
	
	set +e
	return

新建switch-cuda.sh文件，将上边代码写入；

	vi switch-cuda.sh
	source switch-cuda.sh
	source switch-cuda.sh 10.1

此图仅为示意图，根据自己版本展示不同

可以看到当执行source switch-cuda.sh的时候该脚本会扫描所有已安装的CUDA，并列出，用户只需要选择想用的CUDA版本号就可以轻松切换，例如source switch-cuda.sh 10.1，可以看到上图的nvcc也是成功切换了版本。

并且该脚本基于export 语句，重启终端后，CUDA环境还是会恢复到默认的10.0，不影响下次使用，无需手动切回CUDA版本，下图为重启终端后的效果。

此图仅为示意图，根据自己版本展示不同

四、参考文献

[1] CUDA工具包：CUDA Toolkit Archive

[2] cuDNN库：cuDNN Archive

[3] CUDA切换脚本：GitHub - phohenecker/switch-cuda: A simple bash script for switching between installed versions of CUDA.

[4] 安装多版本CUDA：在ubuntu上安装多个版本的CUDA，并实现CUDA版本的自由切换_史蒂夫卡的博客-CSDN博客

[5] (262条消息) 【Linux】在一台机器上同时安装多个版本的CUDA（切换CUDA版本）_TangPlusHPC的博客-CSDN博客_linux安装多个cuda

【Linux】多个版本的CUDA切换 - 知乎

2048 AI社区

有“AI”的1024 = 2048，欢迎大家加入2048 AI社区

更多推荐

软考中级-软件设计师 UML图详解（类图，对象图，用例图，序列图，通信图，状态图，活动图，构件图，部署图）

2048 AI社区

机器学习决策树-分类

2048 AI社区

【SD教程】超详细AI绘画提示词语法讲解！

2048 AI社区

所有评论(0)

查看更多评论

u013250861

@u013250861

已为社区贡献130条内容