按照安裝的正確過程重新整理了一下:Deepin 15.8 肯定沒問題。

被DeepIn Linux 的操作便利性吸引了,重新構建一個學習和工作用的平台,相比Ubuntu來說,UI確實符合中文習慣,最關鍵的QQ和微信我不用折騰的,順便記錄一下過程,給後來的使用者一點參考。

Deepin linux 15.8 1080ti cuda9.0 cudnn7.4 tensorflow-gpu1.12,pytorch-gpu

1、更新補丁安裝包

  1. 控制中心 更新即可或者2、3步
  2. sudo apt update
  3. sudo apt upgrade 即可

2、顯卡驅動問題

cuda9.0默認帶384,不要指望,裝上去啟動不了,主要lightDM進不去,系統自帶 390 儘管符合要求,但後期不能正常工作。

處理方法:安裝cuda9.2自帶的驅動,也就是下載cuda9.2的安裝包,只安裝nv驅動,其他不安裝,我的在ubuntu18.04哪裡講了,參見該文章

先卸載自帶的驅動,有點暴力哦。呵呵,執行的過程中應該是一路的沒有,保險起見還是刪除以下,一面要蛾子。

sudo apt purge nvidia*

這樣也行

sudo apt remove --purge nvidia*

禁用開源的驅動

sudo gedit /etc/modprobe.d/blacklist-nouveau.conf

在文件中添加如下內容:

blacklist nouveau
options nouveau modeset=0

或者

執行如下命令

sudo gedit /etc/modprobe.d/blacklist.conf

在文件尾部添加:

blacklist nouveau

保存文件

執行如下命令:

sudo update-initramfs -u

先不要重啟啟動,重啟後,桌面進不去了,記住完成上面動作後一定不要重啟。。。。

3、下載cuda9.2的安裝包,不是安裝9.2,而是為了安裝它自帶的驅動

下載地址:developer.nvidia.com/cu

僅僅下載第一個,就是為了CUDA9.2的顯卡驅動,當然可以去下載獨立的顯卡驅動

下載的文件名稱是cuda_9.2.148_396.37_linux.run,拷貝到您用戶的工作目錄,為什麼,為了不用切換路徑了,因為重啟後你要打命令的,漢字不一定可以輸入,而我們下載後通常是放在下載目錄,目錄的名字也是漢字,不拷貝,很麻煩。

4、安裝CUDA9.2 帶的顯卡驅動。

下載完成後重啟系統,桌面進不去了,不要害怕。

按CRTL+ALT+F2組合鍵,進入TTY2,登錄,在命令行輸入:

sudo chmod +x cuda_9.2.148_396.37_linux.run
sudo ./cuda_9.2.148_396.37_linux.run

或者

sudo sh ./cuda_9.2.148_396.37_linux.run

接著協議,空格翻完:

Do you accept the previously read EULA?

accept/decline/quit:accept

You are attempting to install on an unsupported configuration. Do you wish to continue?

(y)es/(n)o [ default is no ]: yes

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 396.37?

(y)es/(n)o/(q)uit: y

Do you want to install the OpenGL libraries?

(y)es/(n)o/(q)uit [ default is yes ]:

Do you want to run nvidia-xconfig?

This will update the system X configuration file so that the NVIDIA X driver

is used. The pre-existing X configuration file will be backed up.

This option should not be used on systems that require a custom

X configuration, such as systems with multiple GPU vendors.

(y)es/(n)o/(q)uit [ default is no ]: y

nstall the CUDA 9.2 Toolkit?

(y)es/(n)o/(q)uit: n

Install the CUDA 9.2 Samples?

(y)es/(n)o/(q)uit: n

等待顯卡安裝成功,我反覆多次均可以安裝成功。重啟

5、在安裝cuda9.0之前建議先安裝如下輔助文件:

sudo apt install exfat-fuse exfat-utils unrar p7zip-full p7zip-rar rar unzip gtk2-engines-pixbuf libxml2-utils build-essential gcc-4.8 gcc-4.8-multilib g++-4.8 g++-4.8-multilib gcc-5 gcc-5-multilib g++-5 g++-5-multilib gcc-6 gcc-6-multilib g++-6 g++-6-multilib gcc-7 gcc-7-multilib g++-7 g++-7-multilib curl terminator libx11-dev libxmu-dev libxi-dev libgl1-mesa-glx libglu1-mesa libglu1-mesa-dev libgles2-mesa-dev libfreeimage3 libfreeimage-dev freeglut3 freeglut3-dev libffi-dev -y

GCC G++設置 因為cuda9.0 安裝需要gcc5的編譯器,我為了方便,因為有可能其他的軟體會需要其他高版本的,我順便把gcc都給裝上,需要的時候通過配置來激活即可。

需要GCC5

當然為了方便可以考慮把所有的GGC都裝進去,看你的洗好

sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.8 40
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-5 50
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-6 60
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 70

sudo update-alternatives --config gcc

會出現個選擇界面 選擇gcc5那個

sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.8 40
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-5 50
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-6 60
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-7 70

sudo update-alternatives --config g++

選g++-5的那個

6、安裝cuda9.0

https://developer.nvidia.com/cuda-90-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1604&target_type=runfilelocal

下載

全部下載

sudo sh cuda_9.0.176_384.81_linux.run

Do you accept the previously read EULA?

accept/decline/quit:accept

You are attempting to install on an unsupported configuration. Do you wish to continue?

(y)es/(n)o [ default is no ]: yes

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 384.81?

(y)es/(n)o/(q)uit: no

Install the CUDA 9.0 Toolkit?

(y)es/(n)o/(q)uit: y

Enter Toolkit Location

[ default is /usr/local/cuda-9.0 ]:

Do you want to install a symbolic link at /usr/local/cuda?

(y)es/(n)o/(q)uit: y

Install the CUDA 9.0 Samples?

(y)es/(n)o/(q)uit: y

Enter CUDA Samples Location

[ default is /home/oliver ]:

進入安裝過程,

接著安裝四個補丁,同意協議,會問你安裝路徑是不是,不是改,是的回車,四個補丁一樣的。接著編輯配置文件。這個和Ubuntu18.04有點區別,我第一次把路徑信息寫到

/etc/profile文件里,重啟,居然不起任何作用,不知道是不是我個人的原因,之後我把路徑配置方法,我自己用用戶的.bashrc中,起作用了,我特地用 echo $PATH命令看了,不寫在自己用戶的.bashrc中確實不起作用,一臉懵懂。

sudo gedit ~/.bashrc

輸入如下信息,當然,你看到這個文檔的時候,拷貝也行,題外一句,我順便把Java也配置,不使用Java的請忽略

export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=.:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda
export JAVA_HOME=/usr/local/jdk8
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$JAVA_HOME/bin:$PATH

source ~/.bashrc

或者重啟

7、接著安裝cudnn7.4

https://developer.nvidia.com/rdp/cudnn-download?

developer.nvidia.com

下載對應版本

cudnn-9.0-linux-x64-v7.4.1.5 大約347MB

剪切到你的當前用戶的工作目錄,解壓有兩種方式:一、直接用歸檔器打開,直接拉出來,二、使用命令:

tar -xzvf cudnn-9.0-linux-x64-v7.4.1.5.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

暴力測試cuda

開啟Terminal

cd /home/oliver/NVIDIA_CUDA-9.0_Samples
make >>ll.log

等著編譯完成

cd /home/oliver/NVIDIA_CUDA-9.0_Samples/bin/x86_64/linux/release
oliver@oliver-PC:~/NVIDIA_CUDA-9.0_Samples/bin/x86_64/linux/release$ ./convolutionFFT2D
[./convolutionFFT2D] - Starting...
GPU Device 0: "GeForce GTX 1080 Ti" with compute capability 6.1

Testing built-in R2C / C2R FFT-based convolution
...allocating memory
...generating random input data
...creating R2C & C2R FFT plans for 2048 x 2048
...uploading to GPU and padding convolution kernel and input data
...transforming convolution kernel
...running GPU FFT convolution: 4400.440143 MPix/s (0.909000 ms)
...reading back GPU convolution results
...running reference CPU convolution
...comparing the results: rel L2 = 1.058234E-07 (max delta = 9.978815E-07)
L2norm Error OK
...shutting down
Testing custom R2C / C2R FFT-based convolution
...allocating memory
...generating random input data
...creating C2C FFT plan for 2048 x 1024
...uploading to GPU and padding convolution kernel and input data
...transforming convolution kernel
...running GPU FFT convolution: 4454.342975 MPix/s (0.898000 ms)
...reading back GPU FFT results
...running reference CPU convolution
...comparing the results: rel L2 = 1.087169E-07 (max delta = 1.084381E-06)
L2norm Error OK
...shutting down
Testing updated custom R2C / C2R FFT-based convolution
...allocating memory
...generating random input data
...creating C2C FFT plan for 2048 x 1024
...uploading to GPU and padding convolution kernel and input data
...transforming convolution kernel
...running GPU FFT convolution: 5547.850087 MPix/s (0.721000 ms)
...reading back GPU FFT results
...running reference CPU convolution
...comparing the results: rel L2 = 1.084221E-07 (max delta = 9.397971E-07)
L2norm Error OK
...shutting down
Test Summary: 0 errors
Test passed
oliver@oliver-PC:~/NVIDIA_CUDA-9.0_Samples/bin/x86_64/linux/release$ ./clock_nvrtc
CUDA Clock sample
> Using CUDA Device [0]: GeForce GTX 1080 Ti
> GPU Device has SM 6.1 compute capability
Average clocks/block = 3174.812500

8、Anaconda安裝

https://www.anaconda.com/download/#linux?

www.anaconda.com

目前最新版是python3.7,但是tensorflow目前好像3.7兼容的版本沒出,建議各位安全起見下載3.6的版本。5.2的版本帶的3.6的建議下載

安裝很簡單

chmod +x anaconda.run
./anaconda.run

9、安裝tensorflow

目前最新版是可以兼容3.7了,但安裝需要個技巧。參見解決Python3.7不能安裝tensorflow的問題,感謝作者吳子寒,如果安裝的是3.6的python 安裝很簡單:

pip install tensorflow-gpu

安裝的應該是1.12版的。速度慢,請離線下載。

pip install tensorflow_gpu-1.12.0-cp36-cp36m-manylinux1_x86_64.whl

10、安裝keras

pip install keras

11、安裝pytorch

PyTorch?

pytorch.org圖標

因為使用anaconda 建議命令改成

pip install torch torchvision

會自動根據平台特性選擇GPU版本的pytorch,目前新的穩定版是1.0,速度慢,請離線安裝。

pip install torch-1.0.0-cp36-cp36m-manylinux1_x86_64.whl

pip install torchvision

12、安裝OpenAi 的gym

openai/gym?

github.com圖標

13、安裝AllenNLP

allenai/allennlp?

github.com圖標

其實直接

pip install allennlp

即可

14、 pycharm 的安裝參見 我ubuntu18.04 那篇文章

15、 測試(pytorch)

import torch
print(torch.__version__)
print(torch.cuda.is_available())
x=torch.randn(10)
if torch.cuda.is_available():
device=torch.device("cuda")
y=torch.ones_like(x,device=device)
x=x.to(device)
z=x+y
print(z)
print(z.to("cpu",torch.double))

結果

/home/oliver/anaconda3/bin/python /home/oliver/PycharmProjects/untitled/test.py
0.4.1.post2
True
tensor([ 1.7476, 2.9535, 1.6507, 0.2742, 0.6676, 2.6718, 2.4384, 3.2783,
-0.0905, 1.4600], device=cuda:0)
tensor([ 1.7476, 2.9535, 1.6507, 0.2742, 0.6676, 2.6718, 2.4384, 3.2783,
-0.0905, 1.4600], dtype=torch.float64)

測試沒問題

16、 tensorflow-gpu測試

import tensorflow as tf
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

/home/oliver/anaconda3/bin/python /home/oliver/PycharmProjects/untitled/r.py
/home/oliver/anaconda3/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.6 of module tensorflow.python.framework.fast_tensor_util does not match runtime version 3.7
return f(*args, **kwds)
2018-12-02 12:03:41.443411: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-12-02 12:03:41.538054: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-12-02 12:03:41.538381: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.645
pciBusID: 0000:01:00.0
totalMemory: 10.91GiB freeMemory: 10.42GiB
2018-12-02 12:03:41.538392: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
Device mapping:
/job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU device
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1
2018-12-02 12:03:41.694077: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-12-02 12:03:41.694095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2018-12-02 12:03:41.694100: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2018-12-02 12:03:41.694377: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10078 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-12-02 12:03:41.694879: I tensorflow/core/common_runtime/direct_session.cc:307] Device mapping:
/job:localhost/replica:0/task:0/device:XLA_GPU:0 -> device: XLA_GPU device
/job:localhost/replica:0/task:0/device:XLA_CPU:0 -> device: XLA_CPU device
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1

Process finished with exit code 0

推薦閱讀:

相关文章