2018年7月24日 星期二

安裝優化 Tensorflow GPU相關功能 (Cuda)

問題: 出現 "Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2" 或其他未優化狀況,此為 tensorflow僅使用一般指令集,而未對目前平台新的指令集做優化。

解決方法:

1. 自己動手做

- Windows
https://software.intel.com/zh-cn/articles/tutorial-for-compiling-tensorflow-on-windows

人生苦短,沒有作。

2. 使用別人編譯好的包

- Linux

進去選相關的包
https://github.com/mind/wheels

在自己的虛擬環境
pip --no-cache-dir install https://github.com/mind/wheels/releases/download/{RELEASE}/{WHEEL}

- Windows

進去下載自己需要的包,到虛擬環境的目錄下
https://github.com/fo40225/tensorflow-windows-wheel


在自己的虛擬環境裡安裝
pip --no-cache-dir install 包的名字(如 tensorflow_gpu-1.9.0-cp36-cp36m-win_amd64.whl)


3. 試試看進 python 直譯器

import tensorflow as tf

- 錯誤排除

如出現 "ImportError: Could not find 'cudart64_92.dll'. TensorFlow requires that this DLL be installed in a directory that is named in your %PATH% environment variable. Download and install CUDA 9.2 from this URL: https://developer.nvidia.com/cuda-toolkit"

乖乖去裝一下 https://developer.nvidia.com/cuda-toolkit

- 錯誤排除

ImportError: Could not find 'cudnn64_7.dll'. TensorFlow requires that this DLL be installed in a directory that is named in your %PATH% environment variable. Note that installing cuDNN is a separate step from installing CUDA, and this DLL is often found in a different directory from the CUDA DLLs. You may install the necessary DLL by downloading cuDNN 7 from this URL: https://developer.nvidia.com/cudnn

乖乖去裝一下  https://developer.nvidia.com/cudnn

傳回來是一個壓縮檔
解壓縮到 (我在上面已經重裝了 cuda 9.2, 所以這邊也 copy 進 cuda 9.2 喔)
  1. Copy \cuda\bin\cudnn64_7.dll to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin.
  2. Copy \cuda\ include\cudnn.h to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\include.
  3. Copy \cuda\lib\x64\cudnn.lib to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\lib\x64.
參考文章: https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html


然後,就大功告成。


4. 運作看看


# 新建一個 graph. 
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') 
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') 
c = tf.matmul(a, b) 

# 新建session with log_device_placement 並設為 True. 
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) 

# 運行這個 op.
print(sess.run(c))

剛開始卡了一下,但還是跑完。

注意裡面已經開始使用 GPU了,如: 
MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0 

終於成功了。

沒有留言:

張貼留言