問題: 出現 "Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2" 或其他未優化狀況,此為 tensorflow僅使用一般指令集,而未對目前平台新的指令集做優化。
解決方法:
1. 自己動手做
- Windows
https://software.intel.com/zh-cn/articles/tutorial-for-compiling-tensorflow-on-windows
人生苦短,沒有作。
2. 使用別人編譯好的包
- Linux
進去選相關的包
https://github.com/mind/wheels
在自己的虛擬環境
pip --no-cache-dir install https://github.com/mind/wheels/releases/download/{RELEASE}/{WHEEL}
- Windows
進去下載自己需要的包,到虛擬環境的目錄下
https://github.com/fo40225/tensorflow-windows-wheel
在自己的虛擬環境裡安裝
pip --no-cache-dir install 包的名字(如 tensorflow_gpu-1.9.0-cp36-cp36m-win_amd64.whl)
3. 試試看進 python 直譯器
import tensorflow as tf
- 錯誤排除
如出現 "ImportError: Could not find 'cudart64_92.dll'. TensorFlow requires that this DLL be installed in a directory that is named in your %PATH% environment variable. Download and install CUDA 9.2 from this URL: https://developer.nvidia.com/cuda-toolkit"
乖乖去裝一下 https://developer.nvidia.com/cuda-toolkit
- 錯誤排除
ImportError: Could not find 'cudnn64_7.dll'. TensorFlow requires that this DLL be installed in a directory that is named in your %PATH% environment variable. Note that installing cuDNN is a separate step from installing CUDA, and this DLL is often found in a different directory from the CUDA DLLs. You may install the necessary DLL by downloading cuDNN 7 from this URL: https://developer.nvidia.com/cudnn
乖乖去裝一下 https://developer.nvidia.com/cudnn
傳回來是一個壓縮檔
解壓縮到 (我在上面已經重裝了 cuda 9.2, 所以這邊也 copy 進 cuda 9.2 喔)
- Copy \cuda\bin\cudnn64_7.dll to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\bin.
- Copy \cuda\ include\cudnn.h to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\include.
- Copy \cuda\lib\x64\cudnn.lib to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.2\lib\x64.
參考文章: https://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html
然後,就大功告成。
4. 運作看看
# 新建一個 graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# 新建session with log_device_placement 並設為 True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# 運行這個 op.
print(sess.run(c))
剛開始卡了一下,但還是跑完。
注意裡面已經開始使用 GPU了,如:
MatMul: (MatMul): /job:localhost/replica:0/task:0/device:GPU:0
終於成功了。