Continue Discussion 46 replies
April 2020

quantwin

请问,
1,如何确定 gpu 是否比 cpu 快还是慢。
譬如 i7 和 gtx 1050 ,有必要安装 gpu 版本吗?
2,是否是要么是 cpu 来计算 要么是 gpu 来计算。要对比的话是否两个版本都要装?

1 reply
April 2020 ▶ quantwin

snowkylin

  1. 可以分别使用 GPU 和 CPU 运行同一个模型来选择合适的运算设备。GPU 和 CPU 的选择与模型类型也有关系,例如如果是卷积神经网络则 gtx1050 可能胜过 i7,但如果是强化学习则不一定。

  2. 可以简单地认为,是的。不需要两个版本都装,TensorFlow 2.1 默认安装(pip install tensorflow)就已经同时支持 CPU 和 GPU,可以使用

    cpus = tf.config.list_physical_devices (device_type=‘CPU’)
    tf.config.set_visible_devices (devices=cpus)

来限定只使用 CPU 进行运算。

April 2020

quantwin

1、tf.config.set_visible_devices (devices=cpus) 设定似乎有问题,搜索了 Tf2 官方,也没看明白这个用法。

2、一种方法,类似于当前程序的全局设定,放在 import tensorflow 之前
import os
os.environ [“CUDA_DEVICE_ORDER”] = “PCI_BUS_ID”
os.environ [“CUDA_VISIBLE_DEVICES”] = “-1”

3,另一种方法,是程序临时指定
with tf.device (’/cpu:0’):
A = tf.constant ([[1, 2], [3, 4]])
B = tf.constant ([[5, 6], [7, 8]])
C = tf.matmul (A, B)
print ©

with tf.device (’/gpu:0’):
A = tf.constant ([[1, 2], [3, 4]])
B = tf.constant ([[5, 6], [7, 8]])
C = tf.matmul (A, B)
print ©

1 reply
April 2020 ▶ quantwin

snowkylin

可以参考 tf.config.set_visible_devices  |  TensorFlow v2.14.0 ,相关内容在 https://tf.wiki/zh/basic/tools.html#tf-config-gpu 也有介绍。一个简单的示例程序如下:

import tensorflow as tf
tf.debugging.set_log_device_placement (True)     # 设置输出运算所在的设备

cpus = tf.config.list_physical_devices ('CPU')   # 获取当前设备的 CPU 列表
tf.config.set_visible_devices (cpus)             # 设置 TensorFlow 的可见设备范围为 cpu

A = tf.constant ([[1, 2], [3, 4]])
B = tf.constant ([[5, 6], [7, 8]])
C = tf.matmul (A, B)

print (C)

输出

2020-04-21 11:37:29.007897: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op MatMul in device /job:localhost/replica:0/task:0/device:CPU:0
tf.Tensor (
[[19 22]
[43 50]], shape=(2, 2), dtype=int32)

当然最保险的办法是新开一个 conda 虚拟环境并且

pip install tensorflow-cpu

,安装仅支持 CPU 的 TensorFlow 版本。

April 2020

lunatic

请问安装 cuda 时为什么会报错:(‘Connection broken: OSError ("(10054, ‘WSAECONNRESET’)")’, OSError ("(10054, ‘WSAECONNRESET’)"))

1 reply
April 2020

snowkylin

看起来可能是网络问题。如果使用 conda 安装的话建议设置镜像。

April 2020

Susan_Shen

When I use pip install tensorflow
I got the feedback:
ERROR: Could not find a version that satisfies the requirement tensorflow (from versions: none)
ERROR: No matching distribution found for tensorflow
But I find the command worked:
python -m pip install --upgrade https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-0.12.0-py3-none-any.whl
If you meet such a question, you can try it.

1 reply
April 2020 ▶ Susan_Shen

snowkylin

你这个 TensorFlow 的版本也太老啦。本教程面向 TensorFlow 2.1,如果 pip install tensorflow 出现错误(这种情况一般很少见),请检查 Python 环境设置,重新建立一个新的 conda 环境再安装,或者在搜索引擎里搜索一下报错内容。

1 reply
April 2020 ▶ snowkylin

Susan_Shen

好吧我找到了原因了
安装一直出错是因为我之前下了 Python3.7.4-32bit 版本,换成了 64bit 就没有问题了

April 2020

9_Et

关于 IDE 设置 的,我的路径是 /opt/anaconda3/envs/tf2/bin/python ,不知道会不会有问题……。我系统是 macOS 10.15.3 (19D76)

April 2020

ikou-austin

没有问题,我与你的一致

May 2020

MorningStar_Wang

你好,按照这种方式在 Anaconda 环境中使用 TF2.1 会遇到在使用 Tensorboard 的 Profile 时报错的问题,报错如下:

2020-05-08 11:09:24.374761: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcupti.so.10.1'; dlerror: libcupti.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /opt/gridview//pbs/dispatcher/lib::/usr/local/lib64:/usr/local/lib
2020-05-08 11:09:24.374801: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1307] function cupti_interface_->Subscribe ( &subscriber_, (CUpti_CallbackFunc) ApiCallback, this) failed with error CUPTI could not be loaded or symbol could not be found.
2020-05-08 11:09:24.374816: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1346] function cupti_interface_->ActivityRegisterCallbacks ( AllocCuptiActivityBuffer, FreeCuptiActivityBuffer) failed with error CUPTI could not be loaded or symbol could not be found.
2 replies
May 2020 ▶ MorningStar_Wang

MorningStar_Wang

另附 nvidia-smi 和 conda list 的结果:

Fri May  8 11:25:59 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.50       Driver Version: 430.50       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  Off  | 00000000:89:00.0 Off |                    0 |
| N/A   40C    P0    37W / 250W |   1354MiB / 32510MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-PCIE...  Off  | 00000000:8A:00.0 Off |                    0 |
| N/A   39C    P0    38W / 250W |    320MiB / 32510MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla V100-PCIE...  Off  | 00000000:8B:00.0 Off |                    0 |
| N/A   39C    P0    36W / 250W |    320MiB / 32510MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla V100-PCIE...  Off  | 00000000:8C:00.0 Off |                    0 |
| N/A   43C    P0    49W / 250W |  18828MiB / 32510MiB |     65%      Default |
+-------------------------------+----------------------+----------------------+
|   4  Tesla V100-PCIE...  Off  | 00000000:DA:00.0 Off |                    0 |
| N/A   40C    P0    46W / 250W |  18828MiB / 32510MiB |     73%      Default |
+-------------------------------+----------------------+----------------------+
|   5  Tesla V100-PCIE...  Off  | 00000000:DB:00.0 Off |                    0 |
| N/A   38C    P0    39W / 250W |    320MiB / 32510MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   6  Tesla V100-PCIE...  Off  | 00000000:DC:00.0 Off |                    0 |
| N/A   64C    P0   155W / 250W |   5789MiB / 32510MiB |     86%      Default |
+-------------------------------+----------------------+----------------------+
|   7  Tesla V100-PCIE...  Off  | 00000000:DD:00.0 Off |                    0 |
| N/A   40C    P0    40W / 250W |    320MiB / 32510MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2234      G   /usr/bin/X                                    22MiB |
|    0     83392      C   /usr/lhy/anaconda3/envs/tf2/bin/python       363MiB |
|    0    135660      C   /usr/lhy/anaconda3/envs/tf2/bin/python       955MiB |
|    1    135660      C   /usr/lhy/anaconda3/envs/tf2/bin/python       307MiB |
|    2    135660      C   /usr/lhy/anaconda3/envs/tf2/bin/python       307MiB |
|    3    135660      C   /usr/lhy/anaconda3/envs/tf2/bin/python       307MiB |
|    3    139517      C   python                                      4627MiB |
|    3    140633      C   python                                      4627MiB |
|    3    142274      C   python                                      4627MiB |
|    3    143266      C   python                                      4627MiB |
|    4    133393      C   python                                      4627MiB |
|    4    133906      C   python                                      4627MiB |
|    4    134679      C   python                                      4627MiB |
|    4    135379      C   python                                      4627MiB |
|    4    135660      C   /usr/lhy/anaconda3/envs/tf2/bin/python       307MiB |
|    5    135660      C   /usr/lhy/anaconda3/envs/tf2/bin/python       307MiB |
|    6     15156      C   python                                      5469MiB |
|    6    135660      C   /usr/lhy/anaconda3/envs/tf2/bin/python       307MiB |
|    7    135660      C   /usr/lhy/anaconda3/envs/tf2/bin/python       307MiB |
+-----------------------------------------------------------------------------+
# packages in environment at /usr/lhy/anaconda3/envs/tf2:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main    defaults
absl-py                   0.9.0                    pypi_0    pypi
astor                     0.8.1                    pypi_0    pypi
attrs                     19.3.0                   pypi_0    pypi
backcall                  0.1.0                    py37_0    defaults
bleach                    3.1.4                    pypi_0    pypi
ca-certificates           2020.1.1                      0    defaults
cachetools                4.1.0                    pypi_0    pypi
certifi                   2020.4.5.1               py37_0    defaults
chardet                   3.0.4                    pypi_0    pypi
cloudpickle               1.3.0                    pypi_0    pypi
cudatoolkit               10.1.243             h6bb024c_0    defaults
cudnn                     7.6.5                cuda10.1_0    defaults
cycler                    0.10.0                   pypi_0    pypi
decorator                 4.4.2                      py_0    defaults
defusedxml                0.6.0                    pypi_0    pypi
entrypoints               0.3                      pypi_0    pypi
gast                      0.2.2                    pypi_0    pypi
gin-config                0.1.3                    pypi_0    pypi
google-auth               1.14.0                   pypi_0    pypi
google-auth-oauthlib      0.4.1                    pypi_0    pypi
google-pasta              0.2.0                    pypi_0    pypi
grpcio                    1.28.1                   pypi_0    pypi
gym                       0.10.11                  pypi_0    pypi
h5py                      2.10.0                   pypi_0    pypi
idna                      2.9                      pypi_0    pypi
importlib-metadata        1.6.0                    pypi_0    pypi
ipykernel                 5.2.1                    pypi_0    pypi
ipython                   7.13.0           py37h5ca1d4c_0    defaults
ipython-genutils          0.2.0                    pypi_0    pypi
ipython_genutils          0.2.0                    py37_0    defaults
jedi                      0.17.0                   pypi_0    pypi
jinja2                    2.11.2                   pypi_0    pypi
joblib                    0.14.1                   pypi_0    pypi
json5                     0.9.4                    pypi_0    pypi
jsonschema                3.2.0                    pypi_0    pypi
jupyter-client            6.1.3                    pypi_0    pypi
jupyter_client            6.1.2                      py_0    defaults
jupyter_core              4.6.3                    py37_0    defaults
jupyterlab                2.1.0                    pypi_0    pypi
jupyterlab-server         1.1.1                    pypi_0    pypi
keras-applications        1.0.8                    pypi_0    pypi
keras-preprocessing       1.1.0                    pypi_0    pypi
kiwisolver                1.2.0                    pypi_0    pypi
ld_impl_linux-64          2.33.1               h53a641e_7    defaults
libedit                   3.1.20181209         hc058e9b_0    defaults
libffi                    3.2.1                hd88cf55_4    defaults
libgcc-ng                 9.1.0                hdf63c60_0    defaults
libsodium                 1.0.16               h1bed415_0    defaults
libstdcxx-ng              9.1.0                hdf63c60_0    defaults
lightgbm                  2.3.1                    pypi_0    pypi
markdown                  3.2.1                    pypi_0    pypi
markupsafe                1.1.1                    pypi_0    pypi
matplotlib                3.2.1                    pypi_0    pypi
minepy                    1.2.4                    pypi_0    pypi
mistune                   0.8.4                    pypi_0    pypi
nbconvert                 5.6.1                    pypi_0    pypi
nbformat                  5.0.6                    pypi_0    pypi
ncurses                   6.2                  he6710b0_0    defaults
notebook                  6.0.3                    pypi_0    pypi
numpy                     1.18.3                   pypi_0    pypi
oauthlib                  3.1.0                    pypi_0    pypi
openssl                   1.1.1g               h7b6447c_0    defaults
opt-einsum                3.2.1                    pypi_0    pypi
pandas                    1.0.3                    pypi_0    pypi
pandocfilters             1.4.2                    pypi_0    pypi
parso                     0.7.0                    pypi_0    pypi
pexpect                   4.8.0                    py37_0    defaults
pickleshare               0.7.5                    pypi_0    pypi
pip                       20.0.2                   py37_1    defaults
prometheus-client         0.7.1                    pypi_0    pypi
prompt-toolkit            3.0.5                    pypi_0    pypi
prompt_toolkit            3.0.4                         0    defaults
protobuf                  3.11.3                   pypi_0    pypi
ptyprocess                0.6.0                    pypi_0    pypi
pyaml                     20.4.0                   pypi_0    pypi
pyasn1                    0.4.8                    pypi_0    pypi
pyasn1-modules            0.2.8                    pypi_0    pypi
pydot                     1.4.1                    pypi_0    pypi
pyglet                    1.5.4                    pypi_0    pypi
pygments                  2.6.1                      py_0    defaults
pyparsing                 2.4.7                    pypi_0    pypi
pyrsistent                0.16.0                   pypi_0    pypi
python                    3.7.7           hcf32534_0_cpython    defaults
python-dateutil           2.8.1                      py_0    defaults
python-graphviz           0.14                     pypi_0    pypi
pytz                      2019.3                   pypi_0    pypi
pyyaml                    5.3.1                    pypi_0    pypi
pyzmq                     19.0.0                   pypi_0    pypi
readline                  8.0                  h7b6447c_0    defaults
requests                  2.23.0                   pypi_0    pypi
requests-oauthlib         1.3.0                    pypi_0    pypi
rsa                       4.0                      pypi_0    pypi
scikit-learn              0.22.2.post1             pypi_0    pypi
scipy                     1.4.1                    pypi_0    pypi
send2trash                1.5.0                    pypi_0    pypi
setuptools                46.1.3                   py37_0    defaults
six                       1.14.0                   py37_0    defaults
SQLite                    3.31.1               h7b6447c_0    defaults
tensorboard               2.1.1                    pypi_0    pypi
tensorflow                2.1.0                    pypi_0    pypi
tensorflow-addons         0.9.1                    pypi_0    pypi
tensorflow-estimator      2.1.0                    pypi_0    pypi
tensorflow-probability    0.9.0                    pypi_0    pypi
termcolor                 1.1.0                    pypi_0    pypi
terminado                 0.8.3                    pypi_0    pypi
testpath                  0.4.4                    pypi_0    pypi
tf-agents                 0.4.0                    pypi_0    pypi
tk                        8.6.8                hbc83047_0    defaults
tornado                   6.0.4            py37h7b6447c_1    defaults
tqdm                      4.45.0                   pypi_0    pypi
traitlets                 4.3.3                    py37_0    defaults
typeguard                 2.7.1                    pypi_0    pypi
urllib3                   1.25.9                   pypi_0    pypi
wcwidth                   0.1.9                      py_0    defaults
webencodings              0.5.1                    pypi_0    pypi
werkzeug                  1.0.1                    pypi_0    pypi
wheel                     0.34.2                   py37_0    defaults
wrapt                     1.12.1                   pypi_0    pypi
xgboost                   0.80                     pypi_0    pypi
xz                        5.2.5                h7b6447c_0    defaults
zeromq                    4.3.1                he6710b0_3    defaults
zipp                      3.1.0                    pypi_0    pypi
zlib                      1.2.11               h7b6447c_3    defaults
May 2020

snowkylin

参考一下 python - Tensorflow CUDA - CUPTI error: CUPTI could not be loaded or symbol could not be found - Stack Overflow

May 2020

freedomhnter

ERROR: Could not find a version that satisfies the requirement tensorflow (from versions: none)
ERROR: No matching distribution found for tensorflow

按你说的装不了 tensorflow

1 reply
May 2020 ▶ freedomhnter

snowkylin

请检查你的 Python 版本是否为 64 位,参考 https://stackoverflow.com/questions/48720833/could-not-find-a-version-that-satisfies-the-requirement-tensorflow

May 2020

MingCheung

您好!我安装的 tf 2.0,每次启动时,cuda 的加载信息如下,有的信息重复显示了两次,但是计算上没有问题,请问您知道是什么原因吗?
2020-05-22 09:09:35.094679: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-05-22 09:09:35.146396: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: TITAN X (Pascal) major: 6 minor: 1 memoryClockRate (GHz): 1.531
pciBusID: 0000:02:00.0
2020-05-22 09:09:35.147109: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 1 with properties:
name: Quadro P2000 major: 6 minor: 1 memoryClockRate (GHz): 1.4805
pciBusID: 0000:01:00.0
2020-05-22 09:09:35.147377: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-05-22 09:09:35.149235: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-05-22 09:09:35.150833: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-05-22 09:09:35.151172: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-05-22 09:09:35.153255: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-05-22 09:09:35.154837: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-05-22 09:09:35.159338: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-05-22 09:09:35.162403: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0, 1
2020-05-22 09:09:35.198717: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2194750000 Hz
2020-05-22 09:09:35.201895: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56304c7438d0 executing computations on platform Host. Devices:
2020-05-22 09:09:35.201939: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Host, Default Version
2020-05-22 09:09:35.545476: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x56304ca88560 executing computations on platform CUDA. Devices:
2020-05-22 09:09:35.545540: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): TITAN X (Pascal), Compute Capability 6.1
2020-05-22 09:09:35.545550: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (1): Quadro P2000, Compute Capability 6.1
2020-05-22 09:09:35.547244: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: TITAN X (Pascal) major: 6 minor: 1 memoryClockRate (GHz): 1.531
pciBusID: 0000:02:00.0
2020-05-22 09:09:35.548178: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 1 with properties:
name: Quadro P2000 major: 6 minor: 1 memoryClockRate (GHz): 1.4805
pciBusID: 0000:01:00.0
2020-05-22 09:09:35.548249: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-05-22 09:09:35.548280: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-05-22 09:09:35.548307: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-05-22 09:09:35.548333: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-05-22 09:09:35.548359: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-05-22 09:09:35.548384: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-05-22 09:09:35.548411: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-05-22 09:09:35.552360: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0, 1
2020-05-22 09:09:35.552423: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-05-22 09:09:35.555863: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-22 09:09:35.555894: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165] 0 1
2020-05-22 09:09:35.555908: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0: N N
2020-05-22 09:09:35.555919: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 1: N N
2020-05-22 09:09:35.559429: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11435 MB memory) -> physical GPU (device: 0, name: TITAN X (Pascal), pci bus id: 0000:02:00.0, compute capability: 6.1)
2020-05-22 09:09:35.560428: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 4518 MB memory) -> physical GPU (device: 1, name: Quadro P2000, pci bus id: 0000:01:00.0, compute capability: 6.1)
<tf.Tensor: id=0, shape=(), dtype=int32, numpy=1>

2 replies
June 2020 ▶ MingCheung

hifeng2017

同问,安装完 tensorflow2.2,也是一样的,但是输出过程正常

June 2020 ▶ MingCheung

snowkylin

@MingCheung @hifeng2017 TensorFlow 在运行过程中会在终端打一些 log,这个是正常的,具体使用上只要不报错就行(有 warning 的话大多是使用了过时的 api,可以根据提示做一些调整)

如果不喜欢这些 log,可以参考 https://stackoverflow.com/questions/35911252/disable-tensorflow-debugging-information 把 log 关掉。

June 2020

Yuanwanli1995

非常感觉您的分享,写的非常清晰 ,对我帮助很大 。
请问 spyder 里不能查看变量值是怎么回事呢

1 reply
June 2020 ▶ Yuanwanli1995

snowkylin

我没有用过 spider,所以只能建议你在网络上搜索一下错误信息。本手册建议使用 PyCharm 作为 IDE,参考 https://tf.wiki/zh_hans/basic/installation.html#ide

June 2020

ORION丶

我在 import tensorflow as tf 时出现下面的错误是什么问题,之前装也没有问题。
2020-06-28 21:14:21.568630: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
Traceback (most recent call last):
File “”, line 1, in
File “F:\Python\InstallPath\lib\site-packages\tensorflow_init_.py”, line 41, in
from tensorflow.python.tools import module_util as module_util
File "F:\Python\InstallPath\lib\site-packages\tensorflow\python_init
.py", line 84, in
from tensorflow.python import keras
File “F:\Python\InstallPath\lib\site-packages\tensorflow\python\keras_init_.py”, line 27, in
from tensorflow.python.keras import models
File “F:\Python\InstallPath\lib\site-packages\tensorflow\python\keras\models.py”, line 27, in
from tensorflow.python.keras.engine import sequential
File “F:\Python\InstallPath\lib\site-packages\tensorflow\python\keras\engine\sequential.py”, line 27, in
from tensorflow.python.keras.engine import training
File “F:\Python\InstallPath\lib\site-packages\tensorflow\python\keras\engine\training.py”, line 37, in
from tensorflow.python.keras.engine import data_adapter
File “F:\Python\InstallPath\lib\site-packages\tensorflow\python\keras\engine\data_adapter.py”, line 55, in
from scipy import sparse as scipy_sparse # pylint: disable=g-import-not-at-top
File “F:\Python\InstallPath\lib\site-packages\scipy_init_.py”, line 106, in
from . import distributor_init
File “F:\Python\InstallPath\lib\site-packages\scipy_distributor_init.py”, line 26, in
WinDLL (os.path.abspath (filename))
File "F:\Python\InstallPath\lib\ctypes_init
.py", line 364, in init
self._handle = _dlopen (self._name, mode)
OSError: [WinError 126] 找不到指定的模块。

1 reply
June 2020 ▶ ORION丶

snowkylin

劳烦说明一下安装的 TensorFlow 版本,以及请参考 https://tf.wiki/zh/basic/installation.html#id12 中的 “导入 TensorFlow 时部分可能出现的错误信息及解决方案”。考虑安装一下 Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017 and 2019 看看。

1 reply
June 2020 ▶ snowkylin

ORION丶

刚开始安装的是 GPU 版本 2.2.0,cuda10.1,cudnn7.6.5。后来又降级到 2.1.0 版本,还是这个问题。

July 2020

zyk516

2020-07-24 17:01:38.684786: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2.2.0 安装成功了,但运行速度极慢,请是因为上面报错的信息吗

1 reply
July 2020 ▶ zyk516

snowkylin

  1. “运行速度极慢” 的具体表现是什么,是和什么相比而言速度慢,慢多少?
  2. 请提供你运行的程序代码以及完整的终端输出内容
July 2020

zyk516

运行的代码是 print (tf.reduce_sum (tf.random.normal ([10, 10]))),运行时间都要 10 分钟,我的电脑是 cpui5,gpu 是 gtx950M,我之前装过 tensorflow2.2,运行情况不是这样的。
完整的终端输出内容是:
2020-07-27 17:28:44.898662: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-07-27 17:28:45.943645: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:45.945260: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 950M computeCapability: 5.0
coreClock: 1.124GHz coreCount: 5 deviceMemorySize: 3.95GiB deviceMemoryBandwidth: 29.83GiB/s
2020-07-27 17:28:45.993423: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-27 17:28:46.454724: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-07-27 17:28:46.789430: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-07-27 17:28:46.891430: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-07-27 17:28:47.415397: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-07-27 17:28:47.655817: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-07-27 17:28:48.329574: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-27 17:28:48.329881: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:48.330447: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:48.330787: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-07-27 17:28:48.356413: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-07-27 17:28:48.577678: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2299965000 Hz
2020-07-27 17:28:48.577952: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4dd02d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-27 17:28:48.577971: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-07-27 17:28:49.139140: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:49.139682: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4dd2e20 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-07-27 17:28:49.139747: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 950M, Compute Capability 5.0
2020-07-27 17:28:49.173354: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:49.176437: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 950M computeCapability: 5.0
coreClock: 1.124GHz coreCount: 5 deviceMemorySize: 3.95GiB deviceMemoryBandwidth: 29.83GiB/s
2020-07-27 17:28:49.194079: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-27 17:28:49.194111: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-07-27 17:28:49.194127: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-07-27 17:28:49.194144: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-07-27 17:28:49.194160: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-07-27 17:28:49.194175: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-07-27 17:28:49.194191: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-27 17:28:49.194313: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:49.194822: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:49.195171: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-07-27 17:28:49.220996: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-27 17:28:49.236580: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-27 17:28:49.236624: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-07-27 17:28:49.236634: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-07-27 17:28:49.245993: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:49.246740: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:49.247191: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3708 MB memory) -> physical GPU (device: 0, name: GeForce GTX 950M, pci bus id: 0000:01:00.0, compute capability: 5.0)
输出内容有点长,麻烦您帮我看一下

August 2020

YummyLau

踩过坑,留给后面朋友的建议。
macOS catelina 10.15.4.

Anaconda 环境的理解 初学 Python 者自学 Anaconda 的正确姿势是什么? - 知乎
使用 conda 更新源修复 tensofflow 安装过程中 wrap 出现的问题 ERROR: Cannot uninstall 'wrapt'. during upgrade · Issue #30191 · tensorflow/tensorflow · GitHub → conda update wrapt
pip install tensorflow 需要针对虚拟环境进行设置,而不是宿主环境

August 2020

onlyearth

在 PyCharm 中运行代码后,出现以下代码,但是没有报错,输出结果正常。请问是什么原因?

2020-08-08 18:02:52.694491: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘cudart64_101.dll’; dlerror: cudart64_101.dll not found
2020-08-08 18:02:52.694817: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

2020-08-08 18:02:57.836214: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘nvcuda.dll’; dlerror: nvcuda.dll not found
2020-08-08 18:02:57.836522: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2020-08-08 18:02:57.849447: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host:
2020-08-08 18:02:57.849992: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname:
2020-08-08 18:02:57.850409: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-08-08 18:02:57.862757: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x11aac2bc010 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-08-08 18:02:57.863116: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version

1 reply
August 2020 ▶ onlyearth

snowkylin

运行途中可能会输出一些 TensorFlow 的提示信息,属于正常现象。

如果你指的是 cudart64_101.dll not found,说明 CUDA 未安装或未正确安装,可以参考手册的 “GPU 版本 TensorFlow 安装指南” https://tf.wiki/zh_hans/basic/installation.html#gputensorflow

August 2020

Xingwen_Zhu

import tensorflow as tf

A = tf.constant ([[1, 2], [3, 4]])
B = tf.constant ([[5, 6], [7, 8]])
C = tf.matmul (A, B)

print ©

为什么我的运行结果为:
Tensor (“MatMul_9:0”, shape=(2, 2), dtype=int32)

1 reply
August 2020 ▶ Xingwen_Zhu

snowkylin

请检查你的 TensorFlow 版本是否为 2.X

import tensorflow as tf
print (tf.__version__)
August 2020

KSCzzZ

请问在使用 gpu 运行时出现这样的错误是什么原因(pycharm 的提示信息显示,程序能识别到 gpu,能够显示 gpu 的型号和算力):tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [0] [Op:Assert] name: EagerVariableNameReuse
这种情况出现在指定了 gpu 或者使用默认状态,如果切换成纯 cpu 处理则可以顺利执行代码。问题报错位置为 tf.keras.models.Sequential
求解答,万分感谢

1 reply
August 2020 ▶ KSCzzZ

snowkylin

请贴出你写的完整代码和完整报错信息(不要截图),以及你的 TensorFlow 版本。

1 reply
August 2020 ▶ snowkylin

KSCzzZ

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
import os

os.environ ['TF_CPP_MIN_LOG_LEVEL'] = "2"
os.environ ['CUDA_VISIBLE_DEVICES'] = '0'

def preprocess (x, y):
    x = tf.cast (x, dtype=tf.float32)/255.0
    y = tf.cast (y, dtype=tf.int32)
    return x, y


(x, y), (x_test, y_test) = tf.keras.datasets.fashion_mnist.load_data ()
print (x.shape, y.shape, x_test.shape, y_test.shape)

batchsize = 128

db = tf.data.Dataset.from_tensor_slices ((x,y))
db = db.map (preprocess).shuffle (10000).batch (batchsize)

db_test = tf.data.Dataset.from_tensor_slices ((x_test,y_test))
db_test = db_test.map (preprocess).batch (batchsize)

db_iter = iter (db)
sample = next (db_iter)
print (sample [0].shape, sample [1].shape)

model = keras.Sequential ([
    layers.Dense (256, activation=tf.nn.relu),
    layers.Dense (128, activation=tf.nn.relu),
    layers.Dense (64, activation=tf.nn.relu),
    layers.Dense (32, activation=tf.nn.relu),
    layers.Dense (10)
])

model.build (input_shape=[None, 28*28])
model.summary ()

optimizer = optimizers.Adam (lr=1e-3)


def main ():
    for epoch in range (30):

        for step, (x, y) in enumerate (db):
            x = tf.reshape (x, [-1, 28*28])
            with tf.GradientTape () as tape:
                y_onehot = tf.one_hot (y,depth=10)
                logits = model (x)
                loss1 = tf.reduce_mean (tf.losses.MSE (y_onehot, logits))
                loss2 = tf.reduce_mean (tf.losses.categorical_crossentropy (y_onehot, logits, from_logits=True))

            grads = tape.gradient (loss1, model.trainable_variables)
            optimizer.apply_gradients (zip (grads,model.trainable_variables))

            if step % 100 == 0:
                print (epoch, step, "loss:", float (loss1), float (loss2))

        total_correct = 0
        total_num = 0
        for x,y in db_test:

            x = tf.reshape (x, [-1, 28 * 28])
            logits = model (x)

            prob = tf.nn.softmax (logits, axis=1)

            pred = tf.argmax (prob, axis=1)
            pred = tf.cast (pred, dtype=tf.int32)

            correct = tf.equal (pred, y)
            correct = tf.reduce_sum (tf.cast (correct, dtype=tf.int32))
            total_correct += int (correct)
            total_num +=x.shape [0]

        acc = total_correct / total_num
        print (epoch, "acc:", acc)

if __name__ == '__main__':
    main ()


Traceback (most recent call last):
  File "E:/Python_code/test/FashionMnist_layer", line 30, in <module>
    model = keras.Sequential ([
  File "D:\Program\Python\lib\site-packages\tensorflow\python\training\tracking\base.py", line 457, in _method_wrapper
    result = method (self, *args, **kwargs)
  File "D:\Program\Python\lib\site-packages\tensorflow\python\keras\engine\sequential.py", line 116, in __init__
    super (functional.Functional, self).__init__(  # pylint: disable=bad-super-call
  File "D:\Program\Python\lib\site-packages\tensorflow\python\training\tracking\base.py", line 457, in _method_wrapper
    result = method (self, *args, **kwargs)
  File "D:\Program\Python\lib\site-packages\tensorflow\python\keras\engine\training.py", line 308, in __init__
    self._init_batch_counters ()
  File "D:\Program\Python\lib\site-packages\tensorflow\python\training\tracking\base.py", line 457, in _method_wrapper
    result = method (self, *args, **kwargs)
  File "D:\Program\Python\lib\site-packages\tensorflow\python\keras\engine\training.py", line 317, in _init_batch_counters
    self._train_counter = variables.Variable (0, dtype='int64', aggregation=agg)
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\variables.py", line 262, in __call__
    return cls._variable_v2_call (*args, **kwargs)
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\variables.py", line 244, in _variable_v2_call
    return previous_getter (
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\variables.py", line 237, in <lambda>
    previous_getter = lambda **kws: default_variable_creator_v2 (None, **kws)
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 2633, in default_variable_creator_v2
    return resource_variable_ops.ResourceVariable (
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\variables.py", line 264, in __call__
    return super (VariableMetaclass, cls).__call__(*args, **kwargs)
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 1507, in __init__
    self._init_from_args (
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 1661, in _init_from_args
    handle = eager_safe_variable_handle (
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 242, in eager_safe_variable_handle
    return _variable_handle_from_shape_and_dtype (
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 174, in _variable_handle_from_shape_and_dtype
    gen_logging_ops._assert (  # pylint: disable=protected-access
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\gen_logging_ops.py", line 49, in _assert
    _ops.raise_from_not_ok_status (e, name)
  File "D:\Program\Python\lib\site-packages\tensorflow\python\framework\ops.py", line 6843, in raise_from_not_ok_status
    six.raise_from (core._status_to_exception (e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [0] [Op:Assert] name: EagerVariableNameReuse
1 reply
August 2020

snowkylin

这边在 Colab 的 GPU 上跑了一下没有问题,建议参考 InvalidArgumentError: assertion failed: [0] [Op:Assert] name: EagerVariableNameReuse · Issue #38518 · tensorflow/tensorflow · GitHub ,以及检查你的 GPU 是不是被其他程序占用了。

September 2020

Xavier_Lee

(tf2) C:\Users\Xavie>python
Python 3.7.9 (default, Aug 31 2020, 17:10:11) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type “help”, “copyright”, “credits” or “license” for more information.

import tensorflow as tf
2020-09-11 20:33:26.555854: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘cudart64_101.dll’; dlerror: cudart64_101.dll not found
2020-09-11 20:33:26.560825: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

這樣了,請問如何解決的

1 reply
September 2020 ▶ Xavier_Lee

snowkylin

這個信息表明你的電腦沒有安裝支持CUDA的GPU或者GPU配置不正確,會使用CPU進行運算。如果你的電腦確實沒有NVIDIA的顯示卡或者不打算用GPU,可忽略此提示信息。

September 2020

zhanghanyuA

我在使用tensorflow时总是有这个warning,请问是什么原因:

2020-09-18 14:39:07.885039: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
(2, 2)
2020-09-18 14:39:07.896766: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fde58a96d90 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-18 14:39:07.896780: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
tensorflow版本是cpu 2.3.0版本

2 replies
September 2020 ▶ zhanghanyuA

snowkylin

这不是Warning呀,哪里都没出现warning这个词,只是给你一些information而已。

September 2020 ▶ zhanghanyuA

Rachel

请问你解决这个问题了吗?我也遇到了相同的问题

1 reply
September 2020 ▶ Rachel

sc-learner

据我所知第一个warning如果要解决的话,需要从github下载tensorflow的源代码,用bazel重新编译。这个挺复杂的。具体可参考https://tensorflow.google.cn/install/source(我自己因为是使用gpu版本,所以没必要解决这个问题)

第二个warning我还没弄明白,需要好好看看XLA部分的讲解。

September 2021

632117529

你好,请问跑训练代码出现下面两个信息是什么意思?
1、None of the MLIR Optimization Passes are enabled
2、Couldn’t invoke ptxas.exe --version
代码是链接里的示例代码:कनवल्शनल न्यूरल नेटवर्क (सीएनएन)  |  TensorFlow Core
看起来是能用GPU训练,但是总感觉速度有问题。。跟机器用的是AMD的CPU有关系吗,CPU是5800x,显卡是3080ti,环境是tensorflow-gpu 2.5.0,python=3.8,cudatoolkit=11.3,cudnn=8.2。

具体信息如下:
2021-09-12 17:35:28.302826: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/10
2021-09-12 17:35:28.528250: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cudnn64_8.dll
2021-09-12 17:35:28.954801: I tensorflow/stream_executor/cuda/cuda_dnn.cc:359] Loaded cuDNN version 8201
2021-09-12 17:35:29.577930: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2
2021-09-12 17:35:29.578036: W tensorflow/stream_executor/gpu/asm_compiler.cc:56] Couldn’t invoke ptxas.exe --version
2021-09-12 17:35:29.582453: E tensorflow/core/platform/windows/subprocess.cc:287] Call to CreateProcess failed. Error code: 2
2021-09-12 17:35:29.582966: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: Failed to launch ptxas
Relying on driver to perform ptx compilation.
Modify $PATH to customize ptxas location.
This message will be only logged once.
2021-09-12 17:35:29.620691: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublas64_11.dll
2021-09-12 17:35:30.122407: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library cublasLt64_11.dll
2021-09-12 17:35:30.155622: I tensorflow/stream_executor/cuda/cuda_blas.cc:1838] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
1563/1563 [==============================] - 7s 3ms/step - loss: 1.5276 - accuracy: 0.4408 - val_loss: 1.2432 - val_accuracy: 0.5580
Epoch 2/10
1563/1563 [==============================] - 4s 3ms/step - loss: 1.1622 - accuracy: 0.5898 - val_loss: 1.1072 - val_accuracy: 0.6117

September 2021

snowkylin

我也不太确定原因,或许可以参考 python - Tensorflow 2.4.1 - Couldn't invoke ptxas.exe - Stack Overflow ,换个tf和cuda的版本组合看看

February 2022

Erisnoit

您好,想请问按照您的方法和顺序在conda里安装tensorflow和cuda之后,是直接可以使用GPU运行的么?
我按照上面出现了检测不到GPU的情况,并且好像没有缺少什么文件的报错。CPU正常。
print(tf.test.gpu_device_name())有错误信息,输出大概这样。
[] [PhysicalDevice(name=’/physical_device:CPU:0’, device_type=‘CPU’)]
2022-02-18 19:55:37.195693: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

版本是
cudatoolkit 11.3.1 h2bc3f7f_2
cudnn 8.2.1 cuda11.3_0
tensorboard 2.6.0 py_1
tensorboard-data-server 0.6.0 py37hca6d32c_0
tensorboard-plugin-wit 1.6.0 py_0
tensorflow 2.6.0 mkl_py37h9d15365_0
tensorflow-base 2.6.0 mkl_py37h3d85931_0
tensorflow-estimator 2.6.0 pyh7b7c402_0

各种重装都试了比较头大,想请问有无这种情况出现过?

1 reply
March 2022 ▶ Erisnoit

snowkylin

这里的输出也没报错呀,哪一行说检测不到gpu?