TensorFlow 安装与环境配置

非常感觉您的分享,写的非常清晰 ,对我帮助很大 。
请问 spyder 里不能查看变量值是怎么回事呢

我没有用过 spider,所以只能建议你在网络上搜索一下错误信息。本手册建议使用 PyCharm 作为 IDE,参考 https://tf.wiki/zh_hans/basic/installation.html#ide

我在 import tensorflow as tf 时出现下面的错误是什么问题,之前装也没有问题。
2020-06-28 21:14:21.568630: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
Traceback (most recent call last):
File “”, line 1, in
File “F:\Python\InstallPath\lib\site-packages\tensorflow_init_.py”, line 41, in
from tensorflow.python.tools import module_util as module_util
File "F:\Python\InstallPath\lib\site-packages\tensorflow\python_init
.py", line 84, in
from tensorflow.python import keras
File “F:\Python\InstallPath\lib\site-packages\tensorflow\python\keras_init_.py”, line 27, in
from tensorflow.python.keras import models
File “F:\Python\InstallPath\lib\site-packages\tensorflow\python\keras\models.py”, line 27, in
from tensorflow.python.keras.engine import sequential
File “F:\Python\InstallPath\lib\site-packages\tensorflow\python\keras\engine\sequential.py”, line 27, in
from tensorflow.python.keras.engine import training
File “F:\Python\InstallPath\lib\site-packages\tensorflow\python\keras\engine\training.py”, line 37, in
from tensorflow.python.keras.engine import data_adapter
File “F:\Python\InstallPath\lib\site-packages\tensorflow\python\keras\engine\data_adapter.py”, line 55, in
from scipy import sparse as scipy_sparse # pylint: disable=g-import-not-at-top
File “F:\Python\InstallPath\lib\site-packages\scipy_init_.py”, line 106, in
from . import distributor_init
File “F:\Python\InstallPath\lib\site-packages\scipy_distributor_init.py”, line 26, in
WinDLL (os.path.abspath (filename))
File "F:\Python\InstallPath\lib\ctypes_init
.py", line 364, in init
self._handle = _dlopen (self._name, mode)
OSError: [WinError 126] 找不到指定的模块。

劳烦说明一下安装的 TensorFlow 版本,以及请参考 https://tf.wiki/zh/basic/installation.html#id12 中的 “导入 TensorFlow 时部分可能出现的错误信息及解决方案”。考虑安装一下 Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017 and 2019 看看。

刚开始安装的是 GPU 版本 2.2.0,cuda10.1,cudnn7.6.5。后来又降级到 2.1.0 版本,还是这个问题。

2020-07-24 17:01:38.684786: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2.2.0 安装成功了,但运行速度极慢,请是因为上面报错的信息吗

  1. “运行速度极慢” 的具体表现是什么,是和什么相比而言速度慢,慢多少?
  2. 请提供你运行的程序代码以及完整的终端输出内容

运行的代码是 print (tf.reduce_sum (tf.random.normal ([10, 10]))),运行时间都要 10 分钟,我的电脑是 cpui5,gpu 是 gtx950M,我之前装过 tensorflow2.2,运行情况不是这样的。
完整的终端输出内容是:
2020-07-27 17:28:44.898662: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-07-27 17:28:45.943645: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:45.945260: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 950M computeCapability: 5.0
coreClock: 1.124GHz coreCount: 5 deviceMemorySize: 3.95GiB deviceMemoryBandwidth: 29.83GiB/s
2020-07-27 17:28:45.993423: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-27 17:28:46.454724: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-07-27 17:28:46.789430: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-07-27 17:28:46.891430: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-07-27 17:28:47.415397: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-07-27 17:28:47.655817: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-07-27 17:28:48.329574: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-27 17:28:48.329881: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:48.330447: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:48.330787: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-07-27 17:28:48.356413: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-07-27 17:28:48.577678: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2299965000 Hz
2020-07-27 17:28:48.577952: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4dd02d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-07-27 17:28:48.577971: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-07-27 17:28:49.139140: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:49.139682: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4dd2e20 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-07-27 17:28:49.139747: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce GTX 950M, Compute Capability 5.0
2020-07-27 17:28:49.173354: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:49.176437: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: GeForce GTX 950M computeCapability: 5.0
coreClock: 1.124GHz coreCount: 5 deviceMemorySize: 3.95GiB deviceMemoryBandwidth: 29.83GiB/s
2020-07-27 17:28:49.194079: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-27 17:28:49.194111: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-07-27 17:28:49.194127: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-07-27 17:28:49.194144: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-07-27 17:28:49.194160: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-07-27 17:28:49.194175: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-07-27 17:28:49.194191: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-07-27 17:28:49.194313: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:49.194822: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:49.195171: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2020-07-27 17:28:49.220996: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-07-27 17:28:49.236580: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-27 17:28:49.236624: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0
2020-07-27 17:28:49.236634: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N
2020-07-27 17:28:49.245993: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:49.246740: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-07-27 17:28:49.247191: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3708 MB memory) -> physical GPU (device: 0, name: GeForce GTX 950M, pci bus id: 0000:01:00.0, compute capability: 5.0)
输出内容有点长,麻烦您帮我看一下

踩过坑,留给后面朋友的建议。
macOS catelina 10.15.4.

Anaconda 环境的理解 初学 Python 者自学 Anaconda 的正确姿势是什么? - 知乎
使用 conda 更新源修复 tensofflow 安装过程中 wrap 出现的问题 ERROR: Cannot uninstall 'wrapt'. during upgrade · Issue #30191 · tensorflow/tensorflow · GitHub → conda update wrapt
pip install tensorflow 需要针对虚拟环境进行设置,而不是宿主环境

1 Like

在 PyCharm 中运行代码后,出现以下代码,但是没有报错,输出结果正常。请问是什么原因?

2020-08-08 18:02:52.694491: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘cudart64_101.dll’; dlerror: cudart64_101.dll not found
2020-08-08 18:02:52.694817: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

2020-08-08 18:02:57.836214: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘nvcuda.dll’; dlerror: nvcuda.dll not found
2020-08-08 18:02:57.836522: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2020-08-08 18:02:57.849447: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:169] retrieving CUDA diagnostic information for host:
2020-08-08 18:02:57.849992: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:176] hostname:
2020-08-08 18:02:57.850409: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-08-08 18:02:57.862757: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x11aac2bc010 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-08-08 18:02:57.863116: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version

运行途中可能会输出一些 TensorFlow 的提示信息,属于正常现象。

如果你指的是 cudart64_101.dll not found,说明 CUDA 未安装或未正确安装,可以参考手册的 “GPU 版本 TensorFlow 安装指南” https://tf.wiki/zh_hans/basic/installation.html#gputensorflow

import tensorflow as tf

A = tf.constant ([[1, 2], [3, 4]])
B = tf.constant ([[5, 6], [7, 8]])
C = tf.matmul (A, B)

print ©

为什么我的运行结果为:
Tensor (“MatMul_9:0”, shape=(2, 2), dtype=int32)

请检查你的 TensorFlow 版本是否为 2.X

import tensorflow as tf
print (tf.__version__)

请问在使用 gpu 运行时出现这样的错误是什么原因(pycharm 的提示信息显示,程序能识别到 gpu,能够显示 gpu 的型号和算力):tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [0] [Op:Assert] name: EagerVariableNameReuse
这种情况出现在指定了 gpu 或者使用默认状态,如果切换成纯 cpu 处理则可以顺利执行代码。问题报错位置为 tf.keras.models.Sequential
求解答,万分感谢

请贴出你写的完整代码和完整报错信息(不要截图),以及你的 TensorFlow 版本。

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import datasets, layers, optimizers, Sequential, metrics
import os

os.environ ['TF_CPP_MIN_LOG_LEVEL'] = "2"
os.environ ['CUDA_VISIBLE_DEVICES'] = '0'

def preprocess (x, y):
    x = tf.cast (x, dtype=tf.float32)/255.0
    y = tf.cast (y, dtype=tf.int32)
    return x, y


(x, y), (x_test, y_test) = tf.keras.datasets.fashion_mnist.load_data ()
print (x.shape, y.shape, x_test.shape, y_test.shape)

batchsize = 128

db = tf.data.Dataset.from_tensor_slices ((x,y))
db = db.map (preprocess).shuffle (10000).batch (batchsize)

db_test = tf.data.Dataset.from_tensor_slices ((x_test,y_test))
db_test = db_test.map (preprocess).batch (batchsize)

db_iter = iter (db)
sample = next (db_iter)
print (sample [0].shape, sample [1].shape)

model = keras.Sequential ([
    layers.Dense (256, activation=tf.nn.relu),
    layers.Dense (128, activation=tf.nn.relu),
    layers.Dense (64, activation=tf.nn.relu),
    layers.Dense (32, activation=tf.nn.relu),
    layers.Dense (10)
])

model.build (input_shape=[None, 28*28])
model.summary ()

optimizer = optimizers.Adam (lr=1e-3)


def main ():
    for epoch in range (30):

        for step, (x, y) in enumerate (db):
            x = tf.reshape (x, [-1, 28*28])
            with tf.GradientTape () as tape:
                y_onehot = tf.one_hot (y,depth=10)
                logits = model (x)
                loss1 = tf.reduce_mean (tf.losses.MSE (y_onehot, logits))
                loss2 = tf.reduce_mean (tf.losses.categorical_crossentropy (y_onehot, logits, from_logits=True))

            grads = tape.gradient (loss1, model.trainable_variables)
            optimizer.apply_gradients (zip (grads,model.trainable_variables))

            if step % 100 == 0:
                print (epoch, step, "loss:", float (loss1), float (loss2))

        total_correct = 0
        total_num = 0
        for x,y in db_test:

            x = tf.reshape (x, [-1, 28 * 28])
            logits = model (x)

            prob = tf.nn.softmax (logits, axis=1)

            pred = tf.argmax (prob, axis=1)
            pred = tf.cast (pred, dtype=tf.int32)

            correct = tf.equal (pred, y)
            correct = tf.reduce_sum (tf.cast (correct, dtype=tf.int32))
            total_correct += int (correct)
            total_num +=x.shape [0]

        acc = total_correct / total_num
        print (epoch, "acc:", acc)

if __name__ == '__main__':
    main ()


Traceback (most recent call last):
  File "E:/Python_code/test/FashionMnist_layer", line 30, in <module>
    model = keras.Sequential ([
  File "D:\Program\Python\lib\site-packages\tensorflow\python\training\tracking\base.py", line 457, in _method_wrapper
    result = method (self, *args, **kwargs)
  File "D:\Program\Python\lib\site-packages\tensorflow\python\keras\engine\sequential.py", line 116, in __init__
    super (functional.Functional, self).__init__(  # pylint: disable=bad-super-call
  File "D:\Program\Python\lib\site-packages\tensorflow\python\training\tracking\base.py", line 457, in _method_wrapper
    result = method (self, *args, **kwargs)
  File "D:\Program\Python\lib\site-packages\tensorflow\python\keras\engine\training.py", line 308, in __init__
    self._init_batch_counters ()
  File "D:\Program\Python\lib\site-packages\tensorflow\python\training\tracking\base.py", line 457, in _method_wrapper
    result = method (self, *args, **kwargs)
  File "D:\Program\Python\lib\site-packages\tensorflow\python\keras\engine\training.py", line 317, in _init_batch_counters
    self._train_counter = variables.Variable (0, dtype='int64', aggregation=agg)
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\variables.py", line 262, in __call__
    return cls._variable_v2_call (*args, **kwargs)
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\variables.py", line 244, in _variable_v2_call
    return previous_getter (
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\variables.py", line 237, in <lambda>
    previous_getter = lambda **kws: default_variable_creator_v2 (None, **kws)
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\variable_scope.py", line 2633, in default_variable_creator_v2
    return resource_variable_ops.ResourceVariable (
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\variables.py", line 264, in __call__
    return super (VariableMetaclass, cls).__call__(*args, **kwargs)
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 1507, in __init__
    self._init_from_args (
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 1661, in _init_from_args
    handle = eager_safe_variable_handle (
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 242, in eager_safe_variable_handle
    return _variable_handle_from_shape_and_dtype (
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 174, in _variable_handle_from_shape_and_dtype
    gen_logging_ops._assert (  # pylint: disable=protected-access
  File "D:\Program\Python\lib\site-packages\tensorflow\python\ops\gen_logging_ops.py", line 49, in _assert
    _ops.raise_from_not_ok_status (e, name)
  File "D:\Program\Python\lib\site-packages\tensorflow\python\framework\ops.py", line 6843, in raise_from_not_ok_status
    six.raise_from (core._status_to_exception (e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [0] [Op:Assert] name: EagerVariableNameReuse

这边在 Colab 的 GPU 上跑了一下没有问题,建议参考 InvalidArgumentError: assertion failed: [0] [Op:Assert] name: EagerVariableNameReuse · Issue #38518 · tensorflow/tensorflow · GitHub ,以及检查你的 GPU 是不是被其他程序占用了。

(tf2) C:\Users\Xavie>python
Python 3.7.9 (default, Aug 31 2020, 17:10:11) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type “help”, “copyright”, “credits” or “license” for more information.

import tensorflow as tf
2020-09-11 20:33:26.555854: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library ‘cudart64_101.dll’; dlerror: cudart64_101.dll not found
2020-09-11 20:33:26.560825: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.

這樣了,請問如何解決的

這個信息表明你的電腦沒有安裝支持CUDA的GPU或者GPU配置不正確,會使用CPU進行運算。如果你的電腦確實沒有NVIDIA的顯示卡或者不打算用GPU,可忽略此提示信息。

我在使用tensorflow时总是有这个warning,请问是什么原因:

2020-09-18 14:39:07.885039: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
(2, 2)
2020-09-18 14:39:07.896766: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fde58a96d90 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-18 14:39:07.896780: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
tensorflow版本是cpu 2.3.0版本