TensorFlow 分布式训练

我没有试过既多机又多卡的环境,不过看官方文档上的说明,似乎 MultiWorkerMirroredStrategy 就够了

tf.distribute.experimental.MultiWorkerMirroredStrategy is very similar to MirroredStrategy . It implements synchronous distributed training across multiple workers, each with potentially multiple GPUs.