我可以在没有GPU的机器上使用PyTorch或Tensorflow项目吗?

在Python和机器学习方面,我是个菜鸟。我正在尝试运行两个与“深度图像抠像”有关的项目:

我只是试图在这些项目中运行测试,但遇到了各种问题。我可以在没有GPU的机器上运行它们吗?我以为GPU仅用于加快处理速度,但是我只想看到这些在运行配备GPU的计算机之前运行。 我先向您道歉,因为我知道自己对此非常感兴趣

当我尝试Tensorflow项目时:

  1. I get an error with this line gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction = args.gpu_fraction) probably because I was tf2 and this requires tf1
  2. After I downgraded to tf1 when I try to run the test I get W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations. and InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'MaxPoolWithArgmax' with these attrs. Registered devices: [CPU], Registered kernels: <no registered kernels> and now I'm stuck because I have no clue what this means

当我尝试Pytorch项目时:

  1. First I get this error: RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
  2. So I added map_location=torch.device('cpu') when the model is loaded, but now I get RuntimeError: Error(s) in loading state_dict for VGG16: size mismatch for conv6_1.weight: copying a param with shape torch.Size([512, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([512, 512, 3, 3]). And I'm stuck again

有人可以帮忙吗?

先感谢您!

评论
  • 曾经的你
    曾经的你 回复

    For the PyTorch one, there were two problems and it looks like you've solved the first one on your own with map_location. The second problem is that the weights in your checkpoint and the weights in your model don't have the same shape! A quick detour to the github repo; let's visit net.py in core. Take a look at lines 26 to 28:

    # model released before 2019.09.09 should use kernel_size=1 & padding=0
    # self.conv6_1 = nn.Conv2d(512, 512, kernel_size=1, padding=0,bias=True)
    self.conv6_1 = nn.Conv2d(512, 512, kernel_size=3, padding=1,bias=True)
    

    我猜想检查点正在加载权重,其中conv6_1的内核大小为1而不是3,就像注释掉的代码行一样。因此,尝试取消注释kernel_size = 1的行,并注释掉kernel_size = 3的行。