- 微信
- 微博
  
  分享文章到微博
- 复制链接
  
  复制链接到剪贴板

解决RuntimeError: Expected all tensors to be on the same device, b

AiCharm 发表于 2023/05/14 17:08:32 2023/05/14

【摘要】今天在把.pt文件转ONNX文件时，遇到此错误。报错RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat2 in method wrapper_m...

今天在把.pt文件转ONNX文件时，遇到此错误。

报错

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat2 in method wrapper_mm)

原因

代码中的Tensor**，一会在CPU中运行，一会在GPU中运行**，所以最好是都放在同一个device中执行。

pytorch有两种模型保存方式：

一、保存整个神经网络的的结构信息和模型参数信息，save的对象是网络net

二、只保存神经网络的训练模型参数，save的对象是net.state_dict()

对应两种保存模型的方式，pytorch也有两种加载模型的方式。对应第一种保存方式，加载模型时通过torch.load(’.pth’)直接初始化新的神经网络对象；对应第二种保存方式，需要首先导入对应的网络，再通过net.load_state_dict(torch.load(’.pth’))完成模型参数的加载。

解决方案

在报错中寻找错误的哪一行，通过Print语句查看相关参数到底在那里运行

print(参数a.is_cuda,参数a.is_cuda)

然后把它们统一都放在CPU/GPU上就可以。

解决案例

案例1

报错提示在utils.py 这个文件的问题

index = idx_ range.index_ select(0, reverse_ mapping )

使用print语句检查

print(index .is_cuda,idx_ range.is_cuda)

经过验证发现idx_range在cpu上，index在GPU上，把idx_range放在GPU即可。

  idx_range.to(device)

如果遇到下面问题

NameError: name 'device' is not defined

请在开始加入语句

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

案例2

错误代码

if __name__ == '__main__':
    model = Perception(2, 3, 2).cuda()

    input = torch.randn(4, 2).cuda()
    output = model(input)
    # output = output.cuda()

    label = torch.Tensor([0, 1, 1, 0]).long()
    criterion = nn.CrossEntropyLoss()
    loss_nn = criterion(output, label)
    print(loss_nn)
    loss_functional = F.cross_entropy(output, label)
    print(loss_functional)

解决方案

将用到的Tensor都改为同一个device：Tensor.to(device)

if __name__ == '__main__':
	#添加语句 device = torch.device('cuda:0')以及.to(device)
    device = torch.device('cuda:0')
    model = Perception(2, 3, 2).to(device)

    input = torch.randn(4, 2).to(device)
    output = model(input).to(device)

    label = torch.Tensor([0, 1, 1, 0]).long().to(device)
    criterion = nn.CrossEntropyLoss()
    loss_nn = criterion(output, label).to(device)
    print(loss_nn)
    loss_functional = F.cross_entropy(output, label)
    print(loss_functional)

完整代码：

import torch
from torch import nn
import torch.nn.functional as F

from torch.nn import Linear


class linear(nn.Module):  # 继承nn.Module
    def __init__(self, in_dim, out_dim):
        super(Linear, self).__init__()  # 调用nn.Module的构造函数

        # 使用nn.Parameter来构造需要学习的参数
        self.w = nn.Parameter(torch.randn(in_dim, out_dim))
        self.b = nn.Parameter(torch.randn(out_dim))

    # 在forward中实现前向传播过程
    def forward(self, x):
        x = x.matmul(self.w)
        y = x + self.b.expand_as(x)  # expand_as保证矩阵形状一致

        return y

class Perception(nn.Module):
    def __init__(self, in_dim, hid_dim, out_dim):
        super(Perception, self).__init__()
        self.layer = nn.Sequential(
            nn.Linear(in_dim, hid_dim),
            nn.Sigmoid(),
            nn.Linear(hid_dim, out_dim),
            nn.Sigmoid()
        )
        # self.layer1 = Linear(in_dim, hid_dim)
        # self.layer2 = Linear(hid_dim, out_dim)

    def forward(self, x):
        # x = self.layer1(x)
        # y = torch.sigmoid(x)
        # y = self.layer2(y)
        # y = torch.sigmoid(y)
        y = self.layer(x)
        return y


if __name__ == '__main__':
    device = torch.device('cuda:0')
    model = Perception(2, 3, 2).to(device)

    input = torch.randn(4, 2).to(device)
    output = model(input).to(device)
    # output = output.cuda()

    label = torch.Tensor([0, 1, 1, 0]).long().to(device)
    criterion = nn.CrossEntropyLoss()
    loss_nn = criterion(output, label).to(device)
    print(loss_nn)
    loss_functional = F.cross_entropy(output, label)
    print(loss_functional)

补充

如果遇到错误：Tensor for argument #2 ‘mat1’ is on CPU, but expected it to be on GPU (while checking arguments for addmm)

代表着模型在GPU上进行计算，需要将变量和模型都增加.to(device)，都搬到GPU上。

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
 
g = g.to(device)
 
model=model.to(device)

【声明】本内容来自华为云开发者社区博主，不代表华为云及华为云开发者社区的观点和立场。转载时必须标注文章的来源（华为云社区）、文章链接、文章作者等基本信息，否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容，欢迎发送邮件进行举报，并提供相关证据，一经查实，本社区将立刻删除涉嫌侵权内容，举报邮箱： cloudbbs@huaweicloud.com

点赞
收藏
关注作者

0/1000

抱歉，系统识别当前为高风险访问，暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称，即可参与社区互动！

*长度不超过10个汉字或20个英文字符，设置后3个月内不可修改。

确认取消

加入云驻计划，成为创作者

华为云周边好礼
免费体验产品
特殊身份标识
线下官方门票
内部专家零距离
与10000+优质创作者共同成长

立即加入

解决RuntimeError: Expected all tensors to be on the same device, b

报错

原因

解决方案

解决案例

案例1

案例2

补充

全部回复

设置昵称

关于作者

目录

加入云驻计划，成为创作者

解决RuntimeError: Expected all tensors to be on the same device, b

报错

原因

解决方案

解决案例

案例1

案例2

补充

全部回复

设置昵称

关于作者

目录

热门推荐查看更多

相关文章

加入云驻计划，成为创作者

相关产品