- 微信
- 微博
  
  分享文章到微博
- 复制链接
  
  复制链接到剪贴板

Pytorch里addmm()和addmm_()的用法详解

悲恋花丶无心之人发表于 2021/02/03 02:20:53 2021/02/03

【摘要】一、函数解释在torch/_C/_VariableFunctions.py的有该定义，意义就是实现一下公式：换句话说，就是需要传入5个参数，mat里的每个元素乘以beta，mat1和mat2进行矩阵乘法（左行乘右列）后再乘以alpha，最后将这2个结果加在一起。但是这样说可能没啥概念，接下来博主为大家写上一段代码，大家就明白了~ def addmm(self,...

一、函数解释

在torch/_C/_VariableFunctions.py的有该定义，意义就是实现一下公式：

换句话说，就是需要传入5个参数，mat里的每个元素乘以beta，mat1和mat2进行矩阵乘法（左行乘右列）后再乘以alpha，最后将这2个结果加在一起。但是这样说可能没啥概念，接下来博主为大家写上一段代码，大家就明白了~


  
   
    
     
    
    
      def addmm(self, beta=1, mat, alpha=1, mat1, mat2, out=None): # real signature unknown; restored from __doc__
     
    
   
    
     
    
    
      """
     
    
   
    
     
    
    
     
       addmm(beta=1, mat, alpha=1, mat1, mat2, out=None) -> Tensor
     
    
   
    
     
    
    
     
       
     
    
   
    
     
    
    
     
       Performs a matrix multiplication of the matrices :attr:`mat1` and :attr:`mat2`.
     
    
   
    
     
    
    
     
       The matrix :attr:`mat` is added to the final result.
     
    
   
    
     
    
    
     
       
     
    
   
    
     
    
    
     
       If :attr:`mat1` is a :math:`(n \times m)` tensor, :attr:`mat2` is a
     
    
   
    
     
    
    
     
       :math:`(m \times p)` tensor, then :attr:`mat` must be
     
    
   
    
     
    
    
     
       :ref:`broadcastable <broadcasting-semantics>` with a :math:`(n \times p)` tensor
     
    
   
    
     
    
    
     
       and :attr:`out` will be a :math:`(n \times p)` tensor.
     
    
   
    
     
    
    
     
       
     
    
   
    
     
    
    
     
       :attr:`alpha` and :attr:`beta` are scaling factors on matrix-vector product between
     
    
   
    
     
    
    
     
       :attr:`mat1` and :attr`mat2` and the added matrix :attr:`mat` respectively.
     
    
   
    
     
    
    
     
       
     
    
   
    
     
    
    
     
       .. math::
     
    
   
    
     
    
    
     
       out = \beta\ mat + \alpha\ (mat1_i \mathbin{@} mat2_i)
     
    
   
    
     
    
    
     
       
     
    
   
    
     
    
    
     
       For inputs of type `FloatTensor` or `DoubleTensor`, arguments :attr:`beta` and
     
    
   
    
     
    
    
     
       :attr:`alpha` must be real numbers, otherwise they should be integers.
     
    
   
    
     
    
    
     
       
     
    
   
    
     
    
    
     
       Args:
     
    
   
    
     
    
    
     
       beta (Number, optional): multiplier for :attr:`mat` (:math:`\beta`)
     
    
   
    
     
    
    
     
       mat (Tensor): matrix to be added
     
    
   
    
     
    
    
     
       alpha (Number, optional): multiplier for :math:`mat1 @ mat2` (:math:`\alpha`)
     
    
   
    
     
    
    
     
       mat1 (Tensor): the first matrix to be multiplied
     
    
   
    
     
    
    
     
       mat2 (Tensor): the second matrix to be multiplied
     
    
   
    
     
    
    
     
       out (Tensor, optional): the output tensor
     
    
   
    
     
    
    
     
       
     
    
   
    
     
    
    
     
       Example::
     
    
   
    
     
    
    
     
       
     
    
   
    
     
    
    
     
       >>> M = torch.randn(2, 3)
     
    
   
    
     
    
    
     
       >>> mat1 = torch.randn(2, 3)
     
    
   
    
     
    
    
     
       >>> mat2 = torch.randn(3, 3)
     
    
   
    
     
    
    
     
       >>> torch.addmm(M, mat1, mat2)
     
    
   
    
     
    
    
     
       tensor([[-4.8716, 1.4671, -1.3746],
     
    
   
    
     
    
    
     
       [ 0.7573, -3.9555, -2.8681]])
     
    
   
    
     
    
    
     
       """
     
    
   
    
     
    
    
      pass

二、代码范例

1.先摆出代码，大家可以先复制粘贴运行一下，在之后博主会一一讲解


  
   
    
     
    
    
     
      """
     
    
   
    
     
    
    
     
      @author:nickhuang1996
     
    
   
    
     
    
    
     
      """
     
    
   
    
     
    
    
     
      import torch
     
    
   
    
     
    
    
      
     
    
   
    
     
    
    
     
      rectangle_height = 3
     
    
   
    
     
    
    
     
      rectangle_width = 3
     
    
   
    
     
    
    
     
      inputs = torch.randn(rectangle_height, rectangle_width)
     
    
   
    
     
    
    
     
      for i in range(rectangle_height):
     
    
   
    
     
    
    
      for j in range(rectangle_width):
     
    
   
    
     
    
    
     
       inputs[i] = i * torch.ones(rectangle_width)
     
    
   
    
     
    
    
     
      '''
     
    
   
    
     
    
    
     
      inputs and its transpose
     
    
   
    
     
    
    
     
      -->inputs = tensor([[0., 0., 0.],
     
    
   
    
     
    
    
     
       [1., 1., 1.],
     
    
   
    
     
    
    
     
       [2., 2., 2.]])
     
    
   
    
     
    
    
     
      -->inputs_t = tensor([[0., 1., 2.],
     
    
   
    
     
    
    
     
       [0., 1., 2.],
     
    
   
    
     
    
    
     
       [0., 1., 2.]])
     
    
   
    
     
    
    
     
      '''
     
    
   
    
     
    
    
     
      print("inputs:\n", inputs)
     
    
   
    
     
    
    
     
      inputs_t = inputs.t()
     
    
   
    
     
    
    
     
      print("inputs_t:\n", inputs_t)
     
    
   
    
     
    
    
     
      '''
     
    
   
    
     
    
    
     
      inputs_t @ inputs_t [[0., 1., 2.], [[0., 1., 2.], [[0., 3., 6.]
     
    
   
    
     
    
    
     
       = [0., 1., 2.], @ [0., 1., 2.], = [0., 3., 6.]
     
    
   
    
     
    
    
     
       [0., 1., 2.]] [0., 1., 2.]] [0., 3., 6.]]
     
    
   
    
     
    
    
     
      '''
     
    
   
    
     
    
    
      
     
    
   
    
     
    
    
     
      '''a, b, c and d = 1 * inputs + 1 * (inputs_t @ inputs_t)'''
     
    
   
    
     
    
    
     
      a = torch.addmm(input=inputs, mat1=inputs_t, mat2=inputs_t)
     
    
   
    
     
    
    
     
      b = inputs.addmm(mat1=inputs_t, mat2=inputs_t)
     
    
   
    
     
    
    
     
      c = torch.addmm(input=inputs, beta=1, mat1=inputs_t, mat2=inputs_t, alpha=1)
     
    
   
    
     
    
    
     
      d = inputs.addmm(beta=1, mat1=inputs_t, mat2=inputs_t, alpha=1)
     
    
   
    
     
    
    
     
      '''e and f = 1 * inputs + 1 * (inputs_t @ inputs_t)'''
     
    
   
    
     
    
    
     
      e = torch.addmm(inputs, inputs_t, inputs_t)
     
    
   
    
     
    
    
     
      f = inputs.addmm(inputs_t, inputs_t)
     
    
   
    
     
    
    
     
      '''1 * inputs + 1 * (inputs_t @ inputs_t)'''
     
    
   
    
     
    
    
     
      g = inputs.addmm(1, inputs_t, inputs_t)
     
    
   
    
     
    
    
     
      '''2 * inputs + 1 * (inputs_t @ inputs_t)'''
     
    
   
    
     
    
    
     
      g2 = inputs.addmm(2, inputs_t, inputs_t)
     
    
   
    
     
    
    
     
      '''h = 1 * inputs + 1 * (inputs_t @ inputs_t)'''
     
    
   
    
     
    
    
     
      h = inputs.addmm(1, 1, inputs_t, inputs_t)
     
    
   
    
     
    
    
     
      '''h12 = 1 * inputs + 2 * (inputs_t @ inputs_t)'''
     
    
   
    
     
    
    
     
      h12 = inputs.addmm(1, 2, inputs_t, inputs_t)
     
    
   
    
     
    
    
     
      '''h21 = 2 * inputs + 1 * (inputs_t @ inputs_t)'''
     
    
   
    
     
    
    
     
      h21 = inputs.addmm(2, 1, inputs_t, inputs_t)
     
    
   
    
     
    
    
     
      print("a:\n", a)
     
    
   
    
     
    
    
     
      print("b:\n", b)
     
    
   
    
     
    
    
     
      print("c:\n", c)
     
    
   
    
     
    
    
     
      print("d:\n", d)
     
    
   
    
     
    
    
      
     
    
   
    
     
    
    
     
      print("e:\n", e)
     
    
   
    
     
    
    
     
      print("f:\n", f)
     
    
   
    
     
    
    
      
     
    
   
    
     
    
    
     
      print("g:\n", g)
     
    
   
    
     
    
    
     
      print("g2:\n", g2)
     
    
   
    
     
    
    
      
     
    
   
    
     
    
    
     
      print("h:\n", h)
     
    
   
    
     
    
    
     
      print("h12:\n", h12)
     
    
   
    
     
    
    
     
      print("h21:\n", h21)
     
    
   
    
     
    
    
     
      print("inputs:\n", inputs)
     
    
   
    
     
    
    
     
      '''inputs = 1 * inputs - 2 * (inputs @ inputs_t)'''
     
    
   
    
     
    
    
     
      '''
     
    
   
    
     
    
    
     
      inputs @ inputs_t [[0., 0., 0.], [[0., 1., 2.], [[0., 0., 0.]
     
    
   
    
     
    
    
     
       = [1., 1., 1.], @ [0., 1., 2.], = [0., 3., 6.]
     
    
   
    
     
    
    
     
       [2., 2., 2.]] [0., 1., 2.]] [0., 6., 12.]]
     
    
   
    
     
    
    
     
      '''
     
    
   
    
     
    
    
     
      inputs.addmm_(1, -2, inputs, inputs_t)  # In-place
     
    
   
    
     
    
    
     
      print("inputs:\n", inputs)

2.其中

inputs是一个3×3的矩阵，为


  
   
    
     
    
    
     
      tensor([[0., 0., 0.],
     
    
   
    
     
    
    
     
       [1., 1., 1.],
     
    
   
    
     
    
    
     
       [2., 2., 2.]])

inputs_t也是一个3×3的矩阵，是inputs的转置矩阵，为


  
   
    
     
    
    
     
      tensor([[0., 1., 2.],
     
    
   
    
     
    
    
     
       [0., 1., 2.],
     
    
   
    
     
    
    
     
       [0., 1., 2.]])

* inputs_t @ inputs_t为


  
   
    
     
    
    
     
      '''
     
    
   
    
     
    
    
     
      inputs_t @ inputs_t [[0., 1., 2.], [[0., 1., 2.], [[0., 3., 6.]
     
    
   
    
     
    
    
     
       = [0., 1., 2.], @ [0., 1., 2.], = [0., 3., 6.]
     
    
   
    
     
    
    
     
       [0., 1., 2.]] [0., 1., 2.]] [0., 3., 6.]]
     
    
   
    
     
    
    
     
      '''

3.代码中a，b，c和d展示的是完全形式，即标明了位置参数和传入参数。可以看到input这个位置参数可以写在函数的前面，即

torch.addmm(input, mat1, mat2) = inputs.addmm(mat1, mat2)

完成的公式为：

1 × inputs + 1 ×（inputs_t @ inputs_t）


  
   
    
     
    
    
     
      '''a, b, c and d = 1 * inputs + 1 * (inputs_t @ inputs_t)'''
     
    
   
    
     
    
    
     
      a = torch.addmm(input=inputs, mat1=inputs_t, mat2=inputs_t)
     
    
   
    
     
    
    
     
      b = inputs.addmm(mat1=inputs_t, mat2=inputs_t)
     
    
   
    
     
    
    
     
      c = torch.addmm(input=inputs, beta=1, mat1=inputs_t, mat2=inputs_t, alpha=1)
     
    
   
    
     
    
    
     
      d = inputs.addmm(beta=1, mat1=inputs_t, mat2=inputs_t, alpha=1)


  
   
    
     
    
    
     
      a:
     
    
   
    
     
    
    
     
      tensor([[0., 3., 6.],
     
    
   
    
     
    
    
     
       [1., 4., 7.],
     
    
   
    
     
    
    
     
       [2., 5., 8.]])
     
    
   
    
     
    
    
     
      b:
     
    
   
    
     
    
    
     
      tensor([[0., 3., 6.],
     
    
   
    
     
    
    
     
       [1., 4., 7.],
     
    
   
    
     
    
    
     
       [2., 5., 8.]])
     
    
   
    
     
    
    
     
      c:
     
    
   
    
     
    
    
     
      tensor([[0., 3., 6.],
     
    
   
    
     
    
    
     
       [1., 4., 7.],
     
    
   
    
     
    
    
     
       [2., 5., 8.]])
     
    
   
    
     
    
    
     
      d:
     
    
   
    
     
    
    
     
      tensor([[0., 3., 6.],
     
    
   
    
     
    
    
     
       [1., 4., 7.],
     
    
   
    
     
    
    
     
       [2., 5., 8.]])

4.下面的例子更好了说明了input参数的位置可变性，并且beta和alpha都缺省了：

完成的公式为：

1 × inputs + 1 ×（inputs_t @ inputs_t）


  
   
    
     
    
    
     
      '''e and f = 1 * inputs + 1 * (inputs_t @ inputs_t)'''
     
    
   
    
     
    
    
     
      e = torch.addmm(inputs, inputs_t, inputs_t)
     
    
   
    
     
    
    
     
      f = inputs.addmm(inputs_t, inputs_t)


  
   
    
     
    
    
     
      e:
     
    
   
    
     
    
    
     
      tensor([[0., 3., 6.],
     
    
   
    
     
    
    
     
       [1., 4., 7.],
     
    
   
    
     
    
    
     
       [2., 5., 8.]])
     
    
   
    
     
    
    
     
      f:
     
    
   
    
     
    
    
     
      tensor([[0., 3., 6.],
     
    
   
    
     
    
    
     
       [1., 4., 7.],
     
    
   
    
     
    
    
     
       [2., 5., 8.]])

5.加一个参数，实际上是添加了beta这个参数

完成的公式为：

g = 1 × inputs + 1 ×（inputs_t @ inputs_t）

g2 = 2 × inputs + 1 ×（inputs_t @ inputs_t）


  
   
    
     
    
    
     
      '''1 * inputs + 1 * (inputs_t @ inputs_t)'''
     
    
   
    
     
    
    
     
      g = inputs.addmm(1, inputs_t, inputs_t)
     
    
   
    
     
    
    
     
      '''2 * inputs + 1 * (inputs_t @ inputs_t)'''
     
    
   
    
     
    
    
     
      g2 = inputs.addmm(2, inputs_t, inputs_t)


  
   
    
     
    
    
     
      g:
     
    
   
    
     
    
    
     
      tensor([[0., 3., 6.],
     
    
   
    
     
    
    
     
       [1., 4., 7.],
     
    
   
    
     
    
    
     
       [2., 5., 8.]])
     
    
   
    
     
    
    
     
      g2:
     
    
   
    
     
    
    
     
      tensor([[ 0.,  3.,  6.],
     
    
   
    
     
    
    
     
       [ 2.,  5.,  8.],
     
    
   
    
     
    
    
     
       [ 4.,  7., 10.]])

6.再加一个参数，实际上是添加了alpha这个参数

完成的公式为：

h = 1 × inputs + 1 ×（inputs_t @ inputs_t）

h12 = 1 × inputs + 2 ×（inputs_t @ inputs_t）

h21 = 2 × inputs + 1 ×（inputs_t @ inputs_t）


  
   
    
     
    
    
     
      '''h = 1 * inputs + 1 * (inputs_t @ inputs_t)'''
     
    
   
    
     
    
    
     
      h = inputs.addmm(1, 1, inputs_t, inputs_t)
     
    
   
    
     
    
    
     
      '''h12 = 1 * inputs + 2 * (inputs_t @ inputs_t)'''
     
    
   
    
     
    
    
     
      h12 = inputs.addmm(1, 2, inputs_t, inputs_t)
     
    
   
    
     
    
    
     
      '''h21 = 2 * inputs + 1 * (inputs_t @ inputs_t)'''
     
    
   
    
     
    
    
     
      h21 = inputs.addmm(2, 1, inputs_t, inputs_t)


  
   
    
     
    
    
     
      h:
     
    
   
    
     
    
    
     
      tensor([[0., 3., 6.],
     
    
   
    
     
    
    
     
       [1., 4., 7.],
     
    
   
    
     
    
    
     
       [2., 5., 8.]])
     
    
   
    
     
    
    
     
      h12:
     
    
   
    
     
    
    
     
      tensor([[ 0.,  6., 12.],
     
    
   
    
     
    
    
     
       [ 1.,  7., 13.],
     
    
   
    
     
    
    
     
       [ 2.,  8., 14.]])
     
    
   
    
     
    
    
     
      h21:
     
    
   
    
     
    
    
     
      tensor([[ 0.,  3.,  6.],
     
    
   
    
     
    
    
     
       [ 2.,  5.,  8.],
     
    
   
    
     
    
    
     
       [ 4.,  7., 10.]])

7.当然，以上的步骤inputs没有变化，还是为


  
   
    
     
    
    
     
      inputs:
     
    
   
    
     
    
    
     
      tensor([[0., 0., 0.],
     
    
   
    
     
    
    
     
       [1., 1., 1.],
     
    
   
    
     
    
    
     
       [2., 2., 2.]])

*8.addmm_()的操作和addmm()函数功能相同，区别就是addmm_()有inplace的操作，也就是在原对象基础上进行修改，即把改变之后的变量再赋给原来的变量。例如：

inputs的值变成了改变之后的值，不用再去写某个变量=addmm_() 了，因为inputs就是改变之后的变量！

*inputs@ inputs_t为


  
   
    
     
    
    
     
      '''
     
    
   
    
     
    
    
     
      inputs @ inputs_t [[0., 0., 0.], [[0., 1., 2.], [[0., 0., 0.]
     
    
   
    
     
    
    
     
       = [1., 1., 1.], @ [0., 1., 2.], = [0., 3., 6.]
     
    
   
    
     
    
    
     
       [2., 2., 2.]] [0., 1., 2.]] [0., 6., 12.]]
     
    
   
    
     
    
    
     
      '''

完成的公式为：

inputs = 1 × inputs - 2 ×（inputs @ inputs_t）


  
   
    
     
    
    
     
      '''inputs = 1 * inputs - 2 * (inputs @ inputs_t)'''
     
    
   
    
     
    
    
     
      inputs.addmm_(1, -2, inputs, inputs_t)  # In-place


  
   
    
     
    
    
     
      inputs:
     
    
   
    
     
    
    
     
      tensor([[  0.,   0.,   0.],
     
    
   
    
     
    
    
     
       [  1.,  -5., -11.],
     
    
   
    
     
    
    
     
       [  2., -10., -22.]])

三、代码运行结果


  
   
    
     
    
    
     
      inputs:
     
    
   
    
     
    
    
     
      tensor([[0., 0., 0.],
     
    
   
    
     
    
    
     
       [1., 1., 1.],
     
    
   
    
     
    
    
     
       [2., 2., 2.]])
     
    
   
    
     
    
    
     
      inputs_t:
     
    
   
    
     
    
    
     
      tensor([[0., 1., 2.],
     
    
   
    
     
    
    
     
       [0., 1., 2.],
     
    
   
    
     
    
    
     
       [0., 1., 2.]])
     
    
   
    
     
    
    
     
      a:
     
    
   
    
     
    
    
     
      tensor([[0., 3., 6.],
     
    
   
    
     
    
    
     
       [1., 4., 7.],
     
    
   
    
     
    
    
     
       [2., 5., 8.]])
     
    
   
    
     
    
    
     
      b:
     
    
   
    
     
    
    
     
      tensor([[0., 3., 6.],
     
    
   
    
     
    
    
     
       [1., 4., 7.],
     
    
   
    
     
    
    
     
       [2., 5., 8.]])
     
    
   
    
     
    
    
     
      c:
     
    
   
    
     
    
    
     
      tensor([[0., 3., 6.],
     
    
   
    
     
    
    
     
       [1., 4., 7.],
     
    
   
    
     
    
    
     
       [2., 5., 8.]])
     
    
   
    
     
    
    
     
      d:
     
    
   
    
     
    
    
     
      tensor([[0., 3., 6.],
     
    
   
    
     
    
    
     
       [1., 4., 7.],
     
    
   
    
     
    
    
     
       [2., 5., 8.]])
     
    
   
    
     
    
    
     
      e:
     
    
   
    
     
    
    
     
      tensor([[0., 3., 6.],
     
    
   
    
     
    
    
     
       [1., 4., 7.],
     
    
   
    
     
    
    
     
       [2., 5., 8.]])
     
    
   
    
     
    
    
     
      f:
     
    
   
    
     
    
    
     
      tensor([[0., 3., 6.],
     
    
   
    
     
    
    
     
       [1., 4., 7.],
     
    
   
    
     
    
    
     
       [2., 5., 8.]])
     
    
   
    
     
    
    
     
      g:
     
    
   
    
     
    
    
     
      tensor([[0., 3., 6.],
     
    
   
    
     
    
    
     
       [1., 4., 7.],
     
    
   
    
     
    
    
     
       [2., 5., 8.]])
     
    
   
    
     
    
    
     
      g2:
     
    
   
    
     
    
    
     
      tensor([[ 0.,  3.,  6.],
     
    
   
    
     
    
    
     
       [ 2.,  5.,  8.],
     
    
   
    
     
    
    
     
       [ 4.,  7., 10.]])
     
    
   
    
     
    
    
     
      h:
     
    
   
    
     
    
    
     
      tensor([[0., 3., 6.],
     
    
   
    
     
    
    
     
       [1., 4., 7.],
     
    
   
    
     
    
    
     
       [2., 5., 8.]])
     
    
   
    
     
    
    
     
      h12:
     
    
   
    
     
    
    
     
      tensor([[ 0.,  6., 12.],
     
    
   
    
     
    
    
     
       [ 1.,  7., 13.],
     
    
   
    
     
    
    
     
       [ 2.,  8., 14.]])
     
    
   
    
     
    
    
     
      h21:
     
    
   
    
     
    
    
     
      tensor([[ 0.,  3.,  6.],
     
    
   
    
     
    
    
     
       [ 2.,  5.,  8.],
     
    
   
    
     
    
    
     
       [ 4.,  7., 10.]])
     
    
   
    
     
    
    
     
      inputs:
     
    
   
    
     
    
    
     
      tensor([[0., 0., 0.],
     
    
   
    
     
    
    
     
       [1., 1., 1.],
     
    
   
    
     
    
    
     
       [2., 2., 2.]])
     
    
   
    
     
    
    
     
      inputs:
     
    
   
    
     
    
    
     
      tensor([[  0.,   0.,   0.],
     
    
   
    
     
    
    
     
       [  1.,  -5., -11.],
     
    
   
    
     
    
    
     
       [  2., -10., -22.]])

文章来源: nickhuang1996.blog.csdn.net，作者：悲恋花丶无心之人，版权归原作者所有，如需转载，请联系作者。

原文链接：nickhuang1996.blog.csdn.net/article/details/90638449

点赞
收藏
关注作者

0/1000

抱歉，系统识别当前为高风险访问，暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称，即可参与社区互动！

*长度不超过10个汉字或20个英文字符，设置后3个月内不可修改。

确认取消

加入云驻计划，成为创作者

华为云周边好礼
免费体验产品
特殊身份标识
线下官方门票
内部专家零距离
与10000+优质创作者共同成长

立即加入

Pytorch里addmm()和addmm_()的用法详解

一、函数解释

二、代码范例

三、代码运行结果

全部回复

设置昵称

关于作者

目录

加入云驻计划，成为创作者

Pytorch里addmm()和addmm_()的用法详解

一、函数解释

二、代码范例

三、代码运行结果

全部回复

设置昵称

关于作者

目录

热门推荐查看更多

相关文章

加入云驻计划，成为创作者

相关产品