Pytorch中Softmax和LogSoftmax的使用

举报
悲恋花丶无心之人 发表于 2021/02/03 00:19:34 2021/02/03
【摘要】 目录 一、函数解释 二、代码示例 三、整体代码 一、函数解释 1.Softmax函数常用的用法是指定参数dim就可以: (1)dim=0:对每一列的所有元素进行softmax运算,并使得每一列所有元素和为1。 (2)dim=1:对每一行的所有元素进行softmax运算,并使得每一行所有元素和为1。 class Softmax(Module): r"""App...

目录

一、函数解释

二、代码示例

三、整体代码


一、函数解释

1.Softmax函数常用的用法是指定参数dim就可以:

(1)dim=0:对每一列的所有元素进行softmax运算,并使得每一列所有元素和为1

(2)dim=1:对每一行的所有元素进行softmax运算,并使得每一行所有元素和为1


  
  1. class Softmax(Module):
  2. r"""Applies the Softmax function to an n-dimensional input Tensor
  3. rescaling them so that the elements of the n-dimensional output Tensor
  4. lie in the range [0,1] and sum to 1.
  5. Softmax is defined as:
  6. .. math::
  7. \text{Softmax}(x_{i}) = \frac{\exp(x_i)}{\sum_j \exp(x_j)}
  8. Shape:
  9. - Input: :math:`(*)` where `*` means, any number of additional
  10. dimensions
  11. - Output: :math:`(*)`, same shape as the input
  12. Returns:
  13. a Tensor of the same dimension and shape as the input with
  14. values in the range [0, 1]
  15. Arguments:
  16. dim (int): A dimension along which Softmax will be computed (so every slice
  17. along dim will sum to 1).
  18. .. note::
  19. This module doesn't work directly with NLLLoss,
  20. which expects the Log to be computed between the Softmax and itself.
  21. Use `LogSoftmax` instead (it's faster and has better numerical properties).
  22. Examples::
  23. >>> m = nn.Softmax(dim=1)
  24. >>> input = torch.randn(2, 3)
  25. >>> output = m(input)
  26. """
  27. __constants__ = ['dim']
  28. def __init__(self, dim=None):
  29. super(Softmax, self).__init__()
  30. self.dim = dim
  31. def __setstate__(self, state):
  32. self.__dict__.update(state)
  33. if not hasattr(self, 'dim'):
  34. self.dim = None
  35. def forward(self, input):
  36. return F.softmax(input, self.dim, _stacklevel=5)
  37. def extra_repr(self):
  38. return 'dim={dim}'.format(dim=self.dim)

2.LogSoftmax其实就是对softmax的结果进行log,即Log(Softmax(x))


  
  1. class LogSoftmax(Module):
  2. r"""Applies the :math:`\log(\text{Softmax}(x))` function to an n-dimensional
  3. input Tensor. The LogSoftmax formulation can be simplified as:
  4. .. math::
  5. \text{LogSoftmax}(x_{i}) = \log\left(\frac{\exp(x_i) }{ \sum_j \exp(x_j)} \right)
  6. Shape:
  7. - Input: :math:`(*)` where `*` means, any number of additional
  8. dimensions
  9. - Output: :math:`(*)`, same shape as the input
  10. Arguments:
  11. dim (int): A dimension along which LogSoftmax will be computed.
  12. Returns:
  13. a Tensor of the same dimension and shape as the input with
  14. values in the range [-inf, 0)
  15. Examples::
  16. >>> m = nn.LogSoftmax()
  17. >>> input = torch.randn(2, 3)
  18. >>> output = m(input)
  19. """
  20. __constants__ = ['dim']
  21. def __init__(self, dim=None):
  22. super(LogSoftmax, self).__init__()
  23. self.dim = dim
  24. def __setstate__(self, state):
  25. self.__dict__.update(state)
  26. if not hasattr(self, 'dim'):
  27. self.dim = None
  28. def forward(self, input):
  29. return F.log_softmax(input, self.dim, _stacklevel=5)

二、代码示例

输入代码


  
  1. import torch
  2. import torch.nn as nn
  3. import numpy as np
  4. batch_size = 4
  5. class_num = 6
  6. inputs = torch.randn(batch_size, class_num)
  7. for i in range(batch_size):
  8. for j in range(class_num):
  9. inputs[i][j] = (i + 1) * (j + 1)
  10. print("inputs:", inputs)

得到大小batch_size4类别数6的向量(可以理解为经过最后一层得到)

tensor([[ 1.,  2.,  3.,  4.,  5.,  6.],
            [ 2.,  4.,  6.,  8., 10., 12.],
            [ 3.,  6.,  9., 12., 15., 18.],
            [ 4.,  8., 12., 16., 20., 24.]])

接着我们对该向量每一行进行Softmax


  
  1. Softmax = nn.Softmax(dim=1)
  2. probs = Softmax(inputs)
  3. print("probs:\n", probs)

 得到

tensor([[4.2698e-03, 1.1606e-02, 3.1550e-02, 8.5761e-02, 2.3312e-01, 6.3369e-01],
            [3.9256e-05, 2.9006e-04, 2.1433e-03, 1.5837e-02, 1.1702e-01, 8.6467e-01],
            [2.9067e-07, 5.8383e-06, 1.1727e-04, 2.3553e-03, 4.7308e-02, 9.5021e-01],
            [2.0234e-09, 1.1047e-07, 6.0317e-06, 3.2932e-04, 1.7980e-02, 9.8168e-01]])

此外,我们对该向量每一行进行LogSoftmax


  
  1. LogSoftmax = nn.LogSoftmax(dim=1)
  2. log_probs = LogSoftmax(inputs)
  3. print("log_probs:\n", log_probs)

得到

tensor([[-5.4562e+00, -4.4562e+00, -3.4562e+00, -2.4562e+00, -1.4562e+00, -4.5619e-01],
            [-1.0145e+01, -8.1454e+00, -6.1454e+00, -4.1454e+00, -2.1454e+00, -1.4541e-01],
            [-1.5051e+01, -1.2051e+01, -9.0511e+00, -6.0511e+00,  -3.0511e+00, -5.1069e-02],
            [-2.0018e+01, -1.6018e+01, -1.2018e+01, -8.0185e+00, -4.0185e+00, -1.8485e-02]])

验证每一行元素和是否为1


  
  1. # probs_sum in dim=1
  2. probs_sum = [0 for i in range(batch_size)]
  3. for i in range(batch_size):
  4. for j in range(class_num):
  5. probs_sum[i] += probs[i][j]
  6. print(i, "row probs sum:", probs_sum[i])

得到每一行的和,看到确实为1

0 row probs sum: tensor(1.)
1 row probs sum: tensor(1.0000)
2 row probs sum: tensor(1.)
3 row probs sum: tensor(1.)

验证LogSoftmax是对Softmax的结果进行Log


  
  1. # to numpy
  2. np_probs = probs.data.numpy()
  3. print("numpy probs:\n", np_probs)
  4. # np.log()
  5. log_np_probs = np.log(np_probs)
  6. print("log numpy probs:\n", log_np_probs)

得到

numpy probs:
 [[4.26977826e-03 1.16064614e-02 3.15496325e-02 8.57607946e-02 2.33122006e-01 6.33691311e-01]
  [3.92559559e-05 2.90064461e-04 2.14330270e-03 1.58369839e-02 1.17020354e-01 8.64669979e-01]
  [2.90672347e-07 5.83831024e-06 1.17265590e-04 2.35534250e-03 4.73083146e-02 9.50212955e-01]
  [2.02340233e-09 1.10474026e-07 6.03167746e-06 3.29318427e-04 1.79801770e-02 9.81684387e-01]]
log numpy probs:
 [[-5.4561934e+00 -4.4561934e+00 -3.4561934e+00 -2.4561932e+00 -1.4561933e+00 -4.5619333e-01]
  [-1.0145408e+01 -8.1454077e+00 -6.1454072e+00 -4.1454072e+00 -2.1454074e+00 -1.4540738e-01]
  [-1.5051069e+01 -1.2051069e+01 -9.0510693e+00 -6.0510693e+00 -3.0510693e+00 -5.1069155e-02]
  [-2.0018486e+01 -1.6018486e+01 -1.2018485e+01 -8.0184851e+00 -4.0184855e+00 -1.8485421e-02]]

验证完毕


三、整体代码


  
  1. import torch
  2. import torch.nn as nn
  3. import numpy as np
  4. batch_size = 4
  5. class_num = 6
  6. inputs = torch.randn(batch_size, class_num)
  7. for i in range(batch_size):
  8. for j in range(class_num):
  9. inputs[i][j] = (i + 1) * (j + 1)
  10. print("inputs:", inputs)
  11. Softmax = nn.Softmax(dim=1)
  12. probs = Softmax(inputs)
  13. print("probs:\n", probs)
  14. LogSoftmax = nn.LogSoftmax(dim=1)
  15. log_probs = LogSoftmax(inputs)
  16. print("log_probs:\n", log_probs)
  17. # probs_sum in dim=1
  18. probs_sum = [0 for i in range(batch_size)]
  19. for i in range(batch_size):
  20. for j in range(class_num):
  21. probs_sum[i] += probs[i][j]
  22. print(i, "row probs sum:", probs_sum[i])
  23. # to numpy
  24. np_probs = probs.data.numpy()
  25. print("numpy probs:\n", np_probs)
  26. # np.log()
  27. log_np_probs = np.log(np_probs)
  28. print("log numpy probs:\n", log_np_probs)

文章来源: nickhuang1996.blog.csdn.net,作者:悲恋花丶无心之人,版权归原作者所有,如需转载,请联系作者。

原文链接:nickhuang1996.blog.csdn.net/article/details/105889978

【版权声明】本文为华为云社区用户转载文章,如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。