VGG-M神经网络

举报
小小谢先生 发表于 2022/04/15 23:49:33 2022/04/15
【摘要】 目标跟踪论文一般提到VGG-M神经网络,也就是CNN-M神经网络,其出处是论文《Return of the Devil in the Details: Delving Deep into Convolutional Nets》,其定义如下: 其架构包含5个卷积层和3个全连接层,它的特点是第一个卷积层的步幅减小和感受野较小,这在ILSV...

目标跟踪论文一般提到VGG-M神经网络,也就是CNN-M神经网络,其出处是论文《Return of the Devil in the Details:
Delving Deep into Convolutional Nets》,其定义如下:

其架构包含5个卷积层和3个全连接层,它的特点是第一个卷积层的步幅减小和感受野较小,这在ILSVRC数据集上被证明是有益的。同时,conv2使用更大的步幅(stride=2而不是1)来保持合理的计算时间。还在conv4层使用更少的过滤器(512)。

有时候为了神经网络的鲁棒性和泛化性,也会在VGG-M中添加跨通道局部响应归一化(SpatialMapLRN)。具体代码如下:


  
  1. class SpatialCrossMapLRN(nn.Module):
  2. def __init__(self, local_size=1, alpha=1.0, beta=0.75, k=1, ACROSS_CHANNELS=True):
  3. super(SpatialCrossMapLRN, self).__init__()
  4. self.ACROSS_CHANNELS = ACROSS_CHANNELS
  5. if ACROSS_CHANNELS:
  6. self.average=nn.AvgPool3d(kernel_size=(local_size, 1, 1),
  7. stride=1,
  8. padding=(int((local_size-1.0)/2), 0, 0))
  9. else:
  10. self.average=nn.AvgPool2d(kernel_size=local_size,
  11. stride=1,
  12. padding=int((local_size-1.0)/2))
  13. self.alpha = alpha
  14. self.beta = beta
  15. self.k = k
  16. def forward(self, x):
  17. if self.ACROSS_CHANNELS:
  18. div = x.pow(2).unsqueeze(1)
  19. div = self.average(div).squeeze(1)
  20. div = div.mul(self.alpha).add(self.k).pow(self.beta)
  21. else:
  22. div = x.pow(2)
  23. div = self.average(div)
  24. div = div.mul(self.alpha).add(self.k).pow(self.beta)
  25. x = x.div(div)
  26. return x
  27. class VGGM(nn.Module):
  28. def __init__(self, num_classes=1000):
  29. super(VGGM, self).__init__()
  30. self.num_classes = num_classes
  31. self.features = nn.Sequential(
  32. nn.Conv2d(3, 96, (7, 7), (2, 2)), # conv1
  33. nn.ReLU(),
  34. SpatialCrossMapLRN(5, 0.0005, 0.75, 2),
  35. nn.MaxPool2d((3, 3), (2, 2), (0, 0), ceil_mode=True),
  36. nn.Conv2d(96, 256, (5, 5), (2, 2), (1, 1)), # conv2
  37. nn.ReLU(),
  38. SpatialCrossMapLRN(5, 0.0005, 0.75, 2),
  39. nn.MaxPool2d((3, 3), (2, 2), (0, 0), ceil_mode=True),
  40. nn.Conv2d(256, 512, (3, 3), (1, 1), (1, 1)), # conv3
  41. nn.ReLU(),
  42. nn.Conv2d(512, 512, (3, 3), (1, 1), (1, 1)), # conv4
  43. nn.ReLU(),
  44. nn.Conv2d(512, 512, (3, 3), (1, 1), (1, 1)), # conv5
  45. nn.ReLU(),
  46. nn.MaxPool2d((3, 3), (2, 2), (0, 0), ceil_mode=True)
  47. )
  48. self.classifier = nn.Sequential(
  49. nn.Linear(18432, 4096),
  50. nn.ReLU(),
  51. nn.Dropout(0.5),
  52. nn.Linear(4096, 4096),
  53. nn.ReLU(),
  54. nn.Dropout(0.5),
  55. nn.Linear(4096, num_classes)
  56. )
  57. def forward(self, x):
  58. x = self.features(x)
  59. x = x.view(x.size(0), -1)
  60. x = self.classifier(x)
  61. return x
  62. def vggm(num_classes=1000, pretrained='imagenet'):
  63. if pretrained:
  64. settings = pretrained_settings['vggm'][pretrained]
  65. assert num_classes == settings['num_classes'], \
  66. "num_classes should be {}, but is {}".format(settings['num_classes'], num_classes)
  67. model = VGGM(num_classes=num_classes)
  68. model.load_state_dict(torch.load('../vggm.pth', map_location=lambda storage, loc: storage))
  69. model.input_space = settings['input_space']
  70. model.input_size = settings['input_size']
  71. model.input_range = settings['input_range']
  72. model.mean = settings['mean']
  73. model.std = settings['std']
  74. else:
  75. model = VGGM(num_classes=num_classes)
  76. return model

 

文章来源: blog.csdn.net,作者:小小谢先生,版权归原作者所有,如需转载,请联系作者。

原文链接:blog.csdn.net/xiewenrui1996/article/details/107253152

【版权声明】本文为华为云社区用户转载文章,如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。