轻量化Backbone VGNetG成就“不做选择,全都要”轻量化主干网络

举报
风吹稻花香 发表于 2022/08/05 00:35:05 2022/08/05
【摘要】 128*128 gpu 1060 8ms cpu 11ms skipnet gpu 5ms gpu 18ms 现代高效卷积神经网络 (CNN) 总是使用深度可分离卷积 (DSC) 和神经网络架构搜索 (NAS) 来减少参数数量和计算复杂度。但忽略了网络的一些固有特征。受可视化特征图和 N×N(N>1) ...

128*128 gpu 1060 8ms cpu 11ms

skipnet gpu 5ms gpu 18ms

现代高效卷积神经网络 (CNN) 总是使用深度可分离卷积 (DSC) 和神经网络架构搜索 (NAS) 来减少参数数量和计算复杂度。但忽略了网络的一些固有特征。受可视化特征图和 N×N(N>1) 卷积核的启发,本文引入了几个指导方针,以进一步提高参数效率和推理速度。

基于这些指导方针设计的参数高效 CNN 架构称为 VGNetG,比以前的网络实现了更好的精度和更低的延迟,参数减少了大约 30%~50%。VGNetG-1.0MP 在 ImageNet 分类数据集上以 0.99M 的参数实现了 67.7% 的 top-1 准确率,在 1.14M 的参数下实现了 69.2% 的 top-1 准确率。

此外,证明边缘检测器可以通过用固定边缘检测 kernel替换 N×N kernel来替换可学习的深度卷积层来混合特征。VGNetF-1.5MP 达到 64.4%(-3.2%) 的 top-1 准确率和 66.2%(-1.4%) 的 top-1 准确率,并带有额外的高斯kernel

1本文方法

作者主要是研究了由标准卷积构造的3个典型网络:

  1. 标准卷积==>ResNet-RS

  2. 组卷积==>RegNet

  3. 深度可分离卷积==>MobileNetShuffleNetV2EfficientNets

这些可视化结果表明,M×N×N kernel在网络的不同阶段具有明显不同的模式和分布。

1.1 CNN可以学习如何满足采样定理

以前的工作一直认为卷积神经网络忽略了经典的采样定理,但是作者发现卷积神经网络通过学习低通滤波器可以在一定程度上满足采样定理,尤其是基于 DSCs 的网络,例如 MobileNetV1 和 EfficientNets,如图 2 所示。

1、标准卷积/组卷积

如图 2a 和 2b 所示,在整个 M×N×N 个kernel中存在一个或多个显著 N×N 个kernel,例如模糊kernel,这种现象也意味着这些层的参数是冗余的。请注意,显著kernel不一定看起来像高斯kernel

2、深度可分离卷积

Strided-DSC 的kernel通常类似于高斯kernel,包括但不限于 MobileNetV1MobileNetV2MobileNetV3ShuffleNetV2ReXNetEfficientNets。此外,Strided-DSC kernel的分布不是高斯分布,而是高斯混合分布。

3、最后一个卷积层的Kernels

现代 CNN 总是在分类器之前使用全局池化层来降低维度。因此,类似的现象也出现在最后的深度卷积层上,如图 4 所示。

这些可视化表明应该在下采样层和最后一层选择深度卷积而不是标准卷积和组卷积。此外,可以在下采样层中使用固定的高斯kernel

1.2 重用相邻图层之间的特征图

Identity Kernel和类似特征图

如上图所示,许多深度卷积核仅在中心具有较大的值,就像网络中间部分的恒等核一样。由于输入只是传递到下一层,因此带有恒等核的卷积会导致特征图重复和计算冗余。另一方面,下图显示许多特征图在相邻层之间是相似的(重复的)。

因此,可以用恒等映射代替部分卷积。否则,深度卷积在早期层中很慢,因为它们通常不能充分利用 Shufflenet V2 中报告的现代加速器。所以这种方法可以提高参数效率和推理时间。

1.3 边缘检测器作为可学习的深度卷积

边缘特征包含有关图像的重要信息。如下图所示,大部分kernel近似于边缘检测kernel,例如 Sobel 滤波器 kernel 和拉普拉斯滤波器 kernel。并且这种kernel的比例在后面的层中减少,而喜欢模糊kernelkernel比例增加。

因此,也许边缘检测器可以取代基于 DSC 的网络中的深度卷积,以混合不同空间位置之间的特征。作者将通过用边缘检测kernel替换可学习kernel来证明这一点。

2网络架构

2.1 DownsamplingBlock

DownsamplingBlock 将分辨率减半并扩展通道数。如图 a 所示,仅扩展通道由逐点卷积生成以重用特征。深度卷积的核可以随机初始化或使用固定的高斯核。

2.2 HalfIdentityBlock

如图 b 所示,用恒等映射替换半深度卷积,并在保持块宽度的同时减少half pointwise convolutions。

请注意,输入的右半通道成为输出的左半通道,以便更好地重用特征。

2.3 VGNet Architecture

使用 DownsamplingBlock 和 HalfIdentityBlock 构建了受参数数量限制的 VGNets。整体 VGNetG-1.0MP 架构如表 1 所示。

2.4 Variants of VGNet

为了进一步研究 N×N 内核的影响,引入了 VGNets 的几个变体:VGNetCVGNetG 和 VGNetF

VGNetC:所有参数都是随机初始化和可学习的。

VGNetG:除 DownsamplingBlock 的内核外,所有参数都是随机初始化和可学习的。

VGNetF:深度卷积的所有参数都是固定的。

3实验

4参考

[1].EFFICIENT CNN ARCHITECTURE DESIGN GUIDED BY VISUALIZATION.

原文地址:

cv-models/vgnet.py at 8817ebcdc4a06e9843c88a730126b223a7869441 · ffiirree/cv-models · GitHub

我把所有代码合并到一个文件: 


  
  1. import time
  2. from functools import partial
  3. import os
  4. import torch
  5. import torch.nn as nn
  6. from typing import Any, List, OrderedDict, Union
  7. from functools import partial
  8. import torch
  9. import torch.nn as nn
  10. import torch.nn.functional as F
  11. _NORM_POSIITON: str = 'before'
  12. _NORMALIZER: nn.Module = nn.BatchNorm2d
  13. _NONLINEAR: nn.Module = partial(nn.ReLU, inplace=True)
  14. _SE_INNER_NONLINEAR: nn.Module = partial(nn.ReLU, inplace=True)
  15. _SE_GATING_FN: nn.Module = nn.Sigmoid
  16. _SE_DIVISOR: int = 8
  17. _SE_USE_NORM: bool = False
  18. def get_gaussian_kernel1d(kernel_size, sigma: torch.Tensor):
  19. ksize_half = (kernel_size - 1) * 0.5
  20. x = torch.linspace(-ksize_half, ksize_half, steps=kernel_size).to(sigma.device)
  21. pdf = torch.exp(-0.5 * (x / sigma).pow(2))
  22. return pdf / pdf.sum()
  23. def get_gaussian_kernel2d(kernel_size, sigma: torch.Tensor):
  24. kernel1d = get_gaussian_kernel1d(kernel_size, sigma)
  25. return torch.mm(kernel1d[:, None], kernel1d[None, :])
  26. class ChannelChunk(nn.Module):
  27. def __init__(self, groups: int):
  28. super().__init__()
  29. self.groups = groups
  30. def forward(self, x):
  31. return torch.chunk(x, self.groups, dim=1)
  32. def extra_repr(self):
  33. return f'groups={self.groups}'
  34. class ChannelSplit(nn.Module):
  35. def __init__(self, sections):
  36. super().__init__()
  37. self.sections = sections
  38. def forward(self, x):
  39. return torch.split(x, self.sections, dim=1)
  40. def extra_repr(self):
  41. return f'sections={self.sections}'
  42. class Combine(nn.Module):
  43. def __init__(self, method: str = 'ADD', *args, **kwargs):
  44. super().__init__()
  45. assert method in ['ADD', 'CONCAT'], ''
  46. self.method = method
  47. self._combine = self._add if self.method == 'ADD' else self._cat
  48. @staticmethod
  49. def _add(x):
  50. return x[0] + x[1]
  51. @staticmethod
  52. def _cat(x):
  53. return torch.cat(x, dim=1)
  54. def forward(self, x):
  55. return self._combine(x)
  56. def extra_repr(self):
  57. return f'method=\'{self.method}\''
  58. class PointwiseConv2d(nn.Conv2d):
  59. def __init__(self, inp, oup, stride: int = 1, bias: bool = False, groups: int = 1):
  60. super().__init__(inp, oup, 1, stride=stride, padding=0, bias=bias, groups=groups)
  61. def normalizer_fn(channels):
  62. return _NORMALIZER(channels)
  63. def activation_fn():
  64. return _NONLINEAR()
  65. def channel_shuffle(x, groups):
  66. batchsize, num_channels, height, width = x.data.size()
  67. channels_per_group = num_channels // groups
  68. # reshape
  69. x = x.view(batchsize, groups, channels_per_group, height, width)
  70. x = torch.transpose(x, 1, 2).contiguous()
  71. # flatten
  72. x = x.view(batchsize, -1, height, width)
  73. return x
  74. def norm_activation(channels, normalizer_fn: nn.Module = None, activation_fn: nn.Module = None, norm_position: str = None) -> List[nn.Module]:
  75. norm_position = norm_position or _NORM_POSIITON
  76. assert norm_position in ['before', 'after', 'none'], ''
  77. normalizer_fn = normalizer_fn or _NORMALIZER
  78. activation_fn = activation_fn or _NONLINEAR
  79. if normalizer_fn == None and activation_fn == None:
  80. return []
  81. if normalizer_fn == None:
  82. return [activation_fn()]
  83. if activation_fn == None:
  84. return [normalizer_fn(channels)]
  85. if norm_position == 'after':
  86. return [activation_fn(), normalizer_fn(channels)]
  87. return [normalizer_fn(channels), activation_fn()]
  88. def make_divisible(value, divisor, min_value=None):
  89. if min_value is None:
  90. min_value = divisor
  91. new_value = max(min_value, int(value + divisor / 2) // divisor * divisor)
  92. # Make sure that round down does not go down by more than 10%.
  93. if new_value < 0.9 * value:
  94. new_value += divisor
  95. return new_value
  96. class Stage(nn.Sequential):
  97. def __init__(self, *args):
  98. if len(args) == 1 and isinstance(args[0], list):
  99. args = args[0]
  100. super().__init__(*args)
  101. def append(self, m: Union[nn.Module, List[nn.Module]]):
  102. if isinstance(m, nn.Module):
  103. self.add_module(str(len(self)), m)
  104. elif isinstance(m, list):
  105. [self.append(i) for i in m]
  106. else:
  107. ValueError('')
  108. class Affine(nn.Module):
  109. def __init__(self, dim):
  110. super().__init__()
  111. self.dim = dim
  112. self.alpha = nn.Parameter(torch.ones(dim, 1, 1))
  113. self.beta = nn.Parameter(torch.zeros(dim, 1, 1))
  114. def forward(self, x):
  115. return self.alpha * x + self.beta
  116. def extra_repr(self):
  117. return f'{self.dim}'
  118. class Conv2d3x3(nn.Conv2d):
  119. def __init__(self, in_channels: int, out_channels: int, stride: int = 1, padding: int = None, dilation: int = 1, bias: bool = False, groups: int = 1):
  120. padding = padding if padding is not None else dilation
  121. super().__init__(in_channels, out_channels, 3, stride=stride, padding=padding, dilation=dilation, bias=bias, groups=groups)
  122. class Conv2d1x1(nn.Conv2d):
  123. def __init__(self, in_channels: int, out_channels: int, stride: int = 1, padding: int = 0, bias: bool = False, groups: int = 1):
  124. super().__init__(in_channels, out_channels, 1, stride=stride, padding=padding, bias=bias, groups=groups)
  125. class Conv2d3x3BN(nn.Sequential):
  126. def __init__(self, in_channels: int, out_channels: int, stride: int = 1, padding: int = None, dilation: int = 1, bias: bool = False, groups: int = 1, normalizer_fn: nn.Module = None):
  127. normalizer_fn = normalizer_fn or _NORMALIZER
  128. padding = padding if padding is not None else dilation
  129. super().__init__(Conv2d3x3(in_channels, out_channels, stride=stride, padding=padding, dilation=dilation, bias=bias, groups=groups))
  130. if normalizer_fn:
  131. self.add_module(str(self.__len__()), normalizer_fn(out_channels))
  132. class Conv2d1x1BN(nn.Sequential):
  133. def __init__(self, in_channels: int, out_channels: int, stride: int = 1, padding: int = 0, bias: bool = False, groups: int = 1, normalizer_fn: nn.Module = None):
  134. normalizer_fn = normalizer_fn or _NORMALIZER
  135. super().__init__(Conv2d1x1(in_channels, out_channels, stride=stride, padding=padding, bias=bias, groups=groups))
  136. if normalizer_fn:
  137. self.add_module(str(self.__len__()), normalizer_fn(out_channels))
  138. class Conv2d1x1Block(nn.Sequential):
  139. def __init__(self, in_channels: int, out_channels: int, stride: int = 1, padding: int = 0, bias: bool = False, groups: int = 1, normalizer_fn: nn.Module = None, activation_fn: nn.Module = None,
  140. norm_position: str = None):
  141. super().__init__(Conv2d1x1(in_channels, out_channels, stride=stride, padding=padding, bias=bias, groups=groups), *norm_activation(out_channels, normalizer_fn, activation_fn, norm_position))
  142. class Conv2dBlock(nn.Sequential):
  143. def __init__(self, in_channels, out_channels, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, bias: bool = False, groups: int = 1, normalizer_fn: nn.Module = None,
  144. activation_fn: nn.Module = None, norm_position: str = None, ):
  145. if padding is None:
  146. padding = ((kernel_size - 1) * (dilation - 1) + kernel_size) // 2
  147. super().__init__(nn.Conv2d(in_channels, out_channels, kernel_size=kernel_size, bias=bias, stride=stride, padding=padding, dilation=dilation, groups=groups),
  148. *norm_activation(out_channels, normalizer_fn, activation_fn, norm_position))
  149. class DropPath(nn.Module):
  150. """Stochastic Depth: Drop paths per sample (when applied in main path of residual blocks)"""
  151. def __init__(self, survival_prob: float):
  152. super().__init__()
  153. self.p = survival_prob
  154. def forward(self, x):
  155. if self.p == 1. or not self.training:
  156. return x
  157. # work with diff dim tensors, not just 2D ConvNets
  158. shape = (x.shape[0],) + (1,) * (x.ndim - 1)
  159. probs = self.p + torch.rand(shape, dtype=x.dtype, device=x.device)
  160. # We therefore need to re-calibrate the outputs of any given function f
  161. # by the expected number of times it participates in training, p.
  162. return (x / self.p) * probs.floor_()
  163. def extra_repr(self):
  164. return f'survival_prob={self.p}'
  165. class PointwiseBlock(nn.Sequential):
  166. def __init__(self, inp, oup, stride: int = 1, groups: int = 1, normalizer_fn: nn.Module = None, activation_fn: nn.Module = None, norm_position: str = None, ):
  167. super().__init__(PointwiseConv2d(inp, oup, stride=stride, groups=groups), *norm_activation(oup, normalizer_fn, activation_fn, norm_position))
  168. class DepthwiseConv2dBN(nn.Sequential):
  169. def __init__(self, inp, oup, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, normalizer_fn: nn.Module = None):
  170. normalizer_fn = normalizer_fn or _NORMALIZER
  171. super().__init__(DepthwiseConv2d(inp, oup, kernel_size, stride=stride, padding=padding, dilation=dilation), normalizer_fn(oup))
  172. class DepthwiseBlock(nn.Sequential):
  173. def __init__(self, inp, oup, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, normalizer_fn: nn.Module = None, activation_fn: nn.Module = None,
  174. norm_position: str = None):
  175. super().__init__(DepthwiseConv2d(inp, oup, kernel_size, stride, padding=padding, dilation=dilation), *norm_activation(oup, normalizer_fn, activation_fn, norm_position))
  176. class ChannelShuffle(nn.Module):
  177. def __init__(self, groups: int):
  178. super().__init__()
  179. self.groups = groups
  180. def forward(self, x):
  181. return channel_shuffle(x, self.groups)
  182. def extra_repr(self):
  183. return 'groups={}'.format(self.groups)
  184. class DepthwiseConv2d(nn.Conv2d):
  185. def __init__(self, inp, oup, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, bias: bool = False, ):
  186. if padding is None:
  187. padding = ((kernel_size - 1) * (dilation - 1) + kernel_size) // 2
  188. super().__init__(inp, oup, kernel_size, stride=stride, padding=padding, dilation=dilation, bias=bias, groups=inp)
  189. class PointwiseConv2d(nn.Conv2d):
  190. def __init__(self, inp, oup, stride: int = 1, bias: bool = False, groups: int = 1):
  191. super().__init__(inp, oup, 1, stride=stride, padding=0, bias=bias, groups=groups)
  192. class DepthwiseConv2dBN(nn.Sequential):
  193. def __init__(self, inp, oup, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, normalizer_fn: nn.Module = None):
  194. normalizer_fn = normalizer_fn or _NORMALIZER
  195. super().__init__(DepthwiseConv2d(inp, oup, kernel_size, stride=stride, padding=padding, dilation=dilation), normalizer_fn(oup))
  196. class PointwiseBlock(nn.Sequential):
  197. def __init__(self, inp, oup, stride: int = 1, groups: int = 1, normalizer_fn: nn.Module = None, activation_fn: nn.Module = None, norm_position: str = None, ):
  198. super().__init__(PointwiseConv2d(inp, oup, stride=stride, groups=groups), *norm_activation(oup, normalizer_fn, activation_fn, norm_position))
  199. class SEBlock(nn.Sequential):
  200. """Squeeze excite block
  201. """
  202. def __init__(self, channels, ratio, inner_activation_fn: nn.Module = None, gating_fn: nn.Module = None):
  203. squeezed_channels = make_divisible(int(channels * ratio), _SE_DIVISOR)
  204. inner_activation_fn = inner_activation_fn or _SE_INNER_NONLINEAR
  205. gating_fn = gating_fn or _SE_GATING_FN
  206. layers = OrderedDict([])
  207. layers['pool'] = nn.AdaptiveAvgPool2d((1, 1))
  208. layers['reduce'] = Conv2d1x1(channels, squeezed_channels, bias=True)
  209. if _SE_USE_NORM:
  210. layers['norm'] = _NORMALIZER(squeezed_channels)
  211. layers['act'] = inner_activation_fn()
  212. layers['expand'] = Conv2d1x1(squeezed_channels, channels, bias=True)
  213. layers['gate'] = gating_fn()
  214. super().__init__(layers)
  215. def _forward(self, input):
  216. for module in self:
  217. input = module(input)
  218. return input
  219. def forward(self, x):
  220. return x * self._forward(x)
  221. class InvertedResidualBlock(nn.Module):
  222. def __init__(self, inp, oup, t, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, se_ratio: float = None, se_ind: bool = False, survival_prob: float = None,
  223. normalizer_fn: nn.Module = None, activation_fn: nn.Module = None, dw_se_act: nn.Module = None):
  224. super().__init__()
  225. self.inp = inp
  226. self.planes = int(self.inp * t)
  227. self.oup = oup
  228. self.stride = stride
  229. self.apply_residual = (self.stride == 1) and (self.inp == self.oup)
  230. self.se_ratio = se_ratio if se_ind or se_ratio is None else (se_ratio / t)
  231. self.has_se = (self.se_ratio is not None) and (self.se_ratio > 0) and (self.se_ratio <= 1)
  232. normalizer_fn = normalizer_fn or _NORMALIZER
  233. activation_fn = activation_fn or _NONLINEAR
  234. layers = []
  235. if t != 1:
  236. layers.append(Conv2d1x1Block(inp, self.planes, normalizer_fn=normalizer_fn, activation_fn=activation_fn))
  237. if dw_se_act is None:
  238. layers.append(DepthwiseBlock(self.planes, self.planes, kernel_size, stride=self.stride, padding=padding, dilation=dilation, normalizer_fn=normalizer_fn, activation_fn=activation_fn))
  239. else:
  240. layers.append(DepthwiseConv2dBN(self.planes, self.planes, kernel_size, stride=self.stride, padding=padding, dilation=dilation, normalizer_fn=normalizer_fn))
  241. if self.has_se:
  242. layers.append(SEBlock(self.planes, self.se_ratio))
  243. if dw_se_act:
  244. layers.append(dw_se_act())
  245. layers.append(Conv2d1x1BN(self.planes, oup, normalizer_fn=normalizer_fn))
  246. if self.apply_residual and survival_prob:
  247. layers.append(DropPath(survival_prob))
  248. self.branch1 = nn.Sequential(*layers)
  249. self.branch2 = nn.Identity() if self.apply_residual else None
  250. self.combine = Combine('ADD') if self.apply_residual else None
  251. def forward(self, x):
  252. if self.apply_residual:
  253. return self.combine([self.branch2(x), self.branch1(x)])
  254. else:
  255. return self.branch1(x)
  256. class FusedInvertedResidualBlock(nn.Module):
  257. def __init__(self, inp, oup, t, kernel_size: int = 3, stride: int = 1, padding: int = None, se_ratio: float = None, se_ind: bool = False, survival_prob: float = None,
  258. normalizer_fn: nn.Module = None, activation_fn: nn.Module = None):
  259. super().__init__()
  260. self.inp = inp
  261. self.planes = int(self.inp * t)
  262. self.oup = oup
  263. self.stride = stride
  264. self.padding = padding if padding is not None else (kernel_size // 2)
  265. self.apply_residual = (self.stride == 1) and (self.inp == self.oup)
  266. self.se_ratio = se_ratio if se_ind or se_ratio is None else (se_ratio / t)
  267. self.has_se = (self.se_ratio is not None) and (self.se_ratio > 0) and (self.se_ratio <= 1)
  268. normalizer_fn = normalizer_fn or _NORMALIZER
  269. activation_fn = activation_fn or _NONLINEAR
  270. layers = [Conv2dBlock(inp, self.planes, kernel_size, stride=self.stride, padding=self.padding, normalizer_fn=normalizer_fn, activation_fn=activation_fn)]
  271. if self.has_se:
  272. layers.append(SEBlock(self.planes, self.se_ratio))
  273. layers.append(Conv2d1x1BN(self.planes, oup, normalizer_fn=normalizer_fn))
  274. if self.apply_residual and survival_prob:
  275. layers.append(DropPath(survival_prob))
  276. self.branch1 = nn.Sequential(*layers)
  277. self.branch2 = nn.Identity() if self.apply_residual else None
  278. self.combine = Combine('ADD') if self.apply_residual else None
  279. def forward(self, x):
  280. if self.apply_residual:
  281. return self.combine([self.branch2(x), self.branch1(x)])
  282. else:
  283. return self.branch1(x)
  284. class SharedDepthwiseConv2d(nn.Module):
  285. def __init__(self, channels, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, t: int = 2, bias: bool = False):
  286. super().__init__()
  287. self.channels = channels // t
  288. self.t = t
  289. if padding is None:
  290. padding = ((kernel_size - 1) * (dilation - 1) + kernel_size) // 2
  291. self.mux = DepthwiseConv2d(self.channels, self.channels, kernel_size, stride, padding, dilation, bias=bias)
  292. def forward(self, x):
  293. x = torch.chunk(x, self.t, dim=1)
  294. x = [self.mux(xi) for xi in x]
  295. return torch.cat(x, dim=1)
  296. class HalfIdentityBlock(nn.Module):
  297. def __init__(self, inp: int, se_ratio: float = 0.0):
  298. super().__init__()
  299. self.half3x3 = Conv2d3x3(inp // 2, inp // 2, groups=(inp // 2))
  300. self.combine = Combine('CONCAT')
  301. self.conv1x1 = PointwiseBlock(inp, inp // 2)
  302. if se_ratio > 0.0:
  303. self.conv1x1 = nn.Sequential(PointwiseBlock(inp, inp // 2), SEBlock(inp // 2, se_ratio))
  304. def forward(self, x):
  305. out = self.combine([x[0], self.half3x3(x[1])])
  306. return [x[1], self.conv1x1(out)]
  307. class GaussianBlur(nn.Module):
  308. def __init__(self, channels: int, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, sigma: float = 1.0, learnable: bool = True):
  309. super().__init__()
  310. padding = padding or ((kernel_size - 1) * (dilation - 1) + kernel_size) // 2
  311. self.channels = channels
  312. self.kernel_size = (kernel_size, kernel_size)
  313. self.padding = (padding, padding)
  314. self.stride = (stride, stride)
  315. self.dilation = (dilation, dilation)
  316. self.padding_mode = 'zeros'
  317. self.learnable = learnable
  318. self.sigma = nn.Parameter(torch.tensor(sigma), learnable)
  319. def forward(self, x):
  320. return F.conv2d(x, self.weight, None, self.stride, self.padding, self.dilation, self.channels)
  321. @property
  322. def weight(self):
  323. kernel = get_gaussian_kernel2d(self.kernel_size[0], self.sigma)
  324. return kernel.repeat(self.channels, 1, 1, 1)
  325. @property
  326. def out_channels(self):
  327. return self.channels
  328. def extra_repr(self):
  329. s = ('{channels}, kernel_size={kernel_size}'
  330. ', learnable={learnable}, stride={stride}')
  331. if self.padding != (0,) * len(self.padding):
  332. s += ', padding={padding}'
  333. if self.dilation != (1,) * len(self.dilation):
  334. s += ', dilation={dilation}'
  335. if self.padding_mode != 'zeros':
  336. s += ', padding_mode={padding_mode}'
  337. return s.format(**self.__dict__)
  338. class DownsamplingBlock(nn.Module):
  339. def __init__(self, inp, oup, stride: int = 2, method: str = 'blur', se_ratio: float = 0.0):
  340. assert method in ['blur', 'dwconv', 'maxpool'], f'{method}'
  341. super().__init__()
  342. if method == 'dwconv' or stride == 1:
  343. self.downsample = DepthwiseConv2d(inp, inp, 3, stride)
  344. elif method == 'maxpool':
  345. self.downsample = nn.MaxPool2d(kernel_size=3, stride=stride)
  346. elif method == 'blur':
  347. self.downsample = GaussianBlur(inp, stride=stride, sigma=1.1, learnable=False)
  348. else:
  349. ValueError(f'Unknown downsampling method: {method}.')
  350. split_chs = 0 if inp > oup else min(oup // 2, inp)
  351. self.split = ChannelSplit([inp - split_chs, split_chs])
  352. self.conv1x1 = PointwiseBlock(inp, oup - split_chs)
  353. if se_ratio > 0.0:
  354. self.conv1x1 = nn.Sequential(PointwiseBlock(inp, oup - split_chs), SEBlock(oup - split_chs, se_ratio))
  355. self.halve = nn.Identity()
  356. if oup > 2 * inp or inp > oup:
  357. self.halve = nn.Sequential(Combine('CONCAT'), ChannelChunk(2))
  358. def forward(self, x):
  359. x = self.downsample(x)
  360. _, x2 = self.split(x)
  361. return self.halve([x2, self.conv1x1(x)])
  362. class VGNet(nn.Module):
  363. def __init__(self, in_channels: int = 3, num_classes: int = 1000, channels: List[int] = None, downsamplings: List[str] = None, layers: List[int] = None, se_ratio: float = 0.0,
  364. thumbnail: bool = False, **kwargs: Any):
  365. super().__init__()
  366. position = 'after'
  367. FRONT_S = 1 if thumbnail else 2
  368. strides = [FRONT_S, 2, 2, 2]
  369. self.features = nn.Sequential(OrderedDict([('stem', Conv2dBlock(in_channels, channels[0], stride=FRONT_S))]))
  370. for i in range(len(strides)):
  371. self.features.add_module(f'stage{i + 1}', self.make_layers(channels[i], channels[i + 1], strides[i], downsamplings[i], layers[i], se_ratio))
  372. self.features.stage4.append(nn.Sequential(# DepthwiseConv2d(channels[-1], channels[-1]),
  373. SharedDepthwiseConv2d(channels[-1], t=8), PointwiseBlock(channels[-1], channels[-1]), ))
  374. self.avg = nn.AdaptiveAvgPool2d((1, 1))
  375. self.classifier = nn.Linear(channels[-1], num_classes)
  376. def make_layers(self, inp, oup, s, m, n, se_ratio):
  377. layers = [DownsamplingBlock(inp, oup, stride=s, method=m, se_ratio=se_ratio)]
  378. for _ in range(n - 1):
  379. layers.append(HalfIdentityBlock(oup, se_ratio))
  380. layers.append(Combine('CONCAT'))
  381. return Stage(layers)
  382. def forward(self, x):
  383. x = self.features(x)
  384. x = self.avg(x)
  385. x = torch.flatten(x, 1)
  386. x = self.classifier(x)
  387. return x
  388. def _vgnet(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
  389. model = VGNet(**kwargs)
  390. if pretrained:
  391. if pth is not None:
  392. state_dict = torch.load(os.path.expanduser(pth))
  393. else:
  394. assert 'url' in kwargs and kwargs['url'] != '', 'Invalid URL.'
  395. state_dict = torch.hub.load_state_dict_from_url(kwargs['url'], progress=progress)
  396. model.load_state_dict(state_dict)
  397. return model
  398. # @export
  399. # @nonlinear(partial(nn.SiLU, inplace=True))
  400. # def vgnetg_1_0mp_se(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
  401. # kwargs['channels'] = [28, 56, 112, 224, 368]
  402. # kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
  403. # kwargs['layers'] = [4, 7, 13, 2]
  404. # kwargs['se_ratio'] = 0.25
  405. # return _vgnet(pretrained, pth, progress, **kwargs)
  406. #
  407. #
  408. # @export
  409. # @nonlinear(partial(nn.SiLU, inplace=True))
  410. # def vgnetg_1_5mp_se(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
  411. # kwargs['channels'] = [32, 64, 128, 256, 512]
  412. # kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
  413. # kwargs['layers'] = [3, 7, 14, 2]
  414. # kwargs['se_ratio'] = 0.25
  415. # return _vgnet(pretrained, pth, progress, **kwargs)
  416. #
  417. #
  418. # @export
  419. # @nonlinear(partial(nn.SiLU, inplace=True))
  420. # def vgnetg_2_0mp_se(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
  421. # kwargs['channels'] = [32, 72, 168, 376, 512]
  422. # kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
  423. # kwargs['layers'] = [3, 6, 13, 2]
  424. # kwargs['se_ratio'] = 0.25
  425. # return _vgnet(pretrained, pth, progress, **kwargs)
  426. #
  427. #
  428. # @export
  429. # @nonlinear(partial(nn.SiLU, inplace=True))
  430. # def vgnetg_2_5mp_se(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
  431. # kwargs['channels'] = [32, 80, 192, 400, 544]
  432. # kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
  433. # kwargs['layers'] = [3, 6, 16, 2]
  434. # kwargs['se_ratio'] = 0.25
  435. # return _vgnet(pretrained, pth, progress, **kwargs)
  436. #
  437. #
  438. # @export
  439. # @nonlinear(partial(nn.SiLU, inplace=True))
  440. # def vgnetg_5_0mp_se(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
  441. # kwargs['channels'] = [32, 88, 216, 456, 856]
  442. # kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
  443. # kwargs['layers'] = [4, 7, 15, 5]
  444. # kwargs['se_ratio'] = 0.25
  445. # return _vgnet(pretrained, pth, progress, **kwargs)
  446. if __name__ == '__main__':
  447. kwargs = {}
  448. kwargs['channels'] = [28, 56, 112, 224, 368]
  449. kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
  450. kwargs['layers'] = [4, 7, 13, 2]
  451. kwargs['se_ratio'] = 0.25
  452. kwargs['num_classes'] = 4
  453. model = _vgnet(False, "", True, **kwargs)
  454. model.eval() # .cuda()
  455. data = torch.randn(1, 3, 128, 128) # .cuda()
  456. for i in range(20):
  457. start = time.time()
  458. out = model(data)
  459. print('time', time.time() - start, out.size())

文章来源: blog.csdn.net,作者:AI视觉网奇,版权归原作者所有,如需转载,请联系作者。

原文链接:blog.csdn.net/jacke121/article/details/126151698

【版权声明】本文为华为云社区用户转载文章,如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

举报
请填写举报理由
0/200