轻量化Backbone VGNetG成就“不做选择,全都要”轻量化主干网络
128*128 gpu 1060 8ms cpu 11ms
skipnet gpu 5ms gpu 18ms
现代高效卷积神经网络 (CNN
) 总是使用深度可分离卷积
(DSC
) 和神经网络架构搜索
(NAS
) 来减少参数数量和计算复杂度。但忽略了网络的一些固有特征。受可视化特征图和 N×N(N>1) 卷积核的启发,本文引入了几个指导方针,以进一步提高参数效率和推理速度。
基于这些指导方针设计的参数高效 CNN 架构称为
VGNetG
,比以前的网络实现了更好的精度和更低的延迟,参数减少了大约 30%~50%。VGNetG-1.0MP
在ImageNet
分类数据集上以0.99M
的参数实现了67.7%
的top-1
准确率,在1.14M
的参数下实现了69.2%
的top-1
准确率。此外,证明
边缘检测器
可以通过用固定边缘检测 kernel
替换N×N kernel
来替换可学习的深度卷积层来混合特征。VGNetF-1.5MP
达到64.4%
(-3.2%) 的top-1
准确率和66.2%
(-1.4%) 的top-1
准确率,并带有额外的高斯kernel
。
1本文方法
作者主要是研究了由标准卷积构造的3个典型网络:
-
标准卷积==>
ResNet-RS
, -
组卷积==>
RegNet
, -
深度可分离卷积==>
MobileNet
、ShuffleNetV2
和EfficientNets
。
这些可视化结果表明,M×N×N kernel在网络的不同阶段具有明显不同的模式和分布。
1.1 CNN可以学习如何满足采样定理
以前的工作一直认为卷积神经网络忽略了经典的采样定理,但是作者发现卷积神经网络通过学习低通滤波器可以在一定程度上满足采样定理,尤其是基于 DSCs
的网络,例如 MobileNetV1
和 EfficientNets
,如图 2 所示。
1、标准卷积/组卷积
如图 2a 和 2b 所示,在整个 M×N×N 个kernel
中存在一个或多个显著 N×N 个kernel
,例如模糊kernel
,这种现象也意味着这些层的参数是冗余的。请注意,显著kernel
不一定看起来像高斯kernel
。
2、深度可分离卷积
Strided-DSC
的kernel
通常类似于高斯kernel
,包括但不限于 MobileNetV1
、MobileNetV2
、MobileNetV3
、ShuffleNetV2
、ReXNet
、EfficientNets
。此外,Strided-DSC
kernel的分布不是高斯分布,而是高斯混合分布。
3、最后一个卷积层的Kernels
现代 CNN 总是在分类器之前使用全局池化层来降低维度。因此,类似的现象也出现在最后的深度卷积层上,如图 4 所示。
这些可视化表明应该在下采样层和最后一层选择深度卷积而不是标准卷积和组卷积。此外,可以在下采样层中使用固定的高斯kernel
。
1.2 重用相邻图层之间的特征图
Identity Kernel和类似特征图
如上图所示,许多深度卷积核仅在中心具有较大的值,就像网络中间部分的恒等核一样。由于输入只是传递到下一层,因此带有恒等核的卷积会导致特征图重复和计算冗余。另一方面,下图显示许多特征图在相邻层之间是相似的(重复的)。
因此,可以用恒等映射代替部分卷积。否则,深度卷积在早期层中很慢,因为它们通常不能充分利用 Shufflenet V2
中报告的现代加速器。所以这种方法可以提高参数效率和推理时间。
1.3 边缘检测器作为可学习的深度卷积
边缘特征包含有关图像的重要信息。如下图所示,大部分kernel
近似于边缘检测kernel
,例如 Sobel 滤波器 kernel
和拉普拉斯滤波器 kernel
。并且这种kernel
的比例在后面的层中减少,而喜欢模糊kernel
的kernel
比例增加。
因此,也许边缘检测器
可以取代基于 DSC
的网络中的深度卷积,以混合不同空间位置之间的特征。作者将通过用边缘检测kernel
替换可学习kernel
来证明这一点。
2网络架构
2.1 DownsamplingBlock
DownsamplingBlock
将分辨率减半并扩展通道数。如图 a 所示,仅扩展通道由逐点卷积生成以重用特征。深度卷积的核可以随机初始化或使用固定的高斯核。
2.2 HalfIdentityBlock
如图 b 所示,用恒等映射替换半深度卷积,并在保持块宽度的同时减少half pointwise convolutions。
请注意,输入的右半通道成为输出的左半通道,以便更好地重用特征。
2.3 VGNet Architecture
使用 DownsamplingBlock
和 HalfIdentityBlock
构建了受参数数量限制的 VGNets
。整体 VGNetG-1.0MP
架构如表 1 所示。
2.4 Variants of VGNet
为了进一步研究 N×N 内核的影响,引入了 VGNets
的几个变体:VGNetC
、VGNetG
和 VGNetF
。
VGNetC
:所有参数都是随机初始化和可学习的。
VGNetG
:除 DownsamplingBlock
的内核外,所有参数都是随机初始化和可学习的。
VGNetF
:深度卷积的所有参数都是固定的。
3实验
4参考
[1].EFFICIENT CNN ARCHITECTURE DESIGN GUIDED BY VISUALIZATION.
原文地址:
cv-models/vgnet.py at 8817ebcdc4a06e9843c88a730126b223a7869441 · ffiirree/cv-models · GitHub
我把所有代码合并到一个文件:
-
import time
-
from functools import partial
-
import os
-
import torch
-
import torch.nn as nn
-
from typing import Any, List, OrderedDict, Union
-
-
from functools import partial
-
import torch
-
import torch.nn as nn
-
import torch.nn.functional as F
-
-
_NORM_POSIITON: str = 'before'
-
_NORMALIZER: nn.Module = nn.BatchNorm2d
-
_NONLINEAR: nn.Module = partial(nn.ReLU, inplace=True)
-
_SE_INNER_NONLINEAR: nn.Module = partial(nn.ReLU, inplace=True)
-
_SE_GATING_FN: nn.Module = nn.Sigmoid
-
_SE_DIVISOR: int = 8
-
_SE_USE_NORM: bool = False
-
-
-
def get_gaussian_kernel1d(kernel_size, sigma: torch.Tensor):
-
ksize_half = (kernel_size - 1) * 0.5
-
-
x = torch.linspace(-ksize_half, ksize_half, steps=kernel_size).to(sigma.device)
-
pdf = torch.exp(-0.5 * (x / sigma).pow(2))
-
return pdf / pdf.sum()
-
-
-
def get_gaussian_kernel2d(kernel_size, sigma: torch.Tensor):
-
kernel1d = get_gaussian_kernel1d(kernel_size, sigma)
-
return torch.mm(kernel1d[:, None], kernel1d[None, :])
-
-
-
class ChannelChunk(nn.Module):
-
def __init__(self, groups: int):
-
super().__init__()
-
-
self.groups = groups
-
-
def forward(self, x):
-
return torch.chunk(x, self.groups, dim=1)
-
-
def extra_repr(self):
-
return f'groups={self.groups}'
-
-
-
class ChannelSplit(nn.Module):
-
def __init__(self, sections):
-
super().__init__()
-
-
self.sections = sections
-
-
def forward(self, x):
-
return torch.split(x, self.sections, dim=1)
-
-
def extra_repr(self):
-
return f'sections={self.sections}'
-
-
-
class Combine(nn.Module):
-
def __init__(self, method: str = 'ADD', *args, **kwargs):
-
super().__init__()
-
assert method in ['ADD', 'CONCAT'], ''
-
-
self.method = method
-
self._combine = self._add if self.method == 'ADD' else self._cat
-
-
@staticmethod
-
def _add(x):
-
return x[0] + x[1]
-
-
@staticmethod
-
def _cat(x):
-
return torch.cat(x, dim=1)
-
-
def forward(self, x):
-
return self._combine(x)
-
-
def extra_repr(self):
-
return f'method=\'{self.method}\''
-
-
-
class PointwiseConv2d(nn.Conv2d):
-
def __init__(self, inp, oup, stride: int = 1, bias: bool = False, groups: int = 1):
-
super().__init__(inp, oup, 1, stride=stride, padding=0, bias=bias, groups=groups)
-
-
-
def normalizer_fn(channels):
-
return _NORMALIZER(channels)
-
-
-
def activation_fn():
-
return _NONLINEAR()
-
-
-
def channel_shuffle(x, groups):
-
batchsize, num_channels, height, width = x.data.size()
-
channels_per_group = num_channels // groups
-
-
# reshape
-
x = x.view(batchsize, groups, channels_per_group, height, width)
-
x = torch.transpose(x, 1, 2).contiguous()
-
-
# flatten
-
x = x.view(batchsize, -1, height, width)
-
return x
-
-
-
def norm_activation(channels, normalizer_fn: nn.Module = None, activation_fn: nn.Module = None, norm_position: str = None) -> List[nn.Module]:
-
norm_position = norm_position or _NORM_POSIITON
-
assert norm_position in ['before', 'after', 'none'], ''
-
-
normalizer_fn = normalizer_fn or _NORMALIZER
-
activation_fn = activation_fn or _NONLINEAR
-
-
if normalizer_fn == None and activation_fn == None:
-
return []
-
-
if normalizer_fn == None:
-
return [activation_fn()]
-
-
if activation_fn == None:
-
return [normalizer_fn(channels)]
-
-
if norm_position == 'after':
-
return [activation_fn(), normalizer_fn(channels)]
-
-
return [normalizer_fn(channels), activation_fn()]
-
-
-
def make_divisible(value, divisor, min_value=None):
-
if min_value is None:
-
min_value = divisor
-
-
new_value = max(min_value, int(value + divisor / 2) // divisor * divisor)
-
-
# Make sure that round down does not go down by more than 10%.
-
if new_value < 0.9 * value:
-
new_value += divisor
-
-
return new_value
-
-
-
class Stage(nn.Sequential):
-
def __init__(self, *args):
-
if len(args) == 1 and isinstance(args[0], list):
-
args = args[0]
-
super().__init__(*args)
-
-
def append(self, m: Union[nn.Module, List[nn.Module]]):
-
if isinstance(m, nn.Module):
-
self.add_module(str(len(self)), m)
-
elif isinstance(m, list):
-
[self.append(i) for i in m]
-
else:
-
ValueError('')
-
-
-
class Affine(nn.Module):
-
def __init__(self, dim):
-
super().__init__()
-
-
self.dim = dim
-
-
self.alpha = nn.Parameter(torch.ones(dim, 1, 1))
-
self.beta = nn.Parameter(torch.zeros(dim, 1, 1))
-
-
def forward(self, x):
-
return self.alpha * x + self.beta
-
-
def extra_repr(self):
-
return f'{self.dim}'
-
-
-
class Conv2d3x3(nn.Conv2d):
-
def __init__(self, in_channels: int, out_channels: int, stride: int = 1, padding: int = None, dilation: int = 1, bias: bool = False, groups: int = 1):
-
padding = padding if padding is not None else dilation
-
super().__init__(in_channels, out_channels, 3, stride=stride, padding=padding, dilation=dilation, bias=bias, groups=groups)
-
-
-
class Conv2d1x1(nn.Conv2d):
-
def __init__(self, in_channels: int, out_channels: int, stride: int = 1, padding: int = 0, bias: bool = False, groups: int = 1):
-
super().__init__(in_channels, out_channels, 1, stride=stride, padding=padding, bias=bias, groups=groups)
-
-
-
class Conv2d3x3BN(nn.Sequential):
-
def __init__(self, in_channels: int, out_channels: int, stride: int = 1, padding: int = None, dilation: int = 1, bias: bool = False, groups: int = 1, normalizer_fn: nn.Module = None):
-
normalizer_fn = normalizer_fn or _NORMALIZER
-
padding = padding if padding is not None else dilation
-
-
super().__init__(Conv2d3x3(in_channels, out_channels, stride=stride, padding=padding, dilation=dilation, bias=bias, groups=groups))
-
if normalizer_fn:
-
self.add_module(str(self.__len__()), normalizer_fn(out_channels))
-
-
-
class Conv2d1x1BN(nn.Sequential):
-
def __init__(self, in_channels: int, out_channels: int, stride: int = 1, padding: int = 0, bias: bool = False, groups: int = 1, normalizer_fn: nn.Module = None):
-
normalizer_fn = normalizer_fn or _NORMALIZER
-
-
super().__init__(Conv2d1x1(in_channels, out_channels, stride=stride, padding=padding, bias=bias, groups=groups))
-
if normalizer_fn:
-
self.add_module(str(self.__len__()), normalizer_fn(out_channels))
-
-
-
class Conv2d1x1Block(nn.Sequential):
-
def __init__(self, in_channels: int, out_channels: int, stride: int = 1, padding: int = 0, bias: bool = False, groups: int = 1, normalizer_fn: nn.Module = None, activation_fn: nn.Module = None,
-
norm_position: str = None):
-
super().__init__(Conv2d1x1(in_channels, out_channels, stride=stride, padding=padding, bias=bias, groups=groups), *norm_activation(out_channels, normalizer_fn, activation_fn, norm_position))
-
-
-
class Conv2dBlock(nn.Sequential):
-
def __init__(self, in_channels, out_channels, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, bias: bool = False, groups: int = 1, normalizer_fn: nn.Module = None,
-
activation_fn: nn.Module = None, norm_position: str = None, ):
-
if padding is None:
-
padding = ((kernel_size - 1) * (dilation - 1) + kernel_size) // 2
-
-
super().__init__(nn.Conv2d(in_channels, out_channels, kernel_size=kernel_size, bias=bias, stride=stride, padding=padding, dilation=dilation, groups=groups),
-
*norm_activation(out_channels, normalizer_fn, activation_fn, norm_position))
-
-
-
class DropPath(nn.Module):
-
"""Stochastic Depth: Drop paths per sample (when applied in main path of residual blocks)"""
-
-
def __init__(self, survival_prob: float):
-
super().__init__()
-
-
self.p = survival_prob
-
-
def forward(self, x):
-
if self.p == 1. or not self.training:
-
return x
-
-
# work with diff dim tensors, not just 2D ConvNets
-
shape = (x.shape[0],) + (1,) * (x.ndim - 1)
-
-
probs = self.p + torch.rand(shape, dtype=x.dtype, device=x.device)
-
# We therefore need to re-calibrate the outputs of any given function f
-
# by the expected number of times it participates in training, p.
-
return (x / self.p) * probs.floor_()
-
-
def extra_repr(self):
-
return f'survival_prob={self.p}'
-
-
-
class PointwiseBlock(nn.Sequential):
-
def __init__(self, inp, oup, stride: int = 1, groups: int = 1, normalizer_fn: nn.Module = None, activation_fn: nn.Module = None, norm_position: str = None, ):
-
super().__init__(PointwiseConv2d(inp, oup, stride=stride, groups=groups), *norm_activation(oup, normalizer_fn, activation_fn, norm_position))
-
-
-
class DepthwiseConv2dBN(nn.Sequential):
-
def __init__(self, inp, oup, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, normalizer_fn: nn.Module = None):
-
normalizer_fn = normalizer_fn or _NORMALIZER
-
-
super().__init__(DepthwiseConv2d(inp, oup, kernel_size, stride=stride, padding=padding, dilation=dilation), normalizer_fn(oup))
-
-
-
class DepthwiseBlock(nn.Sequential):
-
def __init__(self, inp, oup, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, normalizer_fn: nn.Module = None, activation_fn: nn.Module = None,
-
norm_position: str = None):
-
super().__init__(DepthwiseConv2d(inp, oup, kernel_size, stride, padding=padding, dilation=dilation), *norm_activation(oup, normalizer_fn, activation_fn, norm_position))
-
-
-
class ChannelShuffle(nn.Module):
-
def __init__(self, groups: int):
-
super().__init__()
-
-
self.groups = groups
-
-
def forward(self, x):
-
return channel_shuffle(x, self.groups)
-
-
def extra_repr(self):
-
return 'groups={}'.format(self.groups)
-
-
-
class DepthwiseConv2d(nn.Conv2d):
-
def __init__(self, inp, oup, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, bias: bool = False, ):
-
if padding is None:
-
padding = ((kernel_size - 1) * (dilation - 1) + kernel_size) // 2
-
-
super().__init__(inp, oup, kernel_size, stride=stride, padding=padding, dilation=dilation, bias=bias, groups=inp)
-
-
-
class PointwiseConv2d(nn.Conv2d):
-
def __init__(self, inp, oup, stride: int = 1, bias: bool = False, groups: int = 1):
-
super().__init__(inp, oup, 1, stride=stride, padding=0, bias=bias, groups=groups)
-
-
-
class DepthwiseConv2dBN(nn.Sequential):
-
def __init__(self, inp, oup, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, normalizer_fn: nn.Module = None):
-
normalizer_fn = normalizer_fn or _NORMALIZER
-
-
super().__init__(DepthwiseConv2d(inp, oup, kernel_size, stride=stride, padding=padding, dilation=dilation), normalizer_fn(oup))
-
-
-
class PointwiseBlock(nn.Sequential):
-
def __init__(self, inp, oup, stride: int = 1, groups: int = 1, normalizer_fn: nn.Module = None, activation_fn: nn.Module = None, norm_position: str = None, ):
-
super().__init__(PointwiseConv2d(inp, oup, stride=stride, groups=groups), *norm_activation(oup, normalizer_fn, activation_fn, norm_position))
-
-
-
class SEBlock(nn.Sequential):
-
"""Squeeze excite block
-
"""
-
-
def __init__(self, channels, ratio, inner_activation_fn: nn.Module = None, gating_fn: nn.Module = None):
-
squeezed_channels = make_divisible(int(channels * ratio), _SE_DIVISOR)
-
inner_activation_fn = inner_activation_fn or _SE_INNER_NONLINEAR
-
gating_fn = gating_fn or _SE_GATING_FN
-
-
layers = OrderedDict([])
-
-
layers['pool'] = nn.AdaptiveAvgPool2d((1, 1))
-
layers['reduce'] = Conv2d1x1(channels, squeezed_channels, bias=True)
-
if _SE_USE_NORM:
-
layers['norm'] = _NORMALIZER(squeezed_channels)
-
layers['act'] = inner_activation_fn()
-
layers['expand'] = Conv2d1x1(squeezed_channels, channels, bias=True)
-
layers['gate'] = gating_fn()
-
-
super().__init__(layers)
-
-
def _forward(self, input):
-
for module in self:
-
input = module(input)
-
return input
-
-
def forward(self, x):
-
return x * self._forward(x)
-
-
-
class InvertedResidualBlock(nn.Module):
-
def __init__(self, inp, oup, t, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, se_ratio: float = None, se_ind: bool = False, survival_prob: float = None,
-
normalizer_fn: nn.Module = None, activation_fn: nn.Module = None, dw_se_act: nn.Module = None):
-
super().__init__()
-
-
self.inp = inp
-
self.planes = int(self.inp * t)
-
self.oup = oup
-
self.stride = stride
-
self.apply_residual = (self.stride == 1) and (self.inp == self.oup)
-
self.se_ratio = se_ratio if se_ind or se_ratio is None else (se_ratio / t)
-
self.has_se = (self.se_ratio is not None) and (self.se_ratio > 0) and (self.se_ratio <= 1)
-
-
normalizer_fn = normalizer_fn or _NORMALIZER
-
activation_fn = activation_fn or _NONLINEAR
-
-
layers = []
-
if t != 1:
-
layers.append(Conv2d1x1Block(inp, self.planes, normalizer_fn=normalizer_fn, activation_fn=activation_fn))
-
-
if dw_se_act is None:
-
layers.append(DepthwiseBlock(self.planes, self.planes, kernel_size, stride=self.stride, padding=padding, dilation=dilation, normalizer_fn=normalizer_fn, activation_fn=activation_fn))
-
else:
-
layers.append(DepthwiseConv2dBN(self.planes, self.planes, kernel_size, stride=self.stride, padding=padding, dilation=dilation, normalizer_fn=normalizer_fn))
-
-
if self.has_se:
-
layers.append(SEBlock(self.planes, self.se_ratio))
-
-
if dw_se_act:
-
layers.append(dw_se_act())
-
-
layers.append(Conv2d1x1BN(self.planes, oup, normalizer_fn=normalizer_fn))
-
-
if self.apply_residual and survival_prob:
-
layers.append(DropPath(survival_prob))
-
-
self.branch1 = nn.Sequential(*layers)
-
self.branch2 = nn.Identity() if self.apply_residual else None
-
self.combine = Combine('ADD') if self.apply_residual else None
-
-
def forward(self, x):
-
if self.apply_residual:
-
return self.combine([self.branch2(x), self.branch1(x)])
-
else:
-
return self.branch1(x)
-
-
-
class FusedInvertedResidualBlock(nn.Module):
-
def __init__(self, inp, oup, t, kernel_size: int = 3, stride: int = 1, padding: int = None, se_ratio: float = None, se_ind: bool = False, survival_prob: float = None,
-
normalizer_fn: nn.Module = None, activation_fn: nn.Module = None):
-
super().__init__()
-
-
self.inp = inp
-
self.planes = int(self.inp * t)
-
self.oup = oup
-
self.stride = stride
-
self.padding = padding if padding is not None else (kernel_size // 2)
-
self.apply_residual = (self.stride == 1) and (self.inp == self.oup)
-
self.se_ratio = se_ratio if se_ind or se_ratio is None else (se_ratio / t)
-
self.has_se = (self.se_ratio is not None) and (self.se_ratio > 0) and (self.se_ratio <= 1)
-
-
normalizer_fn = normalizer_fn or _NORMALIZER
-
activation_fn = activation_fn or _NONLINEAR
-
-
layers = [Conv2dBlock(inp, self.planes, kernel_size, stride=self.stride, padding=self.padding, normalizer_fn=normalizer_fn, activation_fn=activation_fn)]
-
-
if self.has_se:
-
layers.append(SEBlock(self.planes, self.se_ratio))
-
-
layers.append(Conv2d1x1BN(self.planes, oup, normalizer_fn=normalizer_fn))
-
-
if self.apply_residual and survival_prob:
-
layers.append(DropPath(survival_prob))
-
-
self.branch1 = nn.Sequential(*layers)
-
self.branch2 = nn.Identity() if self.apply_residual else None
-
self.combine = Combine('ADD') if self.apply_residual else None
-
-
def forward(self, x):
-
if self.apply_residual:
-
return self.combine([self.branch2(x), self.branch1(x)])
-
else:
-
return self.branch1(x)
-
-
-
class SharedDepthwiseConv2d(nn.Module):
-
def __init__(self, channels, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, t: int = 2, bias: bool = False):
-
super().__init__()
-
-
self.channels = channels // t
-
self.t = t
-
-
if padding is None:
-
padding = ((kernel_size - 1) * (dilation - 1) + kernel_size) // 2
-
-
self.mux = DepthwiseConv2d(self.channels, self.channels, kernel_size, stride, padding, dilation, bias=bias)
-
-
def forward(self, x):
-
x = torch.chunk(x, self.t, dim=1)
-
x = [self.mux(xi) for xi in x]
-
return torch.cat(x, dim=1)
-
-
-
class HalfIdentityBlock(nn.Module):
-
def __init__(self, inp: int, se_ratio: float = 0.0):
-
super().__init__()
-
-
self.half3x3 = Conv2d3x3(inp // 2, inp // 2, groups=(inp // 2))
-
self.combine = Combine('CONCAT')
-
self.conv1x1 = PointwiseBlock(inp, inp // 2)
-
-
if se_ratio > 0.0:
-
self.conv1x1 = nn.Sequential(PointwiseBlock(inp, inp // 2), SEBlock(inp // 2, se_ratio))
-
-
def forward(self, x):
-
out = self.combine([x[0], self.half3x3(x[1])])
-
return [x[1], self.conv1x1(out)]
-
-
-
class GaussianBlur(nn.Module):
-
def __init__(self, channels: int, kernel_size: int = 3, stride: int = 1, padding: int = None, dilation: int = 1, sigma: float = 1.0, learnable: bool = True):
-
super().__init__()
-
-
padding = padding or ((kernel_size - 1) * (dilation - 1) + kernel_size) // 2
-
-
self.channels = channels
-
self.kernel_size = (kernel_size, kernel_size)
-
self.padding = (padding, padding)
-
self.stride = (stride, stride)
-
self.dilation = (dilation, dilation)
-
self.padding_mode = 'zeros'
-
self.learnable = learnable
-
-
self.sigma = nn.Parameter(torch.tensor(sigma), learnable)
-
-
def forward(self, x):
-
return F.conv2d(x, self.weight, None, self.stride, self.padding, self.dilation, self.channels)
-
-
@property
-
def weight(self):
-
kernel = get_gaussian_kernel2d(self.kernel_size[0], self.sigma)
-
return kernel.repeat(self.channels, 1, 1, 1)
-
-
@property
-
def out_channels(self):
-
return self.channels
-
-
def extra_repr(self):
-
s = ('{channels}, kernel_size={kernel_size}'
-
', learnable={learnable}, stride={stride}')
-
if self.padding != (0,) * len(self.padding):
-
s += ', padding={padding}'
-
if self.dilation != (1,) * len(self.dilation):
-
s += ', dilation={dilation}'
-
if self.padding_mode != 'zeros':
-
s += ', padding_mode={padding_mode}'
-
return s.format(**self.__dict__)
-
-
-
class DownsamplingBlock(nn.Module):
-
def __init__(self, inp, oup, stride: int = 2, method: str = 'blur', se_ratio: float = 0.0):
-
assert method in ['blur', 'dwconv', 'maxpool'], f'{method}'
-
-
super().__init__()
-
-
if method == 'dwconv' or stride == 1:
-
self.downsample = DepthwiseConv2d(inp, inp, 3, stride)
-
elif method == 'maxpool':
-
self.downsample = nn.MaxPool2d(kernel_size=3, stride=stride)
-
elif method == 'blur':
-
self.downsample = GaussianBlur(inp, stride=stride, sigma=1.1, learnable=False)
-
else:
-
ValueError(f'Unknown downsampling method: {method}.')
-
-
split_chs = 0 if inp > oup else min(oup // 2, inp)
-
-
self.split = ChannelSplit([inp - split_chs, split_chs])
-
self.conv1x1 = PointwiseBlock(inp, oup - split_chs)
-
-
if se_ratio > 0.0:
-
self.conv1x1 = nn.Sequential(PointwiseBlock(inp, oup - split_chs), SEBlock(oup - split_chs, se_ratio))
-
-
self.halve = nn.Identity()
-
if oup > 2 * inp or inp > oup:
-
self.halve = nn.Sequential(Combine('CONCAT'), ChannelChunk(2))
-
-
def forward(self, x):
-
x = self.downsample(x)
-
_, x2 = self.split(x)
-
return self.halve([x2, self.conv1x1(x)])
-
-
-
class VGNet(nn.Module):
-
def __init__(self, in_channels: int = 3, num_classes: int = 1000, channels: List[int] = None, downsamplings: List[str] = None, layers: List[int] = None, se_ratio: float = 0.0,
-
thumbnail: bool = False, **kwargs: Any):
-
super().__init__()
-
-
position = 'after'
-
FRONT_S = 1 if thumbnail else 2
-
strides = [FRONT_S, 2, 2, 2]
-
-
self.features = nn.Sequential(OrderedDict([('stem', Conv2dBlock(in_channels, channels[0], stride=FRONT_S))]))
-
-
for i in range(len(strides)):
-
self.features.add_module(f'stage{i + 1}', self.make_layers(channels[i], channels[i + 1], strides[i], downsamplings[i], layers[i], se_ratio))
-
-
self.features.stage4.append(nn.Sequential(# DepthwiseConv2d(channels[-1], channels[-1]),
-
SharedDepthwiseConv2d(channels[-1], t=8), PointwiseBlock(channels[-1], channels[-1]), ))
-
-
self.avg = nn.AdaptiveAvgPool2d((1, 1))
-
self.classifier = nn.Linear(channels[-1], num_classes)
-
-
def make_layers(self, inp, oup, s, m, n, se_ratio):
-
layers = [DownsamplingBlock(inp, oup, stride=s, method=m, se_ratio=se_ratio)]
-
for _ in range(n - 1):
-
layers.append(HalfIdentityBlock(oup, se_ratio))
-
-
layers.append(Combine('CONCAT'))
-
return Stage(layers)
-
-
def forward(self, x):
-
x = self.features(x)
-
x = self.avg(x)
-
x = torch.flatten(x, 1)
-
x = self.classifier(x)
-
return x
-
-
-
def _vgnet(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
-
model = VGNet(**kwargs)
-
-
if pretrained:
-
if pth is not None:
-
state_dict = torch.load(os.path.expanduser(pth))
-
else:
-
assert 'url' in kwargs and kwargs['url'] != '', 'Invalid URL.'
-
state_dict = torch.hub.load_state_dict_from_url(kwargs['url'], progress=progress)
-
model.load_state_dict(state_dict)
-
return model
-
-
-
# @export
-
# @nonlinear(partial(nn.SiLU, inplace=True))
-
# def vgnetg_1_0mp_se(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
-
# kwargs['channels'] = [28, 56, 112, 224, 368]
-
# kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
-
# kwargs['layers'] = [4, 7, 13, 2]
-
# kwargs['se_ratio'] = 0.25
-
# return _vgnet(pretrained, pth, progress, **kwargs)
-
#
-
#
-
# @export
-
# @nonlinear(partial(nn.SiLU, inplace=True))
-
# def vgnetg_1_5mp_se(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
-
# kwargs['channels'] = [32, 64, 128, 256, 512]
-
# kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
-
# kwargs['layers'] = [3, 7, 14, 2]
-
# kwargs['se_ratio'] = 0.25
-
# return _vgnet(pretrained, pth, progress, **kwargs)
-
#
-
#
-
# @export
-
# @nonlinear(partial(nn.SiLU, inplace=True))
-
# def vgnetg_2_0mp_se(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
-
# kwargs['channels'] = [32, 72, 168, 376, 512]
-
# kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
-
# kwargs['layers'] = [3, 6, 13, 2]
-
# kwargs['se_ratio'] = 0.25
-
# return _vgnet(pretrained, pth, progress, **kwargs)
-
#
-
#
-
# @export
-
# @nonlinear(partial(nn.SiLU, inplace=True))
-
# def vgnetg_2_5mp_se(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
-
# kwargs['channels'] = [32, 80, 192, 400, 544]
-
# kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
-
# kwargs['layers'] = [3, 6, 16, 2]
-
# kwargs['se_ratio'] = 0.25
-
# return _vgnet(pretrained, pth, progress, **kwargs)
-
#
-
#
-
# @export
-
# @nonlinear(partial(nn.SiLU, inplace=True))
-
# def vgnetg_5_0mp_se(pretrained: bool = False, pth: str = None, progress: bool = True, **kwargs: Any):
-
# kwargs['channels'] = [32, 88, 216, 456, 856]
-
# kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
-
# kwargs['layers'] = [4, 7, 15, 5]
-
# kwargs['se_ratio'] = 0.25
-
# return _vgnet(pretrained, pth, progress, **kwargs)
-
-
-
if __name__ == '__main__':
-
-
-
-
kwargs = {}
-
kwargs['channels'] = [28, 56, 112, 224, 368]
-
kwargs['downsamplings'] = ['blur', 'blur', 'blur', 'blur']
-
kwargs['layers'] = [4, 7, 13, 2]
-
kwargs['se_ratio'] = 0.25
-
kwargs['num_classes'] = 4
-
-
model = _vgnet(False, "", True, **kwargs)
-
-
model.eval() # .cuda()
-
-
data = torch.randn(1, 3, 128, 128) # .cuda()
-
-
for i in range(20):
-
start = time.time()
-
out = model(data)
-
print('time', time.time() - start, out.size())
文章来源: blog.csdn.net,作者:AI视觉网奇,版权归原作者所有,如需转载,请联系作者。
原文链接:blog.csdn.net/jacke121/article/details/126151698
- 点赞
- 收藏
- 关注作者
评论(0)