The kernel size for the layers in efficientnet changes and can be an odd or even number. If the kernel size is an odd number, margins for padding can be calculated with kernel_size // 2, however if the kernel size is an odd number this won't work. ConvLayerDynamicPadding expands the fastai ConvLayer function with an extra padding layer, ensuring padding is sufficient regardless of kernel size.

class ConvLayerDynamicPadding[source]

ConvLayerDynamicPadding(ni, nf, ks=3, stride=1, bias=None, ndim=3, norm_type=<NormType.Batch: 1>, bn_1st=True, act_cls=ReLU, transpose=False, init='auto', xtra=None, bias_std=0.01, padding:Union[int, Tuple[int, int, int]]=0, dilation:Union[int, Tuple[int, int, int]]=1, groups:int=1, padding_mode:str='zeros') :: Sequential

Same as fastai ConvLayer, but more accurately padds input according to `ks` and `stride`
ConvLayerDynamicPadding(ni=3, nf=64, ks=(2,7,5), ndim=3)(torch.randn(1, 3, 10, 10, 10)).size()
torch.Size([1, 64, 10, 10, 10])

EfficientNet uses DopConnect and DropOut. DropConnect needs to be implemented as Module to work with nn.Sequential. See paper about stochastic depth for the reason drop connect and skip connections are used.

class DropConnect[source]

DropConnect(p) :: Module

Drops connections with probability p
DropConnect(0.5)(torch.randn(16, 1, 1, 1, 1)).flatten()
tensor([-0.0000,  0.0000, -0.0000,  0.0000, -0.0000,  0.9227,  0.1613, -0.0000,
         0.0000, -2.2940, -0.0000, -0.0000, -0.0000, -1.3887,  0.0000, -0.0000])

Mobile Inverted Residual Bottleneck Block is the main block of convolutional layers in each EfficientNet.

The Mobile Inverted Residual Bottleneck Block is the main building block of the efficientnet. It is based on this paper: https://arxiv.org/pdf/1801.04381.pdf

class MBConvBlock[source]

MBConvBlock(n_inp, n_out, kernel_size, stride, se_ratio, id_skip, expand_ratio, drop_connect_rate=0.2, act_cls=SiLU, norm_type=<NormType.Batch: 1>, **kwargs) :: Module

Mobile Inverted Residual Bottleneck Block
MBConvBlock(80, 112, 4, 1, 0.25, True, 6)(torch.randn(1, 80, 10, 14, 14)).size()
torch.Size([1, 112, 10, 14, 14])

class EfficientNet[source]

EfficientNet(ni=3, num_classes=101, width_coefficient=1.0, depth_coefficient=1.0, dropout_rate=0.2, drop_connect_rate=0.2, depth_divisor=8, min_depth=None, act_cls=SiLU, norm_type=<NormType.Batch: 1>) :: Sequential

EfficientNet implementation into fastai based on
https://arxiv.org/abs/1905.11946 and the PyTorch
implementation of lukemelas (GitHub username)
https://github.com/lukemelas/EfficientNet-PyTorch
 
EfficientNet(num_classes = 2)(torch.randn(1, 3, 10, 224, 224))
tensor([[ 0.2666, -0.0851]], grad_fn=<AddmmBackward>)

Calling models follows the torchvision approach taken for ResNets. We have private function _efficientnet which passes the building arguments to the EfficientNet class and a single function for each class (efficientnet_b0, efficientnet_b1) which will give the respective model.

Overview of building arguments for the different efficientnets. Keep in mind, that image size is important for the model. Progressize resizing with efficiennets must be done very carefull.

model name width_coeff depth_coeff image_size dropout
'efficientnet_b0' 1.0 1.0 224 0.2
'efficientnet_b1' 1.0 1.1 240 0.2
'efficientnet_b2' 1.1 1.2 260 0.3
'efficientnet_b3' 1.2 1.4 300 0.3
'efficientnet_b4' 1.4 1.8 380 0.4
'efficientnet_b5' 1.6 2.2 456 0.4
'efficientnet_b6' 1.8 2.6 528 0.5
'efficientnet_b7' 2.0 3.1 600 0.5
'efficientnet_b8' 2.2 3.6 672 0.5
'efficientnet_l2' 4.3 5.3 800 0.5

efficientnet_b0[source]

efficientnet_b0(pretrained=False, progress=True, **kwargs)

load efficientnet with specific scaling coefficients

efficientnet_b1[source]

efficientnet_b1(pretrained=False, progress=True, **kwargs)

load efficientnet with specific scaling coefficients

efficientnet_b2[source]

efficientnet_b2(pretrained=False, progress=True, **kwargs)

load efficientnet with specific scaling coefficients

efficientnet_b3[source]

efficientnet_b3(pretrained=False, progress=True, **kwargs)

load efficientnet with specific scaling coefficients

efficientnet_b4[source]

efficientnet_b4(pretrained=False, progress=True, **kwargs)

load efficientnet with specific scaling coefficients

efficientnet_b5[source]

efficientnet_b5(pretrained=False, progress=True, **kwargs)

load efficientnet with specific scaling coefficients

efficientnet_b6[source]

efficientnet_b6(pretrained=False, progress=True, **kwargs)

load efficientnet with specific scaling coefficients

efficientnet_b7[source]

efficientnet_b7(pretrained=False, progress=True, **kwargs)

load efficientnet with specific scaling coefficients

efficientnet_b8[source]

efficientnet_b8(pretrained=False, progress=True, **kwargs)

load efficientnet with specific scaling coefficients

efficientnet_l2[source]

efficientnet_l2(pretrained=False, progress=True, **kwargs)

load efficientnet with specific scaling coefficients
efficientnet_b0(num_classes = 2)(torch.randn(1, 3, 10, 224, 224))
tensor([[0.1697, 0.0239]], grad_fn=<AddmmBackward>)