The tales of convolutions


The tales of convolutions

Blabla

1D-convolution

cross-correlation

2D-convolution

Atrous conv / dilated conv

Transposed convolutions

Box convolutions

https://github.com/shrubb/box-convolutions

Octave convolutions

https://arxiv.org/abs/1904.05049

Spatial separable convolutions

divide kernel N-dim into N kernels 1-dim

initially: 9 multiplications

now: 2 * (3 multiplications) = 6

main issue: not all kernels are spatially separable

Depthwise Separable convolutions

Convolution kernels don’t have an output channel dimension

then perform pointwise convolution (conv1x1)

but with OUTPUT_CHANNELS kernels

with 256 output channels, initially :

  • num parameters: 5x5x3x256 = 19200 parameters

  • num operations: num_parameters * (8*8) = 1.23M operations

Now:

  • num parameters: 5x5 + 256 * 3 = 793 parameters
  • num operations: (8 * 8) * (3 * 5 * 5) + (8 * 8) * 3 * 256 = 53952 operations

Grouped convolutions

MixConv

original: out_chan * k * k

num parameters: $k^2 \cdot in_c \cdot out_c$

num operations: $out_size^2 \cdot k^2 \cdot in_c \cdot out_c$

proposition: split out_chan into groups and apply increasingly large conv kernels

out_chan_group1 * (3 * 3) + out_chan_group2 * (5 * 5) + … + out_chan_groupN * (k * k)

num parameters: $in_c \cdot \frac{out_c}{(k - 1)/2} \cdot \sum\limits_{i=1}^{(k-1)/2} (2i + 1)^2 $

num operations: $$

AdderNet

Deformable Conv

Related Posts

Non-linear activation

A good start eases the journey

Initialization for efficient training

Powerful things you can do with the Markdown editor

This is the summary