# #StackBounty: #neural-network #deep-learning #pytorch #deep-network Understanding depthwise convolution vs convolution with group param…

### Bounty: 50

So in the mobilenet-v1 network, depthwise conv layers are used. And I understand that as follows.

For a input feature map of `(C_in, F_in, F_in)`, we take only 1 kernel with `C_in` channels, let’s say, with size `(C_in, K, K)`, and convolve each channel of the kernel to each channel of the input, to produce a `(C_in, F_out, F_out)` feature map. Then do pointwise conv to combine those feature maps, using `C_out` kernels with size `(C_in, 1, 1)`, we get a result of `(1, F_out, F_out)`. The kernel parameter reduce ratio comparing to normal conv is:

`(K*K*C_in+C_in*C_out)/(K*K*C_in*C_out) = 1/C_out + 1/(K*K)`

And I also checked `Conv2d`(doc) in pytorch, it is said one can achieve the depthwise convolution setting `groups` parameter equals to `C_in`. But as I read related articles, the logic behind setting `groups` looks different with the above depthwise convolution operation that mobilenet used. Let’s say, we have `C_in=6`, and `C_out=18`, `groups=6` means you divide both input and output channels to `6` groups. In each group, `3` kernels each having `1` channel is used to conv with a input channel, so a total of `18` output channels can be produced.

But for a normal convolution, `18*6` total kernel-channels are used for `18 kernels, each having 6 channels`. So the reduce ratio is `18/(18*6)`, thus the reduce ratio is `1/C_in=1/Groups` . Leaving out the pointwise conv not considered, this number is different with the `1/C_out` in above conclusion.

Can anyone explain where am I wrong? Is it bcause I missed something when `C_out` = `factor * C_in` (factor > 1) ?

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.