*Bounty: 50*

*Bounty: 50*

My CNN has the following structure:

- Output neurons: 10
- Input matrix (I): 28×28
- Convolutional layer (C): 3

feature maps with a 5×5 kernel (output dimension is 3x24x24) - Max pooling layer (MP): size 2×2 (ouput dimension is 3x12x12)
- Fully connected layer (FC): 432×10 (3*12*12=432 max pooling layer flattened and vectorized)

After making the forward pass, I calculate the error delta in the output layer as:

$delta^L = (a^L-y) odot sigma'(z^L) (1)$

Being $a^L$ the predicted value and $z^L$ the dot product of the weights, plus the biases.

I calculate the error deltas for the next layers with:

$delta^l = ((w^{l+1})^T delta^{l+1}) odot sigma'(z^l) (2)$

And derivative of the error w.r.t. the weights being

$frac{partial C}{partial w^l_{jk}} = a^{l-1}_k delta^l_j (3)$

I’m able to update the weights (and biases) of $FC$ with no problem. At this point, error delta $delta$ is 10×1.

For calculating the error delta for $MP$ , I find the dot product of $FC$ and the error delta itself, as defined in equation 2. That gives me an error delta of 432×1. Because there are no parameters in this layer, and the flattening and vectorization, I just need to follow the reverse process and reshape it to 3x12x12, being that the error in $MP$.

To find the error delta for $C$, I upsample the error delta following the reverse process of the max pooling ending with a 3x24x24 delta. Finding the hadamard product of each of those matrixes with each of the $σ′$ of the feature maps gives me the error delta for $C$.

But now, how am I supposed to update the kernels, if they’re 5×5, and I is 28×28? $I$ have the error delta for the layer, but I don’t know how to update the weights with it. Also for the bias, as it’s a single value for the whole feature set.