Bounty: 50
I have a neural network in a synthetic experiment I am doing where scale matters and I do not wish to remove it & where my initial network is initialized with a prior that is non-zero and equal everywhere.
How do I add noise appropriately so that it trains well with the gradient descent rule?
$$ w^{<t+1>} := w^{<t>} – eta nabla_W L(W^{<t>}) $$
cross-posted:
- Quora: https://www.quora.com/unanswered/How-do-I-add-appropriate-noise-to-a-neural-network-with-constant-weights-so-that-back-propagation-training-works
- SO: How to add appropriate noise to a neural network with constant weights so that back propagation training works?
- pytorch: https://discuss.pytorch.org/t/how-to-add-appropriate-noise-to-a-neural-network-with-constant-weights-so-that-back-propagation-training-works/93411