# #StackBounty: #machine-learning #neural-networks #tensorflow #keras #differential-equations On solving ode/pde with Neural Networks

### Bounty: 50

Recently, I watched this video on YouTube on the solution of ode/pde with neural network and it motivated me to write a short code in Keras. Also, I believe the video is referencing this paper found here.

I selected an example ode
$$frac{partial^2 x(t)}{partial t^2} + 14 frac{partial x(t)}{partial t} + 49x(t) = 0$$

with initial conditions
$$x(0) = 0, frac{partial x(t)}{partial t}rvert_{t=0} = -3$$

According to the video, if I understand correctly, we let the neural network $$hat{x}(t)$$, be the solution of our ode, so $$x(t) approx hat{x}(t)$$

Then, we minimize the ode which is our custom cost function per say. Since, we have initial conditions, I created a step function for individual data point loss:

At, $$t=0$$:
$$loss_i = left( frac{partial^2 hat{x}(t_i)}{partial t^2} + 14 frac{partial hat{x}(t_i)}{partial t} + 49hat{x}(t_i) right)^2 + left( frac{partial hat{x}(t_i)}{partial t} + 3 right)^2 + left( hat{x}(t_i) right)^2$$

else
$$loss_i = left( frac{partial^2 hat{x}(t_i)}{partial t^2} + 14 frac{partial hat{x}(t_i)}{partial t} + 49hat{x}(t_i) right)^2$$

Then, minimize batch loss
$$min frac{1}{b} sum_{i}^{b} loss_i$$

where $$b$$ is the batch size in training.

Unfortunately, the network always learns zero. On good evidence, the first and second derivatives are very small – and the $$x$$ coefficient is very large i.e.: $$49$$, so the network learns that zero output is a good minimization.

Now there is a chance that I misinterpret the video because I think my code is correct. If someone can shed some light I will truly appreciate it.

Is my cost function correct? Do I need some other transformation?

Update:

I managed to improve the training by removing the conditional cost function. What was happening was that the the conditions were very infrequent – so the network was not adjusting enough for the initial conditions.

By changing the cost function to the following, now the network has to satisfy the initial condition on every step:

$$loss_i = left( frac{partial^2 hat{x}(t_i)}{partial t^2} + 14 frac{partial hat{x}(t_i)}{partial t} + 49hat{x}(t_i) right)^2 + left( frac{partial hat{x}(t=0)}{partial t}rvert_{t=0} + 3 right)^2 + left( hat{x}(t=0)rvert_{t=0} right)^2$$

The results are not perfect but better. I have not managed to get the loss almost zero. Deep networks have not worked at all, only shallow one with sigmoid and lots of epochs.

I am surprised this works at all since the cost function depends on derivatives of non-trainable parameters.

I would appreciate any input on improving the solution. I have seen a lot of fancy methods but this is the most straight forward. For example, in the referenced paper above – the author uses a trial solution. I do not understand how that works at all.

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.