#StackBounty: #machine-learning #neural-networks #tensorflow #keras #differential-equations On solving ode/pde with Neural Networks

Bounty: 50

Recently, I watched this video on YouTube on the solution of ode/pde with neural network and it motivated me to write a short code in Keras. Also, I believe the video is referencing this paper found here.

I selected an example ode
$$
frac{partial^2 x(t)}{partial t^2} + 14 frac{partial x(t)}{partial t} + 49x(t) = 0
$$

with initial conditions
$$
x(0) = 0, frac{partial x(t)}{partial t}rvert_{t=0} = -3
$$

According to the video, if I understand correctly, we let the neural network $hat{x}(t)$, be the solution of our ode, so $x(t) approx hat{x}(t)$

Then, we minimize the ode which is our custom cost function per say. Since, we have initial conditions, I created a step function for individual data point loss:

At, $t=0$:
$$
loss_i = left( frac{partial^2 hat{x}(t_i)}{partial t^2} + 14 frac{partial hat{x}(t_i)}{partial t} + 49hat{x}(t_i) right)^2 +
left( frac{partial hat{x}(t_i)}{partial t} + 3 right)^2 +
left( hat{x}(t_i) right)^2
$$

else
$$
loss_i = left( frac{partial^2 hat{x}(t_i)}{partial t^2} + 14 frac{partial hat{x}(t_i)}{partial t} + 49hat{x}(t_i) right)^2
$$

Then, minimize batch loss
$$
min frac{1}{b} sum_{i}^{b} loss_i
$$

where $b$ is the batch size in training.

Unfortunately, the network always learns zero. On good evidence, the first and second derivatives are very small – and the $x$ coefficient is very large i.e.: $49$, so the network learns that zero output is a good minimization.

enter image description here

Now there is a chance that I misinterpret the video because I think my code is correct. If someone can shed some light I will truly appreciate it.

Is my cost function correct? Do I need some other transformation?

Update:

I managed to improve the training by removing the conditional cost function. What was happening was that the the conditions were very infrequent – so the network was not adjusting enough for the initial conditions.

By changing the cost function to the following, now the network has to satisfy the initial condition on every step:

$$
loss_i = left( frac{partial^2 hat{x}(t_i)}{partial t^2} + 14 frac{partial hat{x}(t_i)}{partial t} + 49hat{x}(t_i) right)^2 +
left( frac{partial hat{x}(t=0)}{partial t}rvert_{t=0} + 3 right)^2 +
left( hat{x}(t=0)rvert_{t=0} right)^2
$$

The results are not perfect but better. I have not managed to get the loss almost zero. Deep networks have not worked at all, only shallow one with sigmoid and lots of epochs.

I am surprised this works at all since the cost function depends on derivatives of non-trainable parameters.

enter image description here

I would appreciate any input on improving the solution. I have seen a lot of fancy methods but this is the most straight forward. For example, in the referenced paper above – the author uses a trial solution. I do not understand how that works at all.


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.