#StackBounty: #maximum-likelihood #likelihood #cross-entropy Can we derive cross entropy formula as maximum likelihood estimation for S…

Bounty: 100

For hard integer labels {0,1}, the cross entropy simplifies to the log loss. In this case, it is easy to show that minimizing the cross entropy is equivalent to maximizing the log likelihood, see e.g. https://stats.stackexchange.com/a/364237/179312

Can we also show this for soft float labels [0,1]? This thread states that the cross entropy function is also appropriate here. But how does the log likelihood function look like in this case?


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.