#StackBounty: #machine-learning #predictive-models #model #validation #calibration Maximum probability returned by classifier much less…

Bounty: 50

There’s a sklearn calibration curve example which shows curves for different classifiers. I changed it to reproduce an issue I am having on a true dataset by adding class imbalance (.95,.05). I get the following curve.

enter image description here

I see that for the blue curve “Mean predicted value” never goes above 0.8. There is an argument to “Normalize:”

Whether y_prob needs to be normalized into the bin [0, 1], i.e. is not a proper probability. If True, the smallest value in y_prob is mapped onto 0 and the largest one onto 1.

When I normalize:

enter image description here

This pulls the blue curve to the right, but I am confused because logistic regression does output proper probabilities. Has it simply not seen enough positive cases because of the class imbalance to confidently give good probability estimates (there are 4,500, which is not small!)? If I were to report this, should I leave it as the truncated plot to show that there is an upper limit to the predicted probabilities? In general, I have found that when I calibrate the classifiers with independent data, it sometimes changes closeness to the curve, but it does not always pull the curve over.


Austin and Steyerberg (2013) 10.1002/sim.5941 Figure 2 plots are not normalized (note the left hand side of plots for low outcome prevalence):
enter image description here

So the “truncated” curves should be kept and indicate poor calibration?

Also it’s interesting that N=500 and prevalence=0.1 (50 cases and 450 controls) has better calibration than N=10000 and prevalence=0.01 (100 cases), suggesting that it’s not the the raw number of cases, but the actual relative number (the prevalence), that leads to poor calibration…

Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.