# #StackBounty: #forecasting #model-evaluation #scoring-rules Evaluating probabilistic forecasts of K-most-likely events from an arbitrar…

### Bounty: 50

Suppose a populous nation has a high homicide rate and an understaffed police force. The police chief hires a statistician and together they decide to take a preventative approach by identifying would-be-murderers before they commit the crime, along the lines of Minority Report.

The police chief requires the statistician to provide the following on a daily basis:

1. A list of tomorrow’s top 100 most-likely murderers. (The statistician may have information about the entire citizen population, but the chief doesn’t have time to think about more than 100 cases.)
2. For each person on the list, the statistician’s best estimate of the probability that the person will commit a murder (in the absence of intervention).

The police chief will regularly evaluate the statistician’s forecasts and provide bonus pay for good performance. Unfortunately, the chief does not know how to score the forecasts in a way that incentivizes the statistician to honestly strive toward the objectives (1) and (2). Can you help?

Here are two basic proposals of increasing complexity:

• Score = recall = The number of people who attempt murder that the statistician included on the list. But this gives no incentive for accurate probabilities (2).
• Score = \$100 – sum_{i=1}^{100} (O_i – p_i)^2 \$, similar to the Brier score. Here \$p_i\$ is the forecasted probability for the \$i\$th person on the list, and \$O_i\$ is the true outcome (0 or 1) for their murdership status. But the statistician can easily maximize this by selecting 100 people with no chance of being murderers and taking \$p_i\$ to be identically 0.

Any other ideas? I strongly suspect that this is not a new problem; a good reference may suffice.

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.