Suppose that we have four possible events $e_0, e_1, e_2 $ and $e_3$. These event are exclusive $e_i cap e_j = emptyset$ and exhaustive $cup e_i = Omega$. Now we ask two experts on their opinion as to the probabilities of the individual $e_i$. We get the results in the following table:
There are numerous ways to combine their opinion to get a resulting probability distribution. There are two papers in particular that provide a very good overview of the entire topic:
Genest, C., & Zidek, J. V. (1986). Combining Probability
Distributions: A Critique and an Annotated Bibliography. Statistical
Science, 1(1), 114–135. https://doi.org/10.1214/ss/1177013825
Dietrich, F., & List, C. (2017). Probabilistic Opinion Pooling. (A.
Hájek & C. Hitchcock, Eds.), Social Choice and Welfare (Vol. 1).
Oxford University Press.
Some of the more prominent methods are:
- linear: take a weighted arithmetic mean of probabilities for each event
- geometric: take a weighted geometric mean of probabilities for each event and then normalize
multiplicative: take the product of probabilities for each event and then normalize
The results of these methods are also included in the table. Each method can be justified based on firm mathematical results.
Now here is the catch. Suppose that we are no longer interested in the individual probabilities of $e_2$ and $e_3$, but we would just like to know what is the probability of $e_2 cup e_3$. There are two ways in which we can proceed:
- We can sum the resulting combined probabilities for $e_1$ and $e_2$.
- We can sum the experts’ opinions for $e_1$ and $e_2$ and then combine the probabilities.
As you see in the table, these two approaches do not yield the same results.
For example, for the geometric combinator method 1 yields 0.71 and method 2 yields 0.712.
If the two methods would be guaranteed to produce the same result, we would call this the marginalization property.
Lindley in his paper ”Reconciliation of discrete probability distributions” (sorry, I cannot find the exact citation) tries to argue that the entire concept of marginalization property is flawed, since in the first case, we are given more information by the experts. He also gives a numerical explanation for his supra-Bayesian model, but I find it very hard to follow. Furthermore, I am not very convinced by his explanation of receiving “more information”. His numeric results hold, but I have the feeling that this is due to the fact that his model is much more complicated and must account for the individual correlations between his experts.
Are there any intuitive arguments that would reconcile the marginalization property with the simple geometric and multiplicative operators?
Get this bounty!!!