#StackBounty: #bayesian #feature-selection #bias #unbiased-estimator #conjugate-prior Bayes estimator are immune to selection Bias

Bounty: 50

Are Bayes estimators immune to selection bias?

Most papers that discuss estimation in high dimension, e.g., whole genome sequence data, will often raise the issue of selection bias. Selection bias arises from the fact that, though we have thousands of potential predictors only few will be selected and inference is done on the selected few. So the process goes in two steps: (1) select a subset of predictors (2) perform inference on the select sets, e.g., estimate odds ratios. Dawid in his 1994 paradox paper focused on unbiased estimators and Bayes estimators. He simplifies the problem to selecting the largest effect, which could be a treatment effect. Then he says, unbiased estimators are affected by selection bias. He used the example: assume
$$
Z_isim N(delta_i,1),quad i=1,ldots,N
$$
then each $Z_i$ is unbiased for $delta_i$. Let $mathbf{Z}=(Z_1,Z_2,ldots,Z_N)^T$, the estimator
$$
gamma_1(mathbf{Z})=max{Z_1,Z_2,ldots,Z_N}
$$
is however biased (positively) for $max{delta_1,delta_2,ldots,delta_N}$. This statement can be easily proven with Jensen’s inequality. Therefore, if we knew $i_{max}$, the index of the largest $delta_i$, we will just use $Z_{i_{max}}$ as its estimator which is unbiased. But because we do not know this, we use $gamma_1(mathbf{Z})$ instead which becomes biased (positively).

enter image description here

But the worrying statement Dawid, Efron and other authors make is Bayes estimators are immune to selection bias. If I will now put prior on $delta_i$, say $delta_isim g(.)$, then the Bayes estimator of $delta_i$ is given by
$$
text{E}{delta_imid Z_i}=z_i+frac{d}{dz_i}m(z_i)
$$
where $m(z_i)=int varphi(z_i-delta_i)g(delta_i)ddelta_i$, with $varphi(.)$ the standard Gaussian.

If we define the new estimator of $delta_{i_{max}}$ as
$$
gamma_2(mathbf{Z})=max{text{E}{delta_1mid Z_1},text{E}{delta_2mid Z_2},ldots,text{E}{delta_Nmid Z_N}},
$$
whatever $i$ you select to estimate $delta_{i_{max}}$ with $gamma_1(mathbf{Z})$, will be the same $i$ if the selection was based on
$gamma_2(mathbf{Z})$.This follows because $gamma_2(mathbf{Z})$ is monotone in $Z_i$. We also know that $text{E}{delta_imid Z_i}$ shrinkes $Z_i$ towards zero with the term, $frac{d}{dz_i}m(z_i)$ which reduces some of the positive bias in $Z_i$. But how do we conclude that Bayes estimators are immune to selection bias. I really don’t get it.


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.