*Bounty: 50*

*Bounty: 50*

I have data D from 10 different subjects in a 2×3×2 factorial design. More specifically, D is an unbiased estimator of the amount of multivariate variance explained by a given model of fMRI (neuroimaging) data. Factors are:

– 2 different models

– 3 different regions of interest (brain area)

– 2 tasks performed by the subjects,

from fastest to slowest. I am interested in whether there is an interaction between model and task, for each of the three regions separately. My statistic is defined by applying a suitable contrast:

region 1: -1 1 0 0 0 0 1 -1 0 0 0 0

region 2: 0 0 -1 1 0 0 0 0 1 -1 0 0

region 3: 0 0 0 0 -1 1 0 0 0 0 1 -1

However, I have reason to believe that the measure D is generally different (by a constant factor) between tasks and regions. Because of that, I average the values of D across subjects and models, per task and region, and divide the values of D by these averages (a form of standardization). I think this standardization is necessary to avoid observing an interaction between model and task simply because different tasks evoke differently strong effects generally. I’m aware that 10 subjects and therefore 20 values per average is not a lot, but it is all I have at the moment. After standardization, I apply the contrast to the standardized data.

I now would like confidence intervals for the estimated interaction strength. For that I use the bias-corrected and accelerated bootstrap CI as implemented in Matlab’s Statistics and Machine Learning Toolbox with 10,000 resamples of the 10 subjects. The result:

Circles indicate the original value of the statistic. Now the weird thing is that for the third contrast, the confidence interval lies far away from the original value, and it is tiny. In fact, its width is numerically 0.

To find out what’s happening, I went into Matlab’s `bootci`

function and found that the normalized bias estimate $z_0$ is numerically infinite.

How does that happen? Here are plots of the bootstrap replicates of the statistic:

Blue dots indicate the bootstrap replicates and the black horizontal lines the original values, with one plot for the three contrasts.

As you notice, for the first and second contrast the original value lies somewhere in the middle of the bootstrap distribution, but for the third it lies far outside.

So, part of the problem is obviously a numerical one: There are so few bootstrap replicates larger than the original value that $z_0$ is evaluated as `Inf`

. This might be fixed by more bootstrap replications, albeit a very large number.

More importantly however, the estimation bias appears to be actually very large for contrast 3. This might be a bias-estimation problem, e.g. due to the fact that the bootstrap distribution has a heavy right tail (probably due to only 20 values entering the standardization). However, the same is true for contrasts 1 and 2, but without the same consequence.

I then looked at the data, to see whether there is something particular about the values entering contrast 3:

The third contrast concerns the values in the lower row, though standardization involves all values. I can’t see anything particular about them, except that they tend to be smaller than in the other regions.

I also tried to find subjects which when excluded reduce the problem (by trial and error) but did not succeed.

Since the contrasts are defined within region, I also tried a modified standardization where I average across subjects, models, *and* regions (60 values), per model, and divide by that, but it didn’t remove the problem.

**My question:** How would you deal with this problem, or how would you go about its diagnosis? How would you explain that there’s a problem only for the third contrast, though the three contrasts have the same form and there aren’t any obvious irregularities in the underlying data? If you think that the problem has something to do with the standardization, do you have an idea how to stabilize it?