*Bounty: 50*

*Bounty: 50*

I have spent a lot of time reading book chapters, articles, online tutorials, etc., but with no clear answer (mostly because they only describe one-way ANOVA or other very specific applications). There have also been many similar questions on this site, but again no satisfactory answer for my purposes.

**In essence, I’d like to know the clear and straightforward (non-technical), and completely generalizable (and practically implementable) answer for how to test the (in)famous ANOVA normality assumption given any number of within-subject or between-subject factors (with any number of levels).**

At least this tutorial advises to test the normality of every single cell, i.e. every possible combination of each level of each factor – but no references or detailed reasoning is given, and it seems quite extreme for complex designs. But most others (e.g. this answer or this book chapter or this video tutorial) suggests that only the residuals should be tested (regardless of within/between factors). Even if I assume that this is latter true, the question remains: which residuals should be tested?

In the following, I use the `R`

function `stats:aov`

output to illustrate in an example some potential answers.

I have a hypothetical dataset. Each individual subject is denoted with "`subject_id`

". There are two between-subject factors: "`btwn_X`

" and "`btwn_Y`

". There are also two within-subject factors: "`wthn_X`

" and "`wthn_Y`

". The aov object `aov_obj`

returns the following:

```
Grand Mean: 523.3064
Stratum 1: subject_id
Terms:
btwn_X btwn_Y btwn_X:btwn_Y Residuals
Sum of Squares 393209.0 45184.5 9583.3 1768261.2
Deg. of Freedom 1 1 1 132
Residual standard error: 115.7407
9 out of 12 effects not estimable
Estimated effects may be unbalanced
Stratum 2: subject_id:wthn_X
Terms:
wthn_X btwn_X:wthn_X btwn_Y:wthn_X btwn_X:btwn_Y:wthn_X Residuals
Sum of Squares 273876.35 262325.82 192.19 663.69 199702.58
Deg. of Freedom 1 1 1 1 132
Residual standard error: 38.89599
4 out of 8 effects not estimable
Estimated effects may be unbalanced
Stratum 3: subject_id:wthn_Y
Terms:
wthn_Y btwn_X:wthn_Y btwn_Y:wthn_Y btwn_X:btwn_Y:wthn_Y Residuals
Sum of Squares 20514.4 27879.9 85348.2 15667.3 325852.8
Deg. of Freedom 1 1 1 1 132
Residual standard error: 49.68483
4 out of 8 effects not estimable
Estimated effects may be unbalanced
Stratum 4: subject_id:wthn_X:wthn_Y
Terms:
wthn_X:wthn_Y btwn_X:wthn_X:wthn_Y btwn_Y:wthn_X:wthn_Y btwn_X:btwn_Y:wthn_X:wthn_Y Residuals
Sum of Squares 1042.83 1070.27 5202.57 5791.20 78756.74
Deg. of Freedom 1 1 1 1 132
Residual standard error: 24.42626
Estimated effects may be unbalanced
```

I can access the following residuals (see here for more details):

```
aov_obj$subject_id$residuals
aov_obj$`subject_id:wthn_X`$residuals
aov_obj$`subject_id:wthn_Y`$residuals
aov_obj$`subject_id:wthn_X:wthn_Y`$residuals
```

Based on this answer, it would seem that each of these variables should be tested separately for normality. Alternatively, perhaps only `subject_id:wthn_X:wthn_Y$residuals`

. (Yet again, perhaps each cell should be tested tested separately, and residuals can be ignored.)