*Bounty: 50*

*Bounty: 50*

We are studying 3 different proteins, each under 9 different conditions at 3 different timepoints (Day 1,2,3). For these we have 3 biological replicates. So we have 81 different experiments (3 proteins * 9 conditions * 3 replicates) ‒ and for each experiment we have data at three different timepoint readings on consecutive days. This gives us 243 observations in a balanced design.

We would like to show which of these proteins and conditions are statistically different from each other. We would like a comparision between proteins, and the conditions of each protein compared. For this we were thinking of using a repeated measure anova test (using R).

I replicated a MWE of the dataset and example here:

```
library(RCurl)
library(dplyr)
raw.data <- getURL("https://gist.githubusercontent.com/jp-um/1849ac4ac61411d0751cdbec4406e0cd/raw/4b014f986085665e75806c38a25f39093b2d19df/anon.csv")
exp.data <- read.csv(text = raw.data, colClasses=c("experiment"="factor",
"protein"="factor",
"condition"="factor",
"day"="factor",
"bioreplicate"="factor"))
summary(exp.data, maxsum=10)
aov.model <- aov(density ~ protein*condition*day + Error(experiment/day), data = exp.data)
summary(aov.model)
```

The output is:

```
Error: experiment
Df Sum Sq Mean Sq F value Pr(>F)
protein 2 10729989 5364994 1166.3 <2e-16 ***
condition 8 16430568 2053821 446.5 <2e-16 ***
protein:condition 16 29649758 1853110 402.8 <2e-16 ***
Residuals 54 248404 4600
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Error: experiment:day
Df Sum Sq Mean Sq F value Pr(>F)
day 2 92976 46488 13.24 7.2e-06 ***
protein:day 4 1776592 444148 126.49 < 2e-16 ***
condition:day 16 3419459 213716 60.87 < 2e-16 ***
protein:condition:day 32 7415908 231747 66.00 < 2e-16 ***
Residuals 108 379221 3511
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
```

I have a few questions, please:

- Is repeated measure anova the way to go here (as opposed to mixed models)?
- Is this the correct way to specify the formula? Will this give me repeated measures over the timepoints (Day)? Specifically what does
`Error(experiment/day)`

mean? My interpretation is that we have a random effect based on the biological replicate (`experiment`

in my case) and repeated readings for the timepoint (`Day`

). - What is the
`ezANOVA`

equivalent way to write this? - Is the above anova, equivalent to the linear mixed effect model
`lme(density ~ protein*condition*day, random = ~1 | experiment/day, data = exp.data)`

? - The output tells me that there are differences between them, but I would like to know which combination gives the difference. I know I can use a post-hoc test for this, and I found
`TukeyHSD`

does not work on repeated measures. I have found I can use`glht`

for this; but I am unable to interpret its output. (I tried`glht(lme.model, linfct=mcp(protein="Tukey", condition="Tukey"))`

but I am not sure this is correct)

Apologies for the (many, and rather basic I am afraid) questions. I would appreciate your time and help.

Many thanks,