## #StackBounty: #bayesian #estimation #inference #prevalence Optimization of pool size and number of tests for prevalence estimation via …

### Bounty: 100

I’m trying to devise a protocol for pooling lab tests from a cohort in order to get prevalence estimates using as few reagents as possible.

Assuming perfect sensitivity and specificity (if you want to include them in the answer is a plus), if I group testing material in pools of size $$s$$ and given an underneath (I don’t like term “real”) mean probability $$p$$ of the disease, the probability of the pool being positive is:

$$p_w = 1 – (1 – p)^s$$

if I run $$w$$ such pools the probability of having $$k$$ positive wells given a certain prevalence is:

$$p(k | w, p) = binom{N}{k} (1 – (1 – p)^k)^s(1 – p)^{s(w-k)}$$

that is $$k sim Binom(w, 1 – (1 – p)^s)$$.

To get $$p$$ I just need to maximize the likelihood $$p(k | w, p)$$ or use the formula $$1 – sqrt[s]{1 – k/w}$$ (not really sure about this second one…).

My question is, how do I optimize $$s$$ (maximize) and $$w$$ (minimize) according to a prior $$p$$ in order have the most precise estimates, below a certain level of error?

Get this bounty!!!