#StackBounty: #bayesian #estimation #inference #prevalence Optimization of pool size and number of tests for prevalence estimation via …

Bounty: 100

I’m trying to devise a protocol for pooling lab tests from a cohort in order to get prevalence estimates using as few reagents as possible.

Assuming perfect sensitivity and specificity (if you want to include them in the answer is a plus), if I group testing material in pools of size $s$ and given an underneath (I don’t like term “real”) mean probability $p$ of the disease, the probability of the pool being positive is:

$$p_w = 1 – (1 – p)^s$$

if I run $w$ such pools the probability of having $k$ positive wells given a certain prevalence is:

$$p(k | w, p) = binom{N}{k} (1 – (1 – p)^k)^s(1 – p)^{s(w-k)}$$

that is $k sim Binom(w, 1 – (1 – p)^s)$.

To get $p$ I just need to maximize the likelihood $p(k | w, p)$ or use the formula $1 – sqrt[s]{1 – k/w}$ (not really sure about this second one…).

My question is, how do I optimize $s$ (maximize) and $w$ (minimize) according to a prior $p$ in order have the most precise estimates, below a certain level of error?

Get this bounty!!!