*Bounty: 300*

*Bounty: 300*

**Short version**

I am trying to analytically solve/approximate the composite likelihood that results from independent Poisson draws and further sampling with or without replacement (I don’t really care which one). I want to use the likelihood with MCMC (Stan), so I need the solution only up to a constant term. Ultimately, I want to model a process where the initial draws are from neg. binomial distribution, but I think I will be able to get there with a solution for the Poisson case.

It is well possible that the solution is not feasible (I don’t understand the math enough to be able to tell whether this is a simple or very difficult problem). I am thus also interested in approximations, negative results or intuition why the problem is probably intractable (e.g. comparing to a known hard problem). Links to useful papers/theorems/tricks that will help me move forward are good answers even if their connection to the problem at hand is not fully worked out.

**Formal statement**

More formally, first $Y = (y_1, …, y_N), y_n sim Pois(lambda_n)$ is drawn independently and then I sample $k$ items at random from all of $Y$ to get $Z = (z_1,…,z_N)$. I.e. I draw $k$ coloured balls from an urn where the amount of balls of color $n$ is drawn from $Pois(lambda_n)$. Here, $k$ is assumed known and fixed and we condition on $forall n: y_n geq z_n$. Technically the sampling is done without replacement, but assuming sampling with replacement should not be a big deal.

I have tried two approaches to solve for sampling without replacement (as this seemed to be the easier case due to some terms cancelling out), but got stuck with both. The likelihood when sampling without replacement is:

$$

P(Z = (z_1, …, z_N) | Lambda = (lambda_1, …, lambda_N)) = frac{

sum_{Y;forall n: y_n geq z_n} left( frac{prod_{n=1}^N{y_n choose z_n}}{ {sum_{n=1}^N y_n} choose k}

prod_{n=1}^N Poisson(y_n |lambda_n)

right)

}{

prod_{n=1}^N P(y_n geq z_n|lambda_n)

}

$$

**Attempted solutions**

Let’s rephrase the likelihood using $bar{y}*n = y_n – z_n, s = sum*{n=1}^N bar{y}_n$ and move all terms independent of $Y$ outside of the summation.

$$

P(Z | Lambda) = c_1 sum_{bar{Y} geq 0}

frac{

prod_{n=1}^N frac{(bar{y}_n + z_n)!}{z_n! bar{y}_n!}

frac{1}{(bar{y}_n + z_n)!} lambda_n^{bar{y}_n + z_n} e^{-lambda_n}

}{

frac {(k + s)!}{k!s!}

}

$$

where $c_1 = left( prod_{n=1}^N P(y_n geq z_n|lambda_n) right)^{-1}$, further

$$

P(Z|Lambda) = c_2 sum_{bar{Y} geq 0}

frac{s!}{s+k!} prod_{n=1}^N frac{lambda_n^{bar{y}_n}}{bar{y}_n !}

$$

where $c_2 = c_1 k! prod_{n=1}^N frac{e^{-lambda_n} lambda_n^{z_n}}{z_n !}$

**Aproach 1: sum over $s$**

Here, I factor the multidimensional infinite sum into infinite summation over $s$ and finite product over possible $y_n$:

$$

P(Z|Lambda) = c_2

sum_{s=0}^{infty} frac{s!}{(s + k)!} prod_{{Y,sum_n bar{y}*n = s}}
prod*{n=1}^N frac{lambda^{bar{y}_n}}{bar{y}_n!}

$$

We can then get rid of the inner products by realizing that for any given $s$ , $n$ and $0 leq j leq s$ there are ${s – j + N – 2}choose{N – 2}$ combinations of $bar{Y}, sum_n bar{y}_n = s$ where $bar{y}_n = j$ (because we are splitting $s – j$ balls into $N – 1$ categories), hence:

$$

P(Z | Lambda) =

c_2 sum_{s=0}^{infty} frac{s!}{(s + k)!}

left( prod_{n=1}^N lambda_n^{sum_{j=0}^s j {{s – j + N – 2} choose {N – 2}}} right)

left( prod_{j=0}^s j!^{N {{s – j + N – 2} choose {N – 2}} } right)^{-1}

$$

Wolfram Alpha suggests a further simplification of the exponents of $lambda_n$, giving:

$$

P(Z | Lambda) =

c_2 sum_{s=0}^{infty} frac{s!}{(s + k)!}

left( prod_{n=1}^N lambda_n^{{N + s – 1} choose {N}} right)

left( prod_{j=0}^s j!^{N {{s – j + N – 2} choose {N – 2}} } right)^{-1}

$$

but then I’m stuck.

**Aproach 2: recursive formula**

I tried to solve summation over each dimension of $Y$ separately, leading to a recursive formula. The bottom is at the last element of $Y$ and $alpha_k$ represents the sum of $bar{y}_1, … bar{y}_k$:

$$

S_N(alpha_{N-1}) = sum_{bar{y}*N = 0}^{infty}
frac{lambda_N^{bar{y}_N}}{bar{y}_N!}
frac{(alpha*{N-1} + bar{y}

*N)!}{(alpha*{N-1} + bar{y}_N + k)!}

$$

$$

S_n(alpha_{n-1}) = sum_{bar{y}*n = 0}^{infty}
frac{lambda_n^{bar{y}_n}}{bar{y}_n!}
S*{n+1}(alpha_{n-1} + bar{y}_n)

$$

$$

p(Z | Lambda) = c_2 S_1(0)

$$

Wolfram Alpha suggests rephrasing $S_N$ in terms of the confluent hypergeometric function of the first kind (${}_1 F_1$):

$$

S_N(alpha_{N-1}) = frac{alpha_{N-1}!}{(alpha_{N-1} + k)!} {}*1 F_1(alpha*{N-1} + 1; k + alpha_{N-1} +1, lambda_N)

$$

But then I am stuck and unable to perform the recursion step.