# #StackBounty: #confidence-interval #binomial #pdf #proportion #cdf Calculating a Confidence Interval for a Proportion for a Sample of D…

### Bounty: 50

I’m interested in a (preferably analytic) solution or approximation to the following problem:

Let $$s_1$$ be a sample from an unknown distribution of size $$N_1$$ and with proportion of successes $$p_1$$. Let $$s_2$$ be an independent sample from the same distribution of size $$N_2$$ with proportion $$p_2$$. Given $$N_1$$, $$p_1$$, and $$N_2$$, can we calculate a Confidence Interval for $$p_2$$?

I would love a general purpose analytic solution if anyone has one, but for simplicity I am fine with considering the case where both $$s_1$$ and $$s_2$$ satisfy the conditions for their sampling distributions to be approximated by a Gaussian distribution.

Now, my approaches to solving this have led me to 2 options:

1. Find upper and lower bounds for the confidence interval of $$p$$ (the population proportion of “successes”), and plug these back into confidence intervals for $$p_2$$ using the sampling distribution for $$p$$ with size $$N_2$$. Then take the max and min of those intervals. Or
2. Treat $$p$$ as a normally distributed random variable with $$mu=p_1$$ and $$sigma=sqrt{frac{p_1(1-p_1)}{N_1}}$$, which would imply the CDF for $$p_2$$ can be found by:

$$CDF(x) = int_0^1{NormPDF(frac{y-p_1}{sqrt{frac{p_1(1-p_1)}{N_1}}})cdot NormCDF(frac{x-y}{sqrt{frac{y(1-y)}{N_2}}})dy}$$

where $$NormPDF$$ and $$NormCDF$$ are the PDF and CDF functions for the standard normal distribution.

The problem with 1 is that the interval found will be much wider than I would ideally want (this is what I am currently using in my equations). The problem with 2 is that I have no idea how to convert this into an analytic function (through approximation with $$erf$$ since I assume there is no analytic solution to the integral). My goal is to graph these intervals as a function of $$p_1$$ in desmos along with other sampling/prediction strategies for comparison – this is why I would really like an analytic solution or approximation.

If someone can solve this, or point me in the right direction to finding a solution that would be greatly appreciated!

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.