Bounty: 100
There have been many questions regarding this topic already addressed on CV. However, I was still unsure if this question was addressed directly.
- Is it possible, for any arbitrary parametric distribution, to properly calculate the p-value for a Kolmogorov-Smirnov test where the parameters of the null distribution are estimated from the data?
- Or does the choice of parametric distribution determine if this can be achieved?
- What about the Anderson-Darling, Cramer von-Mises tests?
- What is the general procedure for estimating the correct p-values?
My general understanding of the procedure would be the following. Assume we have data $X$ and a parametric distribution $F(x;theta)$. Then I would:
- Estimate parameters $hattheta_{0}$ for $F(x;theta)$.
- Calculate Kolmogorv-Smirnov, Anderson-Darling, Cramer von-Mises test statistics: KS$_{0}$, AD$_{0}$ and CVM$_{0}$.
- For $i=1,2,ldots,n$
- Simulate data $y$ from $F(;hattheta_{0})$
- Estimate $hattheta_{i}$ for $F(y;theta_{i})$
- Calculate KS$_{i}$, AD$_{i}$ and CVM$_{i}$ statistics for $F(y;hattheta_{i})$
- Calculate $p$-values as the proportion of these statistics that are more extreme than KS$_{0}$, AD$_{0}$ and CVM$_{0}$, respectively.
Is this correct?