## #StackBounty: #hypothesis-testing #variance #heteroscedasticity #breusch-pagan Test of heteroscedasticity for a categorical/ordinal pre…

### Bounty: 100

I have different number of measurements from various classes. I used one-way anova to see if the means of the observations in each class is different from others. This used the ratio of the between-class variance to the total variance.

Now, I want to test whether some classes (basically those with more observations) have a larger variance than expected by chance. What statistical test should I do? I can calculate the sample variance for each class, and then find the \$R^2\$ and p-value for the correlation of the sample variance vs. class size. Or in R, I could do

``````summary(lm(sampleVar ~ classSize))
``````

But the variance of the esitmator of variance (sample variance) depends on the sample size, even for random data.

For example, I generate some random data:

``````dt <- as.data.table(data.frame(obs=rnorm(4000), clabel=as.factor(sample(x = c(1:200),size = 4000, replace = T, prob = 5+c(1:200)))))
``````

I compute the sample variance and class sizes

``````dt[,classSize := length(obs),by=clabel]; dt[,sampleVar := var(obs),by=clabel]
``````

and then test to see if variance depends on the class size

``````summary(lm(data=unique(dt[,.(sampleVar, classSize),by=clabel]),formula = sampleVar ~ classSize))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.858047   0.056605  15.159   <2e-16 ***
classSize   0.006035   0.002393   2.521   0.0125 *
``````

There seems to be a dependence of the variance with the class size, but this is simply because the variance of the estimator depends on the sample size. How do I construct a statistical test to see if the variances in the different classes are actually dependent on the class sizes?

If my the variable I was regressing against was a continuous variable instead of the ordinal variable classSize, then I could have used the Breusch-Pagan test.

For example, I could do
fit <- lm(data=dt, formula= obs ~ clabel)

Get this bounty!!!

## #StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" \$Pi\$ using an observed p-value

### Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

\$\$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)\$\$

in which \$t_{n-1}\$ is the student \$t\$ quantile based on a specific value of a parameter \$Pi\$ (\$0<Pi<1)\$. For the sake of the illustration, we could suppose that \$Pi\$ is 0.025.

When performing a one-sided \$t\$ test of the null hypothesis \$H_0: mu = 100\$ on a sample taken from that population, the expected \$p\$ value is \$Pi\$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

1. Is the \$p\$ value a maximum likelihood estimator (MLE) of \$Pi\$? (Conjecture: yes, because it is based on a \$t\$ statistic which is based on a likelihood ratio test);

2. Is the \$p\$ value a biased estimator of \$Pi\$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many \$p\$s is close to \$Pi\$ but the mean value of many \$p\$s is much larger);

3. Is the \$p\$ value a minimum variance estimate of \$Pi\$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

4. Can we get a confidence interval around a given \$p\$ value by using the confidence interval of the observed \$t\$ value (this is done using the non-central student \$t\$ distribution with degree of freedom \$n-1\$ and non-centrality parameter \$t\$) and computing the \$p\$ values of the lower and upper bound \$t\$ values? (Conjecture: yes because both the non-central student \$t\$ quantiles and the \$p\$ values of a one-sided test are continuous increasing functions)

Get this bounty!!!

## #StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" \$Pi\$ using an observed p-value

### Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

\$\$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)\$\$

in which \$t_{n-1}\$ is the student \$t\$ quantile based on a specific value of a parameter \$Pi\$ (\$0<Pi<1)\$. For the sake of the illustration, we could suppose that \$Pi\$ is 0.025.

When performing a one-sided \$t\$ test of the null hypothesis \$H_0: mu = 100\$ on a sample taken from that population, the expected \$p\$ value is \$Pi\$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

1. Is the \$p\$ value a maximum likelihood estimator (MLE) of \$Pi\$? (Conjecture: yes, because it is based on a \$t\$ statistic which is based on a likelihood ratio test);

2. Is the \$p\$ value a biased estimator of \$Pi\$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many \$p\$s is close to \$Pi\$ but the mean value of many \$p\$s is much larger);

3. Is the \$p\$ value a minimum variance estimate of \$Pi\$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

4. Can we get a confidence interval around a given \$p\$ value by using the confidence interval of the observed \$t\$ value (this is done using the non-central student \$t\$ distribution with degree of freedom \$n-1\$ and non-centrality parameter \$t\$) and computing the \$p\$ values of the lower and upper bound \$t\$ values? (Conjecture: yes because both the non-central student \$t\$ quantiles and the \$p\$ values of a one-sided test are continuous increasing functions)

Get this bounty!!!

## #StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" \$Pi\$ using an observed p-value

### Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

\$\$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)\$\$

in which \$t_{n-1}\$ is the student \$t\$ quantile based on a specific value of a parameter \$Pi\$ (\$0<Pi<1)\$. For the sake of the illustration, we could suppose that \$Pi\$ is 0.025.

When performing a one-sided \$t\$ test of the null hypothesis \$H_0: mu = 100\$ on a sample taken from that population, the expected \$p\$ value is \$Pi\$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

1. Is the \$p\$ value a maximum likelihood estimator (MLE) of \$Pi\$? (Conjecture: yes, because it is based on a \$t\$ statistic which is based on a likelihood ratio test);

2. Is the \$p\$ value a biased estimator of \$Pi\$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many \$p\$s is close to \$Pi\$ but the mean value of many \$p\$s is much larger);

3. Is the \$p\$ value a minimum variance estimate of \$Pi\$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

4. Can we get a confidence interval around a given \$p\$ value by using the confidence interval of the observed \$t\$ value (this is done using the non-central student \$t\$ distribution with degree of freedom \$n-1\$ and non-centrality parameter \$t\$) and computing the \$p\$ values of the lower and upper bound \$t\$ values? (Conjecture: yes because both the non-central student \$t\$ quantiles and the \$p\$ values of a one-sided test are continuous increasing functions)

Get this bounty!!!

## #StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" \$Pi\$ using an observed p-value

### Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

\$\$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)\$\$

in which \$t_{n-1}\$ is the student \$t\$ quantile based on a specific value of a parameter \$Pi\$ (\$0<Pi<1)\$. For the sake of the illustration, we could suppose that \$Pi\$ is 0.025.

When performing a one-sided \$t\$ test of the null hypothesis \$H_0: mu = 100\$ on a sample taken from that population, the expected \$p\$ value is \$Pi\$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

1. Is the \$p\$ value a maximum likelihood estimator (MLE) of \$Pi\$? (Conjecture: yes, because it is based on a \$t\$ statistic which is based on a likelihood ratio test);

2. Is the \$p\$ value a biased estimator of \$Pi\$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many \$p\$s is close to \$Pi\$ but the mean value of many \$p\$s is much larger);

3. Is the \$p\$ value a minimum variance estimate of \$Pi\$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

4. Can we get a confidence interval around a given \$p\$ value by using the confidence interval of the observed \$t\$ value (this is done using the non-central student \$t\$ distribution with degree of freedom \$n-1\$ and non-centrality parameter \$t\$) and computing the \$p\$ values of the lower and upper bound \$t\$ values? (Conjecture: yes because both the non-central student \$t\$ quantiles and the \$p\$ values of a one-sided test are continuous increasing functions)

Get this bounty!!!

## #StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" \$Pi\$ using an observed p-value

### Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

\$\$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)\$\$

in which \$t_{n-1}\$ is the student \$t\$ quantile based on a specific value of a parameter \$Pi\$ (\$0<Pi<1)\$. For the sake of the illustration, we could suppose that \$Pi\$ is 0.025.

When performing a one-sided \$t\$ test of the null hypothesis \$H_0: mu = 100\$ on a sample taken from that population, the expected \$p\$ value is \$Pi\$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

1. Is the \$p\$ value a maximum likelihood estimator (MLE) of \$Pi\$? (Conjecture: yes, because it is based on a \$t\$ statistic which is based on a likelihood ratio test);

2. Is the \$p\$ value a biased estimator of \$Pi\$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many \$p\$s is close to \$Pi\$ but the mean value of many \$p\$s is much larger);

3. Is the \$p\$ value a minimum variance estimate of \$Pi\$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

4. Can we get a confidence interval around a given \$p\$ value by using the confidence interval of the observed \$t\$ value (this is done using the non-central student \$t\$ distribution with degree of freedom \$n-1\$ and non-centrality parameter \$t\$) and computing the \$p\$ values of the lower and upper bound \$t\$ values? (Conjecture: yes because both the non-central student \$t\$ quantiles and the \$p\$ values of a one-sided test are continuous increasing functions)

Get this bounty!!!

## #StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" \$Pi\$ using an observed p-value

### Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

\$\$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)\$\$

in which \$t_{n-1}\$ is the student \$t\$ quantile based on a specific value of a parameter \$Pi\$ (\$0<Pi<1)\$. For the sake of the illustration, we could suppose that \$Pi\$ is 0.025.

When performing a one-sided \$t\$ test of the null hypothesis \$H_0: mu = 100\$ on a sample taken from that population, the expected \$p\$ value is \$Pi\$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

1. Is the \$p\$ value a maximum likelihood estimator (MLE) of \$Pi\$? (Conjecture: yes, because it is based on a \$t\$ statistic which is based on a likelihood ratio test);

2. Is the \$p\$ value a biased estimator of \$Pi\$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many \$p\$s is close to \$Pi\$ but the mean value of many \$p\$s is much larger);

3. Is the \$p\$ value a minimum variance estimate of \$Pi\$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

4. Can we get a confidence interval around a given \$p\$ value by using the confidence interval of the observed \$t\$ value (this is done using the non-central student \$t\$ distribution with degree of freedom \$n-1\$ and non-centrality parameter \$t\$) and computing the \$p\$ values of the lower and upper bound \$t\$ values? (Conjecture: yes because both the non-central student \$t\$ quantiles and the \$p\$ values of a one-sided test are continuous increasing functions)

Get this bounty!!!

## #StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" \$Pi\$ using an observed p-value

### Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

\$\$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)\$\$

in which \$t_{n-1}\$ is the student \$t\$ quantile based on a specific value of a parameter \$Pi\$ (\$0<Pi<1)\$. For the sake of the illustration, we could suppose that \$Pi\$ is 0.025.

When performing a one-sided \$t\$ test of the null hypothesis \$H_0: mu = 100\$ on a sample taken from that population, the expected \$p\$ value is \$Pi\$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

1. Is the \$p\$ value a maximum likelihood estimator (MLE) of \$Pi\$? (Conjecture: yes, because it is based on a \$t\$ statistic which is based on a likelihood ratio test);

2. Is the \$p\$ value a biased estimator of \$Pi\$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many \$p\$s is close to \$Pi\$ but the mean value of many \$p\$s is much larger);

3. Is the \$p\$ value a minimum variance estimate of \$Pi\$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

4. Can we get a confidence interval around a given \$p\$ value by using the confidence interval of the observed \$t\$ value (this is done using the non-central student \$t\$ distribution with degree of freedom \$n-1\$ and non-centrality parameter \$t\$) and computing the \$p\$ values of the lower and upper bound \$t\$ values? (Conjecture: yes because both the non-central student \$t\$ quantiles and the \$p\$ values of a one-sided test are continuous increasing functions)

Get this bounty!!!

## #StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" \$Pi\$ using an observed p-value

### Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

\$\$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)\$\$

in which \$t_{n-1}\$ is the student \$t\$ quantile based on a specific value of a parameter \$Pi\$ (\$0<Pi<1)\$. For the sake of the illustration, we could suppose that \$Pi\$ is 0.025.

When performing a one-sided \$t\$ test of the null hypothesis \$H_0: mu = 100\$ on a sample taken from that population, the expected \$p\$ value is \$Pi\$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

1. Is the \$p\$ value a maximum likelihood estimator (MLE) of \$Pi\$? (Conjecture: yes, because it is based on a \$t\$ statistic which is based on a likelihood ratio test);

2. Is the \$p\$ value a biased estimator of \$Pi\$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many \$p\$s is close to \$Pi\$ but the mean value of many \$p\$s is much larger);

3. Is the \$p\$ value a minimum variance estimate of \$Pi\$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

4. Can we get a confidence interval around a given \$p\$ value by using the confidence interval of the observed \$t\$ value (this is done using the non-central student \$t\$ distribution with degree of freedom \$n-1\$ and non-centrality parameter \$t\$) and computing the \$p\$ values of the lower and upper bound \$t\$ values? (Conjecture: yes because both the non-central student \$t\$ quantiles and the \$p\$ values of a one-sided test are continuous increasing functions)

Get this bounty!!!

## #StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" \$Pi\$ using an observed p-value

### Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

\$\$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)\$\$

in which \$t_{n-1}\$ is the student \$t\$ quantile based on a specific value of a parameter \$Pi\$ (\$0<Pi<1)\$. For the sake of the illustration, we could suppose that \$Pi\$ is 0.025.

When performing a one-sided \$t\$ test of the null hypothesis \$H_0: mu = 100\$ on a sample taken from that population, the expected \$p\$ value is \$Pi\$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

1. Is the \$p\$ value a maximum likelihood estimator (MLE) of \$Pi\$? (Conjecture: yes, because it is based on a \$t\$ statistic which is based on a likelihood ratio test);

2. Is the \$p\$ value a biased estimator of \$Pi\$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many \$p\$s is close to \$Pi\$ but the mean value of many \$p\$s is much larger);

3. Is the \$p\$ value a minimum variance estimate of \$Pi\$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

4. Can we get a confidence interval around a given \$p\$ value by using the confidence interval of the observed \$t\$ value (this is done using the non-central student \$t\$ distribution with degree of freedom \$n-1\$ and non-centrality parameter \$t\$) and computing the \$p\$ values of the lower and upper bound \$t\$ values? (Conjecture: yes because both the non-central student \$t\$ quantiles and the \$p\$ values of a one-sided test are continuous increasing functions)

Get this bounty!!!