#StackBounty: #regression #confidence-interval #p-value #bootstrap #nonlinear-regression Efficient nonparametric estimation of confiden…

Bounty: 50

I’m estimating parameters for a complex, “implicit” nonlinear model $f(mathbf{x}, boldsymbol{theta})$. It’s “implicit” in the sense that I don’t have an explicit formula for $f$: its value is the output of a complex fluid dynamics code (CFD). After NLS regression, I had a look at residuals, and they don’t look very normal at all. Also, I’m having a lot of issues with estimating their variance-covariance matrix: methods available in nlstools fail with an error.

I’m suspecting the assumption of normally distributed parameter estimators is not valid: thus I would like to use some nonparametric method to estimate confidence intervals, $p$-values and confidence regions for the three parameters of my model. I thought of bootstrap, but other approaches are welcome, so long as they don’t rely on normality of parameter estimators. Would this work:

  1. given data set $D={P_i=(mathbf{x}_i,f_i)}_{i=1}^N$, generate datasets $D_1,dots,D_m$ by sampling with replacement from $D$
  2. For each $D_i$, use NLS (Nonlinear Least Squares) to estimate model parameters $boldsymbol{theta}^*_i=(theta^*_{1i},theta^*_{2i},theta^*_{3i})$
  3. I now have empirical distributions for the NLS parameters estimator. The sample mean of this distribution would be the bootstrap estimate for my parameters; 2.5% and 97.5% quantiles would give me confidence intervals. I could also make scatterplots matrices of each parameter against each other, and get an idea of the correlation among them. This is the part I like the most, because I believe that one parameter is weakly correlated with the others, while the remaining are extremely strongly correlated among themselves.

Is this correct? Then how do I compute the $p-$values – what is the null for nonlinear regression models? For example, for parameter $theta_{3}$, is it that $theta_{3}=0$, and the other two are not? How would I compute the $p-$value for such an hypothesis from my bootstrap sample $boldsymbol{theta}^_1,dots,boldsymbol{theta}^_m$? I don’t see the connection with the null…

Also, each NLS fit takes me quite some time (let’s say a few hours) because I need to run my fluid dynamics code $ptimes N$ times, where $N$ is the size of $D$ and $p$ is about 40 in my case. The total CPU time for bootstrap is then $40times N times m$ the time of a single CFD run, which is a lot. I would need a faster way. What can I do? I thought of building a metamodel for my CFD code (for example, a Gaussian Process model) and use that for bootstrapping, instead than CFD. What do you think? Would that work?


Get this bounty!!!

#StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" $Pi$ using an observed p-value

Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

$$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)$$

in which $t_{n-1}$ is the student $t$ quantile based on a specific value of a parameter $Pi$ ($0<Pi<1)$. For the sake of the illustration, we could suppose that $Pi$ is 0.025.

When performing a one-sided $t$ test of the null hypothesis $H_0: mu = 100$ on a sample taken from that population, the expected $p$ value is $Pi$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

  1. Is the $p$ value a maximum likelihood estimator (MLE) of $Pi$? (Conjecture: yes, because it is based on a $t$ statistic which is based on a likelihood ratio test);

  2. Is the $p$ value a biased estimator of $Pi$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many $p$s is close to $Pi$ but the mean value of many $p$s is much larger);

  3. Is the $p$ value a minimum variance estimate of $Pi$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

  4. Can we get a confidence interval around a given $p$ value by using the confidence interval of the observed $t$ value (this is done using the non-central student $t$ distribution with degree of freedom $n-1$ and non-centrality parameter $t$) and computing the $p$ values of the lower and upper bound $t$ values? (Conjecture: yes because both the non-central student $t$ quantiles and the $p$ values of a one-sided test are continuous increasing functions)


Get this bounty!!!

#StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" $Pi$ using an observed p-value

Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

$$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)$$

in which $t_{n-1}$ is the student $t$ quantile based on a specific value of a parameter $Pi$ ($0<Pi<1)$. For the sake of the illustration, we could suppose that $Pi$ is 0.025.

When performing a one-sided $t$ test of the null hypothesis $H_0: mu = 100$ on a sample taken from that population, the expected $p$ value is $Pi$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

  1. Is the $p$ value a maximum likelihood estimator (MLE) of $Pi$? (Conjecture: yes, because it is based on a $t$ statistic which is based on a likelihood ratio test);

  2. Is the $p$ value a biased estimator of $Pi$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many $p$s is close to $Pi$ but the mean value of many $p$s is much larger);

  3. Is the $p$ value a minimum variance estimate of $Pi$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

  4. Can we get a confidence interval around a given $p$ value by using the confidence interval of the observed $t$ value (this is done using the non-central student $t$ distribution with degree of freedom $n-1$ and non-centrality parameter $t$) and computing the $p$ values of the lower and upper bound $t$ values? (Conjecture: yes because both the non-central student $t$ quantiles and the $p$ values of a one-sided test are continuous increasing functions)


Get this bounty!!!

#StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" $Pi$ using an observed p-value

Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

$$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)$$

in which $t_{n-1}$ is the student $t$ quantile based on a specific value of a parameter $Pi$ ($0<Pi<1)$. For the sake of the illustration, we could suppose that $Pi$ is 0.025.

When performing a one-sided $t$ test of the null hypothesis $H_0: mu = 100$ on a sample taken from that population, the expected $p$ value is $Pi$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

  1. Is the $p$ value a maximum likelihood estimator (MLE) of $Pi$? (Conjecture: yes, because it is based on a $t$ statistic which is based on a likelihood ratio test);

  2. Is the $p$ value a biased estimator of $Pi$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many $p$s is close to $Pi$ but the mean value of many $p$s is much larger);

  3. Is the $p$ value a minimum variance estimate of $Pi$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

  4. Can we get a confidence interval around a given $p$ value by using the confidence interval of the observed $t$ value (this is done using the non-central student $t$ distribution with degree of freedom $n-1$ and non-centrality parameter $t$) and computing the $p$ values of the lower and upper bound $t$ values? (Conjecture: yes because both the non-central student $t$ quantiles and the $p$ values of a one-sided test are continuous increasing functions)


Get this bounty!!!

#StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" $Pi$ using an observed p-value

Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

$$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)$$

in which $t_{n-1}$ is the student $t$ quantile based on a specific value of a parameter $Pi$ ($0<Pi<1)$. For the sake of the illustration, we could suppose that $Pi$ is 0.025.

When performing a one-sided $t$ test of the null hypothesis $H_0: mu = 100$ on a sample taken from that population, the expected $p$ value is $Pi$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

  1. Is the $p$ value a maximum likelihood estimator (MLE) of $Pi$? (Conjecture: yes, because it is based on a $t$ statistic which is based on a likelihood ratio test);

  2. Is the $p$ value a biased estimator of $Pi$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many $p$s is close to $Pi$ but the mean value of many $p$s is much larger);

  3. Is the $p$ value a minimum variance estimate of $Pi$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

  4. Can we get a confidence interval around a given $p$ value by using the confidence interval of the observed $t$ value (this is done using the non-central student $t$ distribution with degree of freedom $n-1$ and non-centrality parameter $t$) and computing the $p$ values of the lower and upper bound $t$ values? (Conjecture: yes because both the non-central student $t$ quantiles and the $p$ values of a one-sided test are continuous increasing functions)


Get this bounty!!!

#StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" $Pi$ using an observed p-value

Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

$$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)$$

in which $t_{n-1}$ is the student $t$ quantile based on a specific value of a parameter $Pi$ ($0<Pi<1)$. For the sake of the illustration, we could suppose that $Pi$ is 0.025.

When performing a one-sided $t$ test of the null hypothesis $H_0: mu = 100$ on a sample taken from that population, the expected $p$ value is $Pi$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

  1. Is the $p$ value a maximum likelihood estimator (MLE) of $Pi$? (Conjecture: yes, because it is based on a $t$ statistic which is based on a likelihood ratio test);

  2. Is the $p$ value a biased estimator of $Pi$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many $p$s is close to $Pi$ but the mean value of many $p$s is much larger);

  3. Is the $p$ value a minimum variance estimate of $Pi$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

  4. Can we get a confidence interval around a given $p$ value by using the confidence interval of the observed $t$ value (this is done using the non-central student $t$ distribution with degree of freedom $n-1$ and non-centrality parameter $t$) and computing the $p$ values of the lower and upper bound $t$ values? (Conjecture: yes because both the non-central student $t$ quantiles and the $p$ values of a one-sided test are continuous increasing functions)


Get this bounty!!!

#StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" $Pi$ using an observed p-value

Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

$$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)$$

in which $t_{n-1}$ is the student $t$ quantile based on a specific value of a parameter $Pi$ ($0<Pi<1)$. For the sake of the illustration, we could suppose that $Pi$ is 0.025.

When performing a one-sided $t$ test of the null hypothesis $H_0: mu = 100$ on a sample taken from that population, the expected $p$ value is $Pi$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

  1. Is the $p$ value a maximum likelihood estimator (MLE) of $Pi$? (Conjecture: yes, because it is based on a $t$ statistic which is based on a likelihood ratio test);

  2. Is the $p$ value a biased estimator of $Pi$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many $p$s is close to $Pi$ but the mean value of many $p$s is much larger);

  3. Is the $p$ value a minimum variance estimate of $Pi$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

  4. Can we get a confidence interval around a given $p$ value by using the confidence interval of the observed $t$ value (this is done using the non-central student $t$ distribution with degree of freedom $n-1$ and non-centrality parameter $t$) and computing the $p$ values of the lower and upper bound $t$ values? (Conjecture: yes because both the non-central student $t$ quantiles and the $p$ values of a one-sided test are continuous increasing functions)


Get this bounty!!!

#StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" $Pi$ using an observed p-value

Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

$$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)$$

in which $t_{n-1}$ is the student $t$ quantile based on a specific value of a parameter $Pi$ ($0<Pi<1)$. For the sake of the illustration, we could suppose that $Pi$ is 0.025.

When performing a one-sided $t$ test of the null hypothesis $H_0: mu = 100$ on a sample taken from that population, the expected $p$ value is $Pi$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

  1. Is the $p$ value a maximum likelihood estimator (MLE) of $Pi$? (Conjecture: yes, because it is based on a $t$ statistic which is based on a likelihood ratio test);

  2. Is the $p$ value a biased estimator of $Pi$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many $p$s is close to $Pi$ but the mean value of many $p$s is much larger);

  3. Is the $p$ value a minimum variance estimate of $Pi$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

  4. Can we get a confidence interval around a given $p$ value by using the confidence interval of the observed $t$ value (this is done using the non-central student $t$ distribution with degree of freedom $n-1$ and non-centrality parameter $t$) and computing the $p$ values of the lower and upper bound $t$ values? (Conjecture: yes because both the non-central student $t$ quantiles and the $p$ values of a one-sided test are continuous increasing functions)


Get this bounty!!!

#StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" $Pi$ using an observed p-value

Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

$$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)$$

in which $t_{n-1}$ is the student $t$ quantile based on a specific value of a parameter $Pi$ ($0<Pi<1)$. For the sake of the illustration, we could suppose that $Pi$ is 0.025.

When performing a one-sided $t$ test of the null hypothesis $H_0: mu = 100$ on a sample taken from that population, the expected $p$ value is $Pi$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

  1. Is the $p$ value a maximum likelihood estimator (MLE) of $Pi$? (Conjecture: yes, because it is based on a $t$ statistic which is based on a likelihood ratio test);

  2. Is the $p$ value a biased estimator of $Pi$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many $p$s is close to $Pi$ but the mean value of many $p$s is much larger);

  3. Is the $p$ value a minimum variance estimate of $Pi$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

  4. Can we get a confidence interval around a given $p$ value by using the confidence interval of the observed $t$ value (this is done using the non-central student $t$ distribution with degree of freedom $n-1$ and non-centrality parameter $t$) and computing the $p$ values of the lower and upper bound $t$ values? (Conjecture: yes because both the non-central student $t$ quantiles and the $p$ values of a one-sided test are continuous increasing functions)


Get this bounty!!!

#StackBounty: #hypothesis-testing #t-test #p-value Estimating "population p-value" $Pi$ using an observed p-value

Bounty: 100

I asked a similar question last month, but from the responses, I see how the question can be asked more precisely.

Let’s suppose a population of the form

$$X sim mathcal{N}(100 + t_{n-1} times sigma / sqrt{n}, sigma)$$

in which $t_{n-1}$ is the student $t$ quantile based on a specific value of a parameter $Pi$ ($0<Pi<1)$. For the sake of the illustration, we could suppose that $Pi$ is 0.025.

When performing a one-sided $t$ test of the null hypothesis $H_0: mu = 100$ on a sample taken from that population, the expected $p$ value is $Pi$, irrespective of sample size (as long as simple randomized sampling is used).

I have 4 questions:

  1. Is the $p$ value a maximum likelihood estimator (MLE) of $Pi$? (Conjecture: yes, because it is based on a $t$ statistic which is based on a likelihood ratio test);

  2. Is the $p$ value a biased estimator of $Pi$? (Conjecture: yes because (i) MLE tend to be biased, and (2) based on simulations, I noted that the median value of many $p$s is close to $Pi$ but the mean value of many $p$s is much larger);

  3. Is the $p$ value a minimum variance estimate of $Pi$? (Conjecture: yes in the asymptotic case but no guarantee for a given sample size)

  4. Can we get a confidence interval around a given $p$ value by using the confidence interval of the observed $t$ value (this is done using the non-central student $t$ distribution with degree of freedom $n-1$ and non-centrality parameter $t$) and computing the $p$ values of the lower and upper bound $t$ values? (Conjecture: yes because both the non-central student $t$ quantiles and the $p$ values of a one-sided test are continuous increasing functions)


Get this bounty!!!