## #StackBounty: #machine-learning #statistical-significance #t-test #p-value How to decide if means of two sets are statistically signifi…

### Bounty: 50

I have a data set consisting of some number of pairs of real numbers. For example:

``````(1.2, 3.4), (3.2, 2.7), ..., (4.2, 1.0)
``````

or

``````(x1, y1), (x2, y2), ..., (xn, yn)
``````

I want to know if the second variable depends on the first one (it is known in advance that if there is a dependency, it is very weak, so it is hard to detect).

I split the data set into two parts using the first number (Xs). Then I use the mean of Ys for the first and the second sub-sets as “predictions”. If find such a split that the squared deviation between the predictions and real values of Ys is minimal. Basically I do what is done by decision trees.

Now I wont to know if the found split and the corresponding difference between the two means is significant. I could use some standard test to check if the means of two sets are statistically significantly different but, I think, it would be incorrect because we did the split that maximise this difference. What would be the way to count for that?

## #StackBounty: #distributions #p-value #goodness-of-fit #kolmogorov-smirnov Goodness-of-fit test on arbitrary parametric distributions w…

### Bounty: 100

There have been many questions regarding this topic already addressed on CV. However, I was still unsure if this question was addressed directly.

1. Is it possible, for any arbitrary parametric distribution, to properly calculate the p-value for a Kolmogorov-Smirnov test where the parameters of the null distribution are estimated from the data?
2. Or does the choice of parametric distribution determine if this can be achieved?
3. What about the Anderson-Darling, Cramer von-Mises tests?
4. What is the general procedure for estimating the correct p-values?

My general understanding of the procedure would be the following. Assume we have data \$X\$ and a parametric distribution \$F(x;theta)\$. Then I would:

• Estimate parameters \$hattheta_{0}\$ for \$F(x;theta)\$.
• Calculate Kolmogorv-Smirnov, Anderson-Darling, Cramer von-Mises test statistics: KS\$_{0}\$, AD\$_{0}\$ and CVM\$_{0}\$.
• For \$i=1,2,ldots,n\$
1. Simulate data \$y\$ from \$F(;hattheta_{0})\$
2. Estimate \$hattheta_{i}\$ for \$F(y;theta_{i})\$
3. Calculate KS\$_{i}\$, AD\$_{i}\$ and CVM\$_{i}\$ statistics for \$F(y;hattheta_{i})\$
• Calculate \$p\$-values as the proportion of these statistics that are more extreme than KS\$_{0}\$, AD\$_{0}\$ and CVM\$_{0}\$, respectively.

Is this correct?

## #StackBounty: #p-value #intuition #application #communication #climate Evidence for man-made global warming hits 'gold standard&#39…

### How should we interpret the $$5sigma$$ threshold in this research on climate change?

This message in a Reuter’s article from 25 february is currently all over the news:

They said confidence that human activities were raising the heat at the Earth’s surface had reached a “five-sigma” level, a statistical gauge meaning there is only a one-in-a-million chance that the signal would appear if there was no warming.

I believe that this refers to this article “Celebrating the anniversary of three key events in climate change science” which contains a plot, which is shown schematically below (It is a sketch because I could not find an open source image for an original, similar free images are found here). Another article from the same research group, which seems to be a more original source, is here (but it uses a 1% significance instead of $$5sigma$$).

The plot presents measurements from three different research groups: 1 Remote Sensing Systems, 2 the Center for Satellite Applications and Research, and the 3 University of Alabama at Huntsville.

The plot displays three rising curves of signal to noise ratio as a function of trend length.

So somehow scientists have measured an anthropogenic signal of global warming (or climate change?) at a $$5sigma$$ level, which is apparently some scientific standard of evidence.

For me such graph, which has a high level of abstraction, raises many questions$$^{dagger}$$, and in general I wonder about the question ‘How did they do this?’. How do we explain this experiment into simple words (but not so abstract) and also explain the meaning of the $$5sigma$$ level?

I ask this question here because I do not want a discussion about climate. Instead I want answers regarding the statistical content and especially to clarify the meaning of such a statement that is using/claiming $$5 sigma$$.

$$^dagger$$:What is the null hypothesis? How did they set up the experiment to get a anthropogenic signal? What is the effect size of the signal? Is it just a small signal and we only measure this now because the noise is decreasing, or is the signal increasing? What kind of assumptions are made to create the statistical model by which they determine the crossing of a 5 sigma threshold (independence, random effects, etc…)? Why are the three curves for the different research groups different, do they have different noise or do they have different signals, and in the case of the latter, what does that mean regarding the interpretation of probability and external validity?

