## #StackBounty: #mathematical-statistics #standard-deviation #subset Variance of set of subsets

### Bounty: 50

First of all sorry for the sloppy terminology, but I am right looking for the name of a statistical concept.

I was asked to calculate the “turnover” of the Facebook friends commenting on my posts, so I am looking for an indicator that has high value if always the same let’s say 10 friends are commenting my posts, and low if always different friends are commenting.

Obviously a set of friends commenting my given post form a subset of my friends, so I am looking a kind of “standard deviation”, “variance” of these subsets over my all posts.

What is the proper name of this statistical concept? How do you calculate it?

Get this bounty!!!

## #StackBounty: #standard-deviation #validation Standard deviation for a reconciled value?

### Bounty: 50

This is probably a silly question and I hope it will not be closed beacuse of topic.

I have \$n\$ sensors, each of them providing a value \$y_i\$ for a measurement with a corresponding standard deviation \$sigma_i\$. I know that the most probable value \$Y\$ is given by
\$\$Y=left(sum_{i=1}^n frac {y_i}{sigma_i^2}right)left(sum_{i=1}^n frac {1}{sigma_i^2}right)^{-1}\$\$ My problem is that I do not remember how is computed \$sigma_Y\$.

Could you provide me the formula ? Thanks in davance.

Get this bounty!!!

## #StackBounty: #standard-deviation #average Average and SD for not paired and different number replicates

### Bounty: 50

``````   mydf <- read.table(header=TRUE, text="
id    col1     col2     col3     col4
fow1  1        8        5        10
fow1  3        7        4        20
fow1  5        6        3        40
fow1  1        8        5        10
fow1  3        7        4        20
fow1  5        6        3        40
aow2  10       1        2        5
aow2  10       1        2        5
aow2  10       1        2        5

")
``````

I want to calculate the average ratio and SD for aow2/fow1, I can not calculate it for each single replicate because they are not paired and different number.
I could calculate average and SD for each column fow1 and aow but how can I calculate the aow2/fow1 average and SD? What scripts should I use?

Get this bounty!!!

## #HackerRank: Correlation and Regression Lines solutions

```import numpy as np
import scipy as sp
from scipy.stats import norm```

### Correlation and Regression Lines – A Quick Recap #1

Here are the test scores of 10 students in physics and history:

Physics Scores 15 12 8 8 7 7 7 6 5 3

History Scores 10 25 17 11 13 17 20 13 9 15

Compute Karl Pearson’s coefficient of correlation between these scores. Compute the answer correct to three decimal places.

Output Format

In the text box, enter the floating point/decimal value required. Do not leave any leading or trailing spaces. Your answer may look like: `0.255`

This is NOT the actual answer – just the format in which you should provide your answer.

```physicsScores=[15, 12,  8,  8,  7,  7,  7,  6, 5,  3]
historyScores=[10, 25, 17, 11, 13, 17, 20, 13, 9, 15]```
`print(np.corrcoef(historyScores,physicsScores)[0][1])`
``````0.144998154581
``````

### Correlation and Regression Lines – A Quick Recap #2

Here are the test scores of 10 students in physics and history:

Physics Scores 15 12 8 8 7 7 7 6 5 3

History Scores 10 25 17 11 13 17 20 13 9 15

Compute the slope of the line of regression obtained while treating Physics as the independent variable. Compute the answer correct to three decimal places.

Output Format

In the text box, enter the floating point/decimal value required. Do not leave any leading or trailing spaces. Your answer may look like: `0.255`

This is NOT the actual answer – just the format in which you should provide your answer.

`sp.stats.linregress(physicsScores,historyScores).slope`
``````0.20833333333333331
``````

### Correlation and Regression Lines – A quick recap #3

Here are the test scores of 10 students in physics and history:

Physics Scores 15 12 8 8 7 7 7 6 5 3

History Scores 10 25 17 11 13 17 20 13 9 15

When a student scores 10 in Physics, what is his probable score in History? Compute the answer correct to one decimal place.

Output Format

In the text box, enter the floating point/decimal value required. Do not leave any leading or trailing spaces. Your answer may look like: `0.255`

This is NOT the actual answer – just the format in which you should provide your answer.

```def predict(pi,x,y):
slope, intercept, rvalue, pvalue, stderr=sp.stats.linregress(x,y);
return slope*pi+ intercept

predict(10,physicsScores,historyScores)```
``````15.458333333333332
``````

### Correlation and Regression Lines – A Quick Recap #4

The two regression lines of a bivariate distribution are:

`4x – 5y + 33 = 0` (line of y on x)

`20x – 9y – 107 = 0` (line of x on y).

Estimate the value of `x` when `y = 7`. Compute the correct answer to one decimal place.

Output Format

In the text box, enter the floating point/decimal value required. Do not lead any leading or trailing spaces. Your answer may look like: `7.2`

This is NOT the actual answer – just the format in which you should provide your answer.

```x=[i for i in range(0,20)]

'''
4x - 5y + 33 = 0
x = ( 5y - 33 ) / 4
y = ( 4x + 33 ) / 5

20x - 9y - 107 = 0
x = (9y + 107)/20
y = (20x - 107)/9
'''
t=7
print( ( 9 * t + 107 ) / 20 )```
``````8.5
``````

#### Correlation and Regression Lines – A Quick Recap #5

The two regression lines of a bivariate distribution are:

`4x – 5y + 33 = 0` (line of y on x)

`20x – 9y – 107 = 0` (line of x on y).

find the variance of y when σx= 3.

Compute the correct answer to one decimal place.

Output Format

In the text box, enter the floating point/decimal value required. Do not lead any leading or trailing spaces. Your answer may look like: `7.2`

This is NOT the actual answer – just the format in which you should provide your answer.

#### Q.3. If the two regression lines of a bivariate distribution are 4x – 5y + 33 = 0 and 20x – 9y – 107 = 0,

• calculate the arithmetic means of x and y respectively.
• estimate the value of x when y = 7. – find the variance of y when σx = 3.
##### Solution : –

We have,

4x – 5y + 33 = 0 => y = 4x/5 + 33/5 ————— (i)

And

20x – 9y – 107 = 0 => x = 9y/20 + 107/20 ————- (ii)

(i) Solving (i) and (ii) we get, mean of x = 13 and mean of y = 17.[Ans.]

(ii) Second line is line of x on y

x = (9/20) × 7 + (107/20) = 170/20 = 8.5 [Ans.]

(iii) byx = r(σy/σx) => 4/5 = 0.6 × σy/3 [r = √(byx.bxy) = √{(4/5)(9/20)]= 0.6 => σy = (4/5)(3/0.6) = 4 [Ans.]

variance= σ**2=> 16