## #StackBounty: #probability #mathematical-statistics #estimation #multivariate-analysis #covariance Methods to prove that a guess for th…

### Bounty: 50

Suppose we are interested in the covariance matrix \$Sigma\$ of a few MLE estimators \$hat theta_1,hat theta_2,cdots,hat theta_n\$. For each \$j\$, \$hat theta_j\$ is normally distributed and estimated from data. The data is multivariate normal with known covariance and mean \$vec 0\$.

The problem is, I obtained the covariance matrix \$Sigma\$ heuristically because it was impossible to compute directly. Now I want to prove that I have found the correct expression. What are some methods which would prove that I have found the correct covariance matrix?

Get this bounty!!!

## #StackBounty: #covariance #covariance-matrix #graphical-model #graph-theory Given an adjacency matrix, how can we fit a covariance matr…

### Bounty: 150

Suppose that I generate a k-regular graph like the following:

``````game <- sample_k_regular(k, r)
``````

Then, based on this adjacency matrix, suppose I replace the \$1\$’s with a correlation parameter \$rho\$ and replace all the \$0\$’s in the diagonal with \$1\$. Then, this is a covariance matrix for something that is like a multivariate normal. However, this way of creating a matrix results in a NON-positive definite matrix. Is there a way to create a covariance matrix structure based on an adjacency matrix based on putting a correlation parameter \$rho\$ where there are ties and \$1\$’s in the diagonal for a common variance? In other words, is there a way to create covariance matrices without running into the positive definiteness problem? Thanks.

Get this bounty!!!

## #StackBounty: #covariance #spatial #kriging Determining covariance of irregularly spaced spatial data

### Bounty: 50

I’m comparing concentration \$C\$ of a contaminant in the same spatial region at two time point 2000 and 2010 with sample size of \$N_{2000}\$ = 51 and \$N_{2010}\$ = 26 (not all the samples are from the same location), mean of \$mu(C){2000}\$ = 47 and \$mu(C){2010}\$ = 27 (determined by block kriging of all point observations) and variance of \$V(C){2000}\$ = 89 and \$V(C){2010}\$ = 68 (kriging variance). To determine if there has been any significant change over the last 10 years, we first need to determine the variance of change in the area:

\$V(Delta C) = V(C){2000} + V(C){2010} – V(C)_{2000,2010}\$

where, \$V(Delta C)\$ is the variance of change over time; and \$V(C){2000,2010}\$ is the covariance between the two temporal samples. Does anyone know how to determine the \$V(C){2000,2010}\$ term in the above equation?

Get this bounty!!!

## #HackerRank: Correlation and Regression Lines solutions

```import numpy as np
import scipy as sp
from scipy.stats import norm```

### Correlation and Regression Lines – A Quick Recap #1

Here are the test scores of 10 students in physics and history:

Physics Scores 15 12 8 8 7 7 7 6 5 3

History Scores 10 25 17 11 13 17 20 13 9 15

Compute Karl Pearson’s coefficient of correlation between these scores. Compute the answer correct to three decimal places.

Output Format

In the text box, enter the floating point/decimal value required. Do not leave any leading or trailing spaces. Your answer may look like: `0.255`

This is NOT the actual answer – just the format in which you should provide your answer.

```physicsScores=[15, 12,  8,  8,  7,  7,  7,  6, 5,  3]
historyScores=[10, 25, 17, 11, 13, 17, 20, 13, 9, 15]```
`print(np.corrcoef(historyScores,physicsScores)[0][1])`
``````0.144998154581
``````

### Correlation and Regression Lines – A Quick Recap #2

Here are the test scores of 10 students in physics and history:

Physics Scores 15 12 8 8 7 7 7 6 5 3

History Scores 10 25 17 11 13 17 20 13 9 15

Compute the slope of the line of regression obtained while treating Physics as the independent variable. Compute the answer correct to three decimal places.

Output Format

In the text box, enter the floating point/decimal value required. Do not leave any leading or trailing spaces. Your answer may look like: `0.255`

This is NOT the actual answer – just the format in which you should provide your answer.

`sp.stats.linregress(physicsScores,historyScores).slope`
``````0.20833333333333331
``````

### Correlation and Regression Lines – A quick recap #3

Here are the test scores of 10 students in physics and history:

Physics Scores 15 12 8 8 7 7 7 6 5 3

History Scores 10 25 17 11 13 17 20 13 9 15

When a student scores 10 in Physics, what is his probable score in History? Compute the answer correct to one decimal place.

Output Format

In the text box, enter the floating point/decimal value required. Do not leave any leading or trailing spaces. Your answer may look like: `0.255`

This is NOT the actual answer – just the format in which you should provide your answer.

```def predict(pi,x,y):
slope, intercept, rvalue, pvalue, stderr=sp.stats.linregress(x,y);
return slope*pi+ intercept

predict(10,physicsScores,historyScores)```
``````15.458333333333332
``````

### Correlation and Regression Lines – A Quick Recap #4

The two regression lines of a bivariate distribution are:

`4x – 5y + 33 = 0` (line of y on x)

`20x – 9y – 107 = 0` (line of x on y).

Estimate the value of `x` when `y = 7`. Compute the correct answer to one decimal place.

Output Format

In the text box, enter the floating point/decimal value required. Do not lead any leading or trailing spaces. Your answer may look like: `7.2`

This is NOT the actual answer – just the format in which you should provide your answer.

```x=[i for i in range(0,20)]

'''
4x - 5y + 33 = 0
x = ( 5y - 33 ) / 4
y = ( 4x + 33 ) / 5

20x - 9y - 107 = 0
x = (9y + 107)/20
y = (20x - 107)/9
'''
t=7
print( ( 9 * t + 107 ) / 20 )```
``````8.5
``````

#### Correlation and Regression Lines – A Quick Recap #5

The two regression lines of a bivariate distribution are:

`4x – 5y + 33 = 0` (line of y on x)

`20x – 9y – 107 = 0` (line of x on y).

find the variance of y when σx= 3.

Compute the correct answer to one decimal place.

Output Format

In the text box, enter the floating point/decimal value required. Do not lead any leading or trailing spaces. Your answer may look like: `7.2`

This is NOT the actual answer – just the format in which you should provide your answer.

#### Q.3. If the two regression lines of a bivariate distribution are 4x – 5y + 33 = 0 and 20x – 9y – 107 = 0,

• calculate the arithmetic means of x and y respectively.
• estimate the value of x when y = 7. – find the variance of y when σx = 3.
##### Solution : –

We have,

4x – 5y + 33 = 0 => y = 4x/5 + 33/5 ————— (i)

And

20x – 9y – 107 = 0 => x = 9y/20 + 107/20 ————- (ii)

(i) Solving (i) and (ii) we get, mean of x = 13 and mean of y = 17.[Ans.]

(ii) Second line is line of x on y

x = (9/20) × 7 + (107/20) = 170/20 = 8.5 [Ans.]

(iii) byx = r(σy/σx) => 4/5 = 0.6 × σy/3 [r = √(byx.bxy) = √{(4/5)(9/20)]= 0.6 => σy = (4/5)(3/0.6) = 4 [Ans.]

variance= σ**2=> 16