# #StackBounty: #r #mathematical-statistics #variance #sampling #mean Right way to compute mean and variance

### Bounty: 50

1.If I take as definition of $$a_{lm}$$ following a normal distribution with mean equal to zero and $$C_ell=langle a_{lm}^2 rangle=text{Var}(a_{lm})$$, and taking the following random variable $$Z$$ defined by this expression :

begin{aligned} Z = sum_{ell=ell_{min}}^{l_{max}} sum_{m=-ell}^{ell} a_{ell m}^{2} end{aligned}

Then, the goal is to compute $$langle Zrangle$$ :

If I consider the random variable $$Y=sum_{m=-ell}^{ell} C_ell bigg(dfrac{a_{ell m}}{sqrt{C_ell}}bigg)^{2}$$, this random variable $$Y$$ follows a $$chi^2(1)$$ distribution weighted by $$C_ell$$.

1. Can I write from this that mean of $$Y$$ is equal to :

$$langle Yrangle =langlebigg(sum_{m=-ell}^{ell} a_{ell m}^{2}bigg)rangle = (2ell+1),C_ell$$

??

and so, we would have :

$$langle Zrangle = sum_{ell=ell_{min}}^{ell_{max}},C_ell,(2ell+1)$$

?? I have serious doubts since the $$a_{lm}$$ doesn’t follow a reduced Normal distribution $$mathcal{N}(0,1)$$.

Shouldn’t it be rather :

begin{align} Z&equiv sum_{ell=ell_{min}}^{ell_{max}} sum_{m=-ell}^ell a_{ell,m}^2 [6pt] &= sum_{ell=ell_{min}}^{ell_{max}} sum_{m=-ell}^ell C_ell cdot bigg( frac{a_{ell,m}}{sqrt{C_ell}} bigg)^2 [6pt] &sim sum_{ell=ell_{min}}^{ell_{max}} sum_{m=-ell}^ell C_ell cdot text{ChiSq}(1) [6pt] &= sum_{ell=ell_{min}}^{ell_{max}} C_ell sum_{m=-ell}^ell text{ChiSq}(1) [6pt] &= sum_{ell=ell_{min}}^{ell_{max}} C_ell cdot text{ChiSq}(2 ell + 1). [6pt] end{align}

1. Now, I want to calculate the mean $$langle Zrangle$$ of $$Z$$ :

Do you agree that my case here is the computation of a mean for a weighted sum of $$chi^2$$ ?

So the computation is not trivial, isn’t it ? Maybe I could compute the mean by starting from analytical :

$$langle Zrangle=sum_{ell=ell_{min}}^{ell_{max}} C_ell (2ell + 1)quad(1)$$

and directly doing the numerical computation :

$$langle Zrangle=sum_{i=1}^{N} C_{ell_{i}} (2ell_{i} + 1)quad(2)$$

I make confusions between $$(1)$$ and $$(2)$$ above since there is each $$C_ell$$ corresponds to each $$ell$$ (I mean on a numerically point of view, each $$C_{ell_{i}}$$ value is associated to a $$ell_{i}$$ value)

1. If the direct computation $$langle Zrangle=sum_{i=1}^{N} C_{ell_{i}} (2ell_{i} + 1)$$ not correct, then I have to consider random variable $$Z$$ following a weighted sum of different chisquared distrbutions :

I have tried with following `R script` where `nRed` is one of the 5 bins considered and `nRow` the number of values for $$ell$$ (from $$ell_{min}$$ to $$ell_{max}$$), and also the `Cl_sp[,i]` the vector of `nRow` values of $$C_ell$$ for each bin $$i$$ taken into acccount.

``````   # Number of bin
nRed <- 5

# Number of rows
nRow <- 36

# Size of sample
nSample_var <- 1000

# NRow values of multipoles l
L <- 2*(array_2D[,1])+1

# Weighted sum of Chi squared distribution
y3_1<-array(0,dim=c(nSample_var,nRed))
for (i in 1:nRed) {
for (j in 1:nRow) {
y3_1[,i] <- y3_1[,i] + Cl_sp[j,i] * rchisq(nSample_var,df=L[j])
}
}

# Print the mean of Z for each bib
for (i in 1:nRed) {
print(paste0('red=',i,'mean_exp = ', mean(y3[,i])))
}
``````
1. Is it the right thing to implement to compute the mean of $$Z$$ if I can’t compute it analytically (see expression $$(2)$$ above).

I would like to compute also the variance of $$Z$$, maybe a simple adding in my `R script` like :

``````# Print the variance of Z for each bin
for (i in 1:nRed) {
print(paste0('red=',i,'mean_exp = ', var(y3[,i])))
}
``````