*Bounty: 100*

*Bounty: 100*

I have a random variable $Y = frac{e^{X}}{1 + e^{X}}$ and I know $X sim N(mu, sigma^2)$.

Is there a way to compute $mathbb{E}(Y)$? I have tried to work out the integral, but haven’t made much progress. Is it even possible?

Skip to content
# Tag: normal-distribution

## #StackBounty: #logistic #normal-distribution #expected-value Expectation of Inverse Logit of Normal Random Variable

*Bounty: 100*

## #StackBounty: #distributions #normal-distribution #kolmogorov-smirnov #equivalence Testing equivalence of two censored distributions

*Bounty: 50*

## #StackBounty: #normal-distribution #stata #beta-distribution Convert normal random variable into beta random variable in STATA

*Bounty: 50*

## #StackBounty: #normal-distribution #normalization #z-score Accounting for differences between judge scoring

*Bounty: 100*

## #StackBounty: #normal-distribution #descriptive-statistics #outliers #boosting #extreme-value Decision trees, Gradient boosting and nor…

*Bounty: 50*

## #StackBounty: #machine-learning #normal-distribution #descriptive-statistics #outliers #extreme-value Decision trees, Gradient boosting…

*Bounty: 50*

## #StackBounty: #bayesian #normal-distribution #expectation-maximization #gaussian-mixture Estimating truncation point in Gaussian mixture

*Bounty: 100*

## #StackBounty: #bayesian #normal-distribution #expectation-maximization #gaussian-mixture Estimating missing data in gaussian mixture

*Bounty: 100*

## #StackBounty: #probability #distributions #normal-distribution #expected-value #ratio Direct way of calculating $mathbb{E} left[ fra…

*Bounty: 150*

## #StackBounty: #probability #distributions #normal-distribution #expected-value #ratio Direct way of calculating $mathbb{E} left[ fra…

*Bounty: 150*

normal-distribution

I have a random variable $Y = frac{e^{X}}{1 + e^{X}}$ and I know $X sim N(mu, sigma^2)$.

Is there a way to compute $mathbb{E}(Y)$? I have tried to work out the integral, but haven’t made much progress. Is it even possible?

If we observe two censored distributions, where all observations above a cutoff are set at the value of the cutoff, **how can we test whether the observable distributions suggest that the two censored distributions come from the same true distribution?**

We could imagine this for income data where incomes above a certain level are reported in a form like ‘$250k/year and greater.’ Or we could imagine data on campaign contributions where people can only donate $$X$, but some probably would have donated more in the absence of the cap.

For example:

```
d1 <- rnorm(n = 1000, sd = 5)
d2 <- rnorm(n = 1000, sd = 5)
d1 <- ifelse(d1>5,5,d1)
d2 <- ifelse(d2>10,10,d2)
par(mfrow=c(1,2))
hist(d1, xlim = c(-20, 20), ylim = c(0,200))
hist(d2, xlim = c(-20, 20), ylim = c(0,200))
```

I need to generate two random variables – lognormal and beta distributed – while ensuring that the correlation between the two variables is -0.3.

I generated two normal random variables with -0.3 correlation as follows;

```
matrix C = (1, -0.3 -0.3, 1)
drawnorm x y, mean(0.921, 0) sds(0.174,1) corr(C)
// Here x is normal rv with mean 0.921 and sd 0.174 while y is a standard normal rv.
```

Converting x to lognormal is simple. I do this

```
gen price = exp(x)
// price is now the lognormal(0.921, 0.174)
```

Problem is in converting y~$N(0,1)$ to $beta(alpha,beta)$. Is there a way to do it?

Consider a competition with 10,000 entrants and 200 judges. Each entrant gets scored on a scale of 0-100 by 2 different judges for a total of 20,000 scores.

I want to remove any judge-to-judge variations in their means and standard deviations. To do this I’m using a Z-score for each judge’s scores and converting that to a T-score to put in back on a 0-100 scale.

In R I’m doing

`df$z_score <- ave(df$score, df$judge, FUN=scale)`

`df$t_score <- ave(df$score, df$judge, FUN = function(x) rescale(x, mean=50, sd=10, df=FALSE))`

in Python the code would be

`df['Z-Score'] = df.groupby('judge')['score'].transform(lambda x: stats.zscore(x, ddof=1))`

`df['T-Score'] = df['Z-Score'].transform(lambda x: x * 10 + 50)`

However, for a variety of reasons, some judges only scored a handful of entrants. Let’s say between 3 and 20.

Is it valid to calculate a Z-score/T-score for those particular judge’s scores as the mean and standard deviations may be skewed due to the small sample or should I run a different test?

I have a question regarding the normality of predictors. I have 100,000 observations in my data. The problem I am analysing is a classification problem so 5% of the data is assigned to class 1, 95,000 observations assigned to class 0, so the data is highly imbalanced. However the observations of the class 1 data is expected to have extreme values.

- What I have done is, trim the top 1% and bottom 1% of the data removing, any possible mistakes in the entry of such data)
- Winsorised the data at the 5% and 95% level (which I have checked and is an accepted practise when dealing with such data that I have).

So;

I plot a density plot of one variable after no outlier manipulation

Here is the same variable after trimming the data at the 1% level

Here is the variable after being trimmed and after being winsorised

My question is how should I approach this problem.

**First question**, should I just leave the data alone at trimming it? or should I continue to winsorise to further condense the extreme values into more meaningful values (since even after trimming the data I am still left with what I feel are extreme values). If I just leave the data after trimming it, I am left with long tails in the distribution like the following (however the observations that I am trying to classify mostly fall at the tail end of these plots).

**Second question**, since decisions trees and gradient boosted trees decide on splits, does the distribution matter? What I mean by that is if the tree splits on a variable at (using the plots above) <= -10. Then according to plot 2 (after trimming the data) and plot 3 (after winsorisation) all firms <= -10 will be classified as class 1.

Consider the decision tree I created below.

My argument is, irregardless of the spikes in the data (made from winsorisation) the decision tree will make the classification at all observations <= 0. So the distribution of that variable should not matter in making the split? It will only affect at what value that split will occur at? and I do not loose too much predictive power in these tails?

I have a question regarding the normality of predictors. I have 100,000 observations in my data. The problem I am analysing is a classification problem so 5% of the data is assigned to class 1, 95,000 observations assigned to class 0, so the data is highly imbalanced. However the observations of the class 1 data is expected to have extreme values.

- What I have done is, trim the top 1% and bottom 1% of the data removing, any possible mistakes in the entry of such data)
- Winsorised the data at the 5% and 95% level (which I have checked and is an accepted practise when dealing with such data that I have).

So;

I plot a density plot of one variable after no outlier manipulation

Here is the same variable after trimming the data at the 1% level

Here is the variable after being trimmed and after being winsorised

My question is how should I approach this problem.

**First question**, should I just leave the data alone at trimming it? or should I continue to winsorise to further condense the extreme values into more meaningful values (since even after trimming the data I am still left with what I feel are extreme values). If I just leave the data after trimming it, I am left with long tails in the distribution like the following (however the observations that I am trying to classify mostly fall at the tail end of these plots).

**Second question**, since decisions trees and gradient boosted trees decide on splits, does the distribution matter? What I mean by that is if the tree splits on a variable at (using the plots above) <= -10. Then according to plot 2 (after trimming the data) and plot 3 (after winsorisation) all firms <= -10 will be classified as class 1.

Consider the decision tree I created below.

My argument is, irregardless of the spikes in the data (made from winsorisation) the decision tree will make the classification at all observations <= 0. So the distribution of that variable should not matter in making the split? It will only affect at what value that split will occur at? and I do not loose too much predictive power in these tails?

I have data modeled as a mixture of two Gaussian distributions. The data is “clipped” i.e., there is data only for values greater than a threshold $t$, even though it is feasible for data to exist in the range $(-inf, t]$. How can I estimate the probability distribution of $t$, given the data?

I have data modeled as a mixture of two Gaussian distributions. The data is “clipped” i.e., there is data only for values greater than a threshold $t$, even though it is feasible for data to exist in the range $(-inf, t]$. How can I estimate the probability distribution of $t$, given the data?

Considering the following random vectors:

begin{align}

textbf{h} &= [h_{1}, h_{2}, ldots, h_{M}]^{T} sim mathcal{CN}left(textbf{0}*{M},dtextbf{I}*{M times M}right), [8pt]

textbf{w} &= [w_{1}, w_{2}, ldots, w_{M}]^{T} sim mathcal{CN}left(textbf{0}*{M},frac{1}{p}textbf{I}*{M times M}right), [8pt]

textbf{y} &= [y_{1}, y_{2}, ldots, y_{M}]^{T} sim mathcal{CN}left(textbf{0}*{M},left(d + frac{1}{p}right)textbf{I}*{M times M}right),

end{align}

where $textbf{y} = textbf{h} + textbf{w}$ and therefore, $textbf{y}$ and $textbf{h}$ are not independent.

I’m trying to find the following expectation:

$$mathbb{E} left[ frac{textbf{h}^{H} textbf{y}textbf{y}^{H} textbf{h}}{ | textbf{y} |^{4} } right],$$

where $| textbf{y} |^{4} = (textbf{y}^{H} textbf{y}) (textbf{y}^{H} textbf{y}$).

In order to find the desired expectation, I’m applying the following approximation:

$$mathbb{E} left[ frac{textbf{x}}{textbf{z}} right] approx frac{mathbb{E}[textbf{x}]}{mathbb{E}[textbf{z}]} – frac{text{cov}(textbf{x},textbf{z})}{mathbb{E}[textbf{z}]^{2}} + frac{mathbb{E}[textbf{x}]}{mathbb{E}[textbf{z}]^{3}}text{var}(mathbb{E}[textbf{z}]).$$

However, applying this approximation to the desired expectation is time consuming and prone to errors as it involves expansions with lots of terms .

I have been wondering if there is a more direct/smarter way of finding the desired expectation.

$textbf{UPDATE 21-04-2018}$: I’ve created a simulation in order to identify the pdf shape of the ratio inside of the expectation operator and as can be seen below it seems much like the pdf of a Gaussian random variable. Additionally, I’ve also noticed that the ratio results in real valued terms, the imaginary part is always equal to zero.

Is there another kind of approximation that can be used to find the expectation (one analytical/closed form result and not only the simulated value of the expection) given that the pdf looks like a Gaussian and probably can be approximated as such?

$textbf{UPDATE 24-04-2018}$: I’ve found an approximation to the case where $textbf{h}$ and $textbf{y}$ are independent.

$$mathbb{E} left[ frac{textbf{h}^{H}*{l} textbf{y}*{k} textbf{y}^{H} *{k} textbf{h}*{l} }{ | textbf{y}*{k} |^{4} } right] = frac{d*{l}[(M+1)(M-2)+4M+6]}{zeta_{k}M(M+1)^{2}}$$

where $zeta_{k} = d_{k} + frac{1}{p}$, $textbf{h}*{l} sim mathcal{CN}left(textbf{0}*{M},d_{l}textbf{I}*{M times M}right)$ and $textbf{h}*{k} sim mathcal{CN}left(textbf{0}*{M},d*{k}textbf{I}*{M times M}right)$. Note that $textbf{y}*{k} = textbf{h}*{k} + w$ and that $textbf{h}*{k}$ and $textbf{h}_{l}$ are independent.

I have used the following approximation:

$$mathbb{E} left[ frac{textbf{x}}{textbf{z}} right] approx frac{mathbb{E}[textbf{x}]}{mathbb{E}[textbf{z}]} – frac{text{cov}(textbf{x},textbf{z})}{mathbb{E}[textbf{z}]^{2}} + frac{mathbb{E}[textbf{x}]}{mathbb{E}[textbf{z}]^{3}}text{var}(mathbb{E}[textbf{z}]).$$

Considering the following random vectors:

begin{align}

textbf{h} &= [h_{1}, h_{2}, ldots, h_{M}]^{T} sim mathcal{CN}left(textbf{0}*{M},dtextbf{I}*{M times M}right), [8pt]

textbf{w} &= [w_{1}, w_{2}, ldots, w_{M}]^{T} sim mathcal{CN}left(textbf{0}*{M},frac{1}{p}textbf{I}*{M times M}right), [8pt]

textbf{y} &= [y_{1}, y_{2}, ldots, y_{M}]^{T} sim mathcal{CN}left(textbf{0}*{M},left(d + frac{1}{p}right)textbf{I}*{M times M}right),

end{align}

where $textbf{y} = textbf{h} + textbf{w}$ and therefore, $textbf{y}$ and $textbf{h}$ are not independent.

I’m trying to find the following expectation:

$$mathbb{E} left[ frac{textbf{h}^{H} textbf{y}textbf{y}^{H} textbf{h}}{ | textbf{y} |^{4} } right],$$

where $| textbf{y} |^{4} = (textbf{y}^{H} textbf{y}) (textbf{y}^{H} textbf{y}$).

In order to find the desired expectation, I’m applying the following approximation:

$$mathbb{E} left[ frac{textbf{x}}{textbf{z}} right] approx frac{mathbb{E}[textbf{x}]}{mathbb{E}[textbf{z}]} – frac{text{cov}(textbf{x},textbf{z})}{mathbb{E}[textbf{z}]^{2}} + frac{mathbb{E}[textbf{x}]}{mathbb{E}[textbf{z}]^{3}}text{var}(mathbb{E}[textbf{z}]).$$

However, applying this approximation to the desired expectation is time consuming and prone to errors as it involves expansions with lots of terms .

I have been wondering if there is a more direct/smarter way of finding the desired expectation.

$textbf{UPDATE 21-04-2018}$: I’ve created a simulation in order to identify the pdf shape of the ratio inside of the expectation operator and as can be seen below it seems much like the pdf of a Gaussian random variable. Additionally, I’ve also noticed that the ratio results in real valued terms, the imaginary part is always equal to zero.

Is there another kind of approximation that can be used to find the expectation (one analytical/closed form result and not only the simulated value of the expection) given that the pdf looks like a Gaussian and probably can be approximated as such?

$textbf{UPDATE 24-04-2018}$: I’ve found an approximation to the case where $textbf{h}$ and $textbf{y}$ are independent.

$$mathbb{E} left[ frac{textbf{h}^{H}*{l} textbf{y}*{k} textbf{y}^{H} *{k} textbf{h}*{l} }{ | textbf{y}*{k} |^{4} } right] = frac{d*{l}[(M+1)(M-2)+4M+6]}{zeta_{k}M(M+1)^{2}}$$

where $zeta_{k} = d_{k} + frac{1}{p}$, $textbf{h}*{l} sim mathcal{CN}left(textbf{0}*{M},d_{l}textbf{I}*{M times M}right)$ and $textbf{h}*{k} sim mathcal{CN}left(textbf{0}*{M},d*{k}textbf{I}*{M times M}right)$. Note that $textbf{y}*{k} = textbf{h}*{k} + w$ and that $textbf{h}*{k}$ and $textbf{h}_{l}$ are independent.

I have used the following approximation:

$$mathbb{E} left[ frac{textbf{x}}{textbf{z}} right] approx frac{mathbb{E}[textbf{x}]}{mathbb{E}[textbf{z}]} – frac{text{cov}(textbf{x},textbf{z})}{mathbb{E}[textbf{z}]^{2}} + frac{mathbb{E}[textbf{x}]}{mathbb{E}[textbf{z}]^{3}}text{var}(mathbb{E}[textbf{z}]).$$