#StackBounty: #probability #distributions #normal-distribution #expected-value #ratio Direct way of calculating $mathbb{E} left[ fra…

Bounty: 150

Considering the following random vectors:

textbf{h} &= [h_{1}, h_{2}, ldots, h_{M}]^{T} sim mathcal{CN}left(textbf{0}{M},dtextbf{I}{M times M}right), [8pt]
textbf{w} &= [w_{1}, w_{2}, ldots, w_{M}]^{T} sim mathcal{CN}left(textbf{0}{M},frac{1}{p}textbf{I}{M times M}right), [8pt]
textbf{y} &= [y_{1}, y_{2}, ldots, y_{M}]^{T} sim mathcal{CN}left(textbf{0}{M},left(d + frac{1}{p}right)textbf{I}{M times M}right),

where $textbf{y} = textbf{h} + textbf{w}$ and therefore, $textbf{y}$ and $textbf{h}$ are not independent.

I’m trying to find the following expectation:

$$mathbb{E} left[ frac{textbf{h}^{H} textbf{y}textbf{y}^{H} textbf{h}}{ | textbf{y} |^{4} } right],$$

where $| textbf{y} |^{4} = (textbf{y}^{H} textbf{y}) (textbf{y}^{H} textbf{y}$).

In order to find the desired expectation, I’m applying the following approximation:

$$mathbb{E} left[ frac{textbf{x}}{textbf{z}} right] approx frac{mathbb{E}[textbf{x}]}{mathbb{E}[textbf{z}]} – frac{text{cov}(textbf{x},textbf{z})}{mathbb{E}[textbf{z}]^{2}} + frac{mathbb{E}[textbf{x}]}{mathbb{E}[textbf{z}]^{3}}text{var}(mathbb{E}[textbf{z}]).$$

However, applying this approximation to the desired expectation is time consuming and prone to errors as it involves expansions with lots of terms .

I have been wondering if there is a more direct/smarter way of finding the desired expectation.

Get this bounty!!!

#StackBounty: #probability #graphical-model #bayesian-network Predicting which cars and systems were fixed from the parts that were ord…

Bounty: 50

Interesting problem from the auto industry. Wondering if anyone has suggestions on a good approach. Auto manufacturer sells cars to independent dealers who sells to end-customers. When cars break down, the dealer fixes them and orders parts from the manufacturer but seldom gives them info on which car was fixed or even what system is being fixed. Some parts for example are used in multiple systems and multiple cars. The manufacturer would like to know about the durability of each part in each system of each car among other things. In general, they would like to assign a car and system to each of these servicings. What’s the best way to do that?

I figure this is a probabilistic problem that might benefit from some kind of graphical model. I also suspect the problem arises in other areas and might have some good known solutions.

The do know which parts are used in each system of each car. Sometimes they are given complete information.

Get this bounty!!!

#StackBounty: #probability #distributions #binomial #graph-theory Distribution and Variance of Count of Triangles in Random Graph

Bounty: 50

Consider an Erdos-Renyi random graph $G=(V(n),E(p))$. The set of $n$ vertices $V$ is labelled by $V = {1,2,ldots,n}$. The set of edged $E$ is constructed by a random process.

Let $p$ be a probability $0<p<1$, then each unordered pair ${i,j}$ of vertices ($i neq j$) occurs as an edge in $E$ with probability $p$, independently of the other pairs.

A triangle in $G$ is an unordered triple ${i,j,k}$ of distinct
vertices, such that ${i,j}$, ${j,k}$, and ${k,i}$ are edges in $G$.

The maximum number of possible triangles is $binom{n}{3}$. Define the random variable $X$ to be the observed count of triangles in the graph $G$.

The probability that three links are simultaneously present is $p^3$. Therefore, the expected value of $X$ is given by $E(X) = binom{n}{3} p^3$. Naively, one may guess that the variance is given by $E(X^2) =binom{n}{3} p^3 (1-p^3)$, but this is not the case.

The following Mathematica code simulates the problem:

N[Mean[myCounts]] // 4216. > similar to expected mean
Binomial[n,3]p^3 // 4233.6
N[StandardDeviation[myCounts]] // 262.078 > not similar to "expected" std
Sqrt[Binomial[n,3](p^3)(1-p^3)] // 57.612

What is the variance of $X$?

Get this bounty!!!

#StackBounty: #probability Calculate probability of being in the top-N based on pairwise probabilities

Bounty: 50

I have a set of, say, 20 documents and I have calculated all the pairwise probabilities that doc_x is more important than doc_y (e.g. P(doc_1 > doc_2) = 0.7). However, the probabilities may be inconsistent i.e. there may exist three documents where P(doc_1 > doc_2) > P(doc_2 > doc_3) > P(doc_3 > doc_1).

Is there a way to estimate the probability of each document belonging in the top-N most important documents? In other words, how can I say that P(doc_1 belongs in the top-5 documents) = ...%?

Get this bounty!!!

#StackBounty: #probability #mathematical-statistics #confidence-interval #inference Confidence Intervals: how to formally deal with $P(…

Bounty: 100

I was thinking about the formal definition of confidence intervals.

Given a Random Sample $textbf{X} = X_1,X_2,dots,X_n$ a level $alpha$ confidence interval for the population parameter $theta$ is defined as a pair of estimators (function of $textbf{X}$), namely $L(textbf{X})$ and $U(textbf{X})$ with $L(textbf{X}) leq U(textbf{X})$, with the property:
$$P(L(textbf{X}) leq theta leq U(textbf{X})) = 1-alpha$$

My interpretation of this this last equality is the joint probability of $L(textbf{X})$ and $U(textbf{X})$, i.e.

$$P(L(textbf{X}) leq theta, U(textbf{X})geqtheta) = 1-alpha$$

My question is how to work with this expression to find $L(textbf{X})$ and $U(textbf{X})$?

If I call $f_{L,U}(l,u)$ the joint pdf of $L(textbf{X})$ and $U(textbf{X})$ it should be something like:

$$int_{- infty}^{theta}int_{theta}^{+infty}f_{L,U}(l,u)dl du = 1 – alpha$$

and then I am stucked. I don’t know how to go on.

My first question is: since both $L(textbf{X})$ and $U(textbf{X})$ are function of the same $textbf{X}$ does something like the joint density even make sense?

I don’t know if my calculation is right but I found that something similar to a joint CDF for $L(textbf{X})$ and $U(textbf{X})$ could be (if $X in rm {I!R} $)

$$F_{L,U}(l,u)= F_{textbf{X}}bigg(minbig(L^{-1}(l),U^{-1}(u)big)bigg)$$
if that makes any sense at all. What is the correct way to think about this?
I know that there are methods like pivotal quantities but the difference with respect to my case is that when we have Pivot the probabilistic statement is expressed it terms of only one random variable so I don’t have a joint density.

Suppose that I call $Q(textbf{X},theta)$ my pivot, I can find say $l$ and $u$ such that:
$$Pbig(lleq Q(textbf{X},theta) leq u big) = 1 – alpha$$
and then I can invert this relationship wrt $theta$. I suppose this equality could be rewritten as:

$$F_{Q}(u) – F_{Q}(l) = 1 – alpha$$

This last equation has infinite solutions since I have 2 unknowns $l$ and $u$. My understanding is that, in order to solve it, one has to choose value for one between $l$ and $u$. For instance I could choose $l$ such that $F_{Q}(l) = 2%$ and solve fo $u$ so that $F_{Q}(u) = 1 – alpha + 2%$

And here is my second question: so there the $level- alpha$ confidence intervals are infinite? And the difference between them is their length? Is my reasoning correct?

Any help would be much appreciated! Thank you.

Get this bounty!!!

#StackBounty: #probability #distributions #random-variable CDF of a function of two random variables

Bounty: 50

Define $Y_1$ and $Y_2$ to be two positive and independent random variables, for which the pdf (probability density function) is the same and is given as:
$f(y) = beta expleft(- beta yright),beta>0$.
Suppose that time is slotted, and $t$ ($=0,1,2,…$) is the time index. $Y_1$ is associated with time $t$ and $Y_2$ with time $t+1$.
For $a,b,c>0$, $a>1$ and $c=frac{log(a)}{b}$, we define $p(s)$ to be some error probability that is a function of $s$, which is given as follows
p(s) = begin{cases} a expleft( – b s right), & text{for $s ge c$} \ 1, & text{for $0 < s < c $} end{cases}
At time $t$, $s=y_1$. And at time $t+1$, $s=y_1+y_2$.
Let $E_1$ represent the event of having an error at time $t$; the corresponding probability is $p(y_1)$. Also, define $E_2$ to be the event that there is an error at time $t+1$; the corresponding probability is $p(y_1+y_2)$. For $E_2$ to happen (at $t+1$), a necessary condition is that $E_1$ happens (at $t$).

Let $E$ be the event that there is an error at time $t$ and $t+1$. So we have $mathbb{P}{ E }= mathbb{P}{ E_1, E_2 }= mathbb{P}{ E_1 } mathbb{P}{ E_2 mid E_1 } = p(y_1) p(y_1+y_2)$. So if the realisations $y_1$ and $y_2$ of $Y_1$ and $Y_2$ are known, the (global) error probability is $p(y_1) p(y_1+y_2)$.

I am interested in the case where at $t$ and $t+1$ the realisations of, respectively, $Y_1$ and $Y_2$ are not known. Let $Z= p(Y_1) p(Y_1+Y_2)$.
So in this case I want to derive the expected value of $Z$. I also want to derive the CCDF (or CDF) of $Z$.

Here is my first attempt of a solution
mathbb{E} left{Zright} &=&int_0^infty int_0^infty p(y_1) p(y_1+y_2) , f(y_1) f(y_2) ,dy_1 dy_2 \
&= &int_0^c int_0^c beta expleft(- beta y_1 right) beta expleft(- beta y_2 right) dy_2 dy_1\
& +&
int_0^c int_{c-y_1}^infty aexpleft(-b(y_1 +y_2)right) beta expleft(- beta y_1 right) beta expleft(- beta y_2 right) dy_2 dy_1 \
&+& int_c^infty int_{0}^infty aexpleft(-b y_1right) aexpleft(-b(y_1 +y_2)right) beta expleft(- beta y_1 right) beta expleft(- beta y_2 right) dy_2 dy_1.
Is the above derivation correct?

CCDF $= Prleft{Z > z right} =$ ?

Please note that if explicit expressions are difficult to derive, I need to at least write these expressions as integrals function of $a$, $b$, $c$, and $beta$; as done for $Eleft{Zright}$.

Get this bounty!!!

#StackBounty: #probability #maximum-likelihood #likelihood On the full likelihood of a transformed sample and the partial likelihood

Bounty: 50

I am following the 1975 paper by Cox entitled Partial Likelihood.

Consider a vector $y$ of observations represented by a random variable $Y$ having density $f_Y(y;theta)$ and suppose that $Y$ is transformed into the sequence $(X_1,S_1,X_2, S_2 dots,X_m, S_m)$ then the full likelihood of this sequence is

$$prod_{j=1}^m f_{X_j | X^{(j-1)} , S^{(j-1)}} (x_j| x^{(j-1),} s^{(j-1)}; theta ) prod_{j=1}^m f_{S_j | X^{(j)} , S^{(j-1)}} (s_j| x^{(j),} s^{(j-1)}; theta )$$

  • Are we transforming $Y$ into a sequence by, as an example, applying a
    function $f:R rightarrow R^n$ to $Y$ or is it a different
  • How is the full likelihood obtained? I tried repeated conditioning in the case $m=4$ but in Cox formula the marginal densities are conditioned only on the previous term.

Get this bounty!!!

#StackBounty: #probability #variance #random-matrix Variance of Random Matrix

Bounty: 50

Let’s consider independent random vectors $hat{boldsymboltheta}_i$, $i = 1, dots, m$, which are all unbiased for $boldsymboltheta$ and that
$$mathbb{E}left[left(hat{boldsymboltheta}_i –
boldsymbolthetaright)^{T}left(hat{boldsymboltheta}_i –
boldsymbolthetaright)right] = sigma^2text{.}$$ Let
$mathbf{1}_{n times p}$ be the $n times p$ matrix of all ones.

Consider the problem of finding
$$mathbb{E}left[left(hat{boldsymboltheta} –
boldsymbolthetaright)^{T}left(hat{boldsymboltheta} –
boldsymbolthetaright)right]$$ where $$hat{boldsymboltheta} =

My attempt is to notice the fact that $$hat{boldsymboltheta} = dfrac{1}{m}underbrace{begin{bmatrix}
hat{boldsymboltheta}1 & hat{boldsymboltheta}_2 & cdots & hat{boldsymboltheta}_m
{mathbf{S}}mathbf{1}{m times 1}$$
and thus
$$text{Var}(hat{boldsymboltheta}) = dfrac{1}{m^2}text{Var}(mathbf{S}mathbf{1}
{m times 1})text{.}$$
How does one find the variance of a random matrix times a constant vector? You may assume that I am familiar with finding variances of linear transformations of a random vector: i.e., if $mathbf{x}$ is a random vector, $mathbf{b}$ a vector of constants, and $mathbf{A}$ a matrix of constants, assuming all are comformable,
$$mathbb{E}[mathbf{A}mathbf{x}+mathbf{b}] = mathbf{A}mathbb{E}[mathbf{x}]+mathbf{b}$$

Get this bounty!!!

#StackBounty: #probability #count-data #frequency #frequentist How to estimate a probability of an event to occur based on its count?

Bounty: 50

I have a generator of random symbols (single act of generation produces exactly one symbol). I know all the symbols that could be generated and for each symbols I would like to estimate the probability of it to be generated (at single act of generation).

The number of observations (acts of generation) is significantly smaller than the total number of possible symbols. As a consequence the most of the symbols have never been observed / generated in our experiment. A large number of observed symbols were observed only once.

The simplest and straightforward way to estimate the probabilities of each symbol to appear is to use this formula: $p_i = n_i/sum_j n_j$, where $n_i$ are counts of the symbol $i$.

Is there a better way to estimate the probabilities $p_i$?

Get this bounty!!!