## #StackBounty: #probability #taylor-series Rate of convergence in probability for likelihood ratio-type function

### Bounty: 50

I define the function
\$\$
lambda_n (theta_1,theta_2)=frac{P_{theta_1}(x)}{P_{theta_2}(x)}=frac{theta_1^x(1-theta_1)^{n-x}}{theta_2^x(1-theta_2)^{n-x}}
\$\$
where \$0<theta_1<1\$ and \$0<theta_2<1\$. The subscript \$n\$ indicates dependence of \$lambda_n(cdot)\$ on \$n\$. Furthermore, \$0<lambda_n(cdot)<1\$.

Let \$boldsymbol{hat{theta}}=(hat{theta}{0,n},hat{theta}_n)^T\$ and \$boldsymbol{theta}=(theta_0,theta)^T\$ where \$hat{theta}{0,n}\$ and \$hat{theta}_{n}\$ are maximum likelihood estimates of \$theta_0\$ and \$theta\$, respectively. I’m interested in understanding the rate of convergence (in probability) of \$lambda_n(boldsymbol{hat{theta}})stackrel{p}{to}lambda_n(boldsymbol{theta})\$. This convergence holds by consistency of the MLE estimates (assuming some regularity conditions), by continuity of \$lambda_n\$ and the continuous mapping theorem.

By invariance, \$lambda_n(boldsymbol{hat{theta}})\$ is the MLE of \$lambda_n(boldsymbol{theta})\$ and \$(hat{theta}{0,n},hat{theta}_n)\$ converge in probability to \$(theta_0,theta)\$ at \$sqrt{n}\$-consistency, i.e. \$sqrt{n}(hat{theta}{0,n}-theta_0)=O_p(1)\$ and \$sqrt{n}(hat{theta}_{n}-theta)=O_p(1)\$.

Question 1: Does invariance property of MLEs allow us to retain the rate of convergence?

If \$lambda_n(cdot)\$ didn’t have the dependence on \$n\$, then this would likely be true. To get at the rate of convergence, I started doing a first-order Taylor expansion:

\$\$
lambda_n(boldsymbol{hat{theta}})=lambda_n(boldsymbol{theta})+(boldsymbol{hat{theta}}-boldsymbol{theta})^Tfrac{partiallambda_n(boldsymbol{theta})}{partialboldsymbol{theta}}+o_{p,n}(|boldsymbol{hat{theta}}-boldsymbol{theta}|)
\$\$

Note that the remainder term has dependence on \$n\$. Now I’d like to show that
\$\$
frac{partiallambda_n(boldsymbol{theta})}{partialboldsymbol{theta}}=O_p(1)
\$\$
in order to get the linear term to be \$o_p(1)\$. Also I need to remove the dependence of the remainder term on \$n\$ by showing some uniform (over \$n\$) boundedness for the second derivative, i.e.

\$\$
leftvertfrac{partiallambda^2_n(boldsymbol{theta})}{partialboldsymbol{theta}^2}rightvertleq M
\$\$

My main concern is that the first and second derivatives depend on some factor \$x-theta n\$ which blows up as \$ntoinfty\$. Is there a way to handle this?

Thank you very much for your help!

Get this bounty!!!

## #StackBounty: #probability #normal-distribution #z-score #odds Calculating Odds of Getting a Sample w/ a Specific Standard Deviation

### Bounty: 50

Trying to calculate the odds on something and was getting myself confused. I’ll try to summarize into a simple problem with made up numbers.

Say a cannon fires projectiles with a population mean of 100 m/s and a standard deviation of 10 m/s, represented by a normal distribution.

I wanted to calculate the odds of firing off 15 rounds in a row that would have a standard deviation between 0 m/s and 2 m/s.

I basically calculated two z-scores:

Z1 = (101-100)/10 and Z2 = (99-100)/10.

Then assumed the probability of getting one round within that range was (using table for standardized z-scores):

P = P(X < Z1) – P(X < Z2)

To fire 15 rounds within that range, then I said P_15 = P^15.

Although, I feel more like I am calculating the odds of my sample to have more more like 3+ sigma (of 2 m/s), since with 1-sigma all the rounds from the sample don’t necessarily have to fall within the +/- 1 m/s range, just ~68% of them. But, I really would like the sample to have a 1-sigma between 0m/s and 2 m/s.

Question: what is the correct way to formulate this problem and what are the details of the calculation?

Thanks.

Get this bounty!!!

## #StackBounty: #probability #random-variable #probability-inequalities #inequality Implications of inequalities

### Bounty: 100

For \$i=1,2,3\$, consider a random variable \$Y_i\$ taking value in
\$\$
mathcal{Y}:={(1,1), (1,0), (0,1), (0,0)}
\$\$ and a random closed set \$S_i\$ taking value in \$mathcal{S}\$ that is the power set of \$mathcal{Y}\$, i.e.
\$\$
mathcal{S}:={{(1,1)}, {(1,0)}, {(0,1)}, {(0,0)}, {(1,1), (1,0)}, {(1,1), (0,1)}, {(1,1), (0,0)}, {(1,0), (0,1)}, {(1,0), (0,0)}, {(0,1), (0,0)}, {(1,1), (1,0), (0,1)}, {(1,1), (1,0), (0,0)}, {(1,1), (0,1), (0,0)}, {(1,0), (0,1), (0,0)}, {(1,1), (1,0), (0,1), (0,0)}}
\$\$
\$Y_i,S_i\$ are defined on the same probability space \$(Omega, mathcal{F}, P)\$.

Suppose that
\$\$
P(Y_iin K)leq P(S_icap Kneq emptyset) text{ } forall K in mathcal{S} text{ for } i=1,2,3
\$\$

For example, for \$K={(1,1), (0,1)}\$ and \$i=1\$
\$\$
P(Y_1=(1,1))+P(Y_1=(0,1))leq \
P(S_1={{(1,1)})+P(S_1={(0,1)})+P(S={(1,1), (1,0)})+P(S_1= {(1,1), (0,1)})+P(S_1={(1,1), (0,0)})+P(S_1={(1,0), (0,1)})+P(S_1={(0,1), (0,0)})+P(S_1= {(1,1), (1,0), (0,1)})+P(S_1={(1,1), (1,0), (0,0)})+P(S_1= {(1,1), (0,1), (0,0)})+P(S_1= {(1,0), (0,1), (0,0)})+P(S_1= {(1,1), (1,0), (0,1), (0,0)}})
\$\$

I would like your help to show that
\$\$
(star) hspace{1cm}
P(Y_1=(1,1))times P(Y_2=(1,1))times P(Y_3=(1,1)) +\P(Y_1=(0,0))times P(Y_2=(0,0))times P(Y_3=(0,0))leq\
P(S_1cap {(1,1)}neq 0 text{ and } S_2cap {(1,1)}neq 0 text{ and } S_3cap {(1,1)}neq 0 text{ OR }\
S_1cap {(0,0)}neq 0 text{ and } S_2cap {(0,0)}neq 0 text{ and } S_3cap {(0,0)}neq 0)
\$\$

My attempt

(A) I take the inequalities referred to \$K={(1,1), (0,0)}\$ for \$i=1,2,3\$ and multiply them across \$i\$:
\$\$
[P(Y_1=(1,1))+P(Y_1=(0,0))]times [P(Y_2=(1,1))+P(Y_2=(0,0))]times [P(Y_3=(1,1))+P(Y_3=(0,0))]leq\ [P(S_1cap {(1,1),(0,0)}neq emptyset)]times [P(S_2cap {(1,1),(0,0)}neq emptyset)]times [P(S_3cap {(1,1),(0,0)}neq emptyset)]
\$\$

(B) On the lhs the terms “in excess” with respect to \$(star)\$ are
\$\$
P(Y_1=(1,1))times P(Y_2=(0,0))times P(Y_3=(0,0))+\
P(Y_1=(0,0))times P(Y_2=(1,1))times P(Y_3=(0,0))+\
P(Y_1=(0,0))times P(Y_2=(0,0))times P(Y_3=(1,1))+\
P(Y_1=(1,1))times P(Y_2=(1,1))times P(Y_3=(0,0))+\
P(Y_1=(1,1))times P(Y_2=(0,0))times P(Y_3=(1,1))+\
P(Y_1=(0,0))times P(Y_2=(1,1))times P(Y_3=(1,1))
\$\$

(C) On the rhs the terms “in excess” with respect to \$(star)\$ are
\$\$
P(S_1cap {(1,1)}neq emptyset text{ and } S_1cap {(0,0)}=emptyset)times P(S_2cap {(0,0)}neq emptyset text{ and } S_2cap {(1,1)}=emptyset)times P(S_3cap {(0,0)}neq emptyset text{ and } S_3cap {(1,1)}=emptyset)+\
P(S_1cap {(0,0)}neq emptyset text{ and } S_1cap {(1,1)}=emptyset)times P(S_2cap {(1,1)}neq emptyset text{ and } S_2cap {(0,0)}=emptyset)times P(S_3cap {(0,0)}neq emptyset text{ and } S_3cap {(1,1)}=emptyset)+\
P(S_1cap {(0,0)}neq emptyset text{ and } S_1cap {(1,1)}=emptyset)times P(S_2cap {(0,0)}neq emptyset text{ and } S_2cap {(1,1)}=emptyset)times P(S_3cap {(1,1)}neq emptyset text{ and } S_3cap {(0,0)}=emptyset)+\
P(S_1cap {(1,1)}neq emptyset text{ and } S_1cap {(0,0)}=emptyset)times P(S_2cap {(1,1)}neq emptyset text{ and } S_2cap {(0,0)}=emptyset)times P(S_3cap {(0,0)}neq emptyset text{ and } S_3cap {(1,1)}=emptyset)+\
P(S_1cap {(1,1)}neq emptyset text{ and } S_1cap {(0,0)}=emptyset)times P(S_2cap {(0,0)}neq emptyset text{ and } S_2cap {(1,1)}=emptyset)times P(S_3cap {(1,1)}neq emptyset text{ and } S_3cap {(0,0)}=emptyset)+\
P(S_1cap {(0,0)}neq emptyset text{ and } S_1cap {(1,1)}=emptyset)times P(S_2cap {(1,1)}neq emptyset text{ and } S_2cap {(0,0)}=emptyset)times P(S_3cap {(1,1)}neq emptyset text{ and } S_3cap {(0,0)}=emptyset)
\$\$

(D) One strategy could be to show that
\$\$
P(Y_1=(1,1))geq P(S_1cap {(1,1)}neq emptyset text{ and } S_1cap {(0,0)}=emptyset)
\$\$
and, similarly, for the other terms, so that (B) \$geq \$ (C), and, hence, because of (A), \$(star)\$ holds. However, I am unable to do it.

(E) What I have shown, instead, is that
\$\$
P(Y_1=(1,1))+P(Y_1=(1,0))+P(Y_1=(0,1))geq\ P(S_1cap {(1,1)}neq emptyset text{ and } S_1cap {(0,0)}=emptyset)
\$\$
and that
\$\$
P(Y_1=(1,1))geq P(S_1={(1,1)})
\$\$
which, however, do not seem to be useful.

Get this bounty!!!

## #HackerRank: Correlation and Regression Lines solutions

```import numpy as np
import scipy as sp
from scipy.stats import norm```

### Correlation and Regression Lines – A Quick Recap #1

Here are the test scores of 10 students in physics and history:

Physics Scores 15 12 8 8 7 7 7 6 5 3

History Scores 10 25 17 11 13 17 20 13 9 15

Compute Karl Pearson’s coefficient of correlation between these scores. Compute the answer correct to three decimal places.

Output Format

In the text box, enter the floating point/decimal value required. Do not leave any leading or trailing spaces. Your answer may look like: `0.255`

This is NOT the actual answer – just the format in which you should provide your answer.

```physicsScores=[15, 12,  8,  8,  7,  7,  7,  6, 5,  3]
historyScores=[10, 25, 17, 11, 13, 17, 20, 13, 9, 15]```
`print(np.corrcoef(historyScores,physicsScores)[0][1])`
``````0.144998154581
``````

### Correlation and Regression Lines – A Quick Recap #2

Here are the test scores of 10 students in physics and history:

Physics Scores 15 12 8 8 7 7 7 6 5 3

History Scores 10 25 17 11 13 17 20 13 9 15

Compute the slope of the line of regression obtained while treating Physics as the independent variable. Compute the answer correct to three decimal places.

Output Format

In the text box, enter the floating point/decimal value required. Do not leave any leading or trailing spaces. Your answer may look like: `0.255`

This is NOT the actual answer – just the format in which you should provide your answer.

`sp.stats.linregress(physicsScores,historyScores).slope`
``````0.20833333333333331
``````

### Correlation and Regression Lines – A quick recap #3

Here are the test scores of 10 students in physics and history:

Physics Scores 15 12 8 8 7 7 7 6 5 3

History Scores 10 25 17 11 13 17 20 13 9 15

When a student scores 10 in Physics, what is his probable score in History? Compute the answer correct to one decimal place.

Output Format

In the text box, enter the floating point/decimal value required. Do not leave any leading or trailing spaces. Your answer may look like: `0.255`

This is NOT the actual answer – just the format in which you should provide your answer.

```def predict(pi,x,y):
slope, intercept, rvalue, pvalue, stderr=sp.stats.linregress(x,y);
return slope*pi+ intercept

predict(10,physicsScores,historyScores)```
``````15.458333333333332
``````

### Correlation and Regression Lines – A Quick Recap #4

The two regression lines of a bivariate distribution are:

`4x – 5y + 33 = 0` (line of y on x)

`20x – 9y – 107 = 0` (line of x on y).

Estimate the value of `x` when `y = 7`. Compute the correct answer to one decimal place.

Output Format

In the text box, enter the floating point/decimal value required. Do not lead any leading or trailing spaces. Your answer may look like: `7.2`

This is NOT the actual answer – just the format in which you should provide your answer.

```x=[i for i in range(0,20)]

'''
4x - 5y + 33 = 0
x = ( 5y - 33 ) / 4
y = ( 4x + 33 ) / 5

20x - 9y - 107 = 0
x = (9y + 107)/20
y = (20x - 107)/9
'''
t=7
print( ( 9 * t + 107 ) / 20 )```
``````8.5
``````

#### Correlation and Regression Lines – A Quick Recap #5

The two regression lines of a bivariate distribution are:

`4x – 5y + 33 = 0` (line of y on x)

`20x – 9y – 107 = 0` (line of x on y).

find the variance of y when σx= 3.

Compute the correct answer to one decimal place.

Output Format

In the text box, enter the floating point/decimal value required. Do not lead any leading or trailing spaces. Your answer may look like: `7.2`

This is NOT the actual answer – just the format in which you should provide your answer.

#### Q.3. If the two regression lines of a bivariate distribution are 4x – 5y + 33 = 0 and 20x – 9y – 107 = 0,

• calculate the arithmetic means of x and y respectively.
• estimate the value of x when y = 7. – find the variance of y when σx = 3.
##### Solution : –

We have,

4x – 5y + 33 = 0 => y = 4x/5 + 33/5 ————— (i)

And

20x – 9y – 107 = 0 => x = 9y/20 + 107/20 ————- (ii)

(i) Solving (i) and (ii) we get, mean of x = 13 and mean of y = 17.[Ans.]

(ii) Second line is line of x on y

x = (9/20) × 7 + (107/20) = 170/20 = 8.5 [Ans.]

(iii) byx = r(σy/σx) => 4/5 = 0.6 × σy/3 [r = √(byx.bxy) = √{(4/5)(9/20)]= 0.6 => σy = (4/5)(3/0.6) = 4 [Ans.]

variance= σ**2=> 16

## #StackBounty: #r #time-series #probability #modeling The effect size of difference

### Bounty: 50

I have this interesting data where I would like to estimate possibly a parameter of the difference (between \$A+B\$ and \$A+C\$, inference using both) that would allow me to infer the development of \$A\$ (whether there is a propensity to decrease or increase).

Any hint as to how to approach it included type of modeling/estimation procedure?

Here is part of the data: The data itself is a rate of observing number of species in days.

These have been calculated in R based on this formula for \$A\$:

``````A = obs / mean(obs.window)
``````

The values of \$B\$ and \$C\$ in R are based on the formulas:

``````B = obs / min(obs.window)
``````

and

`C = obs / max(obs.window)`

where `obs` is a observed number of species per day and `obs.window` is a average value of a sliding window of \$10\$ days (moving average).

`````` x <- "A B C
1  0.63 0.67 0.61
2  0.62 0.64 0.60
3  0.64 0.65 0.59
4  0.70 0.70 0.63
5  0.71 0.73 0.68
6  0.70 0.75 0.69
7  0.71 0.75 0.70
8  0.74 0.76 0.71
9  0.79 0.81 0.74
10 0.80 0.83 0.76
11 0.82 0.84 0.78
12 0.82 0.84 0.80
13 0.83 0.85 0.81
14 0.81 0.88 0.80
15 0.78 0.84 0.77
16 0.75 0.79 0.74
17 0.73 0.77 0.72
18 0.72 0.75 0.71
19 0.73 0.75 0.71
20 0.73 0.75 0.71
21 0.74 0.76 0.72
22 0.72 0.76 0.71
23 0.71 0.74 0.69
24 0.73 0.75 0.70
25 0.78 0.79 0.71
26 0.82 0.84 0.77
27 0.80 0.84 0.78
28 0.77 0.81 0.76
29 0.79 0.81 0.75
30 0.83 0.84 0.78
31 0.86 0.87 0.82
32 0.85 0.87 0.83
33 0.83 0.84 0.82
34 0.78 0.85 0.77
35 0.74 0.80 0.72
36 0.72 0.76 0.71
37 0.74 0.77 0.70
38 0.75 0.75 0.70
39 0.78 0.81 0.72
40 0.78 0.82 0.75"
``````

``````data <- read.table(text=x, header = TRUE)

data\$diff_AC <- with(data, (A-C))
data\$diff_AB <- with(data, (A-B))

with(data, plot(A~1, col=1))
with(data, points(B~1, col=2))
with(data, points(C~1, col=3))
``````

Get this bounty!!!

## #StackBounty: #probability #stochastic-processes #combinatorics #random-walk Number of coins problem

### Bounty: 50

We start at time \$t=0\$ with a single coin placed at position \$i=0\$. At each later time step \$t\$, at our current position \$i\$, if we have neighbour coins on left and right of \$i\$ then we just change position randomly to either of them (so, to \$i-1\$ or \$i+1\$). If we have only one neighbour coin (so, at the end of the string of coins), then we either move to that neighbour (so, to \$i-1\$ if our single neighbour is on our left and to \$i+1\$ if it is on our right), or we add a coin and fill out our empty side. We assume the latter has a higher probability (call it \$p\$) of occurring than the moving possibility. So for example at the start, since we have only one coin, during the first time step we have no choice but to add a coin either to left or right randomly.

Can we determine how many coins we have on average at a given time step \$t\$?

I imagine the answer depends on the value of \$p\$.

Get this bounty!!!

## #StackBounty: #probability #stochastic-processes #combinatorics Number of coins problem

### Bounty: 50

We start off with 1 coin, lets give it the position \$i=0.\$ At each time step \$t,\$ at our current position \$i,\$ if we have neighbour coins on left and right of \$i\$ then we just change position randomly to either of them (so to i-1 or i+1), and if we have only one neighbour coin (so at the end of the string of coins) then we either move to that neighbour (so i+1 if our single neighbour is on the left and i-1 if it is on the right) or we add a coin and fill out our empty side, we assume the latter has always a higher probability (call it \$k\$) of occuring compared to the moving possibility. So for example at the start, since we have only one coin, during the first time step we have no choice but to add a coin either to left or right randomly. Can we determine how many coins we have on average at a given time step \$t?\$ I imagine it is dependent on the \$k\$ we choose.
Please let me know if anything stands vague to you.

Get this bounty!!!

## #StackBounty: #probability #mathematical-statistics #references #interpretation #locality-sensitive-hash Reference / resource request f…

### Bounty: 50

The hash table is defined the function family \$G = {g:S rightarrow U^k}\$ such that \$g(p) = (h_1(p),ldots,h_k(p))\$ , where \$h_i ∈ H\$. The query point \$q\$ is hashed into all the hash table
\${g_1(p),ldots,g_l(p)}\$. The candidate set, \${p_1,p_2,ldots,p_m}\$, is composed of the points in all the
hash tables which are hashed into the same bucket with the query point \$q\$.

The properties of LSH are:

\$1.) g_j(p’) neq g_j(q),\$

\$2) p^* in B(q,r) text{then} g_j(p^*) = g_j(q)\$

How can I proof these two properties and where can I find a simpler easy to understand proof of these 2 properties? I cannot understand anything about how to proceed with the proof for the two properties. Any study material / tuorial would really help where I can find the proof to understand. Please help.

Get this bounty!!!

## #StackBounty: #probability #combinatorics Schuette–Nesbitt formula

### Bounty: 100

I was reading the article about the Schuette–Nesbitt formula, which is described as “a generalization of the inclusion–exclusion principle”, which has both a combinatorial and probabilistic versions. Another website gave a proof for dependent events (pdf download), and found a third that compares it to Waring’s Theorem (pdf)

However, I am still confused. I tried finding a clear worked-out example using discrete probabilities (for simplicity) that the steps are clear from one line to the next – to help in overall understanding of the formula.

Is there a good reference, or an answer that can give a short worked-out example?

Get this bounty!!!

## #StackBounty: #probability #distributions #stochastic-processes PDf of sum of multinomial and gaussian distribution

### Bounty: 50

I have a model,
signal \$y_n in mathcal{R}\$ (signals in real domain) can be expressed as begin{align}
y_n &= s_n * h_n + v_n = sum_{k=0}^{L-1}h_k s_{n-k} + v_n,
label{Eq1}
end{align}
where \$*\$ is the convolution operator and \$v_n\$ is a zero mean AWGN. In an earlier Question asked http://dsp.stackexchange.com/questions/37698/help-in-proper-notations-and-mathematical-formulation

the input information source \$s_n\$ is an independent multinomial process with the probability parameter \$p in (0,1)\$. Let, there be \$m\$ distinct symbols \$a_1, a_2, ldots, a_m\$ in the sequence with probability of occurrence \$p_1,ldots,p_m\$, respectively.
Rewriting,
begin{align}
y_n &= mathbf{h}^Tmathbf{s}_n + v_n
end{align}

The unknowns are the channel coefficients, the input, and the noise variance. So, the parameter vector of unknowns is \$mathbf{theta} = [{mathbf{h},mathbf{s},p_1,…,p_m,sigma^2_v}]^T\$

SInce the input is also unknw, theFisher Information must include the input as well. But I don’t know how do I write the log likelihood expression so that the Fisher Information matrix includes the term for the unknown input as well. This is what I have tried but I don’t know if I am doing it correctly.

The conditional probability density function of \$mathbf{y}\$ can be written as:
begin{align}
P(mathbf{y}|mathbf{theta}) &= prod_{n=1}^{N}P(y_n|mathbf{s}n) nonumber\
&= (2 pi sigma^2_v)^{-N/2} exp left(-frac{sum
{n=1}^N {(y_n-mathbf{h}^T mathbf{s}_n)}^2}{2sigma_v^2} right)
label{Eq15}
end{align}
The log-likelihood probability density function (PDF) which is the logarithm of the joint conditional pdf is:
begin{align}
F &= -frac{N}{2} ln(2 pi sigma^2_v) – frac{1}{2sigma^2_v} left[ {(y_n – {mathbf{h}}^T mathbf{s}_n)}^2 right]
label{Eq16}
end{align}

Get this bounty!!!