#StackBounty: #assumptions #joint-distribution #marginal Is it possible to derive joint probabilities from marginals with assumptions a…

Bounty: 100

I understand the title is too generic. I tried to look for similar questions and although there were a few that were seemingly about the same issue, either they provided answers in the negative or had no convincing answers or they suggested the use of copulas.

Since I have no working knowledge of copulas if they are actually the answer to my problem I am going to have to invest some time in getting acquainted with them, but before I do I would like to know if I should indeed invest the time, in the first place. Hence this question.

I have a population of individuals with a certain number of characteristics eg unemployed persons over some period of time; I know how many of them are located in a certain district (characteristic #1) also I know how many of them have achieved a certain education level eg MSc or relevant level (characteristic #2) but I don’t have data on location and education for the same individual.

Given that the available info is something like the following table (for simplicity I don’t include all the relevant characteristics-just ‘location‘ (rows) and ‘education‘ (columns)):

              | "MSc or higher"   "other edu"  |   sum 
   "Region A" |       x               a        |   n_A    (unemployed in region A)            
              |                                | 
   "rest regs"|       y               b        | n_U-n_A  (unemployed in other regions)
      sum     |     n_MSc          n_U-n_MSc   |  n_U     (unemployed persons)
              | (unemployed       (unemployed  |
              |  with MSc)         with other  |
              |                    education)  |
  1. is it warranted to claim that eg $frac{n_{MSc}}{n_{U}}$ is a measure of the risk of unemployment that a person with an education level equivalent or better than a MSc degree faces? Similarly, is eg $frac{n_{A}}{n_{U}}$ a measure of the risk of unemployment for a person situated in Region A?
  2. If the table above is reinterpreted as representing the unemployment risk associated with the relevant cell each time (ie if we divide the rightmost column and bottom row with $n_U$ to obtain marginal prrobabilities for the corresponding rows/columns and replace $x,y,a$ and $b$ with $p_x,p_y,p_a$ and $p_b$-the respective–unknown–joint probabilities) is there a way to retrieve those joint probabilities using only what information is contained in the tables presented above?
  3. Are there plausible assumptions/restrictions that would assist or facilitate the calculations for finding the joint probabilities (eg some proposed/assumed relation between conditional frequencies) within reasonable bounds and for the purpose of having a rough estimate of what the actual figures about the joint instances of characteristics would be eg if more refined data sources (eg data sources detailing those joint frequencies) are considered?

( I apologize for the crude table layout but I was unable to use latex properly )

Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.