I recently appeared for an Interview for my college and I was asked the following question. The Interviewer said that this question was a Data Science question. He asked the same question to a friend of mine as well.
Suppose 7.5% of the population has a certain Bone Disease. During COVID pandemic you go to a hospital and see the records. 25% of the COVID Infected patients also had the Bone Disease. Can we say for sure if the Bone Disease is a symptom of COVID-19?
I said No, and explained it as it’s not necessary that COVID-19 is causing these symptoms, it could very well be possible that the 7.5% of the country’s population which already had the disease is more susceptible to the virus due to lowered immunity. Hence making conclusions is not possible.
Then the interviewer asked me How can we be sure if it is a symptom or not?
I replied saying we can go to more Hospitals, collect more data and see if it correlates everywhere.
The Interviewer then said If we have the same results everywhere will you conclude it’s a symptom?
I had no good answer but I replied that Just correlation of data is not sufficient, we also need to check if the people who have COVID-19 had the bone disease prior to getting infected or not. See if that percentage also correlates and stuff.
Here he stopped questioning however I couldn’t judge If I was right or wrong.
I am in Grade-12 so I have no experience in Data Science as such. I do know a fair bit of statistics however I have never solved such questions. Can someone provide me insights on how to solve such questions and make meaningful conclusions?
I have asked the same question on Data Science SE however i noticed the other questions there were quite different so I wasn’t sure if this question is appropriate there. On Maths SE I was told it is appropriate for Stats SE as well so i’m posting it here too