#StackBounty: #python #machine-learning #scikit-learn #classification #data-science Mapping – Feature Importance vs Label classification

Bounty: 50

I have a set of data as below, where I am studying using Python (sklearn) what are the 3 top features affecting a Food_Taste (label),

Id,Cook_Temp_C,Cook_Time_Min,Ingredients_Count,Salt_Level_g,Meat_Freshness,Food_Taste(Bad:0,OK:1,Good:2)
0,40,15,5,3,0,1
1,28,5,7,3,1,2
2,43,15,4,2,0,0
3,48,20,5,3,1,0
4,22,7,8,3,1,2
5,25,8,6,3,1,2
6,34,13,6,1,1,0
7,30,8,8,1,1,2
8,11,11,5,2,0,1
9,15,16,6,1,1,0

After fitting e.g using RandomForestClassifier(), the Feature Importance returns Cook_Temp_C, Cook_Time_Min, Meat_Freshness as the 3 most important features.

Now, I am trying to answer the below research question,

What are the value ranges for Cook_Temp_C, Cook_Time_Min, Meat_Freshness that statistically contributed for a good Food_Taste (Good:2) ?

Possible Expected Result:

Cook_Temp_C = [22,25,28,30]
Cook_Time_Min = [5,7,8,8]
Meat_Freshness = [1]

The above result basically concludes if a person like to have a good meal, he need to cook a fresh meat in anyway he likes between 5-8 minutes within 22-30ºC.

Question

Would you be able to guide me on how to go about approaching this research question?
Is there any library in sklearn or otherwise that able to get this information? Any additional information such as confidence interval, outliers etc. is a bonus.


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.