#StackBounty: #variance #ancova Find variables most responsible for variance between groups

Bounty: 100

I have a set of data with continuous features $x_1, x_2,…,x_n$, as well as a continuous $y$ which is some complicated, unknown function of the $x_i$. Each data point, furthermore, has a discrete label (category). I want to somehow quantify which variables $x_i$ are most responsible for the variance of $y$ between the groups.

Below is a simple example. The blue and red dots are in different categories. Clearly most of the variation in $y$ between the two categories is due to $x_2$.

enter image description here

Are there any statistical methods that I can use for this?


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.