*Bounty: 50*

*Bounty: 50*

I would like to find out which variables (from a set of 30 binary variables) have the most impact on an ordinal satisfaction measurement (it can reach from 1 – not happy at all to 4 – absolutely happy).

Unfortunately most of the binary independent variables are (highly) correlated.

There are about 20 different shops to sell the product and I also want to check if different customer-types have different drivers.

My dataset looks like this (with D1 to D30 being the dichotomous independent Variables):

I wanted to use a hierarchical regression, but I think it will not be appropriate for the ordinal dependent variable. Another problem might be the high correlation between the binary independent variables.

So now I read about random forest classification, but I am not sure if this is the right way to go?

Do you have any suggestions about a proper method for my problem?

And more generally, are there any methods to deal with high correlation in binary predictors?