I would like to find out which variables (from a set of 30 binary variables) have the most impact on an ordinal satisfaction measurement (it can reach from 1 – not happy at all to 4 – absolutely happy).
Unfortunately most of the binary independent variables are (highly) correlated.
There are about 20 different shops to sell the product and I also want to check if different customer-types have different drivers.
My dataset looks like this (with D1 to D30 being the dichotomous independent Variables):
I wanted to use a hierarchical regression, but I think it will not be appropriate for the ordinal dependent variable. Another problem might be the high correlation between the binary independent variables.
So now I read about random forest classification, but I am not sure if this is the right way to go?
Do you have any suggestions about a proper method for my problem?
And more generally, are there any methods to deal with high correlation in binary predictors?