At my work, we employ a nearest neighbor algorithm to classify records. Part of this process, of course, includes determining which features to use as auxiliary information in the algorithm. Also, we allow the features we select to be weighted so that more important features have higher weight in the distance calculation.
When it comes to selecting which features to use and how to weight them, the first thought my colleagues often have is to run a regression on the variable of interest using the auxiliary information and then use the coefficients and/or the p-values obtained from that regression to decide which features and their weights to use in the nearest neighbor algorithm.
My initial thought is that this method is probably not the best way to go about it, but I can’t come up with any concrete reasons against it other than saying that what features may work well in the regression context may not work well in the nearest neighbor context.
Does anyone have any thoughts on the validity of this method? Is it an appropriate feature selection method or am I correct in thinking that it is not the best way?