#StackBounty: #machine-learning #k-nearest-neighbour How to overcome the computational cost of the KNN algorithm?

Bounty: 50

How to overcome the computational cost of the KNN algorithm when the dataset is large.

Bear in mind that I am using my own function for building the knn algorithm in R.

Is there any manipulation to do with this?

I have tried dataset with 6000 instances and it takes a long time.

Note

wdbc_n is my normalized data set.

Lbl is the class label of the data set.

0.1 is the l norm as I defined within lpnorm function which I do not put here.

fx1NN<-function(TrainS, TestS,TrainLabel,p){ 
  n<-dim(TrainS)[1]
  Dist=matrix(0,nrow=n,ncol=1)
    for (i in 1:n){
      Dist[i]=lpnorm(as.matrix(TrainS[i,]-TestS),p) 
    }
  Dist2=which.min(Dist)
  predL=TrainLabel[Dist2] 
  return(predL)
}


LOOCV1NN<-function(Data,Lbl,p){ 
  N<-dim(Data)[1] 
  predLOOCV=matrix(0,nrow=N,ncol=1)
  for (i in 1:N){
    LOOOCV_train=Data[-i,] 
  LOOOCV_test=Data[i,] 
  TLabel=Lbl[-i] 
predLOOCV[i]=fx1NN(LOOOCV_train,LOOOCV_test,TLabel,p) 
  }

  confM=confusionMatrix(factor(predLOOCV),factor(Lbl)) 
  R=as.matrix(confM)
  Recall=R[1,1]/sum(R[,1])
  Precision=R[1,1]/sum(R[1,])
  F1score=2/((1/Recall)+(1/Precision))
  CorClass=sum(predLOOCV==Lbl) 
LL<-list(confM,CorClass,F1score) 
  return(LL)
}
LOOCV1NN(wdbc_n,Lbl,0.1)


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.