Abstract: k Nearest Neighbour Classifier (kNN) is a widely used non parametric machine learning model. This classifier can model complex data distributions and can achieve generalization. kNN algorithm coherently groups data into subsets and labels the test instance based on the similar or nearest training instances. The optimal selection of the nearest neighbours has to be done for accurate classification. Our work implements kNN algorithm with a similarity measure in identifying the optimal nearest neighbours for test instances and deciding their class label as the majority class label among the nearest neighbours. The proposed similarity measure considers the data distribution and thus helps in selecting the optimal nearest neighbours. The effectiveness of the proposed work is evaluated on several datasets with different classifiers like J48, Naive Bayes (NB) classifier. The proposed method outperforms in comparison with other ensemble learning techniques like Multilayer Perceptron (MLP) and Random Forest (RF) with high classification accuracy.
Keywords: similarity measure, nearest neighbour, k fold cross validation, classification, accuracy.
| DOI: 10.17148/IJARCCE.2022.11706