GA-Facilitated KNN Classifier Optimization with Varying Similarity Measures

Michael R. Peterson, Travis E. Doom, Michael L. Raymer

Research output: Contribution to conferencePaper

Abstract

Genetic algorithms are powerful tools for k-nearest neighbors classifier optimization. While traditional knn classification techniques typically employ Euclidian distance to assess pattern similarity, other measures may also be utilized. Previous research demonstrates that GAs can improve predictive accuracy by searching for optimal feature weights and offsets for a cosine similarity-based knn classifier. GA-selected weights determine the classification relevance of each feature, while offsets provide alternative points of reference when assessing angular similarity. Such optimized classifiers perform competitively with other contemporary classification techniques. This paper explores the effectiveness of GA weight and offset optimization for knowledge discovery using knn classifiers with varying similarity measures. Using Euclidian distance, cosine similarity, and Pearson correlation, untrained classifiers are compared with weight-optimized classifiers for several datasets. Simultaneous weight and offset optimization experiments are also performed for cosine similarity and Pearson correlation. This type of optimization represents a novel technique for maximizing Pearson correlation-based knn performance. While unoptimized cosine and Pearson classifiers often perform worse than their Euclidian counterparts, optimized cosine and Pearson classifiers typically show equivalent or improved performance over optimized Euclidian classifiers. In some cases, offset optimization provides further improvement for knn classifiers employing cosine similarity or Pearson correlation.

Original languageAmerican English
DOIs
StatePublished - Sep 1 2005
EventProceedings of the 2005 IEEE Congress on Evolutionary Computation -
Duration: Sep 1 2005 → …

Conference

ConferenceProceedings of the 2005 IEEE Congress on Evolutionary Computation
Period9/1/05 → …

Disciplines

  • Bioinformatics
  • Communication
  • Communication Technology and New Media
  • Computer Sciences
  • Databases and Information Systems
  • Life Sciences
  • OS and Networks
  • Physical Sciences and Mathematics
  • Science and Technology Studies
  • Social and Behavioral Sciences

Cite this