Advertisement

Adaptive Distance Metrics for Nearest Neighbour Classification Based on Genetic Programming

  • Alexandros Agapitos
  • Michael O’Neill
  • Anthony Brabazon
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7831)

Abstract

Nearest Neighbour (NN) classification is a widely-used, effective method for both binary and multi-class problems. It relies on the assumption that class conditional probabilities are locally constant. However, this assumption becomes invalid in high dimensions, and severe bias can be introduced, which degrades the performance of the method. The employment of a locally adaptive distance metric becomes crucial in order to keep class conditional probabilities approximately uniform, whereby better classification performance can be attained. This paper presents a locally adaptive distance metric for NN classification based on a supervised learning algorithm (Genetic Programming) that learns a vector of feature weights for the features composing an instance query. Using a weighted Euclidean distance metric, this has the effect of adaptive neighbourhood shapes to query locations, stretching the neighbourhood along the directions for which the class conditional probabilities don’t change much. Initial empirical results on a set of real-world classification datasets showed that the proposed method enhances the generalisation performance of standard NN algorithm, and that it is a competent method for pattern classification as compared to other learning algorithms.

Keywords

Output Vector Near Neighbour Radial Basis Function Network Feature Weight Query Pattern 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agapitos, A., Brabazon, A., O’Neill, M.: Controlling Overfitting in Symbolic Regression Based on a Bias/Variance Error Decomposition. In: Coello Coello, C.A., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds.) PPSN 2012, Part I. LNCS, vol. 7491, pp. 438–447. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  2. 2.
    Agapitos, A., O’Neill, M., Brabazon, A.: Evolutionary Learning of Technical Trading Rules without Data-Mining Bias. In: Schaefer, R., Cotta, C., Kołodziej, J., Rudolph, G. (eds.) PPSN XI, Part I. LNCS, vol. 6238, pp. 294–303. Springer, Heidelberg (2010)Google Scholar
  3. 3.
    Agapitos, A., O’Neill, M., Brabazon, A., Theodoridis, T.: Maximum Margin Decision Surfaces for Increased Generalisation in Evolutionary Decision Tree Learning. In: Silva, S., Foster, J.A., Nicolau, M., Machado, P., Giacobini, M. (eds.) EuroGP 2011. LNCS, vol. 6621, pp. 61–72. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  4. 4.
    Domeniconi, C., Gunopulos, D., Peng, J.: Large margin nearest neighbor classifiers. IEEE Transactions on Neural Networks 16(4), 899–909 (2005)CrossRefGoogle Scholar
  5. 5.
    Domeniconi, C., Peng, J., Gunopulos, D.: Locally adaptive metric nearest-neighbor classification. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(9), 1281–1285 (2002)CrossRefGoogle Scholar
  6. 6.
    Fix, E., Hodges Jr., J.L.: Discriminatory analysis. nonparametric discrimination: Consistency properties. International Statistical Review 57(3), 238–247 (1989)zbMATHCrossRefGoogle Scholar
  7. 7.
    Frank, A., Asuncion, A.: UCI machine learning repository (2010), http://archive.ics.uci.edu/ml
  8. 8.
    Friedman, J.H.: Flexible metric nearest neighbour classification. Tech. rep., Department of Statistics. Stanford University (1994)Google Scholar
  9. 9.
    Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighbourhood components analysis. In: Advances in Neural Information Processing Systems 17, pp. 513–520. MIT Press (2004)Google Scholar
  10. 10.
    Guo, R., Chakraborty, S.: Bayesian adaptive nearest neighbor. Stat. Anal. Data Min. 3(2), 92–105 (2010)MathSciNetGoogle Scholar
  11. 11.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The weka data mining software: An update. SIGKDD Explorations 11(1) (2009)Google Scholar
  12. 12.
    Hastie, T., Tibshirani, R.: Discriminant adaptive nearest neighbor classification. IEEE Trans. Pattern Anal. Mach. Intell. 18(6), 607–616 (1996)CrossRefGoogle Scholar
  13. 13.
    Kattan, A., Agapitos, A., Poli, R.: Unsupervised Problem Decomposition Using Genetic Programming. In: Esparcia-Alcázar, A.I., Ekárt, A., Silva, S., Dignum, S., Uyar, A.Ş. (eds.) EuroGP 2010. LNCS, vol. 6021, pp. 122–133. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  14. 14.
    Mitchell, T.: Machine Learning. McGraw-Hill (1997)Google Scholar
  15. 15.
    Peng, J., Heisterkamp, D.R., Dai, H.K.: Lda/svm driven nearest neighbor classification. IEEE Transactions on Neural Networks 14(4), 940–942 (2003)CrossRefGoogle Scholar
  16. 16.
    Poli, R., Langdon, W.B., McPhee, N.F.: A Field Guide to Genetic Programming. Lulu Enterprises, UK Ltd (2008)Google Scholar
  17. 17.
    Theodoridis, T., Agapitos, A., Hu, H.: A gaussian groundplan projection area model for evolving probabilistic classifiers. In: Genetic and Evolutionary Computation Conference, GECCO 2011, Dublin, July 12-16. ACM (2011)Google Scholar
  18. 18.
    Trevor, H., Robert, T., Jerome, F.: The Elements of Statistical Learning, 2nd edn. Springer (2009)Google Scholar
  19. 19.
    Tuite, C., Agapitos, A., O’Neill, M., Brabazon, A.: A Preliminary Investigation of Overfitting in Evolutionary Driven Model Induction: Implications for Financial Modelling. In: Di Chio, C., Brabazon, A., Di Caro, G.A., Drechsler, R., Farooq, M., Grahl, J., Greenfield, G., Prins, C., Romero, J., Squillero, G., Tarantino, E., Tettamanzi, A.G.B., Urquhart, N., Uyar, A.Ş. (eds.) EvoApplications 2011, Part II. LNCS, vol. 6625, pp. 120–130. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  20. 20.
    Tuite, C., Agapitos, A., O’Neill, M., Brabazon, A.: Early stopping criteria to counteract overfitting in genetic programming. In: Genetic and Evolutionary Computation Conference, GECCO 2011, Dublin, July 12-16. ACM (2011)Google Scholar
  21. 21.
    Wang, J., Neskovic, P., Cooper, L.N.: Improving nearest neighbor rule with a simple adaptive distance measure. Pattern Recogn. Lett. 28(2), 207–213 (2007)CrossRefGoogle Scholar
  22. 22.
    Weinberger, K.Q., Saul, L.K.: Distance metric learning for large margin nearest neighbor classification. J. Mach. Learn. Res. 10, 207–244 (2009)zbMATHGoogle Scholar
  23. 23.
    Zhang, G.-J., Du, J.-X., Huang, D.-S., Lok, T.-M., Lyu, M.R.: Adaptive Nearest Neighbor Classifier Based on Supervised Ellipsoid Clustering. In: Wang, L., Jiao, L., Shi, G., Li, X., Liu, J. (eds.) FSKD 2006. LNCS (LNAI), vol. 4223, pp. 582–585. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  24. 24.
    Zhang, Y., Zhang, M.: A multiple-output program tree structure in genetic programming. In: Mckay, R.I., Cho, S.B. (eds.) Proceedings of the Second Asian-Pacific Workshop on Genetic Programming, Cairns, Australia, p. 12Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Alexandros Agapitos
    • 1
  • Michael O’Neill
    • 1
  • Anthony Brabazon
    • 1
  1. 1.Financial Mathematics and Computation Research Cluster, Complex and Adaptive Systems LaboratoryUniversity College DublinIreland

Personalised recommendations