A binary neural k-nearest neighbour technique

Article

Abstract

K-Nearest Neighbour (k-NN) is a widely used technique for classifying and clustering data. K-NN is effective but is often criticised for its polynomial run-time growth as k-NN calculates the distance to every other record in the data set for each record in turn. This paper evaluates a novel k-NN classifier with linear growth and faster run-time built from binary neural networks. The binary neural approach uses robust encoding to map standard ordinal, categorical and real-valued data sets onto a binary neural network. The binary neural network uses high speed pattern matching to recall the k-best matches. We compare various configurations of the binary approach to a conventional approach for memory overheads, training speed, retrieval speed and retrieval accuracy. We demonstrate the superior performance with respect to speed and memory requirements of the binary approach compared to the standard approach and we pinpoint the optimal configurations.

Keywords

Binary neural network Correlation matrix memory K-nearest neighbour Parabolic kernel 

References

  1. 1.
    Aha DW, Bankert RB (1994) Feature selection for case-based classification of cloud types: an empirical comparison. In: Proc of the AAAI-94 workshop on case-based reasoningGoogle Scholar
  2. 2.
    Aleksander I, Albrow R (1968) Pattern recognition with adaptive logic elements. In: IEE conference on pattern recognition, pp 68–74Google Scholar
  3. 3.
    Aleksander I, Thomas W, Bowden P (1984) Wisard: a radical step forward in image recognition. In: Sensor review, pp 120–124Google Scholar
  4. 4.
    Austin J (1995) Distributed associative memories for high speed symbolic reasoning. In: Sun R, Alexandre F (eds) IJCAI ’95 working notes of workshop on connectionist-symbolic integration: from unified to hybrid approaches. Montreal, Quebec, pp 87–93Google Scholar
  5. 5.
    Austin J (1998) RAM-based neural networks. Progress in neural processing: 9. World Scientific, SingaporeGoogle Scholar
  6. 6.
    Bishop CM (1995) Neural networks for pattern recognition. Clarendon, OxfordGoogle Scholar
  7. 7.
    Bledsoe W, Browning I (1959) Pattern recognition and reading by machine. In: Proc of eastern joint computer conference, pp 225–231Google Scholar
  8. 8.
    Dasarathy B (ed) (1991) Nearest neighbor (NN) norms: NN pattern classification techniques. In: IEEE computer societyGoogle Scholar
  9. 9.
    Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. In: Proc of 12th international conference on machine learning. Morgan Kaufmann, San Francisco, CA, pp 194–202Google Scholar
  10. 10.
    Hodge V, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22:85–126CrossRefMATHGoogle Scholar
  11. 11.
    Hodge V, Lees K, Austin J (2003) A high performance k-NN approach using binary neural networks. Neur Net 17:441–458CrossRefGoogle Scholar
  12. 12.
    Hodge V, Weeks M, Austin J (2003) AURA k-Nearest neighbour approach. Internal reportGoogle Scholar
  13. 13.
    Hopfield J (1982) Neural networks and physical systems with emergent collective computation abilities. Proc Nat Acad Sci USA 79:2554–2558PubMedMathSciNetGoogle Scholar
  14. 14.
    IBM quest data mining project (2003) The quest synthetic data generation code for classification—http://www.almaden.ibm.com/software/quest/Resources/datasets/syndata.html#classSynData, last accessed 16 October 2003Google Scholar
  15. 15.
    Knorr E, Ng R (1998) Algorithms for mining distance-based outliers in large datasets. In: Proc of the VLDB conference. New York, pp 392–403Google Scholar
  16. 16.
    Lees K, O’Keefe S, Austin J (2001) Imputation using a binary neural network. Internal reportGoogle Scholar
  17. 17.
    Skalak D (1994) Prototype and feature selection by sampling and random mutation hill climbing algorithms. In: Machine learning: Proc of the eleventh international conference, pp 293–301Google Scholar
  18. 18.
    Turner A (2003) Introduction to CMMs and AURA-based systems—http://www.cs.york.ac.uk/arch/NeuralNetworks/binary.html, last accessed 8 August, 2003Google Scholar
  19. 19.
    Weeks M, Hodge V, O’Keefe S, Austin J, Lees K (2003) Improved AURA k-nearest neighbour approach. In: Proc of IWANN-2003, international work-conference on artificial and natural neural networks. Mahon, Menorca, Balearic Islands, Spain, June 3–6Google Scholar
  20. 20.
    Wettschereck D (1994) A study of distance-based machine learning algorithms. PhD thesis, Department of computer science, Oregon State University, CorvallisGoogle Scholar
  21. 21.
    Witten I, Frank E (1999) Data mining: practical machine learning tools and techniques with java implementations. Morgan KaufmannGoogle Scholar
  22. 22.
    Zhou P, Austin J, Kennedy J (1999) A high performance k-NN classifier using a binary correlation matrix memory. In: Advances in neural information processing systems 11Google Scholar

Copyright information

© Springer-Verlag 2005

Authors and Affiliations

  1. 1.Dept. of Computer ScienceUniversity of YorkYorkUK

Personalised recommendations