Abstract
K-Nearest Neighbour (k-NN) is a widely used technique for classifying and clustering data. K-NN is effective but is often criticised for its polynomial run-time growth as k-NN calculates the distance to every other record in the data set for each record in turn. This paper evaluates a novel k-NN classifier with linear growth and faster run-time built from binary neural networks. The binary neural approach uses robust encoding to map standard ordinal, categorical and real-valued data sets onto a binary neural network. The binary neural network uses high speed pattern matching to recall the k-best matches. We compare various configurations of the binary approach to a conventional approach for memory overheads, training speed, retrieval speed and retrieval accuracy. We demonstrate the superior performance with respect to speed and memory requirements of the binary approach compared to the standard approach and we pinpoint the optimal configurations.
Similar content being viewed by others
References
Aha DW, Bankert RB (1994) Feature selection for case-based classification of cloud types: an empirical comparison. In: Proc of the AAAI-94 workshop on case-based reasoning
Aleksander I, Albrow R (1968) Pattern recognition with adaptive logic elements. In: IEE conference on pattern recognition, pp 68–74
Aleksander I, Thomas W, Bowden P (1984) Wisard: a radical step forward in image recognition. In: Sensor review, pp 120–124
Austin J (1995) Distributed associative memories for high speed symbolic reasoning. In: Sun R, Alexandre F (eds) IJCAI ’95 working notes of workshop on connectionist-symbolic integration: from unified to hybrid approaches. Montreal, Quebec, pp 87–93
Austin J (1998) RAM-based neural networks. Progress in neural processing: 9. World Scientific, Singapore
Bishop CM (1995) Neural networks for pattern recognition. Clarendon, Oxford
Bledsoe W, Browning I (1959) Pattern recognition and reading by machine. In: Proc of eastern joint computer conference, pp 225–231
Dasarathy B (ed) (1991) Nearest neighbor (NN) norms: NN pattern classification techniques. In: IEEE computer society
Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. In: Proc of 12th international conference on machine learning. Morgan Kaufmann, San Francisco, CA, pp 194–202
Hodge V, Austin J (2004) A survey of outlier detection methodologies. Artif Intell Rev 22:85–126
Hodge V, Lees K, Austin J (2003) A high performance k-NN approach using binary neural networks. Neur Net 17:441–458
Hodge V, Weeks M, Austin J (2003) AURA k-Nearest neighbour approach. Internal report
Hopfield J (1982) Neural networks and physical systems with emergent collective computation abilities. Proc Nat Acad Sci USA 79:2554–2558
IBM quest data mining project (2003) The quest synthetic data generation code for classification—http://www.almaden.ibm.com/software/quest/Resources/datasets/syndata.html#classSynData, last accessed 16 October 2003
Knorr E, Ng R (1998) Algorithms for mining distance-based outliers in large datasets. In: Proc of the VLDB conference. New York, pp 392–403
Lees K, O’Keefe S, Austin J (2001) Imputation using a binary neural network. Internal report
Skalak D (1994) Prototype and feature selection by sampling and random mutation hill climbing algorithms. In: Machine learning: Proc of the eleventh international conference, pp 293–301
Turner A (2003) Introduction to CMMs and AURA-based systems—http://www.cs.york.ac.uk/arch/NeuralNetworks/binary.html, last accessed 8 August, 2003
Weeks M, Hodge V, O’Keefe S, Austin J, Lees K (2003) Improved AURA k-nearest neighbour approach. In: Proc of IWANN-2003, international work-conference on artificial and natural neural networks. Mahon, Menorca, Balearic Islands, Spain, June 3–6
Wettschereck D (1994) A study of distance-based machine learning algorithms. PhD thesis, Department of computer science, Oregon State University, Corvallis
Witten I, Frank E (1999) Data mining: practical machine learning tools and techniques with java implementations. Morgan Kaufmann
Zhou P, Austin J, Kennedy J (1999) A high performance k-NN classifier using a binary correlation matrix memory. In: Advances in neural information processing systems 11
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hodge, V., Austin, J. A binary neural k-nearest neighbour technique. Knowl Inf Syst 8, 276–291 (2005). https://doi.org/10.1007/s10115-004-0191-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-004-0191-4