Using k-Nearest Neighbor and Feature Selection as an Improvement to Hierarchical Clustering

Mylonas, Phivos; Wallace, Manolis; Kollias, Stefanos

doi:10.1007/978-3-540-24674-9_21

Phivos Mylonas¹⁸,
Manolis Wallace¹⁸ &
Stefanos Kollias¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3025))

Included in the following conference series:

Hellenic Conference on Artificial Intelligence

1425 Accesses
6 Citations

Abstract

Clustering of data is a difficult problem that is related to various fields and applications. Challenge is greater, as input space dimensions become larger and feature scales are different from each other. Hierarchical clustering methods are more flexible than their partitioning counterparts, as they do not need the number of clusters as input. Still, plain hierarchical clustering does not provide a satisfactory framework for extracting meaningful results in such cases. Major drawbacks have to be tackled, such as curse of dimensionality and initial error propagation, as well as complexity and data set size issues. In this paper we propose an unsupervised extension to hierarchical clustering in the means of feature selection, in order to overcome the first drawback, thus increasing the robustness of the whole algorithm. The results of the application of this clustering to a portion of dataset in question are then refined and extended to the whole dataset through a classification step, using k-nearest neighbor classification technique, in order to tackle the latter two problems. The performance of the proposed methodology is demonstrated through the application to a variety of well known publicly available data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Hirota, K., Pedrycz, W.: Fuzzy computing for data mining. Proceedings of the IEEE 87, 1575–1600 (1999)
Article Google Scholar
Kohavi, R., Sommerfield, D.: Feature Subset Selection Using theWrapper Model: Overfitting and Dynamic Search Space Topology. In: Proceedings of KDD-1995 (1995)
Google Scholar
Lim, T.-S., Loh, W.-Y., Shih, Y.-S.: A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-three Old and New Classification Algorithms. Machine Learning 40, 203–229 (2000)
Article MATH Google Scholar
Miyamoto, S.: Fuzzy Sets in Information Retrieval and Cluster Analysis. Kluwer Academic Publishers, Dordrecht (1990)
MATH Google Scholar
Swiniarski, R.W., Skowron, A.: Rough set methods in feature selection and recognition. Pattern Recognition Letters 24, 833–849 (2003)
Article MATH Google Scholar
Theodoridis, S., Koutroumbas, K.: Pattern Recognition. Academic Press, London (1998)
MATH Google Scholar
Tsapatsoulis, N., Wallace, M., Kasderidis, S.: Improving the Performance of Resource Allocation Networks through Hierarchical Clustering of High – Dimensional Data. In: Proceedings of the International Conference on Artificial Neural Networks (ICANN), Istanbul, Turkey (2003)
Google Scholar
Wallace, M., Stamou, G.: Towards a Context Aware Mining of User Interests for Consumption of Multimedia Documents. In: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), Lausanne, Switzerland (2002)
Google Scholar
Yager, R.R.: Intelligent control of the hierarchical agglomerative clustering process. IEEE Transactions on Systems, Man and Cybernetics, Part B 30(6), 835–845 (2000); Tsapatsoulis, N., Wallace, M. and Kasderidis, S.
Article MathSciNet Google Scholar
Wallace, M., Mylonas, P.: Detecting and Verifying Dissimilar Patterns in Unlabelled Data. In: 8th Online World Conference on Soft Computing in Industrial Applications, September 29-October 17 (2003)
Google Scholar
Mitchell, T.M.: Machine Learning. McGraw-Hill Companies, Inc., New York (1997)
MATH Google Scholar
Wallace, M., Kollias, S.: Soft Attribute Selection for Hierarchical Clustering in High Dimensions. In: Proceedings of the International Fuzzy Systems Association World Congress( IFSA), Istanbul, Turkey, June-July (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical and Computer Engineering, National Technical University of Athens, 9, Iroon Polytechniou Str., 157 73, Zographou, Athens, Greece
Phivos Mylonas, Manolis Wallace & Stefanos Kollias

Authors

Phivos Mylonas
View author publications
You can also search for this author in PubMed Google Scholar
Manolis Wallace
View author publications
You can also search for this author in PubMed Google Scholar
Stefanos Kollias
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Info and Communication Systems Eng, Aegean University, 83200, Karlovassi, Samos, Greece
George A. Vouros
Department of Informatics, University of Piraeus, Piraeus, Greece
Themistoklis Panayiotopoulos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mylonas, P., Wallace, M., Kollias, S. (2004). Using k-Nearest Neighbor and Feature Selection as an Improvement to Hierarchical Clustering. In: Vouros, G.A., Panayiotopoulos, T. (eds) Methods and Applications of Artificial Intelligence. SETN 2004. Lecture Notes in Computer Science(), vol 3025. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24674-9_21

Download citation

DOI: https://doi.org/10.1007/978-3-540-24674-9_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21937-8
Online ISBN: 978-3-540-24674-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics