Triangular Kernel Nearest-Neighbor-Based Clustering Algorithm for Discovering True Clusters

Musdholifah, Aina; Hashim, Siti Zaiton Mohd

doi:10.1007/978-3-642-36778-6_11

Aina Musdholifah^21,22 &
Siti Zaiton Mohd Hashim²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7769))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1063 Accesses
1 Altmetric

Abstract

Clustering is a powerful exploratory technique for extracting the knowledge of given data. Several clustering techniques that have been proposed require predetermined number of clusters. However, the triangular kernel-nearest neighbor-based clustering (TKNN) has been proven able to determine the number and member of clusters automatically. TKNN provides good solutions for clustering non-spherical and high-dimensional data without prior knowledge of data labels. On the other hand, there is no definite measure to evaluate the accuracy of the clustering result. In order to evaluate the performance of the proposed TKNN clustering algorithm, we utilized various benchmark classification datasets. Thus, TKNN is proposed for discovering true clusters with arbitrary shape, size and density contained in the datasets. The experimental results on benched-mark datasets showed the effectiveness of our technique. Our proposed TKNN achieved more accurate clustering results and required less time processing compared with k-means, ILGC, DBSCAN and KFCM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 49.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Anderson, T.K.: Kernel Density Estimation and K-means Clustering to Profile Road Accident Hotspots. Accident Analysis and Prevention 41(3), 359–364 (2009)
Article Google Scholar
Golob, T.F., Recker, W.W.: A Method for Relating Type of Crash to Traffic Flow Characteristics on Urban Freeways. Transportation Research Part A: Policy and Practice 38(1), 53–80 (2004)
Article Google Scholar
Shekhar, S., et al.: Data Mining and Visualization of Twin-cities Traffic Data, in Technical Report (TR 01-015), University of Minnesota (2001)
Google Scholar
Skyving, M., Berg, H.Y., Laflamme, L.: A Pattern Analysis of Traffic Crashes Fatal to Older Drivers. Accident Analysis and Prevention 41(2), 253–258 (2009)
Article Google Scholar
Steinbach, M., et al.: Discovery of climate indices using clustering. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC (2003)
Google Scholar
Wang, M., Wang, A.P., Li, A.B.: Mining spatial-temporal clusters from geo-databases. In: 2nd International Conference on Advanced Data Mining and Applications, Xian, PEOPLES R CHINA (2006)
Google Scholar
Lin, F., et al.: Discovery of teleconnections using data mining technologies in global climate datasets. Data Science Journal 6(suppl.), S749–S755 (2007)
Google Scholar
Birant, D., Kut, A.: ST-DBSCAN: An algorithm for clustering spatial-temp oral data. Data & Knowledge Engineering 60(1), 208–221 (2007)
Article Google Scholar
Chang, W., Zeng, D., Chen, H.C.: Prospective spatio-temporal data analysis for security informatics. In: 8th IEEE International Conference on Intelligent Transportation Systems (ITSC 2005). IEEE, Vienna (2005)
Google Scholar
Zhang, D., Chen, S.: Kernel-based fuzzy and probabilistic c-means clustering. In: The International Conference on Artificial Neural Networks, Istanbul, Turkey (2003)
Google Scholar
Zhang, D., Chen, S.: Clustering incomplete data using kernel-based fuzzy c-means algorithm. Neural Processing Letters 18, 155–162 (2003)
Article Google Scholar
Hinneburg, A., Keim, D.A.: An effecient approach to clustering in large multimedia databases with noise. In: The Fourth International Conference on Knowledge Discovery and data Mining (KDD 1998). AAAI Press, Menlo Park (1998)
Google Scholar
Hinneburg, A., Keim, D.A.: A general approach to clustering in large database with noise. Knowledge and Information Systems 5(4), 387–415 (2003)
Article Google Scholar
Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson Addison Wesley (2006)
Google Scholar
Ester, M., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceeding on the 2nd International Conference on Knowledge Discovery and Data Mining, Portland (1996)
Google Scholar
Musdholifah, A., Hashim, S.Z.M.: Triangular kernel nearest neighbor-based clustering for pattern extraction in spatio-temporal database. In: The 10th International Conference on Intelligent System Design and Applications, Egypt (2010)
Google Scholar
Classification data, UCI Repository of Machine Learning Database
Google Scholar
Wasito, I., Hashim, S.Z.M., Sukmaningrum, S.: Iterative Local Gaussian Clustering for Expressed Genes Identification Linked to Malignancy of Human Colorectal Carcinoma. Bioinformation 2(5), 175–181 (2007)
Article Google Scholar
Tran, T.N., Wehrens, R., Buydens, L.M.C.: KNN-kernel density-based clustering for high-dimensional multivariate data. Computational Statistics & Data Analysis 51, 513–525 (2006)
Article MathSciNet MATH Google Scholar
Clustering datasets, Speech and Image Processing Unit, School of Computing, University of Eastern Finland (2012)
Google Scholar
Fu, L., Medico, E.: A novel fuzzy clustering method for the analysis of DNA microarray data. BMC bioinformatics 8(1), 3 (2007)
Article Google Scholar
Jain, A.K., Law, M.H.C.: Data Clustering: A User’s Dilemma. In: Pal, S.K., Bandyopadhyay, S., Biswas, S. (eds.) PReMI 2005. LNCS, vol. 3776, pp. 1–10. Springer, Heidelberg (2005)
Chapter Google Scholar
Chang, H., Yeung, D.-Y.: Robust path-based spectral clustering. Pattern Recognition 41(1), 191–203 (2008)
Article MathSciNet MATH Google Scholar
Gionis, A., Mannila, H., Tsaparas, P.: Clustering aggregation. ACM Transactions on Knowledge Discovery from Data 1(1), 1–30 (2007)
Article Google Scholar
Zahn, C.T.: Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Transaction on Computers 100(1), 68–86 (1971)
Article Google Scholar
Veenman, C.J.: A maximum variance cluster algorithm. IEE Transaction on Pattern Analysis and Machine Intelligence 24(9), 1273–1280 (2002)
Article Google Scholar
Van Rijsbergen, C.J.: Information retrieval. Butterworths, London (1979)
Google Scholar
Gullo, F., Ponti, G., Tagarelli, A.: Clustering Uncertain Data Via K-Medoids. In: Greco, S., Lukasiewicz, T. (eds.) SUM 2008. LNCS (LNAI), vol. 5291, pp. 229–242. Springer, Heidelberg (2008)
Chapter Google Scholar
Martinez, W.L., Martinez, A.R.: Exploratory data analysis with MATLAB. Chapman & Hall/CRC (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Soft Computing Research Group, Universiti Teknologi Malaysia, Malaysia
Aina Musdholifah & Siti Zaiton Mohd Hashim
Department of Computer Science and Electronics, Universitas Gadjah Mada, Indonesia
Aina Musdholifah

Authors

Aina Musdholifah
View author publications
You can also search for this author in PubMed Google Scholar
Siti Zaiton Mohd Hashim
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ISIR, Osaka University, 8-1, Mihogaoka, Ibaraki, Osaka, Japan
Takashi Washio
Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, 1068 Xueyuan Boulevard, 518055, Shenzhen, Guangdong, China
Jun Luo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Musdholifah, A., Hashim, S.Z.M. (2013). Triangular Kernel Nearest-Neighbor-Based Clustering Algorithm for Discovering True Clusters. In: Washio, T., Luo, J. (eds) Emerging Trends in Knowledge Discovery and Data Mining. PAKDD 2012. Lecture Notes in Computer Science(), vol 7769. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36778-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-36778-6_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36777-9
Online ISBN: 978-3-642-36778-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics