Skip to main content

Triangular Kernel Nearest-Neighbor-Based Clustering Algorithm for Discovering True Clusters

  • Conference paper
Emerging Trends in Knowledge Discovery and Data Mining (PAKDD 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7769))

Included in the following conference series:

Abstract

Clustering is a powerful exploratory technique for extracting the knowledge of given data. Several clustering techniques that have been proposed require predetermined number of clusters. However, the triangular kernel-nearest neighbor-based clustering (TKNN) has been proven able to determine the number and member of clusters automatically. TKNN provides good solutions for clustering non-spherical and high-dimensional data without prior knowledge of data labels. On the other hand, there is no definite measure to evaluate the accuracy of the clustering result. In order to evaluate the performance of the proposed TKNN clustering algorithm, we utilized various benchmark classification datasets. Thus, TKNN is proposed for discovering true clusters with arbitrary shape, size and density contained in the datasets. The experimental results on benched-mark datasets showed the effectiveness of our technique. Our proposed TKNN achieved more accurate clustering results and required less time processing compared with k-means, ILGC, DBSCAN and KFCM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anderson, T.K.: Kernel Density Estimation and K-means Clustering to Profile Road Accident Hotspots. Accident Analysis and Prevention 41(3), 359–364 (2009)

    Article  Google Scholar 

  2. Golob, T.F., Recker, W.W.: A Method for Relating Type of Crash to Traffic Flow Characteristics on Urban Freeways. Transportation Research Part A: Policy and Practice 38(1), 53–80 (2004)

    Article  Google Scholar 

  3. Shekhar, S., et al.: Data Mining and Visualization of Twin-cities Traffic Data, in Technical Report (TR 01-015), University of Minnesota (2001)

    Google Scholar 

  4. Skyving, M., Berg, H.Y., Laflamme, L.: A Pattern Analysis of Traffic Crashes Fatal to Older Drivers. Accident Analysis and Prevention 41(2), 253–258 (2009)

    Article  Google Scholar 

  5. Steinbach, M., et al.: Discovery of climate indices using clustering. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, DC (2003)

    Google Scholar 

  6. Wang, M., Wang, A.P., Li, A.B.: Mining spatial-temporal clusters from geo-databases. In: 2nd International Conference on Advanced Data Mining and Applications, Xian, PEOPLES R CHINA (2006)

    Google Scholar 

  7. Lin, F., et al.: Discovery of teleconnections using data mining technologies in global climate datasets. Data Science Journal 6(suppl.), S749–S755 (2007)

    Google Scholar 

  8. Birant, D., Kut, A.: ST-DBSCAN: An algorithm for clustering spatial-temp oral data. Data & Knowledge Engineering 60(1), 208–221 (2007)

    Article  Google Scholar 

  9. Chang, W., Zeng, D., Chen, H.C.: Prospective spatio-temporal data analysis for security informatics. In: 8th IEEE International Conference on Intelligent Transportation Systems (ITSC 2005). IEEE, Vienna (2005)

    Google Scholar 

  10. Zhang, D., Chen, S.: Kernel-based fuzzy and probabilistic c-means clustering. In: The International Conference on Artificial Neural Networks, Istanbul, Turkey (2003)

    Google Scholar 

  11. Zhang, D., Chen, S.: Clustering incomplete data using kernel-based fuzzy c-means algorithm. Neural Processing Letters 18, 155–162 (2003)

    Article  Google Scholar 

  12. Hinneburg, A., Keim, D.A.: An effecient approach to clustering in large multimedia databases with noise. In: The Fourth International Conference on Knowledge Discovery and data Mining (KDD 1998). AAAI Press, Menlo Park (1998)

    Google Scholar 

  13. Hinneburg, A., Keim, D.A.: A general approach to clustering in large database with noise. Knowledge and Information Systems 5(4), 387–415 (2003)

    Article  Google Scholar 

  14. Tan, P.N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Pearson Addison Wesley (2006)

    Google Scholar 

  15. Ester, M., et al.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceeding on the 2nd International Conference on Knowledge Discovery and Data Mining, Portland (1996)

    Google Scholar 

  16. Musdholifah, A., Hashim, S.Z.M.: Triangular kernel nearest neighbor-based clustering for pattern extraction in spatio-temporal database. In: The 10th International Conference on Intelligent System Design and Applications, Egypt (2010)

    Google Scholar 

  17. Classification data, UCI Repository of Machine Learning Database

    Google Scholar 

  18. Wasito, I., Hashim, S.Z.M., Sukmaningrum, S.: Iterative Local Gaussian Clustering for Expressed Genes Identification Linked to Malignancy of Human Colorectal Carcinoma. Bioinformation 2(5), 175–181 (2007)

    Article  Google Scholar 

  19. Tran, T.N., Wehrens, R., Buydens, L.M.C.: KNN-kernel density-based clustering for high-dimensional multivariate data. Computational Statistics & Data Analysis 51, 513–525 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  20. Clustering datasets, Speech and Image Processing Unit, School of Computing, University of Eastern Finland (2012)

    Google Scholar 

  21. Fu, L., Medico, E.: A novel fuzzy clustering method for the analysis of DNA microarray data. BMC bioinformatics 8(1), 3 (2007)

    Article  Google Scholar 

  22. Jain, A.K., Law, M.H.C.: Data Clustering: A User’s Dilemma. In: Pal, S.K., Bandyopadhyay, S., Biswas, S. (eds.) PReMI 2005. LNCS, vol. 3776, pp. 1–10. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  23. Chang, H., Yeung, D.-Y.: Robust path-based spectral clustering. Pattern Recognition 41(1), 191–203 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  24. Gionis, A., Mannila, H., Tsaparas, P.: Clustering aggregation. ACM Transactions on Knowledge Discovery from Data 1(1), 1–30 (2007)

    Article  Google Scholar 

  25. Zahn, C.T.: Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Transaction on Computers 100(1), 68–86 (1971)

    Article  Google Scholar 

  26. Veenman, C.J.: A maximum variance cluster algorithm. IEE Transaction on Pattern Analysis and Machine Intelligence 24(9), 1273–1280 (2002)

    Article  Google Scholar 

  27. Van Rijsbergen, C.J.: Information retrieval. Butterworths, London (1979)

    Google Scholar 

  28. Gullo, F., Ponti, G., Tagarelli, A.: Clustering Uncertain Data Via K-Medoids. In: Greco, S., Lukasiewicz, T. (eds.) SUM 2008. LNCS (LNAI), vol. 5291, pp. 229–242. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  29. Martinez, W.L., Martinez, A.R.: Exploratory data analysis with MATLAB. Chapman & Hall/CRC (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Musdholifah, A., Hashim, S.Z.M. (2013). Triangular Kernel Nearest-Neighbor-Based Clustering Algorithm for Discovering True Clusters. In: Washio, T., Luo, J. (eds) Emerging Trends in Knowledge Discovery and Data Mining. PAKDD 2012. Lecture Notes in Computer Science(), vol 7769. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36778-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36778-6_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36777-9

  • Online ISBN: 978-3-642-36778-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics