Skip to main content
Log in

Fuzzy Shared Nearest Neighbor Clustering

  • Published:
International Journal of Fuzzy Systems Aims and scope Submit manuscript

Abstract

Shared nearest neighbor (SNN) clustering algorithm is a robust graph-based, efficient clustering method that could handle high-dimensional data. The SNN clustering works well when the data consist of clusters that are of diverse in shapes, densities, and sizes but assignment of the data points lying in the boundary regions of overlapping clusters is not accurate. In order to overcome this problem, we have presented an extension of shared nearest neighbor algorithm that have better capability of handling the data points lying in the boundary regions specifically for overlapping cluster by means of fuzzy concept. Extensive experiments were carried out to compare the proposed approach fuzzy shared nearest neighbor clustering (FSNN) with existing clustering methods K-means, Fuzzy C-means, Density_clust, and Shared Nearest Neighbor. The effectiveness of FSNN is evaluated in benchmark datasets. Experimental results using FSNN method show that it can accurately cluster the data points lying in the overlapping partition and generate compact and well-separated clusters as compared to state-of-the-art clustering algorithm. The results obtained using different clustering methods are validated by standard cluster validation measures.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  1. Chen, M.S., Han, J., Yu, P.S.: Data mining: an overview from a database perspective. IEEE Trans. Knowl. Data Eng. 8(6), 866–883 (1996)

    Article  Google Scholar 

  2. Kaufman, L., Rousseeuw, P.J.: Finding groups in data: an introduction to cluster analysis. Wiley, Hoboken (2009)

    MATH  Google Scholar 

  3. Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)

    Article  Google Scholar 

  4. Hartigan, J.A., Wong, M.A.: Algorithm AS 136: a k-means clustering algorithm. J. R. Stat. Soc. Series C Appl. Stat. 28(1), 100–108 (1979)

    MATH  Google Scholar 

  5. Bezdek, J.C., Ehrlich, R., Full, W.: FCM: the fuzzy c-means clustering algorithm. Comput. Geosci. 10(2–3), 191–203 (1984)

    Article  Google Scholar 

  6. Huang, Zhexue: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min. Knowl. Discov. 2(3), 283–304 (1998)

    Article  Google Scholar 

  7. Ertoz, L., Steinbach, M., Kumar V.: A new shared nearest neighbor clustering algorithm and its applications. In: Workshop on clustering high dimensional data and its applications, SIAM data mining 2002, Arlington, VA, USA (2002)

  8. Ertoz, L., Steinbach, M., Kumar, V.: Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data. In: Proceedings of second SIAM international conference on data mining, San Francisco, CA, USA (2003)

  9. Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. InKdd 96(34), 226–231 (1996)

    Google Scholar 

  10. Guha, S., Rastogi, R., Shim, K.: CURE: an efficient clustering algorithm for large databases. In: ACM SIGMOD Record, vol. 27, no. 2, pp. 73–84. ACM (1998)

  11. Hinneburg, A., Keim, D.A.: An efficient approach to clustering in large multimedia databases with noise. In: KDD, vol. 98, pp. 58–65 (1998)

  12. Agrawal R. Johannes Gehrke. Dimitrios Gunopulos, and Prabhakar Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. In: Proceedings of the ACM SIGMOD conference on management of dala, pp. 94–105 (1998)

    Article  Google Scholar 

  13. Karypis, G., Han, E.H., Kumar, V.: Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8), 68–75 (1999)

    Article  Google Scholar 

  14. Rodriguez, A., Laio, A.: Machine learning. Clustering by fast search and find of density peaks. Science 344, 1492–1496 (2014)

    Article  Google Scholar 

  15. Jarvis, R.A., Patrick, E.A.: Clustering using a similarity measure based on shared near neighbors. IEEE Trans. Comput. 100(11), 1025–1034 (1973)

    Article  Google Scholar 

  16. Houle ME, Kriegel HP, Kröger P, Schubert E, Zimek A. Can shared-neighbor distances defeat the curse of dimensionality?. In: International conference on scientific and statistical database management, pp. 482–500. Springer, Berlin Heidelberg (2010)

    Google Scholar 

  17. Sharma, R., Verma, K. Soft Comput (2017). https://doi.org/10.1007/s00500-017-2767-4

    Article  Google Scholar 

  18. Lichman, M.: UCI machine learning repository http://archive.ics.uci.edu/ml. University of California, School of Information and Computer Science, Irvine, CA (2013)

  19. Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 2, 224–227 (1979)

    Article  Google Scholar 

  20. Dunn, J.C.: Well-separated clusters and optimal fuzzy partitions. J Cybern. 4(1), 95–104 (1974)

    Article  MathSciNet  Google Scholar 

  21. Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 1(20), 53–65 (1987)

    Article  Google Scholar 

  22. Tan, P., Steinbach, M., Kumar, V.: Introduction to data mining. Addison-Wesley, Boston (2006)

    Google Scholar 

  23. Han, J., Kamber, M., Pei, J.: Data mining: concepts and techniques, 3rd edn. Morgan Kaufmann, Burlington (2011)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Rika Sharma or Kesari Verma.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sharma, R., Verma, K. Fuzzy Shared Nearest Neighbor Clustering. Int. J. Fuzzy Syst. 21, 2667–2678 (2019). https://doi.org/10.1007/s40815-019-00699-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40815-019-00699-7

Keywords

Navigation