Skip to main content
Log in

Improving the performance of visualized clustering method

  • Original Article
  • Published:
International Journal of System Assurance Engineering and Management Aims and scope Submit manuscript

Abstract

In data domains, the process of clustering is expressed as exploratory data analysis in which similar objects can be grouped as subsets according to the properties of a cluster. Discovering the number of clusters is an important issue in clustering. It is noted that k-means gives poor clustering results when the user attempts an incorrect ‘k’ value. The visual access tendency (VAT) is a widely used technique for discovering the number of clusters. Recently, Bezdek et al. introduced extended ideas of VAT such as SpecVAT, and iVAT. The SpecVAT uses spectral approach and produces accurate clustering results than VAT. The limitation of SpecVAT is that it unables to solve the clustering tendency problem for path-based clustered data. The iVAT technique solves this issue. These techniques use an Euclidean space for dissimilarity matrix computation. In this paper, we use a multi-view point based similarity (MVS) cosine metric for achieving robust results. We present two proposed methods, namely, cSpecVAT and GMMMVS-VAT. The cSpecVAT is developed by cosine metric and spectral concepts and it extracts efficient clustering results over the comprehensive datasets such as synthetic, real, genetic and image. For audio datasets, there is another method proposed called as GMMMVS-VAT, which includes the following steps: modelling the speech data by Gaussian mixture model (GMM), and MVS for extracting the similarity features as reference to multi-view points; hence, it works more effectively on speech datasets. In MVS, we use a number of view-points as reference making it more robust than a single view-point approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Bezdek James (2002) VAT: a tool for visual assessment of cluster tendency. Proc Int Joint Conf Neural Netw 3:2225–2230

    Google Scholar 

  • Bezdek JC, Pal NR (1998) Some new indexes of clustering validity. IEEE Trans Syst Man Cybernet 28(3):301–315

    Article  Google Scholar 

  • Bolshakova N, Azuaje F (2003) Cluster validiation techniques for genome expression data. Sig Process 83:825–833

    Article  MATH  Google Scholar 

  • Cai D, He X, Han J (2005) Document clustering using locality preserving indexing. IEEE Trans Knowl Data Eng 17(2):1624–1637

    Article  Google Scholar 

  • Cheng Y (1995) Mean shift, mode seeking, and clustering. IEEE Trans Pattern Anal Mach Intell 17(8):790–799

    Article  Google Scholar 

  • Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619

    Article  Google Scholar 

  • Dehak N, Dehak R, Glass J, Reynolds D, Kenny P (2010) Cosine similarity scoring without score normalization techniques. In proceedings of IEEE Odyssey workshop, Brno

  • Duda RO, Hart PE, Stork DG (2000) Pattern Classification, 2nd edn. Wiley, New York

    MATH  Google Scholar 

  • Eswara Reddy B, Rajendra Prasad K (2012) Reducing runtime values in minimum spanning tree based clustering by visual access tendency. Int J Data Min Knowl Manag Process 2(3):11–22

    Article  Google Scholar 

  • Fakunaga K, Hostetler L (1975) The estimation of the gradient of a density function with applications in pattern recognition. IEEE Trans Inf Theory 21(1):32–40

    Article  MathSciNet  MATH  Google Scholar 

  • Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validity techniques. J Intell Inform Syst 17(2):107–145

    Article  MATH  Google Scholar 

  • Havens TC, Bezdek JC (2010) An efficient formulation of the improved visual assessment of cluster tendency (iVAT) algorthm. IEEE Trans Knowl Data Eng 22(10):1401–1413

    Article  Google Scholar 

  • Jain AK, Murthi MN, Flynn PJ (1999) Data Clustering: Review. ACM Comput Surv 31(3):266–320

    Article  Google Scholar 

  • Kenny P, Boulianne G (2007) Speaker and session variability in GMM based speaker verification. IEEE Trans Audio Speech Lang Process 15(4):1448–1460

    Article  Google Scholar 

  • Lovasz L, Plummer M (1986) Matching theory. Akadémiai Kiadó, Budapest

    MATH  Google Scholar 

  • Nguyen DT (2012) Clustering with multi-viewpoint based similarity measure. IEEE Trans Knowl Data Eng 24(6):988–1001

    Article  Google Scholar 

  • Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybernet 9(1):62–66

    Article  MathSciNet  Google Scholar 

  • Pekalska E, Harol A, Duin RPW, Spillmann B, Bunke H (2006) Non-Euclidean or non-metric measures can be informative. In: Yeung D-Y et al (eds) SSPR & SPR 2006. LNCS, vol 4109. Springer, Heidelberg, pp 871–880

  • Popescu M, Bezdek JC, Havens TC, Keller JM (2013) A clustering validity frame work based on Induced partition dissimilarity. IEEE Trnas Cybern 43(1):308–320

    Article  Google Scholar 

  • Ramze ReZaee M, Lelieveldt BPF (1998) A new cluster validity index for the fuzzy c-mean. Pattern Recognit Lett 19:237–246

    Article  MATH  Google Scholar 

  • Reynolds DA (1995) Speaker identification and verification using Gaussian mixture speaker models. Speech Commun 17:91–108

    Article  Google Scholar 

  • Reynolds D, Quatieri T, Dunn R (2000) Speaker verification using adapted Gaussian mixture models. Digit Signal Process 10(3):19–41

    Article  Google Scholar 

  • Senoussaoui M, Kenny P (2014) A study of the cosine distance-based mean shift for telephone speech diarization. IEEE/ACM Trans Audio Speech Lang Process 22(1):217–227

    Article  Google Scholar 

  • M. Senoussaoui, Patrick Kenny, Themos stafylakis, Pierre Dumouchel (2013) Efficient iterative mean shift based cosine dissimilarity for mutli-recording speaker clustering. In: Proceedings of ICASSP, 7712–7715

  • Senoussaoui M, Kenny P, Stafylakis T, Dumouchel P (2014) A study of the cosine distance-based mean shift for telephone speech diarization. IEEE Trans Audio, Speech Lang Process 22(1):217–227

    Article  Google Scholar 

  • Tang H, Chu SM (2012) Partially supervised speaker clustering. IEEE Trans Pattern Anal Mach Intell 34(5):959–971

    Article  Google Scholar 

  • Wang Liang, Bezdek James (2009) automatically determining the number of clusters in unlabeled datasets. IEEE Trans Knowl Data Eng 21(3):335–349

    Article  Google Scholar 

  • Wang Liang, Bezdek James (2010) Enhanced visual analysis for cluster tendency assessment and data partitioning. IEEE Trans Knowl Data Eng 22(10):1401–1413

    Article  Google Scholar 

  • Wang X, Wang X, Wlkes DM (2009) A divide-and-conquer—approach for minimum spanning tree-based clustering. IEEE Trans Knowl Data Eng 21(7):945–958

    Article  Google Scholar 

  • (1998) http://archive.ics.uci.edu/ml/datasets.html

  • Georghiades A et al (2001) Yale face database. http://vision.ucsd.edu/leekc/ExtYaleDatabase/ExtYaleB.html

  • (2012) http://www.exploredata.net/Downloads/Gene-Expression-Data-Set

  • Y Yan, L Chen, DT Nguyen (2012) Semi-supervised clustering with multi-viewpoint based similarity measure. In: WCCI 2012 IEEE world congress on computational intelligence, Brisbane, 1–8

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. Rajendra Prasad.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Eswara Reddy, B., Rajendra Prasad, K. Improving the performance of visualized clustering method. Int J Syst Assur Eng Manag 7 (Suppl 1), 102–111 (2016). https://doi.org/10.1007/s13198-015-0342-x

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13198-015-0342-x

Keywords

Navigation