Improving the performance of visualized clustering method

Eswara Reddy, B.; Rajendra Prasad, K.

doi:10.1007/s13198-015-0342-x

Improving the performance of visualized clustering method

Original Article
Published: 07 February 2015

Volume 7, pages 102–111, (2016)
Cite this article

International Journal of System Assurance Engineering and Management Aims and scope Submit manuscript

B. Eswara Reddy¹ &
K. Rajendra Prasad²

170 Accesses
6 Citations
Explore all metrics

Abstract

In data domains, the process of clustering is expressed as exploratory data analysis in which similar objects can be grouped as subsets according to the properties of a cluster. Discovering the number of clusters is an important issue in clustering. It is noted that k-means gives poor clustering results when the user attempts an incorrect ‘k’ value. The visual access tendency (VAT) is a widely used technique for discovering the number of clusters. Recently, Bezdek et al. introduced extended ideas of VAT such as SpecVAT, and iVAT. The SpecVAT uses spectral approach and produces accurate clustering results than VAT. The limitation of SpecVAT is that it unables to solve the clustering tendency problem for path-based clustered data. The iVAT technique solves this issue. These techniques use an Euclidean space for dissimilarity matrix computation. In this paper, we use a multi-view point based similarity (MVS) cosine metric for achieving robust results. We present two proposed methods, namely, cSpecVAT and GMMMVS-VAT. The cSpecVAT is developed by cosine metric and spectral concepts and it extracts efficient clustering results over the comprehensive datasets such as synthetic, real, genetic and image. For audio datasets, there is another method proposed called as GMMMVS-VAT, which includes the following steps: modelling the speech data by Gaussian mixture model (GMM), and MVS for extracting the similarity features as reference to multi-view points; hence, it works more effectively on speech datasets. In MVS, we use a number of view-points as reference making it more robust than a single view-point approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Bezdek James (2002) VAT: a tool for visual assessment of cluster tendency. Proc Int Joint Conf Neural Netw 3:2225–2230
Google Scholar
Bezdek JC, Pal NR (1998) Some new indexes of clustering validity. IEEE Trans Syst Man Cybernet 28(3):301–315
Article Google Scholar
Bolshakova N, Azuaje F (2003) Cluster validiation techniques for genome expression data. Sig Process 83:825–833
Article MATH Google Scholar
Cai D, He X, Han J (2005) Document clustering using locality preserving indexing. IEEE Trans Knowl Data Eng 17(2):1624–1637
Article Google Scholar
Cheng Y (1995) Mean shift, mode seeking, and clustering. IEEE Trans Pattern Anal Mach Intell 17(8):790–799
Article Google Scholar
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619
Article Google Scholar
Dehak N, Dehak R, Glass J, Reynolds D, Kenny P (2010) Cosine similarity scoring without score normalization techniques. In proceedings of IEEE Odyssey workshop, Brno
Duda RO, Hart PE, Stork DG (2000) Pattern Classification, 2nd edn. Wiley, New York
MATH Google Scholar
Eswara Reddy B, Rajendra Prasad K (2012) Reducing runtime values in minimum spanning tree based clustering by visual access tendency. Int J Data Min Knowl Manag Process 2(3):11–22
Article Google Scholar
Fakunaga K, Hostetler L (1975) The estimation of the gradient of a density function with applications in pattern recognition. IEEE Trans Inf Theory 21(1):32–40
Article MathSciNet MATH Google Scholar
Halkidi M, Batistakis Y, Vazirgiannis M (2001) On clustering validity techniques. J Intell Inform Syst 17(2):107–145
Article MATH Google Scholar
Havens TC, Bezdek JC (2010) An efficient formulation of the improved visual assessment of cluster tendency (iVAT) algorthm. IEEE Trans Knowl Data Eng 22(10):1401–1413
Article Google Scholar
Jain AK, Murthi MN, Flynn PJ (1999) Data Clustering: Review. ACM Comput Surv 31(3):266–320
Article Google Scholar
Kenny P, Boulianne G (2007) Speaker and session variability in GMM based speaker verification. IEEE Trans Audio Speech Lang Process 15(4):1448–1460
Article Google Scholar
Lovasz L, Plummer M (1986) Matching theory. Akadémiai Kiadó, Budapest
MATH Google Scholar
Nguyen DT (2012) Clustering with multi-viewpoint based similarity measure. IEEE Trans Knowl Data Eng 24(6):988–1001
Article Google Scholar
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybernet 9(1):62–66
Article MathSciNet Google Scholar
Pekalska E, Harol A, Duin RPW, Spillmann B, Bunke H (2006) Non-Euclidean or non-metric measures can be informative. In: Yeung D-Y et al (eds) SSPR & SPR 2006. LNCS, vol 4109. Springer, Heidelberg, pp 871–880
Popescu M, Bezdek JC, Havens TC, Keller JM (2013) A clustering validity frame work based on Induced partition dissimilarity. IEEE Trnas Cybern 43(1):308–320
Article Google Scholar
Ramze ReZaee M, Lelieveldt BPF (1998) A new cluster validity index for the fuzzy c-mean. Pattern Recognit Lett 19:237–246
Article MATH Google Scholar
Reynolds DA (1995) Speaker identification and verification using Gaussian mixture speaker models. Speech Commun 17:91–108
Article Google Scholar
Reynolds D, Quatieri T, Dunn R (2000) Speaker verification using adapted Gaussian mixture models. Digit Signal Process 10(3):19–41
Article Google Scholar
Senoussaoui M, Kenny P (2014) A study of the cosine distance-based mean shift for telephone speech diarization. IEEE/ACM Trans Audio Speech Lang Process 22(1):217–227
Article Google Scholar
M. Senoussaoui, Patrick Kenny, Themos stafylakis, Pierre Dumouchel (2013) Efficient iterative mean shift based cosine dissimilarity for mutli-recording speaker clustering. In: Proceedings of ICASSP, 7712–7715
Senoussaoui M, Kenny P, Stafylakis T, Dumouchel P (2014) A study of the cosine distance-based mean shift for telephone speech diarization. IEEE Trans Audio, Speech Lang Process 22(1):217–227
Article Google Scholar
Tang H, Chu SM (2012) Partially supervised speaker clustering. IEEE Trans Pattern Anal Mach Intell 34(5):959–971
Article Google Scholar
Wang Liang, Bezdek James (2009) automatically determining the number of clusters in unlabeled datasets. IEEE Trans Knowl Data Eng 21(3):335–349
Article Google Scholar
Wang Liang, Bezdek James (2010) Enhanced visual analysis for cluster tendency assessment and data partitioning. IEEE Trans Knowl Data Eng 22(10):1401–1413
Article Google Scholar
Wang X, Wang X, Wlkes DM (2009) A divide-and-conquer—approach for minimum spanning tree-based clustering. IEEE Trans Knowl Data Eng 21(7):945–958
Article Google Scholar
(1998) http://archive.ics.uci.edu/ml/datasets.html
Georghiades A et al (2001) Yale face database. http://vision.ucsd.edu/leekc/ExtYaleDatabase/ExtYaleB.html
(2012) http://www.exploredata.net/Downloads/Gene-Expression-Data-Set
Y Yan, L Chen, DT Nguyen (2012) Semi-supervised clustering with multi-viewpoint based similarity measure. In: WCCI 2012 IEEE world congress on computational intelligence, Brisbane, 1–8

Download references

Author information

Authors and Affiliations

CSE Department, JNTUA College of Engineering, Ananthapur, Andhra Pradesh, India
B. Eswara Reddy
JNTUA College of Engineering, Ananthapur, Andhra Pradesh, India
K. Rajendra Prasad

Authors

B. Eswara Reddy
View author publications
You can also search for this author in PubMed Google Scholar
K. Rajendra Prasad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to K. Rajendra Prasad.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Eswara Reddy, B., Rajendra Prasad, K. Improving the performance of visualized clustering method. Int J Syst Assur Eng Manag 7 (Suppl 1), 102–111 (2016). https://doi.org/10.1007/s13198-015-0342-x

Download citation

Received: 16 August 2014
Revised: 08 December 2014
Published: 07 February 2015
Issue Date: December 2016
DOI: https://doi.org/10.1007/s13198-015-0342-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving the performance of visualized clustering method

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Data clustering: application and trends

A Comprehensive Survey of Anomaly Detection Algorithms

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving the performance of visualized clustering method

Abstract

Access this article

Similar content being viewed by others

Density-Based Clustering Based on Hierarchical Density Estimates

Data clustering: application and trends

A Comprehensive Survey of Anomaly Detection Algorithms

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation