Advertisement

Pattern Analysis and Applications

, Volume 20, Issue 1, pp 21–31 | Cite as

Cluster validity index based on Jeffrey divergence

  • Ahmed Ben SaidEmail author
  • Rachid Hadjidj
  • Sebti Foufou
Theoretical Advances

Abstract

Cluster validity indexes are very important tools designed for two purposes: comparing the performance of clustering algorithms and determining the number of clusters that best fits the data. These indexes are in general constructed by combining a measure of compactness and a measure of separation. A classical measure of compactness is the variance. As for separation, the distance between cluster centers is used. However, such a distance does not always reflect the quality of the partition between clusters and sometimes gives misleading results. In this paper, we propose a new cluster validity index for which Jeffrey divergence is used to measure separation between clusters. Experimental results are conducted using different types of data and comparison with widely used cluster validity indexes demonstrates the outperformance of the proposed index.

Keywords

Clustering Cluster validity index Jeffrey divergence 

Notes

Acknowledgments

This publication was made possible by NPRP Grant # 4-1165- 2-453 from the Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.

References

  1. 1.
    Arbelaitz O, Gurrutxaga I, Muguerza J, Pérez JM, Perona I (2013) An extensive comparative study of cluster validity indices. Pattern Recogn 46(1):243–256CrossRefGoogle Scholar
  2. 2.
    Athanasios P (1991) Probability, random variables and stochastic processes, 3rd edn. McGraw-Hill Companies, New YorkGoogle Scholar
  3. 3.
    Bache K, Lichman M (2013) UCI machine learning repository. http://archive.ics.uci.edu/ml
  4. 4.
    Bandyopadhyay S, Saha S, Pedrycz W (2011) Use of a fuzzy granulation-degranulation criterion for assessing cluster validity. Fuzzy Sets Syst 170:22–42CrossRefGoogle Scholar
  5. 5.
    Bezdek JC (1974) Cluster validity with fuzzy sets. J Cybernet 3:58–73MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York and LondonCrossRefzbMATHGoogle Scholar
  7. 7.
    Chang H, Yao Y, Koschan A, Abidi BR, Abidi MA (2009) Improving face recognition via narrowband spectral range selection using jeffrey divergence. IEEE Trans Inf Forensics Secur 4(1):111–122CrossRefGoogle Scholar
  8. 8.
    Chen MY, Linkens D (2004) Rule-base self-generation and simplification for data-driven fuzzy models. Fuzzy Sets Syst 142:243–265MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Deza Marie M, Deza Elena (2009) Encyclopedia of distances. Springer, Heidelberg New York Dordrecht LondonCrossRefzbMATHGoogle Scholar
  10. 10.
    Elhamifar E, Vidal R (2013) Sparse subspace clustering: algorithm, theory, and applications. IEEE Trans Pattern Anal Mach Intell 35(11):2765–2781CrossRefGoogle Scholar
  11. 11.
    Ester M, Peter Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. AAAI Press, Palo AltoGoogle Scholar
  12. 12.
    Everitt BS, Landau S, Leese M, Stahl D (2011) An introduction to classification and clustering, chap 1:1–13. Wiley, New YorkGoogle Scholar
  13. 13.
    Fränti P, Virmajoki O (2006) Iterative shrinking method for clustering problems. Pattern Recogn 39(5):761–775CrossRefzbMATHGoogle Scholar
  14. 14.
    Georghiades A, Belhumeur P, Kriegman D (2001) From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Trans Pattern Anal Mach Intell 23(6):643–660CrossRefGoogle Scholar
  15. 15.
    Goldberger J, Hinton GE, Roweis ST, Salakhutdinov R (2005) Neighbourhood components analysis. In: Saul L, Weiss Y, Bottou L (eds) Advances in neural information processing systems 17. MIT Press, Cambridge, pp 513–520Google Scholar
  16. 16.
    Gurrutxaga I, Muguerza J, Arbelaitz O, Pérez JM, Martín JI (2011) Towards a standard methodology to evaluate internal cluster validity indices. Pattern Recogn Lett 32:505–515CrossRefGoogle Scholar
  17. 17.
    Jain A, Dubes R (1988) Algorithms for clustering data. Prentice-Hall Englewood Cliffs, New YorkzbMATHGoogle Scholar
  18. 18.
    Jain AK (2010) Data clustering: 50 years beyond k-means. Pattern Recogn Lett 31:651–666CrossRefGoogle Scholar
  19. 19.
    Krooshof PW, Postma GJ, Melssen WJ, Buydens LM (2012) Biomedical imaging: principles and applications, chap 12:1–29. Wiley, New YorkGoogle Scholar
  20. 20.
    Lee K, Ho J, Kriegman D (2005) Acquiring linear subspaces for face recognition under variable lighting. IEEE Trans Pattern Anal Mach Intell 27(5):684–698CrossRefGoogle Scholar
  21. 21.
    Mu Y, Ding W, Tao D (2013) Local discriminative distance metrics ensemble learning. Pattern Recogn 46(8):2337–2349CrossRefzbMATHGoogle Scholar
  22. 22.
    Pakhira MK, Bandyopadhyay S, Maulik U (2004) Validity index for crisp and fuzzy clusters. Pattern Recogn 37:487–501CrossRefzbMATHGoogle Scholar
  23. 23.
    Pakhira MK, Bandyopadhyay S, Maulik U (2005) A study of some fuzzy cluster validity indices, genetic clustering and application to pixel classification. Fuzzy Sets Syst 155:191–214MathSciNetCrossRefGoogle Scholar
  24. 24.
    Pascual D, Pla F, Snchez JS (2010) Cluster validation using information stability measures. Pattern Recogn Lett 31(6):454–461CrossRefGoogle Scholar
  25. 25.
    Puzicha J, Hofmann T, Buhmann J (1997) Non-parametric similarity measures for unsupervised texture segmentation and image retrieval. Proc IEEE Conf Comput Vis Pattern Recogn 1997:267–272CrossRefGoogle Scholar
  26. 26.
    Sugiyama M (2007) Dimensionality reduction of multimodal labeled data by local fisher discriminant analysis. J Mach Learn Res 8:1027–1061zbMATHGoogle Scholar
  27. 27.
    Tran TN, Wehrens R, Buydens LM (2005) Clustering multispectral images: a tutorial. Chemom Intell Lab Syst 77:3–17CrossRefGoogle Scholar
  28. 28.
    Veenman CJ, Reinders M, Backer E (2002) A maximum variance cluster algorithm. IEEE Trans Pattern Anal Mach Intell 24:1273–1280CrossRefGoogle Scholar
  29. 29.
    Wang W, Zhang Y (2007) On fuzzy cluster validity indices. Fuzzy Sets Syst 158:2095–2117MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Weinberger KQ, Blitzer J, Saul LK (2006) Distance metric learning for large margin nearest neighbor classification. In. In NIPS, MIT Press, CambridgeGoogle Scholar
  31. 31.
    Wu KL, Yang MS, Hsieh JN (2009) Robust cluster validity indexes. Pattern Recogn 42(11):2541–2550CrossRefzbMATHGoogle Scholar
  32. 32.
    Xie XL, Beni G (1991) A validity measure for fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 13:841–847CrossRefGoogle Scholar
  33. 33.
    Xing EP, Ng AY, Jordan MI, Russell S (2003) Distance metric learning, with aplicationt o clustering with side-information. In: Advances in neural information processing systems 15, MIT Press, Cambridge, pp 505–512Google Scholar
  34. 34.
    Žalik KR (2010) Cluster validity index for estimation of fuzzy clusters of different sizes and densities. Pattern Recogn 43(10):3374–3390CrossRefzbMATHGoogle Scholar
  35. 35.
    Žalik KR, Žalik B (2011) Validity index for clusters of different sizes and densities. Pattern Recogn Lett 32:221–234CrossRefzbMATHGoogle Scholar
  36. 36.
    Zheng J, You H (2013) A new model-independent method for change detection in multitemporal sar images based on radon transform and jeffrey divergence. IEEE Geosci Remote Sens Lett 10(1):91–95CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2015

Authors and Affiliations

  • Ahmed Ben Said
    • 1
    • 2
    Email author
  • Rachid Hadjidj
    • 1
  • Sebti Foufou
    • 1
  1. 1.CSE Department, College of EngineeringQatar UniversityDohaQatar
  2. 2.LE2I Lab, UMR CNRS 6306University of BurgundyDijonFrance

Personalised recommendations