Overview on Cluster Analysis

  • Kweku-Muata Osei-BrysonEmail author
  • Sergey Samoilenko
Part of the Integrated Series in Information Systems book series (ISIS, volume 34)


This chapter provides an overview of cluster analysis. Its main purpose is to introduce the reader to the major concepts underlying this data mining (DM) technique, particularly those that are relevant to the chapter that involves the use of this technique. It also provides an illustrative example of cluster analysis.


Cluster Algorithm Domain Expert Cluster Validity Fraud Detection Natural Cluster 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Aggarwal CC, Yu PS (2001) Outlier detection for high dimensional data. In: Proceedings of the 2001 ACM SIGMOD international conference on management of data, pp 37–46Google Scholar
  2. Ankerst M, Breunig M, Kriegel H-P, Sander J (1999) OPTICS: ordering points to identify the clustering structure. In: Proceedings of ACM SIGMOD’99 international conference on the management of data, Philadelphia, PA, 1999, pp 49–60Google Scholar
  3. Balijepally V, Mangalaraj G, Iyengar K (2011) Are we wielding this hammer correctly? A reflective review of the application of cluster analysis in information systems research. J Assoc Inf Syst 12(5):375–413Google Scholar
  4. Balijepally V (2006) Application of cluster analysis in information systems research: a review. A & M University, Prairie ViewGoogle Scholar
  5. Bittman A, Gelbrand R (2009) Visualization of multi-algorithm clustering for better economic decisions—the case of car pricing. Decis Support Syst 47(1):42–50CrossRefGoogle Scholar
  6. Bezdek J (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New YorkCrossRefGoogle Scholar
  7. Bock H (1996) Probability models in partitional cluster analysis. Comput Stat Data Anal 23:5–28CrossRefGoogle Scholar
  8. Chen S-C, Ching R, Lin Y-S (2004) An extended study of the k-means algorithm for data clustering and its applications. J Oper Res Soc 55:976–987CrossRefGoogle Scholar
  9. Dave R (1992) Generalized fuzzy C-shells clustering and detection of circular and elliptic boundaries. Pattern Recogn 25:713–722Google Scholar
  10. Dubes R (1983) Cluster analysis and related issues. In: Chen C, Pau L, Wang P (eds) Handbook of pattern recognition and computer vision. World Scientific Publishing Co. Inc., River Edge, pp 3–32Google Scholar
  11. Dunn J (1974) Well separated clusters and optimal fuzzy partitions. J Cybern 4:95–104CrossRefGoogle Scholar
  12. Gordon A (1999) Classification. Chapman & Hall, New YorkGoogle Scholar
  13. Kaufman L, Rousseeuw P (1990) Finding groups in data. Wiley, New YorkCrossRefGoogle Scholar
  14. Huang J, Ng M, Rong H, Li Z (2005) Automated variable weighting in k-means type clustering. IEEE Trans Pattern Anal Mach Intell 27(5):657–668CrossRefGoogle Scholar
  15. Jain A, Dubes R (1988) Algorithms for clustering data. Prentice-Hall advanced reference series. Prentice-Hall Inc., Upper Saddle RiverGoogle Scholar
  16. Jain A, Murty M, Flynn P (1999) Data clustering: a review. ACM Comput Surv 31(3):264–323CrossRefGoogle Scholar
  17. Kimani S, Lodi S, Catarci T, Santucci G, Sartori Vidamine C (2004) A visual data mining environment. J Visual Lang Comput 15:37–67CrossRefGoogle Scholar
  18. Mathers W, Choi D (2004) Cluster analysis of patients with ocular surface disease blepharitis, and dry eye. Arch. Ophthalmol. 122:1700–1704CrossRefGoogle Scholar
  19. Mc Queen J (1967) Some methods for classification and analysis of multivariate observations. In: Lecam LM, Neyman J (eds) Proceedings of the 5th Berkeley symposium on mathematical statistics and probability. California Press, Berkeley, pp 281–297Google Scholar
  20. Murtagh F (1983) A survey of recent advances in hierarchical clustering algorithms which use cluster centers. Comput J 26:354–359CrossRefGoogle Scholar
  21. Okazaki S (2006) What do we know about mobile internet adopters? A cluster analysis. Inf Manage 43(2):127–141CrossRefGoogle Scholar
  22. Osei-Bryson K-M (2005) Assessing cluster quality using multiple measures. The next wave in computing, optimization, and decision technologies, pp 371–384Google Scholar
  23. Rai A, Tang X, Brown P, Keil M (2006) Assimilation patterns in the use of electronic procurement innovations: a cluster analysis. Inf Manage 43(3):336–349CrossRefGoogle Scholar
  24. Ramze Rezaee M, Lelieveldt B, Reiber J (1998) A new cluster validity index for the fuzzy c-mean. Pattern Recogn Lett 19:237–246CrossRefGoogle Scholar
  25. Samoilenko S, Osei-Bryson K-M (2010) Determining sources of relative inefficiency in heterogeneous samples: methodology using cluster analysis, DEA and neural networks. Eur J Oper Res 206(2):479–487CrossRefGoogle Scholar
  26. Tibshirani R, Walther G (2005) Cluster validation by prediction strength. J Comput Graph Stat 14(5):11–28Google Scholar
  27. Wallace L, Keil M, Rai A (2004) Understanding software project risk: a cluster analysis. Inf Manage 42:115–155CrossRefGoogle Scholar
  28. Ward J (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc 58:236–244CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Department of Information SystemsVirginia Commonwealth UniversityRichmondUSA
  2. 2.Department of Computer ScienceAverett UniversityDanvilleUSA

Personalised recommendations