Exploring Network Behavior Using Cluster Analysis



Innovation increasingly does occur in network environments. Identifying the important players in the innovative process, namely “the innovators”, is key to understanding the process of innovation. Doing this requires flexible analysis tools tailored to work well with complex datasets generated within such environments. One such tool, cluster analysis, organizes a large data set into discrete groups based on patterns of similarity. It can be used to discover data patterns in networks without requiring strong ex ante assumptions about the properties of either the data generating process or the environment. This paper reviews key procedures and algorithms related to cluster analysis. Further, it demonstrates how to choose among these methods to identify the characteristics of players in a network experiment where innovation emerges endogenously.

JEL Classification

C46 C81 


  1. Adomavicius, G., Curley, S. P., Gupta, A., & Sanyal, P. (2012). Effect of information feedback on bidder behavior in continuous combinatorial auctions. Management Science, 58, 811–830.CrossRefGoogle Scholar
  2. Astrahan, M. M. (1970). Speech analysis by clustering, or the hyperphome method. In Stanford artificial intelligence project memorandum AIM-124. Stanford: Stanford University.Google Scholar
  3. Babu, G. J., & Feigelson, E. D. (1997). Statistical challenges in modern astronomy II. New York: Springer.CrossRefGoogle Scholar
  4. Ball, G. H., & Hall, D. J. (1965). ISODATA: A novel method for data analysis and pattern classification Menlo Park. Stanford: Stanford Research Institute.Google Scholar
  5. Banfield, J. D., & Raftery, A. E. (1993). Model-based Gaussian and Non-Gaussian clustering. Biometrics, 49, 803–821.CrossRefGoogle Scholar
  6. Borgen, F. H., & Barnett, D. C. (1987). Applying cluster analysis in counseling research. Journal of Counseling Psychology, 34(4), 456–468.CrossRefGoogle Scholar
  7. Bradley, P. S., & Fayyad, U. M. (1998). Refining initial points for k-means clustering. In J. Shavlik (Ed.), Machine learning: Proceedings of the fifteenth international conference (pp. 91–99). San Francisco: Morgan Kaufmann.Google Scholar
  8. Calinski, R. B., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics, 3, 1–27.CrossRefGoogle Scholar
  9. Celeux, G., & Govaert, G. (1995). Gaussian parsimonious clustering models. Pattern Recognition, 28(5), 781–793.CrossRefGoogle Scholar
  10. Clarke, D. L. (1968). Analytical archaeology. London: Methuen.Google Scholar
  11. DeRubeis, E., Wylie, J. L., Cameron, D. W., Nair, R. C., & Jolly, A. M. (2007). Combining social network analysis and cluster analysis to identify sexual network types. International Journal of STD & AIDS, 18(11), 754–759.CrossRefGoogle Scholar
  12. Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York: Wiley.Google Scholar
  13. El-Gamal, M. A., & Grether, D. M. (1995). Are people Bayesian? Uncovering behavioral strategies. Journal of the American Statistical Association, 90, 1137–1145.CrossRefGoogle Scholar
  14. Everitt, B. S., Landau, S., Leese, M., & Stahl, D. (2011). Cluster analysis. New York: Wiley Press, 5th edition.CrossRefGoogle Scholar
  15. Farmer, A. E., McGuffinb, P., & Spitznagelc, E. L. (1983). Heterogeneity in schizophrenia: A cluster-analytic approach. Psychiatry Research, 8(1), 1–12.CrossRefGoogle Scholar
  16. Fisher, W. D. (1969). Clustering and aggregation in economics. Baltimore: The Johns Hopkins University Press.Google Scholar
  17. Garcia-Escudero, L. A., & Gordaliza, A. (1999). Robustness of properties of K-means and trimmed K-means. Journal of the American Statistical Association, 94, 956–969.Google Scholar
  18. Gower, J. C. (1966). Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika, 53, 325–338.CrossRefGoogle Scholar
  19. Gower, J. C., & Legendre, P. (1986). Metric and Euclidean properties of dissimilarity coefficients. Journal of Classification, 5, 5–48.CrossRefGoogle Scholar
  20. Hansen, P., & Jaumard, B. (1997). Cluster analysis and mathematical programming. Mathematical Programming, 79, 191–215.Google Scholar
  21. Hartigan, J. A., & Wong, M. A. (1979). Algorithm AS 136: A k-means clustering algorithm. Applied Statistics, 28, 100–108.CrossRefGoogle Scholar
  22. Hay, P. J., Fairburn, C. G., & Doll, H. A. (1996). The classification of bulimic eating disorders: A community based study. Psychological Medicine, 26(4), 801–812.CrossRefGoogle Scholar
  23. Hirschberg, J. G., Maasoumi, E., & Slottje, D. J. (1991). Cluster analysis for measuring welfare and quality of life across countries. Journal of Econometrics, 50, 131–150.CrossRefGoogle Scholar
  24. Houser, D., Keane, M., & McCabe, K. (2004). Behavior in a dynamic decision problem: An analysis of experimental evidence using a Bayesian type classification algorithm. Econometrica, 72(3), 781–822.CrossRefGoogle Scholar
  25. Jajuga, K., Walesiak, M., & Bak, A. (2003). On the general distance measure. In M. Schwaiger & O. Opitz (Eds.), Exploratory data analysis in empirical research. Heidelberg: Springer.Google Scholar
  26. Johnson, S. (1967). Hierarchical clustering schemes. Psychometrika, 32(3), 241–254.CrossRefGoogle Scholar
  27. Kaufman, L., & Rousseeuw, P. (1990). Finding groups in data: An introduction to cluster analysis. New York: Wiley.CrossRefGoogle Scholar
  28. Kerr, M. K., & Churchill, G. A. (2001). Bootstrapping cluster analysis: Assessing the reliability of conclusions from microarray experiments. Proceedings of National Academy of Sciences of the USA, 98(16), 8961–8965.CrossRefGoogle Scholar
  29. Kohn, H. F., Steinley, D., & Brusco, M. J. (2010). The p-median model as a tool for clustering psychological data. Psychological Methods, 15, 87–95.CrossRefGoogle Scholar
  30. Kurzban, R., Houser, D. (2005). Experiments investigating cooperative types in human groups: A complement to evolutionary theory and simulations, Proceedings of the National Academy of Sciences of the United States of America, 102(5), 1803–1807.CrossRefGoogle Scholar
  31. Liu, G. L. (1968). Introduction to combinatorial mathematics. New York: McGraw Hill.Google Scholar
  32. MacQueen, J. (1967). Some methods of classification and analysis of multivariate observations. In L. M. Le Cam & J. Neyman (Eds.), Proceedings of the fifth Berkeley symposium on mathematical statistics and probability (pp. 281–297). Berkeley: University of California Press.Google Scholar
  33. Milligan, G. W. (1980). An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika, 45, 325–342.CrossRefGoogle Scholar
  34. Milligan, G. W., & Cooper, M. C. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50, 159–179.CrossRefGoogle Scholar
  35. Murtagh, F., & Raftery, A. E. (1984). Fitting straight lines to point patterns. Pattern Recognition, 17, 479–483.CrossRefGoogle Scholar
  36. Punj, G., & Stewart, D. W. (1983). Cluster analysis in marketing research: Review and suggestions for application. Journal of Marketing Research, 20(2), 134–148.CrossRefGoogle Scholar
  37. Rogers, D. J., Tanimoto, T. T. (1960). A Computer Program for Classifying Plants. Science 132 (3434):1115–1118.CrossRefGoogle Scholar
  38. Rong, R., & Houser, D. (2012). Growing stars: A laboratory analysis of network formation. Working paper.Google Scholar
  39. Rosenburg, H. (1910). On the relation between brightness and spectral type in the Pleiades [title translated in English]. Astronomische Nachrichten, 186, 71.CrossRefGoogle Scholar
  40. Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20, 53–65.CrossRefGoogle Scholar
  41. Scott, A. J., & Symons, M. J. (1971). Clustering methods based on likelihood ratio criteria. Biometrics, 27, 387–398.CrossRefGoogle Scholar
  42. Slater, S. F., & Zwirlein, T. J. (1996). The structure of financial strategy: Patterns in financial decision making. Managerial and Decision Economics, 17(3), 253–266.CrossRefGoogle Scholar
  43. Sneath, P. H. A., & Sokal, R. R. (1973). Numerical taxonomy. San Francisco: W. H. Freeman.Google Scholar
  44. Späth, H. (1980). Cluster analysis algorithms for data reduction and classification of objects. New York: Wiley.Google Scholar
  45. Spearman, C. (1904). General intelligence, objectively determined and measured. American Journal of Psychology, 15, 201–292.CrossRefGoogle Scholar
  46. Steinley, D. (2003). K-means clustering: What you don’t know may hurt you. Psychological Methods, 8, 294–304.CrossRefGoogle Scholar
  47. Sutton, M. Q., & Reinhard, K. J. (1995). Cluster analysis of the coprolites from Antelope House: Implications for Anasazi diet and cuisine. Journal of Archaeological Science, 22(6), 741–750.CrossRefGoogle Scholar
  48. Symons, M. J. (1981). Clustering criteria and multivariate normal mixtures. Biometrics, 37, 35–43.CrossRefGoogle Scholar
  49. Tryon, R. C. (1932). Multiple factors Vs two factors as determiners of ability. Psychological Review, 39, 324–351.CrossRefGoogle Scholar
  50. Tryon, R. C. (1935). A theory of psychological components—an alternative to “mathematical factors. Psychological Review, 42, 425–454.CrossRefGoogle Scholar
  51. Tryon, R. C., & Bailey, D. E. (1966). The BCTRY computer system of cluster and factor analysis. Multivariate Behavioral Research, 1, 95–111.CrossRefGoogle Scholar
  52. Witten, D. M., & Tibshirani, R. (2010). Supervised multidimensional scaling for visualization, classification, and bipartite ranking. Journal Computational Statistics & Data Analysis archive, 55(1), 789–801.CrossRefGoogle Scholar
  53. Yamamori, T., Kato, K., Kawagoe, T., & Matsui, A. (2008). Voice matters in a dictator game. Experimental Economics, 11, 336–343.CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of EconomicsWeber State UniversityOgdenUSA
  2. 2.ICES, Department of EconomicsGeorge mason UniversityFairfaxUSA

Personalised recommendations