Skip to main content
Log in

Lowdimensional Additive Overlapping Clustering

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

To reveal the structure underlying two-way two-mode object by variable data, Mirkin (1987) has proposed an additive overlapping clustering model. This model implies an overlapping clustering of the objects and a reconstruction of the data, with the reconstructed variable profile of an object being a summation of the variable profiles of the clusters it belongs to. Grasping the additive (overlapping) clustering structure of object by variable data may, however, be seriously hampered in case the data include a very large number of variables. To deal with this problem, we propose a new model that simultaneously clusters the objects in overlapping clusters and reduces the variable space; as such, the model implies that the cluster profiles and, hence, the reconstructed data profiles are constrained to lie in a lowdimensional space. An alternating least squares (ALS) algorithm to fit the new model to a given data set will be presented, along with a simulation study and an illustrative example that makes use of empirical data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • ANDERSON, T. (1951), “Estimating Linear Restrictions on Regression Coefficients for Multivariate Normal Distributions,” The Annals of Mathematical Statistics 22, 327–351.

    Article  MATH  Google Scholar 

  • ARABIE, P., and HUBERT, L. (1994), “Cluster Analysis in Marketing Research,” in Handbook of Marketing Research, ed. R. Bagozzi, Oxford: Blackwell, pp. 160–189.

  • BERKOWITZ, L. (1989), “Frustration-aggression Hypothesis: Examination and Reformulation,” Psychological Bulletin 106, 59–73.

    Article  Google Scholar 

  • BOCK, H.-H. (1987), “On the interface Between Cluster Analysis, Principal Component Analysis and Multidimensional Scaling,” in Multivariate Statistical Modeling and Data Analysis: Proceedings of the Advanced Symposium on Multivariate Modeling and Data Analysis May 15–16, 1986, eds. H. Bozdogan and A. Gupta, Dordrecht, The Netherlands: Reidel Publishing Company, pp. 17–34.

  • CARROLL, J. D., and CHATURVEDI, A. (1995), “A General Approach to Clustering and Multidimensional Scaling of Two-way, Three-way or Higher-way Data,” in Geometric Representations of Perceptual Phenomena: Papers in honor of Tarow Indow on his 70th birthday, eds. D. R. Luce, M. D’Zmura, D. Hoffman, G. J. Iverson, and K. A. Romney, Mahwah, New Jersey: Lawrence Erlbaum Associates, pp. 295–318.

  • CATTELL, R. B. (1966), “TheMeaning and Strategic Use of Factor Analysis,” in Handbook of Multivariate Experimental Psychology, ed. R. B. Cattell, Chicago: Rand McNally, pp. 174–243.

  • CEULEMANS, E., and KIERS, H. A. L. (2006), “Selecting Among Three-mode Principal Component Models of Different Types and Complexities: A Numerical Convex Hull Based Method,” British Journal of Mathematical and Statistical Psychology 59, 133–150.

    Article  MathSciNet  Google Scholar 

  • CEULEMANS, E., TIMMERMAN, M. E., and KIERS, H. A. L. (2011), “The CHull Procedure for Selecting Among Multilevel Component Solutions,” Chemometrics and Intelligent Laboratory Systems 106, 12–20.

    Article  Google Scholar 

  • CEULEMANS, E., and VAN MECHELEN, I. (2005), “Hierarchical Classes Models for Three-way Three-mode Binary Data: Interrelations and Model Selection,” Psychometrika 70, 461–480.

    Article  MathSciNet  Google Scholar 

  • CEULEMANS, E., and VAN MECHELEN, I. (2004), “Tucker2 Hierarchical Classes Analysis,” Psychometrika 69, 375–399.

    Article  MathSciNet  Google Scholar 

  • CEULEMANS, E., VAN MECHELEN, I., and LEENEN, I. (2007), “The Local Minima Problem in Hierarchical Classes Analysis: An Evaluation of a Simulated Annealing Algorithm and Various Multistart Procedures,” Psychometrika 72, 377–391.

    Article  MathSciNet  MATH  Google Scholar 

  • CEULEMANS, E., VAN MECHELEN, I., and LEENEN, I. (2003), “Tucker3 Hierarchical Classes Analysis,” Psychometrika 68, 413–433.

    Article  MathSciNet  Google Scholar 

  • CHANG, W.-C. (1983), “On Using Principal Components Before Separating a Mixture of Two Multivariate Normal Distributions,” Applied Statistics 32, 267–275.

    Article  MATH  Google Scholar 

  • CHATURVEDI, A., and CARROLL, J. D. (1994), “An Alternating Combinatorial Optimization Approach to Fitting the INDCLUS and Generalized INDCLUS Models,” Journal of Classification 11, 155–170.

    Article  MATH  Google Scholar 

  • COHEN, J., and COHEN, P. (1983), Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences (2nd ed.), Hillsdale, NJ: Erlbaum.

  • DEPRIL, D., VAN MECHELEN, I., and MIRKIN, B. G. (2008), “Algorithms for Additive Clustering of Rectangular Data Tables,” Computational Statistics and Data Analysis 52, 4923–4938.

    Article  MathSciNet  MATH  Google Scholar 

  • DE SOETE, G., and CARROLL, J. D. (1994), “K-means Clustering an a Low-dimensional Euclidean Space,” in New Approaches in Classification and Data Analysis, eds. E. Diday, Y. Lechevallier, M. Schader, P. Bertrand, and B. Burtschy, Berlin, Germany: Springer-Verlag, pp. 212–219.

  • EVERITT, B. (1977), “Cluster Analysis,” in The Analysis of Survey Data, Vol. 1: Exploring Data Structures, eds. C. A. O’Muircheartaig and C. Payne, London: Wiley, pp. 63–88.

  • HUBERT, L. J., ARABIE, P., and HESSON-MCINNES, M. (1992), “Multidimensional Scaling in the City-block Metric - A Combinatorial Approach,” Journal of Classification 9, 211–236.

    Article  Google Scholar 

  • KRZANOWSKI,W. (1979), “Between-groups Comparison of Principal Components,” Journal of the American Statistical Association 74, 703–707.

    Article  MathSciNet  MATH  Google Scholar 

  • KUPPENS, P., and VAN MECHELEN, I. (2007), “Determinants of the Anger Appraisals of Threatened Self-esteem, Other-blame, and Frustration,” Cognition and Emotion 21, 56–77.

    Article  Google Scholar 

  • KUPPENS, P., VANMECHELEN, I., and SMITS, D. J.M. (2003), “The Appraisal Basis of Anger: Specificity, Necessity and Sufficiency of Components,” Emotion 3, 254–269.

    Article  Google Scholar 

  • LEE, D. D., and SEUNG, S. H. (2001), “Algorithms for Non-negativeMatrix Factorization,” Advances in Neural Information Processing Systems 13, 556–562.

    Google Scholar 

  • LEE, D. D., and SEUNG, S. H. (1999), “Learning the Parts of Objects by Non-negative Matrix Factorization,” Nature 401, 788–791.

    Article  Google Scholar 

  • MIRKIN, B. G. (1987), “Method of Principal Cluster Analysis,” Automation and Remote Control 48, 1379–1386.

    MATH  Google Scholar 

  • ROCCI, R., and VICHI,M. (2005), “Three-mode Component Analysis with Crisp or Fuzzy Partition of Units,” Psychometrika 70, 715–736.

    Article  MathSciNet  Google Scholar 

  • SCHEPERS, J., CEULEMANS, E., and VAN MECHELEN, I. (2008), “Selecting Among Multi-mode Partitioning Models of Different Complexities: A Comparison of Four Model Selection Criteria,” Journal of Classification 25, 67–85.

    Article  MathSciNet  MATH  Google Scholar 

  • SHEPARD, R. N., and ARABIE, P. (1979), “Additive Clustering Representations of Similarities as Combinations of Discrete Overlapping Properties,” Psychological Review 86, 87–123.

    Article  Google Scholar 

  • SPIELBERGER, C. D., JOHNSON, E. H., RUSSELL, S. F., CRANE, J. C., JACOBS, G. A., and WORDEN, T. J. (1985), “The Experience and Expression of Anger: Construction and Validation of an Anger Expression Scale,” in Anger and Hostility in Cardiovascular and Behavioral Disorders, eds. M. A. Chesney and R. H. Rosenman, New York: Hemisphere, pp. 5–30.

  • STEINLEY, D. (2003), “Local Optima in K-means Clustering: What You Don’t Know May Hurt You,” Psychological Methods 8, 294–304.

    Article  Google Scholar 

  • STEINLEY, D., and BRUSCO, M. J. (2007), “Intializing K-means Batch Clustering: A Critical Evaluation of Several Techniques,” Journal of Classification 24, 99–121.

    Article  MathSciNet  MATH  Google Scholar 

  • STOICA, P., and VIBERG, M. (1996), “Maximum Likelihood Parameter and Rank Estimation in Reduced-Rank Multivariate Linear Regressions,” IEEE Transactions on Signal Processing 44, 3096–3078.

    Google Scholar 

  • TRYON, R. C., and BAILY, D. E. (1970), Cluster Analysis, New York: McGraw-Hill.

    Google Scholar 

  • VICHI, M., and KIERS, H. A. L. (2001), “FactorialK-means Analysis for Two-Way Data,” Computational Statistics and Data Analysis 37, 49–64.

    Article  MathSciNet  MATH  Google Scholar 

  • VICHI, M., ROCCI, R., and KIERS, H. A. L. (2007), “Simultaneous Component and Clustering Models for Three-Way Data: Within and Between Approaches,” Journal of Classification 24, 71–98.

    Article  MathSciNet  MATH  Google Scholar 

  • WILDERJANS, T. F., CEULEMANS, E., and KUPPENS, P. (2012), “Clusterwise HICLAS: A Generic Modeling Strategy to Trace Similarities and Differences in Multi-Block Binary Data,” Behavior Research Methods, 44, 532–545.

    Article  Google Scholar 

  • WILDERJANS, T. F., CEULEMANS, E., and MEERS, K. (in press), “CHull: A Generic Convex Hull Based Model Selection Method,” Behavior Research Methods.

  • WILDERJANS, T. F., CEULEMANS, E., and VAN MECHELEN, I. (in press), “The SIMCLAS Model: Simultaneous Analysis of Coupled Binary Data Matrices with Noise Heterogeneity Between and Within Data Blocks,” Psychometrika.

  • WILDERJANS, T. F., CEULEMANS, E., VAN MECHELEN, I., and DEPRIL, D. (2011), “ADPROCLUS: A Graphical User Interface for Fitting Additive Profile Clustering Models to Object by Variable Data Matrices,” Behavior Research Methods 43, 56–65.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tom F. Wilderjans.

Additional information

The research in this paper was partially supported by the Research Fund of KU Leuven (PDM-kort project 3 H100377, dr. Tom F. Wilderjans; GOA 2005/04, Prof. dr. Iven Van Mechelen), by the Belgian Science Policy (IAP P6/03, Prof. dr. Iven Van Mechelen), and by the Fund of Scientific Research (FWO)-Flanders (project G.0546.09, Prof. dr. I. Van Mechelen). The simulation study was conducted using high performance computational resources provided by the KU Leuven (http://ludit.kuleuven.be/hpc). Requests for reprints should be sent to Tom F. Wilderjans. The authors are obliged to Prof. dr. Peter Kuppens for kindly providing the data of Section 5 and to the anonymous reviewers for most helpful remarks on previous versions of this paper.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Depril, D., Van Mechelen, I. & Wilderjans, T.F. Lowdimensional Additive Overlapping Clustering. J Classif 29, 297–320 (2012). https://doi.org/10.1007/s00357-012-9112-5

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-012-9112-5

Keywords

Navigation