Skip to main content
Log in

Simultaneous Method of Orthogonal Non-metric Non-negative Matrix Factorization and Constrained Non-hierarchical Clustering

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

For multivariate categorical data, it is important to detect both clustering structures and low dimensions such that clusters are discriminated. This is because it is easy to interpret the features of clusters through the estimated low dimensions. It is sure that these existing methods for dimensional reduction clustering are useful to achieve such purpose; however, the interpretation sometimes becomes complicated due to the sign of the estimated parameters. Thus, we propose new dimensional reduction clustering with non-negativity constraints for all parameters. The proposed method has several advantages. First, when the features of clusters are interpreted, it is easier to interpret the clusters since effects of sign should not be considered. In addition, from the non-negativity and orthogonality constraints, the estimated components become perfect simple structure, which is interpretable descriptions. Second, we showed that the clustering results are not inferior to these existing methods through the simulations, although the constraints for the proposed method are strong.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Adachi, K. (2000). Growth curve representation and clustering under optimal scaling of repeated choice data. Behaviormetrika, 27, 15–32.

    Article  Google Scholar 

  • Adachi, K., & Murakami, T. (2011). Hikeiryoutahenryoukaisekihou (in Japanese). Japan: Asakurasyoten.

    Google Scholar 

  • Arabie, P., & Hubert, L. (1994). Cluster analysis in marketing research. In Bagozzi, R. P. (Ed.) Advanced Methods of Marketing Research (pp. 160–189). Oxford: Blackwell.

  • Benzecri, J.P. (1979). Sur le calcul des taux d’inertie dans l’analyse d’un questionnaire. Cahiers de l’Analyse des Donnees, 4, 377–378.

    Google Scholar 

  • Bernaad, C.A., & Jennrich, R.I. (2003). Orthomax rotation and perfect simple structure. Psychometrika, 68, 585–588.

    Article  MathSciNet  MATH  Google Scholar 

  • Bergami, M., & Bagozzi, R.P. (2000). Self-categorization, affective commitment and group selfesteem as distinct aspects of social identity in the organization. British Journal of Social Psychology, 39(4), 555–577.

    Article  Google Scholar 

  • Carroll, J.D., Green, P.E., Schaffer, C.M. (1986). Interpoint distance comparisons in correspondence analysis. Journal of Marketing Research, 22, 271–281.

    Article  Google Scholar 

  • De Soete, G., & Carroll, J.D. (1994). K-means clustering in low-dimensional Euclidean space. In Diday, E., Lechevallier, Y., Schader, M., Bertrand, P., Burtschy, B. (Eds.) New Approaches in Classification and Data Analysis (pp. 212–219). Heidelberg: Springer.

  • Ding, C., He, X., Simon, H. (2005). Orthogonal nonnagative matrix tri-factorizations for clustering. In Proceedings of 12th ACM SIGKDD International Conference Knowledge Discovery and Data Mining (KDD) (pp. 126–135).

  • Ding, C., Li, T., Peng, W., Park, H. (2006). Orthogonal nonnagative matrix tri-factorizations for clustering. In Proceedings of SIAM Data Mining Conference (pp. 606–610).

  • Huang, Z. (1998). Extensions to the k-means algorithm for clustering large data with categorical values. Journal of Data Mining and Knowledge Discovery, 2, 283–304.

    Article  Google Scholar 

  • Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2, 193–218.

    Article  MATH  Google Scholar 

  • Hwang, H., & Takane, Y. (2004). Generalized structured component analysis. Psychometrika, 69(1), 81–99.

    Article  MathSciNet  MATH  Google Scholar 

  • Hwang, H., Dillon, W.R., Takane, Y. (2006). An extension of multiple correspondence analysis for identifying heterogeneous subgroups of respondents. Psychometrika, 71, 161–171.

    Article  MathSciNet  MATH  Google Scholar 

  • Hwang, H., Dillon, W.R., Takane, Y. (2010). Fuzzy cluster multiple correspondence analysis. Behaviormetrika, 67, 215–228.

    MATH  Google Scholar 

  • Iodice D’Enza, A., & Paulumbo, F. (2013). Iterative factor clustering of binary data. Computational Statistics, 28(2), 789–807.

    Article  MathSciNet  MATH  Google Scholar 

  • Lee, D.D., & Seung, H.S. (1999). Learning the parts of objects with nonnegative matrix factorization. Nature, 401, 788–791.

    Article  MATH  Google Scholar 

  • Lee, D.D., & Seung, H.S. (2001). Algorithm for non-negative matrix factorization. In NIPS.

  • Li, S., Hou, X., Zhang, H., Cheng, Q. (2001). Learning spatially localized, parts-based representation. Proceedings of IEEE Conference Computer Vision and Pattern Recognition (pp. 207–212).

  • Li, T., & Ding, C. (2006). The relationsships among various nonnegative matrix factorization methods for clustering. Proceedings of IEEE Sixth International Conference and Data Mining (pp. 362–371).

  • Macqueen, J. (1967). Some methods for classification and analysis of multivariate observations. Fifth berkeley symposium on mathematics, statistics and probability (pp. 281–297). University of California Press.

  • Milligan, G.W., & Cooper, M.C. (1988). A study of standardization of variables in cluster analysis. Journal of Classification, 5, 181–204.

    Article  MathSciNet  Google Scholar 

  • Mitsuhiro, M., & Yadohisa, H. (2015). Reduced k-means clustering with MCA in low-dimensional space. Computational Statistics, 30, 463–475.

    Article  MathSciNet  MATH  Google Scholar 

  • Rocci, R., Gattone, S.A., Vichi, M. (2011). A new dimension reduction method: factor discriminant k-means. Journal of Classification, 28, 210–226.

    Article  MathSciNet  MATH  Google Scholar 

  • Timmerman, M.E., Ceulemans, E., Kiers, H.A.L., Vichi, M. (2010). Factorial and reduced k-means reconsidered. Computational Statistics & Data Analysis, 54, 1858–1871.

    Article  MathSciNet  MATH  Google Scholar 

  • Van Buuren, S., & Heiser, W.J. (1989). Clustering N objects into K groups under optimal scaling of variables. Psychometrika, 54, 699–706.

    Article  MathSciNet  Google Scholar 

  • Van De Velden, M., Iodice D’Enza, A., Palumbo, F. (2017). Cluster correspondence analysis. Psychometrika, 82(1), 158–185.

    Article  MathSciNet  MATH  Google Scholar 

  • Vichi, M., & Kiers, H.A.L. (2001). Factorial k-means analysis for two-way data. Computational Staitstics & Data Analysis, 37, 49–64.

    Article  MathSciNet  MATH  Google Scholar 

  • Wang, J. (2010). Consistent selection of the number of clusters via crossvalidation. Biometrika, 97, 893–904.

    Article  MathSciNet  MATH  Google Scholar 

  • Yamamoto, M., & Hayashi, K. (2015). Clustering of multivariate binary data with dimension reduction via L1-regularized likelihood maximization. Pattern Recognition, 48, 3959–3968.

    Article  MATH  Google Scholar 

Download references

Acknowledgments

We appreciate the editer, and reviewers for the useful comments. This work was supported by JSPS KAKENHI Grant Number JP40782818.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kensuke Tanioka.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tanioka, K., Yadohisa, H. Simultaneous Method of Orthogonal Non-metric Non-negative Matrix Factorization and Constrained Non-hierarchical Clustering. J Classif 36, 73–93 (2019). https://doi.org/10.1007/s00357-018-9284-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-018-9284-8

Keywords

Navigation