A unified representation of simultaneous analysis methods of reduction and clustering

Mitsuhiro, Masaki; Yadohisa, Hiroshi

doi:10.1007/s42081-018-0022-6

A unified representation of simultaneous analysis methods of reduction and clustering

Published: 15 October 2018

Volume 1, pages 393–412, (2018)
Cite this article

Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

585 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

In data analytics, to extract features from multivariate data, objects and variables are often displayed in a reduced space that is easy to interpret. When clustering is used in these displays, we often obtain the cluster structure of the original data using an approach combining two multivariate analysis methods called “simultaneous analysis.” Stratifying objects can also extract features of the variables that we would like to interpret. Simultaneous analysis methods for these tasks estimate the unknown parameters of the two methods simultaneously and can find a low-dimensional subspace that reflects the cluster structure. However, despite the many common parts of these methods, it is necessary to change the method depending to the aim of the analysis and data type, making them inconvenient to actually use. To address this shortcoming, we propose a simultaneous analysis framework that is composed of several possible reduction methods integrated with clustering methods. The unified framework is applicable to numerical, categorical, and mixed data. Using this method, we can display objects and variables in a low-dimensional subspace that reflects the cluster structure. Moreover, we discuss the framework’s extensions and how it relates to several other proposed simultaneous analysis methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Comprehensive Survey of Clustering Algorithms

Article 01 June 2015

Univariate and multivariate skewness and kurtosis for measuring nonnormality: Prevalence, influence and estimation

Article 17 October 2016

Data clustering: application and trends

Article 27 November 2022

References

Adachi, K., & Murakami, T. (2011). Nonmetric multivariate analysis. Tokyo: Asakura-Shoten. (in Japanese).
Google Scholar
Adachi, K. (2016). Matrix-based introduction to multivariate data analysis. Singapore: Springer.
Book Google Scholar
Adachi, K., & Trendafilov, N. T. (2018). Some mathematical properties of the matrix decomposition solution in factor analysis. Psychometrika, 83, 1–18.
Article MathSciNet Google Scholar
Arabie, P., & Hubert, L. (1994). Cluster analysis in marketing research. In R. P. Bagozzi (Ed.), Handbook of Marketing Research. Oxford: Blackwell.
MATH Google Scholar
Bezdek, J. C. (1974). Numerical taxonomy with fuzzy sets. Journal of Mathematical Biology, 1, 57–71.
Article MathSciNet Google Scholar
De Leeuw, J., Young, F. W., & Takane, Y. (1976). Additive structure in qualitative data: An alternating least squares method with optimal scaling features. Psychometrika, 41, 471–503.
Article Google Scholar
De Soete, G., & Carroll, J. D. (1994). \(K\)-means clustering in a low-dimensional Euclidean space. In E. Diday, Y. Lechevallier, M. Schader, P. Bertrand, & B. Burtschy (Eds.), New approaches in classification and data analysis (pp. 212–219). Heidelberg: Springer.
Chapter Google Scholar
Fordellone, M., & Vichi, M. (2017). Multiple correspondence \(k\)-means: simultaneous versus sequential approach for dimension reduction and clustering. Data science and social research (pp. 81–95). Cham: Springer.
Chapter Google Scholar
Gifi, A. (1990). Nonlinear multivariate analysis. Chichester: Wiley.
MATH Google Scholar
Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24, 417.
Article Google Scholar
Hwang, H., & Dillon, W. R. (2010). Simultaneous two-way clustering of multiple correspondence analysis. Multivariate Behavioral Research, 45, 186–208.
Article Google Scholar
Hwang, H., Dillon, W. R., & Takane, Y. (2006). An extension of multiple correspondence analysis for identifying heterogeneous subgroups of respondents. Psychometrika, 71, 161–171.
Article MathSciNet Google Scholar
Hwang, H., Dillon, W. R., & Takane, Y. (2010). Fuzzy cluster multiple correspondence analysis. Behaviormetrika, 37, 111–133.
Article Google Scholar
Iodice D’ Enza, A., & Palumbo, F. (2013). Iterative factor clustering of binary data. Computational Statistics, 28, 1–19.
Article MathSciNet Google Scholar
Linting, M., Meulman, J. J., Groenen, P. J., & Van der Kooij, A. J. (2007). Nonlinear principal components analysis: introduction and application. Psychological methods, 12, 336–358.
Article Google Scholar
MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1, 281–297.
MathSciNet MATH Google Scholar
Makino, N. (2015). Generalized data-fitting factor analysis with multiple quantification of categorical variables. Computational Statistics, 30, 1–14.
Article MathSciNet Google Scholar
Meulman, J. J., Van der Kooij, A. J., & Heiser, W. J. (2004). Principal components analysis with nonlinear optimal scaling transformations for ordinal and nominal data. The Sage handbook of quantitative methodology for the social sciences (pp. 49–72).
Mitsuhiro, M., & Yadohisa, H. (2013). Simultaneous fuzzy clustering with multiple correspondence analysis. In Proceedings of the 59th World Statistics Congress of the International Statistics Institute (pp. 5567–5572).
Mitsuhiro, M., & Yadohisa, H. (2015). Reduced \(k\)-means clustering with MCA in a low-dimensional space. Computational Statistics, 30, 463–475.
Article MathSciNet Google Scholar
Mori, Y., Kuroda, M., & Makino, N. (2016). Nonlinear Principal Component Analysis and Its Applications. Singapore: Springer.
Book Google Scholar
Mulaik, S. A. (2010). Foundations of Factor Analysis (2nd ed.). Boca Raton: Chapman and Hall/CRC.
MATH Google Scholar
Reich, Y., & Fenves, S. J. (1992). Inductive learning of synthesis knowledge. International Journal of Expert Systems Research and Applications, 5, 275–275.
Article Google Scholar
Rocci, R., Gattone, S. A., & Vichi, M. (2011). A new dimension reduction method: factor discriminant \(k\)-means. Journal of Classification, 28, 210–226.
Article MathSciNet Google Scholar
Steinley, D. (2003). Local optima in \(k\)-means clustering: what you don’t know may hurt you. Psychological Methods, 8, 294.
Article Google Scholar
Takane, Y., Young, F. W., & de Leeuw, J. (1979). Nonmetric common factor analysis: An alternating least squares method with optimal scaling features. Behaviormetrika, 6, 45–56.
Article Google Scholar
Timmerman, M. E., Ceulemans, E., Kiers, H. A. L., & Vichi, M. (2010). Factorial and reduced \(k\)-means reconsidered. Computational Statistics and Data Analysis, 54, 1858–1871.
Article MathSciNet Google Scholar
Timmerman, M. E., Ceulemans, E., De Roover, K., & Van Leeuwen, K. (2013). Subspace \(k\)-means clustering. Behavior Research Methods, 45, 1011–1023.
Article Google Scholar
Trendafilov, N. T., & Unkel, S. (2011). Exploratory factor analysis of data matrices with more variables than observations. Journal of Computational and Graphical Statistics, 20, 874–891.
Article MathSciNet Google Scholar
Unkel, S., & Trendafilov, N. T. (2010). Simultaneous parameter estimation in exploratory factor analysis: An expository review. International Statistical Review, 78, 363–382.
Article Google Scholar
Unkel, S., & Trendafilov, N. T. (2013). Zig-zag exploratory factor analysis with more variables than observations. Computational Statistics, 28, 107–125.
Article MathSciNet Google Scholar
Van Buuren, S., & Heiser, W. J. (1989). Clustering \(n\) objects into \(k\) groups under optimal scaling of variables. Psychometrika, 54, 699–706.
Article MathSciNet Google Scholar
Vichi, M., & Kiers, H. A. L. (2001). Factorial \(k\)-means analysis for two-way data. Computational Statistics and Data Analysis, 37, 49–64.
Article MathSciNet Google Scholar
Vichi, M., Rocci, R., & Kiers, H. A. (2007). Simultaneous component and clustering models for three-way data: within and between approaches. Journal of Classification, 24, 71–98.
Article MathSciNet Google Scholar
Yamamoto, M., & Hwang, H. (2014). A general formulation of cluster analysis with dimension reduction and subspace separation. Behaviormetrika, 41, 115–129.
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank the reviewers and editors for their helpful comments on this manuscript.

Author information

Authors and Affiliations

Nikkei Research Inc., 2-2-1 Uchikanda, Chiyoda-ku, Tokyo, 101-0047, Japan
Masaki Mitsuhiro
Department of Culture and Information Science, Doshisha University, 1-3 Tatara Miyakodani, Kyotanabe, Kyoto, 610-0394, Japan
Hiroshi Yadohisa

Authors

Masaki Mitsuhiro
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Yadohisa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masaki Mitsuhiro.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mitsuhiro, M., Yadohisa, H. A unified representation of simultaneous analysis methods of reduction and clustering. Jpn J Stat Data Sci 1, 393–412 (2018). https://doi.org/10.1007/s42081-018-0022-6

Download citation

Received: 28 February 2018
Accepted: 26 September 2018
Published: 15 October 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s42081-018-0022-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A unified representation of simultaneous analysis methods of reduction and clustering

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Univariate and multivariate skewness and kurtosis for measuring nonnormality: Prevalence, influence and estimation

Data clustering: application and trends

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A unified representation of simultaneous analysis methods of reduction and clustering

Abstract

Access this article

Similar content being viewed by others

A Comprehensive Survey of Clustering Algorithms

Univariate and multivariate skewness and kurtosis for measuring nonnormality: Prevalence, influence and estimation

Data clustering: application and trends

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation