Abstract
In this paper, we propose a method for extracting nonlinear structure from multi-dimensional data. In dimension reduction such as principal component analysis (PCA) and projection pursuit (PP) (Friedman in J Am Stat Assoc 82(397):249–266, 1987), we search for projection directions which maximize an index, variance (PCA) or projection indices (PP). Various measures of dependence, including MIC (Reshef et al. in Science 334(6062):1518–1524, 2011) and TIC (Reshef et al. in J Mach Learn Res 17(211):1–63, 2016), have been proposed to evaluate the strength of linear or nonlinear relationships between 2 variables. We adopt them in place of indices in dimension reduction, and extract nonlinear structures. We confirm the performance through numerical examples.
Similar content being viewed by others
References
Clark, M. (2013). A comparison of correlation measures. Retrieved May 7, 2022, from http://www.stats.ox.ac.uk/~cucuring/Lecture_2_Correlations_Dependence.pdf
Friedman, J. H. (1987). Exploratory projection pursuit. Journal of the American Statistical Association, 82(397), 249–266.
Gretton, A., Bousquet, O., Smola, A., & Scholkopf, B. (2005). Measuring statistical dependence with Hilbert–Schmidt norms. Algorithmic learning theory (pp. 63–77). Springer.
Hestenes, M. R. (1969). Multiplier and gradient methods. Journal of Optimization Theory and Applications, 4(5), 303–320.
Hoeffding, W. (1948). A non-parametric test of independence. The Annals of Mathematical Statistics, 19(4), 546–557.
Kraskov, A., Stogbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69, 066138.
Lopez-Paz, D., Hennig, P., & Scholkopf, B. (2013). The randomized dependence coefficient. Advances in Neural Information Processing Systems, 26, 1–9.
Powell, M. J. D. (1969). A method for nonlinear constraints in minimization problems. In R. Fletcher (Ed.), Optimization (pp. 283–298). Academic Press.
Reshef, D. N., Reshef, Y. A., Finucane, H. K., Grossman, S. R., McVean, G., Turnbaugh, P. J., et al. (2011). Detecting novel associations in large data sets. Science, 334(6062), 1518–1524.
Reshef, D. N., Reshef, Y. A., Sabeti, P. C., & Mitzenmacher, M. (2018). An empirical study of leading measures of dependence. The Annals of Applied Statistics, 12(1), 123–155.
Reshef, Y. A., Reshef, D. N., Finucane, H. K., Sabeti, P. C., & Mitzenmacher, M. (2016). Measuring dependence powerfully and equitably. Journal of Machine Learning Research, 17(211), 1–63.
Ross, B. C. (2014). Mutual information between discrete and continuous data sets. PLoS One, 9(2), e0087357.
Speed, T. (2011). A correlation for the 21st century. Science, 334(6062), 1502–1503.
Suzana, S. S., Daniel, Y. T., Asuka, N., & Andre, F. (2014). A comparative study of statistical methods used to identify dependencies between gene expression signals. Briefings in Bioinformatics, 15(6), 906–918.
Szekely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The Annals of Statistics, 35(6), 2769–2794.
Funding
This work was supported by Japan Society for the Promotion of Science KAKENHI [Grant number 18H03207].
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ishimoto, S., Minami, H. & Mizuta, M. A method for extracting nonlinear structure based on measures of dependence. Jpn J Stat Data Sci 5, 661–674 (2022). https://doi.org/10.1007/s42081-022-00177-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42081-022-00177-9