Skip to main content
Log in

A method for extracting nonlinear structure based on measures of dependence

  • Original Paper
  • Published:
Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

Abstract

In this paper, we propose a method for extracting nonlinear structure from multi-dimensional data. In dimension reduction such as principal component analysis (PCA) and projection pursuit (PP) (Friedman in J Am Stat Assoc 82(397):249–266, 1987), we search for projection directions which maximize an index, variance (PCA) or projection indices (PP). Various measures of dependence, including MIC (Reshef et al. in Science 334(6062):1518–1524, 2011) and TIC (Reshef et al. in J Mach Learn Res 17(211):1–63, 2016), have been proposed to evaluate the strength of linear or nonlinear relationships between 2 variables. We adopt them in place of indices in dimension reduction, and extract nonlinear structures. We confirm the performance through numerical examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  • Clark, M. (2013). A comparison of correlation measures. Retrieved May 7, 2022, from http://www.stats.ox.ac.uk/~cucuring/Lecture_2_Correlations_Dependence.pdf

  • Friedman, J. H. (1987). Exploratory projection pursuit. Journal of the American Statistical Association, 82(397), 249–266.

    Article  MathSciNet  MATH  Google Scholar 

  • Gretton, A., Bousquet, O., Smola, A., & Scholkopf, B. (2005). Measuring statistical dependence with Hilbert–Schmidt norms. Algorithmic learning theory (pp. 63–77). Springer.

  • Hestenes, M. R. (1969). Multiplier and gradient methods. Journal of Optimization Theory and Applications, 4(5), 303–320.

    Article  MathSciNet  MATH  Google Scholar 

  • Hoeffding, W. (1948). A non-parametric test of independence. The Annals of Mathematical Statistics, 19(4), 546–557.

    Article  MathSciNet  MATH  Google Scholar 

  • Kraskov, A., Stogbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical Review E, 69, 066138.

    Article  MathSciNet  Google Scholar 

  • Lopez-Paz, D., Hennig, P., & Scholkopf, B. (2013). The randomized dependence coefficient. Advances in Neural Information Processing Systems, 26, 1–9.

    Google Scholar 

  • Powell, M. J. D. (1969). A method for nonlinear constraints in minimization problems. In R. Fletcher (Ed.), Optimization (pp. 283–298). Academic Press.

  • Reshef, D. N., Reshef, Y. A., Finucane, H. K., Grossman, S. R., McVean, G., Turnbaugh, P. J., et al. (2011). Detecting novel associations in large data sets. Science, 334(6062), 1518–1524.

    Article  MATH  Google Scholar 

  • Reshef, D. N., Reshef, Y. A., Sabeti, P. C., & Mitzenmacher, M. (2018). An empirical study of leading measures of dependence. The Annals of Applied Statistics, 12(1), 123–155.

    Article  MathSciNet  MATH  Google Scholar 

  • Reshef, Y. A., Reshef, D. N., Finucane, H. K., Sabeti, P. C., & Mitzenmacher, M. (2016). Measuring dependence powerfully and equitably. Journal of Machine Learning Research, 17(211), 1–63.

    MathSciNet  MATH  Google Scholar 

  • Ross, B. C. (2014). Mutual information between discrete and continuous data sets. PLoS One, 9(2), e0087357.

    Article  Google Scholar 

  • Speed, T. (2011). A correlation for the 21st century. Science, 334(6062), 1502–1503.

    Article  Google Scholar 

  • Suzana, S. S., Daniel, Y. T., Asuka, N., & Andre, F. (2014). A comparative study of statistical methods used to identify dependencies between gene expression signals. Briefings in Bioinformatics, 15(6), 906–918.

    Article  Google Scholar 

  • Szekely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The Annals of Statistics, 35(6), 2769–2794.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Funding

This work was supported by Japan Society for the Promotion of Science KAKENHI [Grant number 18H03207].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shoma Ishimoto.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ishimoto, S., Minami, H. & Mizuta, M. A method for extracting nonlinear structure based on measures of dependence. Jpn J Stat Data Sci 5, 661–674 (2022). https://doi.org/10.1007/s42081-022-00177-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42081-022-00177-9

Keywords

Navigation