Skip to main content

Part of the book series: Undergraduate Topics in Computer Science ((UTICS))

Abstract

After a short introduction of the general concept of decision rule to relate input and target features, this chapter describes some generic and most popular methods for learning correlations over two or more features. Four of them pertain to quantitative targets (linear regression, canonical correlation, neural network, and regression tree), and seven to categorical ones (linear discrimination, support vector machine, naïve Bayes classifier, classification tree, contingency table, distance between partition and ranking relations, and the correspondence analysis). Of these, classification trees are treated in a most detailed way including a number of theoretical results that are not well known. These establish firm relations between popular scoring functions and bivariate measures—Quetelet indexes in contingency tables and, rather unexpectedly, normalization options for dummy variables representing target categories. Some related concepts such as Bayesian decision rules, bag-of-word model in text analysis, VC-dimension and kernel for non-linear classification are introduced too. The Chapter outlines several important characteristics of summarization and correlation between two features, and displays some of the properties of those. They are:

  • linear regression and correlation coefficient for two quantitative variables (Sect. 3.2);

  • tabular regression and correlation ratio for the mixed scale case (Sect. 3.8.3); and

  • contingency table, Quetelet index, statistical independence, and Pearson’s chi-squared for two nominal variables; the latter is treated as a summary correlation measure, in contrast to the conventional view of it as just a criterion of statistical independence (Sect. 3.6.1); moreover, a few less known least-squares based concepts are outlined, including canonical correlation and correspondence analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Both, Francis Galton and Charles Darwin, were grandsons of a celebrated medical doctor and philosopher, Erasmus Darwin.

References

  • J.-P. Benzecri, Correspondence Analysis Handbook (CRC Press, 1992). ISBN 10 0824784375

    Google Scholar 

  • M. Berthold, D. Hand, Intelligent Data Analysis (Springer, Berlin, 2003)

    Google Scholar 

  • L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, Classification and Regression Trees (Wadswarth, Belmont, Ca, 1984)

    MATH  Google Scholar 

  • A.C. Davison, D.V. Hinkley, Bootstrap Methods and Their Application, 7th edn. (Cambridge University Press, Cambridge, 2005)

    MATH  Google Scholar 

  • H.B. Demuth, M.H. Beale, O. De Jess, M.T. Hagan, Neural network design (Martin Hagan, 2014)

    Google Scholar 

  • R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classification (Wiley-Interscience, 2001). ISBN 0-471-05669-3

    Google Scholar 

  • S.B. Green, N.J. Salkind, Using SPSS for the Windows and Macintosh: Analyzing and Understanding Data (Prentice Hall, 2003)

    Google Scholar 

  • M. Greenacre, Correspondence Analysis in Practice (CRC Press, 2017)

    Google Scholar 

  • P.D. Grünwald, The Minimum Description Length Principle (MIT Press, 2007)

    Google Scholar 

  • J.F. Hair, W.C. Black, B.J. Babin, R.E. Anderson, Multivariate Data Analysis, 7th edn. (Prentice Hall, 2010). ISBN-10: 0-13-813263-1

    Google Scholar 

  • J. Han, M. Kamber, J. Pei, Data Mining: Concepts and Techniques, 3rd edn. (Elsevier, 2011). ISBN: 978-9380931913

    Google Scholar 

  • S.S. Haykin, Neural Networks, 2nd edn. (Prentice Hall, 1999). ISBN: 0132733501

    Google Scholar 

  • A.Z. Israëls, Eigenvalue Techniques for Qualitative Data (Leiden, DSWO Press, 1987)

    Google Scholar 

  • J. Kemeny, L. Snell, Mathematical Models in Social Sciences (New-York, Blaisdell, 1962)

    Google Scholar 

  • M.G. Kendall, A. Stewart, Advanced Statistics: Inference and Relationship, 2nd edn. (Griffin, London, 1967)

    Google Scholar 

  • L. Lebart, A. Morineau, M. Piron, Statistique Exploratoire Multidimensionelle (Dunod, Paris, 1995). ISBN 2-10-002886-3

    MATH  Google Scholar 

  • C.D. Manning, P. Raghavan, H. Schütze, Introduction to Information Retrival (Cambridge University Press, Cambridge, 2008)

    Book  Google Scholar 

  • B. Mirkin, Group Choice (Halsted Press, Washington, DC, 1979)

    Google Scholar 

  • B. Mirkin, Grouping in Socio-Economic Research (Finansy i Statistika Publishers, Moscow, Russia, 1985)

    Google Scholar 

  • B. Mirkin, Mathematical Classification and Clustering (Kluwer, AP, Dordrecht, 1996)

    Google Scholar 

  • B. Mirkin, Clustering: A Data Recovery Approach (Chapman & Hall/CRC, 2012). ISBN 978-1-4398-3841-9

    Google Scholar 

  • F. Murtagh, Correspondence Analysis and Data Coding with Java and R (Chapman & Hall/CRC, Boca Raton, FL, 2005)

    Book  Google Scholar 

  • T.M. Mitchell Machine Learning (McGraw Hill, 2010)

    Google Scholar 

  • S. Nishisato, Elements of Dual Scaling: An introduction to practical data analysis (Psychology Press, 2014)

    Google Scholar 

  • B. Polyak, Introduction to Optimization (Optimization Software, Los Angeles, 1987). ISBN 0911575146

    MATH  Google Scholar 

  • J.R. Quinlan, C4. 5: Programs for Machine Learning (Morgan Kaufmann, San Mateo, 1993)

    Google Scholar 

  • B. Schölkopf, A.J. Smola, Learning with Kernels (The MIT Press, 2005)

    Google Scholar 

  • V. Vapnik, Estimation of Dependences Based on Empirical Data, 2nd edn. (Springer Science + Business Media Inc., 2006)

    Google Scholar 

  • A. Webb, Statistical Pattern Recognition (Wiley, 2002). ISBN-0-470-84514-7

    Google Scholar 

Articles

  • L. Breiman, Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  • J. Bring, How to standardize regression coefficients. Am. Stat. 48(3), 209–213 (1994)

    Google Scholar 

  • J. Carpenter, J. Bithell, Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat. Med. 19, 1141–1163 (2000)

    Article  Google Scholar 

  • H.E. Daniels, The relation between measures of correlation in the universe of sample permutations. Biometrika, 33(2), 129–135 (1944)

    Article  MathSciNet  Google Scholar 

  • F. Esposito, D. Malerba, G. Semeraro, A comparative analysis of methods for pruning decision trees. IEEE Trans. Pattern Anal. Mach. Intell. 19(5), 476–491 (1997)

    Article  Google Scholar 

  • T. Fawcett, An introduction to ROC analysis. Pattern Recogn. Lett. 27, 861–873 (2006)

    Article  Google Scholar 

  • D.H. Fisher, Knowledge acquisition via incremental conceptual clustering. Mach. Learn. 2, 139–173 (1987)

    Google Scholar 

  • P.J.F. Groenen, G. Nalbantov, J.C. Bioch, SVM-Maj: a majorization approach to linear support vector machines with different hinge errors. Adv. Data Anal. Classif. 2(1), 17–43 (2008)

    Article  MathSciNet  Google Scholar 

  • J.G. Kemeny, Mathematics without numbers. Daedalus 88(4), 577–591 (1959)

    Google Scholar 

  • L. Lebart, B.G. Mirkin, Correspondence analysis and classification, in Multivariate Analysis: Future Directions, vol. 2 ed. by C. Cuadras, C.R. Rao (North Holland, 1993), pp. 341–357

    Google Scholar 

  • Y. LeCun, Y. Bengio, G. Hinton, Deep learning. Nature 521(7553), 436 (2015)

    Article  Google Scholar 

  • W.Y. Loh, Y.S. Shih, Split selection methods for classification trees. Stat. Sin. 815–840 (1997)

    Google Scholar 

  • E. Lombardo, F. Beh, P. Kroonenberg, Modelling trends in ordered correspondence analysis using orthogonal polynomials. Psychometrika 81(325–349), 2016 (2016)

    MathSciNet  MATH  Google Scholar 

  • G. Louppe, L. Wehenkel, A. Sutera, P. Geurts, Understanding variable importances in forests of randomized trees. in Advances in Neural Information Processing Systems (NIPS), (2013), pp. 431–439

    Google Scholar 

  • M. Meilă, Comparing clusterings—an information based distance, J. Multivar. Anal. 98(5), 873–895 (2007)

    Article  MathSciNet  Google Scholar 

  • B. Mirkin, L. Cherny, Some properties of the partition space, in K. Bagrinovsky, E. Berland (Eds.), Math. Anal. Econ. Models III, Institute of Economics of the Siberian Branch of the USSR’s Academy of the Sciences, Novosibirsk, 126–147 (1972)

    Google Scholar 

  • B. Mirkin, Eleven ways to look at the chi-squared coefficient for contingency tables. Am. Stat. 55(2), 111–120 (2001)

    Article  MathSciNet  Google Scholar 

  • B. Mirkin, T.I. Fenner Tied rankings, ordered partitions, and weak orders: Distance and consensus. J. Classif. 36(2), (2019)

    Google Scholar 

  • J.N. Morgan, J.A. Sonquist, Problems in the analysis of survey data, and a proposal. J. Am. Stat. Assoc. 58, 415–435 (1963)

    Article  Google Scholar 

  • I. Morlini, S. Zani, A new class of weighted similarity indices using polytomous variables. J. Classif. 29(2), 199–226 (2012)

    Article  MathSciNet  Google Scholar 

  • K. Pearson, On a criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen in random sampling. Phil. Mag. 50, 157–175 (1900)

    Article  Google Scholar 

  • J. Schmidhuber, Deep learning in neural networks: an overview. Neural Netw 61, 85–117 (2015)

    Article  Google Scholar 

  • K. Steele, H.O. Stefánsson, Decision Theory, The Stanford Encyclopedia of Philosophy (Winter 2015 edn.), Edward N. Zalta (Ed.), http://plato.stanford.edu/archives/win2015/entries/decision-theory/

  • N.G. Waller, J.A. Jones, Correlation weights in multiple regression. Psychometrika 75(1), 58–69 (2010)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Boris Mirkin .

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Mirkin, B. (2019). Learning Correlations. In: Core Data Analysis: Summarization, Correlation, and Visualization. Undergraduate Topics in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-030-00271-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00271-8_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00270-1

  • Online ISBN: 978-3-030-00271-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics