Abstract
After a short introduction of the general concept of decision rule to relate input and target features, this chapter describes some generic and most popular methods for learning correlations over two or more features. Four of them pertain to quantitative targets (linear regression, canonical correlation, neural network, and regression tree), and seven to categorical ones (linear discrimination, support vector machine, naïve Bayes classifier, classification tree, contingency table, distance between partition and ranking relations, and the correspondence analysis). Of these, classification trees are treated in a most detailed way including a number of theoretical results that are not well known. These establish firm relations between popular scoring functions and bivariate measures—Quetelet indexes in contingency tables and, rather unexpectedly, normalization options for dummy variables representing target categories. Some related concepts such as Bayesian decision rules, bag-of-word model in text analysis, VC-dimension and kernel for non-linear classification are introduced too. The Chapter outlines several important characteristics of summarization and correlation between two features, and displays some of the properties of those. They are:
-
linear regression and correlation coefficient for two quantitative variables (Sect. 3.2);
-
tabular regression and correlation ratio for the mixed scale case (Sect. 3.8.3); and
-
contingency table, Quetelet index, statistical independence, and Pearson’s chi-squared for two nominal variables; the latter is treated as a summary correlation measure, in contrast to the conventional view of it as just a criterion of statistical independence (Sect. 3.6.1); moreover, a few less known least-squares based concepts are outlined, including canonical correlation and correspondence analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Both, Francis Galton and Charles Darwin, were grandsons of a celebrated medical doctor and philosopher, Erasmus Darwin.
References
J.-P. Benzecri, Correspondence Analysis Handbook (CRC Press, 1992). ISBN 10 0824784375
M. Berthold, D. Hand, Intelligent Data Analysis (Springer, Berlin, 2003)
L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, Classification and Regression Trees (Wadswarth, Belmont, Ca, 1984)
A.C. Davison, D.V. Hinkley, Bootstrap Methods and Their Application, 7th edn. (Cambridge University Press, Cambridge, 2005)
H.B. Demuth, M.H. Beale, O. De Jess, M.T. Hagan, Neural network design (Martin Hagan, 2014)
R.O. Duda, P.E. Hart, D.G. Stork, Pattern Classification (Wiley-Interscience, 2001). ISBN 0-471-05669-3
S.B. Green, N.J. Salkind, Using SPSS for the Windows and Macintosh: Analyzing and Understanding Data (Prentice Hall, 2003)
M. Greenacre, Correspondence Analysis in Practice (CRC Press, 2017)
P.D. Grünwald, The Minimum Description Length Principle (MIT Press, 2007)
J.F. Hair, W.C. Black, B.J. Babin, R.E. Anderson, Multivariate Data Analysis, 7th edn. (Prentice Hall, 2010). ISBN-10: 0-13-813263-1
J. Han, M. Kamber, J. Pei, Data Mining: Concepts and Techniques, 3rd edn. (Elsevier, 2011). ISBN: 978-9380931913
S.S. Haykin, Neural Networks, 2nd edn. (Prentice Hall, 1999). ISBN: 0132733501
A.Z. Israëls, Eigenvalue Techniques for Qualitative Data (Leiden, DSWO Press, 1987)
J. Kemeny, L. Snell, Mathematical Models in Social Sciences (New-York, Blaisdell, 1962)
M.G. Kendall, A. Stewart, Advanced Statistics: Inference and Relationship, 2nd edn. (Griffin, London, 1967)
L. Lebart, A. Morineau, M. Piron, Statistique Exploratoire Multidimensionelle (Dunod, Paris, 1995). ISBN 2-10-002886-3
C.D. Manning, P. Raghavan, H. Schütze, Introduction to Information Retrival (Cambridge University Press, Cambridge, 2008)
B. Mirkin, Group Choice (Halsted Press, Washington, DC, 1979)
B. Mirkin, Grouping in Socio-Economic Research (Finansy i Statistika Publishers, Moscow, Russia, 1985)
B. Mirkin, Mathematical Classification and Clustering (Kluwer, AP, Dordrecht, 1996)
B. Mirkin, Clustering: A Data Recovery Approach (Chapman & Hall/CRC, 2012). ISBN 978-1-4398-3841-9
F. Murtagh, Correspondence Analysis and Data Coding with Java and R (Chapman & Hall/CRC, Boca Raton, FL, 2005)
T.M. Mitchell Machine Learning (McGraw Hill, 2010)
S. Nishisato, Elements of Dual Scaling: An introduction to practical data analysis (Psychology Press, 2014)
B. Polyak, Introduction to Optimization (Optimization Software, Los Angeles, 1987). ISBN 0911575146
J.R. Quinlan, C4. 5: Programs for Machine Learning (Morgan Kaufmann, San Mateo, 1993)
B. Schölkopf, A.J. Smola, Learning with Kernels (The MIT Press, 2005)
V. Vapnik, Estimation of Dependences Based on Empirical Data, 2nd edn. (Springer Science + Business Media Inc., 2006)
A. Webb, Statistical Pattern Recognition (Wiley, 2002). ISBN-0-470-84514-7
Articles
L. Breiman, Random forests. Mach. Learn. 45(1), 5–32 (2001)
J. Bring, How to standardize regression coefficients. Am. Stat. 48(3), 209–213 (1994)
J. Carpenter, J. Bithell, Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat. Med. 19, 1141–1163 (2000)
H.E. Daniels, The relation between measures of correlation in the universe of sample permutations. Biometrika, 33(2), 129–135 (1944)
F. Esposito, D. Malerba, G. Semeraro, A comparative analysis of methods for pruning decision trees. IEEE Trans. Pattern Anal. Mach. Intell. 19(5), 476–491 (1997)
T. Fawcett, An introduction to ROC analysis. Pattern Recogn. Lett. 27, 861–873 (2006)
D.H. Fisher, Knowledge acquisition via incremental conceptual clustering. Mach. Learn. 2, 139–173 (1987)
P.J.F. Groenen, G. Nalbantov, J.C. Bioch, SVM-Maj: a majorization approach to linear support vector machines with different hinge errors. Adv. Data Anal. Classif. 2(1), 17–43 (2008)
J.G. Kemeny, Mathematics without numbers. Daedalus 88(4), 577–591 (1959)
L. Lebart, B.G. Mirkin, Correspondence analysis and classification, in Multivariate Analysis: Future Directions, vol. 2 ed. by C. Cuadras, C.R. Rao (North Holland, 1993), pp. 341–357
Y. LeCun, Y. Bengio, G. Hinton, Deep learning. Nature 521(7553), 436 (2015)
W.Y. Loh, Y.S. Shih, Split selection methods for classification trees. Stat. Sin. 815–840 (1997)
E. Lombardo, F. Beh, P. Kroonenberg, Modelling trends in ordered correspondence analysis using orthogonal polynomials. Psychometrika 81(325–349), 2016 (2016)
G. Louppe, L. Wehenkel, A. Sutera, P. Geurts, Understanding variable importances in forests of randomized trees. in Advances in Neural Information Processing Systems (NIPS), (2013), pp. 431–439
M. Meilă, Comparing clusterings—an information based distance, J. Multivar. Anal. 98(5), 873–895 (2007)
B. Mirkin, L. Cherny, Some properties of the partition space, in K. Bagrinovsky, E. Berland (Eds.), Math. Anal. Econ. Models III, Institute of Economics of the Siberian Branch of the USSR’s Academy of the Sciences, Novosibirsk, 126–147 (1972)
B. Mirkin, Eleven ways to look at the chi-squared coefficient for contingency tables. Am. Stat. 55(2), 111–120 (2001)
B. Mirkin, T.I. Fenner Tied rankings, ordered partitions, and weak orders: Distance and consensus. J. Classif. 36(2), (2019)
J.N. Morgan, J.A. Sonquist, Problems in the analysis of survey data, and a proposal. J. Am. Stat. Assoc. 58, 415–435 (1963)
I. Morlini, S. Zani, A new class of weighted similarity indices using polytomous variables. J. Classif. 29(2), 199–226 (2012)
K. Pearson, On a criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen in random sampling. Phil. Mag. 50, 157–175 (1900)
J. Schmidhuber, Deep learning in neural networks: an overview. Neural Netw 61, 85–117 (2015)
K. Steele, H.O. Stefánsson, Decision Theory, The Stanford Encyclopedia of Philosophy (Winter 2015 edn.), Edward N. Zalta (Ed.), http://plato.stanford.edu/archives/win2015/entries/decision-theory/
N.G. Waller, J.A. Jones, Correlation weights in multiple regression. Psychometrika 75(1), 58–69 (2010)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Mirkin, B. (2019). Learning Correlations. In: Core Data Analysis: Summarization, Correlation, and Visualization. Undergraduate Topics in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-030-00271-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-00271-8_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00270-1
Online ISBN: 978-3-030-00271-8
eBook Packages: Computer ScienceComputer Science (R0)