Abstract
One goal of statistical studies is to highlight associations between pairs of variables. This is particularly useful when one wants to get a clear picture of a multi-dimensional data set and motivate a specific policy intervention (Sect. 4.1). Yet, the choice of a method is not straightforward. Testing for correlation is the relevant approach to investigate a linear association between two numerical variables (Sect. 4.2). The chi-square test is an inferential test that uses data from a sample to make conclusions about the relationship between two categorical variables (Sect. 4.3). When one variable is numerical and the other is categorical, the usual approach is to test for differences between means or to implement an analysis of variance (Sect. 4.4). When faced with more than two variables, it is also possible to provide a multidimensional representation of the problem using methods such as principal component analysis (Sect. 4.5) and multiple correspondence analysis (Sect. 4.6). The idea is to reduce the dimensionality of a data set by plotting all the observations on 2D graphs describing how observations cluster with respect to various characteristics. These groups can for instance serve to identify the beneficiaries of a particular intervention. Using R-CRAN, several examples are included in this chapter to illustrate the different methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Galton, F. (1877). Typical laws of heredity. Nature, 15, 492ā495.
Galton, F. (1889). Natural inheritance. London: Macmillan.
Giudici, P. (2005). Applied data mining: Statistical methods for business and industry. New York: Wiley.
Lang, T. A., & Secic, M. (2006). How to report statistics in medicine: Annotated guidelines for authors, editors, and reviewers. Philadelphia, PA: ACP.
MacDonell, W. R. (1902). On criminal anthropometry and the identification of criminals. Biometrika, 1, 177ā227.
Pearson, K. (1900). On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine Series, 5, 157ā175.
Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine Series, 6, 559ā572.
Pearson, K. (1906). On certain points connected with scale order in the case of a correlation of two characters which for some arrangement give a linear regression line. Biometrika, 5, 176ā178.
Rosenthal, G., & Rosenthal, J. A. (2011). Statistics and data interpretation for social work. New York: Springer.
TuffƩry, S. (2011). Data mining and statistics for decision making. Wiley.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
Ā© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Josselin, JM., Le Maux, B. (2017). Measuring and Visualizing Associations. In: Statistical Tools for Program Evaluation . Springer, Cham. https://doi.org/10.1007/978-3-319-52827-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-52827-4_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-52826-7
Online ISBN: 978-3-319-52827-4
eBook Packages: Economics and FinanceEconomics and Finance (R0)