Abstract
Language testing researchers often use statistical models to approximate and study a true model (i.e., the underlying system that is responsible for generating data). Building a model that successfully approximates the true model is not an easy task and typically involves data-driven model selection. However, available tools for model selection cannot guarantee successful reproduction of the true model. Moreover, there are consequences of model selection that affect the quality of inferences. Introducing and illustrating some of these issues related to model selection is the goal of this chapter. In particular, I focus on three issues: (1) uncertainty due to model selection in statistical inference, (2) successful approximations of data with an incorrect model, and (3) existence of substantively different models whose statistical counterparts are highly comparable. I conclude with a call for explicitly acknowledging and justifying model selection processes, as laid out in Bachman’s research use argument framework (2006, 2009).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19, 716–723.
Bachman, L. F. (1982). The trait structure of cloze test scores. TESOL Quarterly, 16, 61–70.
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.
Bachman, L. F. (2000). Modern language testing at the turn of the century: Assuring that what we count counts. Language Testing, 17, 1–42.
Bachman, L. F. (2006). Generalizability: A journey into the nature of empirical research in applied linguistics. In M. Chalhoub-Deville, C. A. Chapelle, & P. Duff (Eds.), Inference and generalizability in applied linguistics: Multiple perspectives (pp. 165–207). Dordrecht, The Netherlands: John Benjamins.
Bachman, L. F. (2009). Generalizability and research use arguments. In K. Ercikan & W-M. Roth (Eds.), Generalizing from educational research (pp. 127–148). New York, NY: Tayler & Francis.
Bachman, L. F. (2013). Ongoing challenges in language assessment. In A. J. Kunnan (Ed.), The companion to language assessment. Wiley-Blackwell: Hoboken, NJ.
Bachman, L. F., & Palmer, A. S. (1981). The construct validation of the FSI oral interview. Language Learning, 31, 67–86.
Bachman, L. F., & Palmer, A. S. (1982). The construct validation of some components of communicative proficiency. TESOL Quarterly, 16, 444–465.
Bae, J., & Bachman, L. F. (1998). A latent variable approach to listening and reading: Testing factorial invariance across two groups of children in the Korean/English two-way immersion program. Language Testing, 15, 380–414.
Bae, J., & Bachman, L. F. (2010). An investigation of four writing traits and two tasks across two languages. Language Testing, 27, 213–234.
Bellman, R. E. (1961). Adaptive control processes. Princeton, NJ: Princeton University Press.
Berk, R. A. (2016). Statistical learning from a regression perspective (2nd ed.). New York, NY: Springer.
Berk, R. A., Brown, L., Buja, A., Zhang, K., & Zhao, L. (2013). Valid post-selection inference. The Annals of Statistics, 41, 802–837.
Berk, R. A., Brown, L., & Zhao, L. (2010). Statistical inference after model selection. Journal of Quantitative Criminology, 26, 217–236.
Berk, R. A., & Freedman, D. A. (2003). Statistical assumptions as empirical commitments. In T. G. Blomberg & S. Cohen (Eds.), Law, punishment, and social control: Essays in honor of Sheldon Messinger (pp. 235–254). New York, NY: Aldine de Gruyter.
Box, G. E. P. (1976). Science and statistics. Journal of the American Statistical Association, 71, 791–799.
Breiman, L. (2001a). Statistical modeling: The two cultures. Statistical Science, 16, 199–231.
Breiman, L. (2001b). Random forests. Machine Learning, 45, 5–32.
Brown, L. D. (1967). The conditional level of Student’s t test. The Annals of Mathematical Statistics, 38, 1068–1071.
Buehler, R. J., & Feddersen, A. P. (1963). Note on a conditional property of Student’s t. The Annals of Mathematical Statistics, 34, 1098–1100.
Chatfield, C. (1995). Model uncertainty, data mining and statistical inference. Journal of the Royal Statistical Society, Series A, 158, 419–466.
Cox, D. R., & Snell, E. J. (1974). The choice of variables in observational studies. Journal of the Royal Statistical Society, Series C, 23, 51–59.
Cudeck, R., & Henly, S. J. (1991). Model selection in covariance structures analysis and the “problem” of sample size: A clarification. Psychological Bulletin, 109, 512–519.
Educational Testing Service. (2019). About the TOEFL iBT® test. https://www.ets.org/toefl/ibt/about.
Faraway, J. J. (2016). Does data splitting improve prediction? Statistics and Computing, 26, 40–60.
Fouly, K., Bachman, L. F., & Cziko, G. (1990). The divisibility of language competence: A confirmatory approach. Language Learning, 40, 1–21.
Gelman, A., & Nolan, D. (2002). Teaching statistics: A bag of tricks. Oxford: Oxford University Press.
Kabalia, P. (1998). Valid confidence intervals in regression after variable selection. Econometric Theory, 14, 463–482.
Kadane, J. B., & Lazar, N. A. (2004). Methods and criteria for model selection. Journal of the American Statistical Association, 99, 279–290.
Lee, S., & Hershberger, S. (1990). A simple rule for generating equivalent models in covariance structure modeling. Multivariate Behavioral Research, 25, 313–334.
Lee, T., MacCallum, R. C., & Browne, M. W. (2018). Fungible parameter estimates in structural equation modeling. Psychological Methods, 23, 58–75.
Leeb, H., & Pötscher, B. M. (2005). Model selection and inference: Facts and fiction. Econometric Theory, 21, 21–59.
Leeb, H., & Pötscher, B. M. (2006). Can one estimate the conditional distribution of post-model-selection estimators? The Annals of Statistics, 34, 2554–2591.
Leeb, H., & Pötscher, B. M. (2008). Model selection. In T. G. Anderson, R. A. Davis, J. P. Kreib, & T. Mikosch (Eds.), The handbook of financial time series (pp. 785–821). New York, NY: Springer.
MacCallum, R. C., Wegener, D. T., Uchino, B. N., & Fabrigar, L. R. (1993). The problem of equivalent models in applications of covariance structure analysis. Psychological Bulletin, 114, 185–199.
MacKay, D. J. C. (1992). Bayesian interpolation. Neural Computation, 4, 415–447.
McCullagh, P., & Nelder, J. A. (1989). Generalized linear models (2nd ed.). London: Chapman & Hall.
Meehl, P. E., & Waller, N. G. (2002). The path analysis controversy: A new statistical approach to strong appraisal of verisimilitude. Psychological Methods, 7, 283–300.
R Core Team. (2019). R: A language and environment for statistical computing [Computer software]. Vienna, Austria: R Foundation for Statistical Computing.
Schapire, R. E. (1999). A brief introduction to boosting. In Proceedings of the Sixth International Joint Conference on Artificial Intelligence (pp. 1401–1406). Stockholm, Sweden.
Sen, P. K. (1979). Asymptotic properties of maximum likelihood estimators based on conditional specification. Annals of Statistics, 7, 1019–1033.
Shmueli, G. (2010). To explain or to predict? Statistical Science, 25, 289–310.
Waller, N. G. (2008). Fungible weights in multiple regression. Psychometrika, 73, 691–703.
Waller, N. G., & Jones, J. A. (2009). Locating the extrema of fungible regression weights. Psychometrika, 74, 589–602.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Choi, I. (2020). The Curse of Explanation: Model Selection in Language Testing Research. In: Ockey, G.J., Green, B.A. (eds) Another Generation of Fundamental Considerations in Language Assessment. Springer, Singapore. https://doi.org/10.1007/978-981-15-8952-2_9
Download citation
DOI: https://doi.org/10.1007/978-981-15-8952-2_9
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-8951-5
Online ISBN: 978-981-15-8952-2
eBook Packages: EducationEducation (R0)