Abstract
In this chapter, mixed-effects regression modeling is introduced, mostly using alternation modeling as an example. It is one option to deal with cases where observations vary by groups (such as speakers, registers, lemmas) by introducing so-called random effects into the model specification. It is stressed that using a categorical variable as a random effect is just an alternative to using it as a normal fixed effect in a Generalised Linear Model (GLM) as introduced in Chap. 21, but that the two options have different mathematical advantages and disadvantages. Simple random intercepts are introduced, which capture per-group tendencies. However, random slopes (for situations where fixed effects vary per group) and multilevel models (for situations where group-wise tendencies can be predicted from other variables, for example when lemma frequency is useful to predict lemma-specific tendencies) are also introduced. Criteria for including random effects in models and for evaluating the model fit (for example through pseudo-coefficients of determination) are discussed. The demonstration in R uses the popular lme4 package.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Trivially, grouping factors should never be ordinal variables. They are always categorical. Terminologically, the “groups” are the collections of observations, and each such group corresponds to the “levels” of the categorical grouping factor.
- 2.
The fixed effects discussed so far, which are interpreted at the level of observations, are consequently called “first-level fixed effects” or “first-level predictors”.
- 3.
Choosing one dummy as a reference level is necessary because otherwise, infinitely many equivalent estimates of the model coefficients exist, as one could simply add any arbitrary constant to the intercept and shift the other coefficients accordingly. However, the estimator works under the assumption that there is a unique maximum likelihood estimate. This extends to any other appropriate coding for categorical variables.
- 4.
There is one other practical difference. If models are used to make actual predictions (which is rarely the case in linguistics), a random effect allows one to make predictions for unseen groups. See Gelman and Hill (2007, 272–275).
- 5.
Shrinkage is thus stronger (and the conditional mode/mean is closer to 0) if there is less evidence that a group deviates from the overall tendency. The lower the number of observations per group, the less evidence there is.
- 6.
Again, we do not assume them to be fixed population parameters, which would be the case for true estimates such as fixed effects coefficients.
- 7.
There are, of course, elegant ways of pulling the frequency values from another data frame on the fly in R.
- 8.
The variance-covariance matrix of glmm.01 can also be extracted directly using the VarCorr( glmm.01) command.
- 9.
Since the bootstrap (especially with smaller original sample sizes) tends to run into replications where the estimation of the variance fails and is thus returned as 0, the bootstrap interval is sometimes skewed towards 0 when the profile confidence interval frames the true value symmetrically. The bootstrap is thus not always more robust or intrinsically better. Comparing both methods is recommended.
- 10.
Again, the accompanying script contains all necessary code.
- 11.
This entails that GLMMs with only one simple random effect cannot be compared with a model without it, as such a model would be a GLM and not a nested GLMM.
- 12.
Notice that the results reported in the paper differ slightly from the sample script included with this chapter because the random number generator was in a different state.
References
Bates, D. M. (2010). Lme4: Mixed-effects modeling with R. http://lme4.r-forge.r-project.org/lMMwR/lrgprt.pdf.
Bates, D., Kliegl, R., Vasishth, S., & Baayen, R. (2015a). Parsimonious mixed models. https://arxiv.org/abs/1506.04967.
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015b). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01.
Biber, D., Finegan, E., & Atkinson, D. (1994). Archer and its challenges: Compiling and exploring a representative corpus of historical english registers. In U. Fries, P. Schneider, & G. Tottie (Eds.), Creating and using english language corpora (pp. 1–13). Amsterdam: Rodopi.
Fox, J., & Monette, G. (1992). Generalized collinearity diagnostics. Journal of the American Statistical Association, 87, 178–183. https://doi.org/10.2307/2290467.
Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge: Cambridge University Press.
Halekoh, U., & Højsgaard, S. (2014). A Kenward-Roger approximation and parametric bootstrap methods for tests in linear mixed models – the R package pbkrtest. Journal of Statistical Software, 59(9), 1–30. https://doi.org/10.18637/jss.v059.i09.
Matuschek, H., Kliegl, R., Vasishth, S., Baayen, H., & Bates, D. (2017). Balancing type I error and power in linear mixed models. Journal of Memory and Language, 94, 305–315. https://doi.org/10.1016/j.jml.2017.01.001.
Nakagawa, S., & Schielzeth, H. (2013). A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods in Ecology and Evolution, 4(2), 133–142. https://doi.org/10.1111/j.2041-210x.2012.00261.x.
Schäfer, R. (2018). Abstractions and exemplars: The measure noun phrase alternation in German. Cognitive Linguistics, 29(4), 729–771. https://doi.org/10.1515/cog-2017-0050.
Schäfer, R., Barbaresi, A., & Bildhauer, F. (2013). The good, the bad, and the hazy: Design decisions in web corpus construction. In S. Evert, E. Stemle, & P. Rayson (Eds.), Proceedings of the 8th Web as Corpus Workshop (WAC-8) (pp. 7–15). Lancaster: SIGWAC.
Schäfer, R., & Bildhauer, F. (2012). Building large corpora from the web using a new efficient tool chain. In N. C. (Chair), K. Choukri, T. Declerck, M. U. Doan, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12) (pp. 486–493). Istanbul: European Language Resources Association (ELRA).
Schielzeth, H., & Forstmeier, W. (2009). Conclusions beyond support: Overconfident estimates in mixed models. Behavioral Ecology, 20(2), 416–420. https://doi.org/10.1093/beheco/arn145.
Zuur, A. F., Ieno, E. N., & Elphick, C. S. (2010). A protocol for data exploration to avoid common statistical problems. Methods in Ecology and Evolution, 1(1), 3–14. https://doi.org/10.1111/j.2041-210x.2009.00001.x.
Zuur, A. F., Ieno, E. N., Walker, N., Saveliev, A. A., & Smith, G. M. (2009). Mixed effects models and extensions in ecology with R. Berlin: Springer.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic Supplementary Materials
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Schäfer, R. (2020). Mixed-Effects Regression Modeling. In: Paquot, M., Gries, S.T. (eds) A Practical Handbook of Corpus Linguistics. Springer, Cham. https://doi.org/10.1007/978-3-030-46216-1_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-46216-1_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46215-4
Online ISBN: 978-3-030-46216-1
eBook Packages: Religion and PhilosophyPhilosophy and Religion (R0)