Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

An autoregressive growth model for longitudinal item analysis

Abstract

A first-order autoregressive growth model is proposed for longitudinal binary item analysis where responses to the same items are conditionally dependent across time given the latent traits. Specifically, the item response probability for a given item at a given time depends on the latent trait as well as the response to the same item at the previous time, or the lagged response. An initial conditions problem arises because there is no lagged response at the initial time period. We handle this problem by adapting solutions proposed for dynamic models in panel data econometrics. Asymptotic and finite sample power for the autoregressive parameters are investigated. The consequences of ignoring local dependence and the initial conditions problem are also examined for data simulated from a first-order autoregressive growth model. The proposed methods are applied to longitudinal data on Korean students’ self-esteem.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  1. Aitkin, M., & Alfó, M. (1998). Regression models for longitudinal binary responses. Statistics and Computing, 8, 289–307.

  2. Aitkin, M., & Alfó, M. (2003). Longitudinal analysis of repeated binary data using autoregressive and random effect modelling. Statistical Modelling, 3, 291–303.

  3. Andersen, E. B. (1985). Estimating latent correlations between repeated testings. Psychometrika, 50, 3–16.

  4. Arulampalam, W., & Stewart, M. B. (2009). Simplified implementation of the heckman estimator of the dynamic probit model and a comparison with alternative estimators. Oxford Bulletin of Economics and Statistics, 71, 659–681.

  5. Bartolucci, F., & Nigro, V. (2010). A dynamic model for binary panel data with unobserved heterogeneity admitting a \(\sqrt{n}\)-consistent conditional estimator. Econometrica, 78, 719–733.

  6. Bollen, K. A. (1989). Structural equations with latent variables. New York: Wiley.

  7. Bollen, K. A., & Curran, P. J. (2004). Autoregressive latent trajectory (ALT) models: A synthesis of two traditions. Sociological Methods & Research, 32, 336–383.

  8. Bollen, K. A., & Stine, R. A. (1992). Bootstrapping goodness-of-fit measures in structural equation models. Sociological Methods & Research, 21, 205–229.

  9. Bradlow, E. T., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64, 153–168.

  10. Braeken, J. (2011). A boundary mixture approach violations of conditional independence. Psychometrika, 76, 57–76.

  11. Braeken, J., Tuerlinckx, F., & De Boeck, P. (2007). Copula functions for residual dependency. Psychometrika, 72, 393–411.

  12. Breinegaard, A., Rabe-Hesketh, S., & Skrondal, A. (2015). The transition model test for serial dependence in mixed-effects models for binary data. Statistical Methods in Medical Research. doi:10.1177/0962280215588123.

  13. Buse, A. (1982). The likelihood ratio, Wald, and Lagrange multiplier tests: An expository note. The American Statistician, 36, 153–157.

  14. Cai, L. (2010). A two-tier full-information item factor analysis model with applications. Psychometrika, 75, 581–612.

  15. De Boeck, P., Bakker, M., Zwitser, R., Nivard, M., Hofman, A., Tuerlinckx, F., et al. (2011). The estimation of item response models with the lmer function from the lme4 package in R. Journal of Statistical Software, 39, 1–28.

  16. Dunson, D. B. (2003). Dynamic latent trait models for multidimensional longitudinal data. Journal of the American Statistical Association, 98, 555–563.

  17. Embretson, S. E. (1991). A multidimensional latent trait model for measuring learning and change. Psychometrika, 56, 495–515.

  18. Engle, R. F. (1980). Wald, likelihood ratio and Lagrange multiplier test in econometrics. In Z. Griliches & M. Intriligator (Eds.), Handbook of econometrics (pp. 775–826). Amsterdam: North-Holland Science Publishers.

  19. Fahrmeir, L., & Kaufmann, H. (1987). Regression models for non-stationary categorical time series. Journal of Time Series Analysis, 8, 147–160.

  20. Fotouhi, A. R. (2005). The initial conditions problem in longitudinal binary process: A simulation study. Simulation Modelling Practice and Theory, 13, 566–583.

  21. Gibbons, R. D., & Hedeker, D. (1992). Full-information item bi-factor analysis. Psychometrika, 57, 423–436.

  22. Hancock, G. R., & Kuo, W. (2001). An illustration of second-order latent growth models. Structural Equation Modeling, 8, 470–489.

  23. Heagerty, P., & Kurland, B. (2001). Misspecified maximum likelihood estimates and generalised linear mixed models. Biometrika, 88, 973–985.

  24. Heckman, J. J. (1981). The incidental parameters problem and the problem of initial conditions in estimating a discrete time-discrete data stochastic process. In C. F. Manski & D. MacFadden (Eds.), Structural analysis of discrete data with econometric applications (pp. 179–195). Cambridge: MIT Press.

  25. Hoskens, M., & De Boeck, P. (1997). A parametric model for local item dependence among test items. Psychological Methods, 2, 261–277.

  26. Hsiao, C. (2003). Analysis of panel data (2nd ed.). New York: Cambridge University Press.

  27. Jeon, M. (2012). Estimation of Complex Generalized Linear Mixed Models for Measurement and Growth. PhD thesis, University of California, Berkeley.

  28. Jeon, M., & Rabe-Hesketh, S. (2012). Profile-likelihood approach for estimating generalized linear mixed models with factor structures. Journal of Educational and Behavioral Statistics, 37, 518–542.

  29. Jeon, M., Rijmen, F., & Rabe-Hesketh, S. (2013). Modeling differential item functioning using a generalization of the multiple-group bifactor model. Journal of Educational and Behavioral Statistics, 38, 32–60.

  30. Lee, K.-S., Lim, H.-J., & Ahn, S.-Y. (2010). Korea Youth Panel Study. National Youth Policy Institute, Seoul. Retrieved http://archive.nypi.re.kr.

  31. Maydeu-Olivares, A., & Joe, H. (2014). Assessing approximate fit in categorical data analysis. Multivariate Behavioral Research, 49, 305–328.

  32. McArdle, J. J. (1988). Dynamic but structural equation modeling of repeated measures data. In R. B. Cattell & J. Nesselroade (Eds.), Handbook of multivariate experimental psychology (pp. 561–614). New York: Plenum Press.

  33. Mellenbergh, G. J. (1989). Item bias and item response theory. International Journal of Educational Research, 13, 127–143.

  34. Meredith, W., & Millsap, R. E. (1992). On the misuse of manifest variables in the detection of measurement bias. Psychometrika, 57, 289–311.

  35. Millsap, R. E. (2010). Testing measurement invariance using item response theory in longitudinal data: An introduction. Child Development Perspectives, 4, 5–9.

  36. Pastor, D. A., & Beretvas, S. N. (2006). Longitudinal Rasch modeling in the context of psychotherapy. Applied Psychological Measurement, 30, 100–120.

  37. Potscher, B. M., & Srinivasan, S. (1994). A comparison of order estimation procedures for ARMA models. Statistica Sinica, 4, 29–50.

  38. Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2005). Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects. Journal of Econometrics, 128, 301–323.

  39. Rijmen, F. (2009). An efficient EM algorithm for multidimensional IRT models: Full information maximum likelihood estimation in limited time. ETS Research Report (RR0903).

  40. Rogers, H. J., & Swaminathan, H. (1993). A comparison of logistic regression and Mantel-Haenszel procedures for detecting differential item functioning. Applied Psychological Measurement, 17, 105–116.

  41. Rotnitzky, A., & Wypij, D. (1994). A note on the bias of estimators with missing data. Biometrics, 50, 1163–1170.

  42. Rubin, D. B. (1976). Inference and missing data (with discussion). Biometrka, 63, 581–592.

  43. Satorra, A., & Saris, W. (1985). Power of the likelihood ratio test in covariance structure analysis. Psychometrika, 51, 83–90.

  44. Sayer, A. G., & Cumsille, P. E. (2001). Second-order latent growth model. In L. M. Collins & A. G. Sayer (Eds.), New methods for the analysis of change (pp. 179–199). Washington, DC: American Psychological Association.

  45. Segawa, E. (2005). A growth model for multilevel ordinal data. Journal of Educational and Behavioral Statistics, 30, 369–396.

  46. Serrano, D. (2010). A second-order growth model for longitudinal item response data. PhD thesis, University of North Carolina, Chapel Hill.

  47. Skrondal, A., & Rabe-Hesketh, S. (2014). Handling initial conditions and endogenous covariates in dynamic/transiton models for binary data with unobserved heterogeneity. Journal of the Royal Statistical Society Series C, 63, 211–237.

  48. Tuerlinckx, F., & De Boeck, P. (2001). The effect of ignoring item interactions on the estimated discrimination parameters in item response theory. Psychological Methods, 6, 181–195.

  49. Vasdekis, V. G. S., Cagnone, S., & Moustaki, I. (2012). A composite likelihood inference in latent variable models for ordinal longitudinal responses. Psychometrika, 77, 425–441.

  50. Verguts, T., & De Boeck, P. (2000). A Rasch model for learning while solving an intelligence test. Applied Psychological Measurement, 24, 151–162.

  51. Verhelst, N. D., & Glas, C. A. W. (1993). A dynamic generalization of the Rasch model. Psychometrika, 58, 395–415.

  52. Wang, W., & Wilson, M. (2005). The Rasch testlet model. Applied Psychological Measurement, 29, 126–149.

  53. White, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica, 50, 1–26.

  54. Wilson, M., & Adams, R. J. (1995). Rasch models for item bundles. Psychometrika, 60, 181–198.

  55. Wooldridge, J. F. (2005). Simple solutions to the initial conditions problem in dynamic, nonlinear panel data models with unobserved heterogeneity. Journal of Applied Econometrics, 20, 39–54.

  56. Zumbo, B. D. (1999). A handbook on the theory and methods for differential item functioning: Logistic regression modeling as a unitary framework for binary and likert-type (ordinal) item scores. Directorate of Human Resources Research and Evaluation, Department of National Defense, Ottawa.

Download references

Acknowledgments

The research reported here was supported by the Institute of Education Sciences, U.S. Department of Education, through Grant R305D110027 to Educational Testing Service. The opinions expressed are those of the authors and do not represent the views of the Institute or the U.S. Department of Education.

Author information

Correspondence to Minjeong Jeon.

Appendix

Appendix

Here we illustrate how to estimate the proposed model by utilizing gllamm. We use model M1 which includes a first-order lagged effect for item 1 (used in the empirical study in Sect. 7).

Let t represent years 2003 to 2008 denoted by \(t=0, 1, 2,\ldots , 5\), respectively (i.e., \(t=0\) for year 2003). We can formulate model M1 for the initial time point (\(t=0\)) as

$$\begin{aligned} g \left( \text {Pr} (y_{tis}=1 | \delta _{1s}, \gamma _{2s} ) \right)&= \beta _i' + \alpha _i' \delta _{1s} + \alpha _i' \gamma _{2s} \text {time}_{ts} + \alpha _i' \epsilon _{ts}, \end{aligned}$$
(8)

where \(\gamma _{2s} = b+ \delta _{2s}\) (with b being the mean of \(\gamma _{2s}\). See Eq. (5)). For the following time points (\(t>0\)), we formulate the model as

$$\begin{aligned} g \left( \text {Pr} (y_{tis}=1 | y_{(t-1)1s}, \delta _{1s}, \gamma _{2s} ) \right)&= \beta _i + \lambda _{1}y_{(t-1)1s}r_{i=1} + \alpha _i \delta _{1s} + \alpha _i \gamma _{2s} \text {time}_{ts} + \alpha _i \epsilon _{ts}, \end{aligned}$$
(9)

where \(\lambda _{1}\) is the parameter for the lagged response (\(y_{(t-1)1s}\)) for item 1 (here \(r_{i=1}\) is a dummy variable for item 1). We can formulate a combined model for \(t=0\) and \(t>0\) by utilizing the dummy variable \(d_{t=0,i=1}\) that indicates item 1 at the initial time point (\(t=0\)) as follows:

$$\begin{aligned} g \left( \text {Pr} (y_{tis}=1 | y_{(t-1)1s}, \delta _{1s}, \gamma _{2s} ) \right) =&\, \beta _i + \beta _1^* d_{t=0,i=1} + \lambda _{1}y_{(t-1)1s}r_{i=1} \end{aligned}$$
(10)
$$\begin{aligned}&+ \delta _{1s} (\alpha _i + \alpha _1^* d_{t=0,i=1} ) \end{aligned}$$
(11)
$$\begin{aligned}&+ \gamma _{2s} (\alpha _i \text {time}_{ts} + \alpha _1^* \text {time}_{ts} d_{t=0,i=1}) \end{aligned}$$
(12)
$$\begin{aligned}&+ \epsilon _{ts} ( \alpha _i + \alpha _1^* d_{t=0,i=1}), \end{aligned}$$
(13)

where Eq. (10) constitutes the fixed part of the model and Eqs. (11) to (13) constitute the random part of the model. In the fixed part, \(\beta _1^* =\beta _1'-\beta _1\) and in the random part \(\alpha _1^*=\alpha _1'-\alpha _1\) and \(\gamma _{2s} = b+\delta _{2s}\).

figurea

To save on computation time, users may run the model with a small number of quadrature points (e.g., 2) and use the initial estimates as starting values with a larger number of quadrature points (e.g., 5).

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Jeon, M., Rabe-Hesketh, S. An autoregressive growth model for longitudinal item analysis. Psychometrika 81, 830–850 (2016). https://doi.org/10.1007/s11336-015-9489-2

Download citation

Keywords

  • autoregressive models
  • initial conditions problem
  • measurement invariance
  • serial dependence