Abstract
Purpose
Construct validity is commonly assessed by applying statistical methods to data. However, purely empirical methods cannot explain what happens between the attribute and the instrument scores, which is the core of construct validity. Linear Logistic Test Models (LLTMs) can provide such explanation by decomposing item difficulties into a weighted sum of theoretical item properties. In this study, we aim to support construct validity of the Evaluation of Daily Activity Questionnaire (EDAQ) by using item properties accounting for item difficulties.
Methods
Dichotomized responses to the EDAQ were analyzed with (1) the Rasch model (to estimate item difficulties), and (2) LLTMs (to predict item difficulties). Seven properties of the items were identified and rated in ordinal scales by 39 Occupational Therapists worldwide. Aggregated metric estimates—the weights used to predict item difficulties in LLTMs—were derived from the ratings using seven cumulative link mixed models. Estimated and predicted item difficulties were compared.
Results
The Rasch model showed acceptable fit and unidimensionality for a sample of 42 locally independent EDAQ items. The LLTM plus error showed significantly better fit than the LLTM. In the former, three of the seven properties were not significant, and the corresponding model including only the significant properties was used to predict item difficulties; they explained 77.5% of the variance in estimated item difficulties.
Conclusion
A satisfactory theoretical explanation of what makes an activity of daily living task more difficult than another has been provided by a LLTM plus error model, therefore supporting construct validity of the EDAQ.
Similar content being viewed by others
References
Nunally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York: McGraw-Hill
Strauss, M. E., & Smith, G. T. (2009). Construct validity: Advances in theory and methodology. Annual Review of Clinical Psychology, 5, 1–25. https://doi.org/10.1146/annurev.clinpsy.032408.153639.
American Psychological Association, American Educational Research Association, & National Council on Measurement in Education (1954). Technical recommendations for psychological tests and diagnostic techniques. Washington DC: American Psychological Association
Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52(4), 281–302. https://doi.org/10.1037/h0040957.
Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological Review, 111(4), 1061–1071. https://doi.org/10.1037/0033-295x.111.4.1061.
Embretson, S. (1983). Construct validity: Construct representation versus nomothetic span. Psychological Bulletin, 93(1), 179–197. https://doi.org/10.1037/0033-2909.93.1.179.
Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741–749. https://doi.org/10.1037/0003-066x.50.9.741.
Kane, M. T. (2001). Current concerns in validity theory. Journal of Educational Measurement, 38(4), 319–342. https://doi.org/10.1111/j.1745-3984.2001.tb01130.x.
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. København: Danmarks Paedagogiske Institut.
Stenner, J. A., & Smith M. III (1982). Testing construct theories. Perceptual and Motor Skills, 55, 415–426.
Stenner, J. A., Smith M. A. III, Burdick D. S. (1983). Toward a theory of construct definition. Journal of Educational Measurement, 20(4), 305–316.
Arthur, G. (1947). A point scale of performance tests. New York: Psychological Corp.
Fischer, G. H. (1973). The linear logistic test model as an instrument in educational research. Acta Psychologica, 37(6), 359–374. https://doi.org/10.1016/0001-6918(73)90003-6.
Baghaei, P., & Kubinger, K. D. (2015). Linear logistic test modeling with R. Practical Assessment, Research and Evaluation, 20, 1–11.
Janssen, R., Schepers, J., & Peres, D. (2004). Models with item and item group predictors. In P. De Boeck & M. Wilson (Eds.), Explanatory item response models. New York: Springer.
Green, K. E., & Smith, R. M. (1987). A comparison of two methods of decomposing item difficulties. Journal of Educational Statistics, 12(4), 369–381. https://doi.org/10.2307/1165055.
Hartig, J., Frey, A., Nold, G., & Klieme, E. (2012). An application of explanatory item response modeling for model-based proficiency scaling. Educational and Psychological Measurement, 72(4), 665–686. https://doi.org/10.1177/0013164411430707.
Kubinger, K. D. (1979). Das Problemlöseverhalten bei der statistischen Auswertung psychologischer Experimente. Ein Beispiel hochschuldidaktischer Forschung. Zeitschrift für Experimentelle und Angewandte Psychologie, 26, 467–496.
Zeuch, N., Holling, H., & Kuhn, J. T. (2011). Analysis of the Latin square task with linear logistic test models. Learning and Individual Differences, 21, 629–632.
Sonn, U., Törnquist, K., & Svensson, E. (1999). The ADL taxonomy—From individual categorical data to ordinal categorical data. Scandinavian Journal of Occupational Therapy, 6(1), 11–20. https://doi.org/10.1080/110381299443807.
Fong, T. G., Gleason, L. J., Wong, B., Habtemariam, D., Jones, R. N., Schmitt, E. M., et al. (2015). Cognitive and physical demands of activities of daily living in older adults: Validation of expert panel ratings. PM R, 7(7), 727–735. https://doi.org/10.1016/j.pmrj.2015.01.018.
Hammond, A., Tennant, A., Tyson, S. F., Nordenskiold, U., Hawkins, R., & Prior, Y. (2015). The reliability and validity of the English version of the evaluation of daily activity questionnaire for people with rheumatoid arthritis. Rheumatology, 54(9), 1605–1615. https://doi.org/10.1093/rheumatology/kev008.
WHO. (2001). International classification of functioning, disability and health. Geneva: World Health Organization (WHO). Retrieved from http://nla.gov.au/nla.cat-vn1515102.
Andrich, D., Sheridan, B., & Luo, G. (2010). Rasch models for measurement: RUMM2030. Perth: RUMM Laboratory Pty, Ltd.
Smith, E. V. Jr. (2002). Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. Journal of Applied Measurement, 3(2), 205–231.
Fischer, G. H. (1995). Rasch models: foundations, recent developments, and applications. New York: Springer.
Kline, T. J. B. (2005). Psychological testing: A practical approach to design and evaluation. Thousand Oaks: Sage Publications, Inc.
Christensen, R. (2015). A tutorial on fitting cumulative link mixed models with clmm2 from the ordinal package. https://cran.rproject.org/web/packages/ordinal/vignettes/clmm2_tutorial.pdf. Accessed 23 May 2018.
Agresti, A. (2012). Analysis of ordinal categorical data (2nd ed.). Hoboken: Wiley.
De Boeck, P., Bakker, M., Zwitser, R., & Nivard, M. (2011). The estimation of item response models with the lmer function from the lme4 package in R. Journal of Statistical Software. 39(12), 1–28.
De Boeck, P., Cho, S. J., & Wilson, M. (2016). Explanatory item response models. In A. A. Rupp, & J. P. Leighton (Eds.), The Wiley handbook of cognition and assessment: Frameworks, methodologies, and applications. New Jersey: Wiley.
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software; 67(1), 1–48.
Doran, H., Bates, D., Bliese, P., & Dowling, M. (2007). Estimating the multilevel Rasch model: With the lme4 package. Journal of Statistical software, 20(2), 18. https://doi.org/10.18637/jss.v020.i02.
De Boeck, P. (2008). Random item IRT models. Psycometrika, 73(4), 533–559. https://doi.org/10.1007/s11336-008-9092-x.
Burnham, K. P., & Anderson, D. R. (2002). Model selection and multimodel inference: A practical information-theoretic approach. New York: Springer.
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461–464. https://doi.org/10.1214/aos/1176344136.
Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. New York: Chapman & Hall.
Draper, N. R., & Smith, H. (1998). Applied regression analysis (Vol. 1). New York: Wiley.
Ludlow, L., & Klein, K. (2014). Suppressor variables: The difference between ‘is’ versus ‘acting as’. Journal of Statistics Education. https://doi.org/10.1080/10691898.2014.11889703.
Fischer, G. H., & Formann, A. K. (1982). Some applications of logistic latent trait models with linear constraints on the parameters. Applied Psychological Measurement, 6(4), 397–416. https://doi.org/10.1177/014662168200600403.
Acknowledgements
We are extremely thankful to the 39 Occupational Therapists who participated and made possible this project. We would also like to thank Armin Gemperli and Cristina Ehrmann for their comments in a previous version of this paper. This paper is part of the cumulative PhD thesis of NDA.
Funding
This work was supported by Swiss Paraplegic Research and University of Lucerne.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
A confirmation that ethics approval was not required for this study was granted from the Ethics Committee Northwest and Central Switzerland (EKNZ) on May 9, 2018.
Informed consent
All the occupational therapists contributed to the study on a voluntary basis, giving explicit consent to participate by returning the completed Excel file via email.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Adroher, N.D., Tennant, A. Supporting construct validity of the Evaluation of Daily Activity Questionnaire using Linear Logistic Test Models. Qual Life Res 28, 1627–1639 (2019). https://doi.org/10.1007/s11136-019-02146-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11136-019-02146-4