Abstract
Purpose
To extend knowledge about measurement equivalence of depression measures assessed by tablet and paper–pencil administration, the present study evaluated the effect of mode of administration (MoA) on scale and item level for the Patient Health Questionnaire (PHQ-9) and the Aachen Depression Item Bank (ADIB) in elderly patients.
Methods
Primary care patients (N = 193, ≥60 years) were assessed following a crossover design in Leipzig, Germany. All participants filled out the PHQ-9 and the ADIB in both MoAs under study. Effects of MoA were analyzed by intra-class correlation, mixed-effects regression, and differential item functioning (DIF). Additionally, detection rates between both MoAs were compared using receiver operating characteristics analysis compared to a diagnostic interview (SCID-I, N = 163).
Results
No effect of MoA was found in the PHQ-9 on scale score or item level. Two ADIB items showed DIF according to MoA. In terms of discriminatory power, MoA did not influence detection rates of both instruments.
Conclusions
In summary, our findings suggest that no severe effect of mode of administration on self-report assessments of depression should be expected. It can be concluded that tablets provide a valid way to electronically assess depressive symptoms in elderly patients. Yet changes in item presentation can influence the psychometric properties and require equivalence testing using sophisticated analyses on item level such as DIF.


References
Gwaltney, C. J., Shields, A. L., & Shiffman, S. (2008). Equivalence of electronic and paper-and-pencil administration of patient-reported outcome measures: A meta-analytic review. Value in Health, 11, 322–333.
Ebner-Priemer, U. W., Kubiak, T., & Pawlik, K. (2009). Ambulatory assessment introduction. European Psychologist, 14, 95–97.
Gibbons, R. D., Weiss, D. J., Kupfer, D. J., Frank, E., Fagiolini, A., Grochocinski, V. J., et al. (2008). Using computerized adaptive testing to reduce the burden of mental health assessment. Psychiatric Services, 59, 361–368.
Forkmann, T., Boecker, M., Wirtz, M., Frey, C., & Gauggel, S. (2010). Adaptives testen in der psychotherapie: Das Rasch-basierte adaptive depressionsscreening (A-DESC). Zeitschrift für Klinische Diagnostik und Evaluation, 3, 59–75.
Kurt, R., Bogner, H. R., Straton, J. B., Tien, A. Y., & Gallo, J. J. (2004). Computer-assisted assessment of depression and function in older primary care patients. Computer Methods and Programs in Biomedicine, 73, 165–171.
Goldstein, L. A., Gibbons, M. B. C., Thompson, S. M., Scott, K., Heintz, L., Green, P., et al. (2011). Outcome assessment via handheld computer in community mental health: Consumer satisfaction and reliability. Journal of Behavioral Health Services & Research, 38, 414–423.
Velikova, G., Wright, E. P., Smith, A. B., Cull, A., Gould, A., Forman, D., et al. (1999). Automated collection of quality-of-life data: A comparison of paper and computer touch-screen questionnaires. Journal of Clinical Oncology, 17, 998–1007.
Coons, S. J., Gwaltney, C. J., Hays, R. D., Lundy, J. J., Sloan, J. A., Revicki, D. A., et al. (2009). Recommendations on evidence needed to support measurement equivalence between electronic and paper-based patient-reported outcome (pro) measures: ISPOR ePRO good research practices task force report. Value in Health, 12, 419–429.
Bjorner, J. B., Rose, M., Gandek, B., Stone, A. A., Junghaenel, D. U., & Ware, J. E. (2014). Method of administration of PROMIS scales did not significantly impact score level, reliability, or validity. Journal of Clinical Epidemiology, 67, 108–113.
Alfonsson, S., Maathz, P., & Hursti, T. (2014). Interformat reliability of digital psychiatric self-report questionnaires: A systematic review. Journal of Medical Internet Research, 16, 86–97.
Bushnell, D. M., Martin, M. L., & Parasuraman, B. (2003). Electronic versus paper questionnaires: A further comparison in persons with asthma. Journal of Asthma, 40, 751–762.
Bliven, B. D., Kaufman, S. E., & Spertus, J. A. (2001). Electronic collection of health-related quality of life data: Validity, time benefits, and patient preference. Quality of Life Research, 10, 15–21.
Fritz, F., Balhorn, S., Riek, M., Breil, B., & Dugas, M. (2012). Qualitative and quantitative evaluation of EHR-integrated mobile patient questionnaires regarding usability and cost-efficiency. International Journal of Medical Informatics, 81, 303–313.
Swartz, R. J., de Moor, C., Cook, K. F., Fouladi, R. T., Basen-Engquist, K., Eng, C., et al. (2007). Mode effects in the center for epidemiologic studies depression (CES-D) scale: Personal digital assistant vs. paper and pencil administration. Quality of Life Research, 16, 803–813.
Kroenke, K., & Spitzer, R. L. (2002). The PHQ-9: A new depression diagnostic and severity measure. Psychiatric Annals, 32, 509–515.
Kroenke, K., Spitzer, R. L., & Williams, J. B. W. (2001). The PHQ-9—validity of a brief depression severity measure. Journal of General Internal Medicine, 16, 606–613.
Forkmann, T., Boecker, M., Norra, C., Eberle, N., Kircher, T., Schauerte, P., et al. (2009). Development of an item bank for the assessment of depression in persons with mental illnesses and physical diseases using Rasch analysis. Rehabilitation Psychology, 54, 186–197.
Tennant, A., & Conaghan, P. G. (2007). The Rasch measurement model in rheumatology: What is it and why use it? When should it be applied, and what should one look for in a Rasch paper? Arthritis & Rheumatism-Arthritis Care & Research, 57, 1358–1362.
Forkmann, T., Kroehne, U., Wirtz, M., Norra, C., Baumeister, H., Gauggel, S., et al. (2013). Adaptive screening for depression—recalibration of an item bank for the assessment of depression in persons with mental and somatic diseases and evaluation in a simulated computer-adaptive test environment. Journal of Psychosomatic Research, 75, 437–443.
Forkmann, T., Boecker, M., Wirtz, M., Glaesmer, H., Brahler, E., Norra, C., et al. (2010). Validation of the Rasch-based Depression Screening in a large scale German general population sample. Health and Quality of Life Outcomes, 8, 105.
Boecker, M., Elhan., A. H., Tennant, A., Wirtz, M., Eberle N., Gauggel, S. (2010). On the way to NeuroCAT: development and initial evaluation of the Aachen-ADL-item bank. Abstract.
Bayliss, E. A., Ellis, J. L., & Steiner, J. F. (2005). Subjective assessments of comorbidity correlate with quality of life health outcomes: Initial validation of a comorbidity assessment instrument. Health and Quality of Life Outcomes, 3, 51.
Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6, 431–444.
van der Linden, W. J., & Pashley, P. J. (2000). Item selection and ability estimation in adaptive testing. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice. Dordrecht: Kluwer.
Choi, S. W., & Swartz, R. J. (2009). Comparison of CAT item selection criteria for polytomous items. Applied Psychological Measurement, 33, 419–440.
Penfield, R. D. (2006). Applying Bayesian item selection approaches to adaptive tests using polytomous items. Applied Measurement in Education, 19, 1–20.
Kroenke, K., Spitzer, R. L., Williams, J. B. W., & Lowe, B. (2010). The patient health questionnaire somatic, anxiety, and depressive symptom scales: A systematic review. General Hospital Psychiatry, 32, 345–359.
Bayliss, E. A., Ellis, J. L., & Steiner, J. F. (2009). Senior´s self-reported multimorbidity captured biopsychosocial factors not incorporated into two other databased morbidity measures. Journal of Clinical Epidemiology, 62, 550–557.
Wittchen, H. U., Zaudig, M., & Fydrich, T. (1997). SKID-I und SKID-II. Strukturiertes klinisches interview für DSM-IV (1st ed.). Göttingen: Hogrefe.
Hox, J. J. (2010). Multilevel analysis: techniques and applications (2nd ed.). New York: Routledge.
Macmillan, N. A. (2002). Signal detection theory. In J. Wixted (Ed.), Methodology in experimental psychology. New York: Wiley.
Swets, J. A. (1988). Measuring the accuracy of diagnostic Systems. Science, 240, 1285–1293.
Andrich, D., Lyne, A., Sheridan, B., & Luo, G. (2010). RUMM: Rasch unidimensional measurement models software (2030). Computer software. Perth: RUMM Laboratory.
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174.
Shea, T. L., Tennant, A., & Pallant, J. F. (2009). Rasch model analysis of the Depression, Anxiety and Stress Scales (DASS). BMC Psychiatry, 9, 21.
Pallant, J. F., & Tennant, A. (2007). An introduction to the Rasch measurement model: An example using the Hospital Anxiety and Depression Scale (HADS). British Journal of Clinical Psychology, 46, 1–18.
Elhan, A. H., Oztuna, D., Kutlay, S., Kucukdeveci, A. A., & Tennant, A. (2008). An initial application of computerized adaptive testing (CAT) for measuring disability in patients with low back pain. BMC Musculoskeletal Disorders, 9, 166.
Andrich, D. (1988). Rasch models for measurement. London: Sage.
Linacre, J. M. (1998). Structure in Rasch residuals: why principal component analysis (PCA). Rasch Measurement Transactions, 12, 636.
Smith, R. M., & Miao, C. Y. (1994). Assessing unidimensionality for Rasch measurement. In M. Wilson (Ed.), Objective measurement: Theory into practice. Norwood NJ: Ablex Publishing Corporation.
Smith, E. V. J. (2002). Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. Journal of applied measurement, 3, 205–231.
Maddala, R. (1986). Limit-dependent and qualitative variables in economics (3rd ed.). Cambridge: Cambridge University Press.
Forkmann, T., Gauggel, S., Spangenberg, L., Brahler, E., & Glaesmer, H. (2013). Dimensional assessment of depressive severity in the elderly general population: Psychometric evaluation of the PHQ-9 using Rasch analysis. Journal of Affective Disorders, 148, 323–330.
Phelan, E., Williams, B., Meeker, K., Bonn, K., Frederick, J., LoGerfo, J., et al. (2010). A study of the diagnostic accuracy of the PHQ-9 in primary care elderly. BMC Family Practice, 11, 63.
Chen, S. L., Chiu, H. L., Xu, B. H., Ma, Y., Jin, T., Wu, M. H., et al. (2010). Reliability and validity of the PHQ-9 for screening late-life depression in Chinese primary care. International Journal of Geriatric Psychiatry, 25, 1127–1133.
Lamers, F., Jonkers, C. C. M., Bosma, H., Penninx, B. W. J. H., Knottnerus, J. A., & van Eijk, J. T. (2008). Summed score of the Patient Health Questionnaire-9 was a reliable and valid method for depression screening in chronically ill elderly patients. Journal of Clinical Epidemiology, 61, 679–687.
Spangenberg, L., Forkmann, T., Braehler, E., & Glaesmer, H. (2011). The association of depression and multimorbidity in the elderly: Implications for the assessment of depression. Psychogeriatrics, 11, 227–234.
Gilbody, S., Richards, D., Brealey, S., & Hewitt, C. (2007). Screening for depression in medical settings with the patient health questionnaire (PHQ): A diagnostic meta-analysis. Journal of General Internal Medicine, 22, 1596–1602.
Spangenberg, L., Forkmann, T., & Glaesmer, H. (2014). Actually I am not a computer literate …”: Usability and acceptance of depression measurement by tablet administration versus paper-pencil mode in German primary care patients >60 years. Nervenheilkunde, 33, 631–637.
Riedel-Heller, S. G., Busse, A., & Angermeyer, M. C. (2006). The state of mental health in old-age across the ‘old’ European Union—a systematic review. Acta Psychiatrica Scandinavica, 113, 388–401.
Djernes, J. K. (2006). Prevalence and predictors of depression in populations of elderly: A review. Acta Psychiatrica Scandinavica, 113, 372–387.
Acknowledgments
This study was funded by a junior research grant by the medical faculty, University of Leipzig.
Conflict of interest
None.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Spangenberg, L., Glaesmer, H., Boecker, M. et al. Differences in Patient Health Questionnaire and Aachen Depression Item Bank scores between tablet versus paper-and-pencil administration. Qual Life Res 24, 3023–3032 (2015). https://doi.org/10.1007/s11136-015-1040-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11136-015-1040-5