Skip to main content
Log in

Differences in Patient Health Questionnaire and Aachen Depression Item Bank scores between tablet versus paper-and-pencil administration

  • Published:
Quality of Life Research Aims and scope Submit manuscript

Abstract

Purpose

To extend knowledge about measurement equivalence of depression measures assessed by tablet and paper–pencil administration, the present study evaluated the effect of mode of administration (MoA) on scale and item level for the Patient Health Questionnaire (PHQ-9) and the Aachen Depression Item Bank (ADIB) in elderly patients.

Methods

Primary care patients (N = 193, ≥60 years) were assessed following a crossover design in Leipzig, Germany. All participants filled out the PHQ-9 and the ADIB in both MoAs under study. Effects of MoA were analyzed by intra-class correlation, mixed-effects regression, and differential item functioning (DIF). Additionally, detection rates between both MoAs were compared using receiver operating characteristics analysis compared to a diagnostic interview (SCID-I, N = 163).

Results

No effect of MoA was found in the PHQ-9 on scale score or item level. Two ADIB items showed DIF according to MoA. In terms of discriminatory power, MoA did not influence detection rates of both instruments.

Conclusions

In summary, our findings suggest that no severe effect of mode of administration on self-report assessments of depression should be expected. It can be concluded that tablets provide a valid way to electronically assess depressive symptoms in elderly patients. Yet changes in item presentation can influence the psychometric properties and require equivalence testing using sophisticated analyses on item level such as DIF.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (France)

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

References

  1. Gwaltney, C. J., Shields, A. L., & Shiffman, S. (2008). Equivalence of electronic and paper-and-pencil administration of patient-reported outcome measures: A meta-analytic review. Value in Health, 11, 322–333.

    Article  PubMed  Google Scholar 

  2. Ebner-Priemer, U. W., Kubiak, T., & Pawlik, K. (2009). Ambulatory assessment introduction. European Psychologist, 14, 95–97.

    Article  Google Scholar 

  3. Gibbons, R. D., Weiss, D. J., Kupfer, D. J., Frank, E., Fagiolini, A., Grochocinski, V. J., et al. (2008). Using computerized adaptive testing to reduce the burden of mental health assessment. Psychiatric Services, 59, 361–368.

    Article  PubMed Central  PubMed  Google Scholar 

  4. Forkmann, T., Boecker, M., Wirtz, M., Frey, C., & Gauggel, S. (2010). Adaptives testen in der psychotherapie: Das Rasch-basierte adaptive depressionsscreening (A-DESC). Zeitschrift für Klinische Diagnostik und Evaluation, 3, 59–75.

    Google Scholar 

  5. Kurt, R., Bogner, H. R., Straton, J. B., Tien, A. Y., & Gallo, J. J. (2004). Computer-assisted assessment of depression and function in older primary care patients. Computer Methods and Programs in Biomedicine, 73, 165–171.

    Article  PubMed Central  PubMed  Google Scholar 

  6. Goldstein, L. A., Gibbons, M. B. C., Thompson, S. M., Scott, K., Heintz, L., Green, P., et al. (2011). Outcome assessment via handheld computer in community mental health: Consumer satisfaction and reliability. Journal of Behavioral Health Services & Research, 38, 414–423.

    Article  Google Scholar 

  7. Velikova, G., Wright, E. P., Smith, A. B., Cull, A., Gould, A., Forman, D., et al. (1999). Automated collection of quality-of-life data: A comparison of paper and computer touch-screen questionnaires. Journal of Clinical Oncology, 17, 998–1007.

    CAS  PubMed  Google Scholar 

  8. Coons, S. J., Gwaltney, C. J., Hays, R. D., Lundy, J. J., Sloan, J. A., Revicki, D. A., et al. (2009). Recommendations on evidence needed to support measurement equivalence between electronic and paper-based patient-reported outcome (pro) measures: ISPOR ePRO good research practices task force report. Value in Health, 12, 419–429.

    Article  PubMed  Google Scholar 

  9. Bjorner, J. B., Rose, M., Gandek, B., Stone, A. A., Junghaenel, D. U., & Ware, J. E. (2014). Method of administration of PROMIS scales did not significantly impact score level, reliability, or validity. Journal of Clinical Epidemiology, 67, 108–113.

    Article  PubMed Central  PubMed  Google Scholar 

  10. Alfonsson, S., Maathz, P., & Hursti, T. (2014). Interformat reliability of digital psychiatric self-report questionnaires: A systematic review. Journal of Medical Internet Research, 16, 86–97.

    Article  Google Scholar 

  11. Bushnell, D. M., Martin, M. L., & Parasuraman, B. (2003). Electronic versus paper questionnaires: A further comparison in persons with asthma. Journal of Asthma, 40, 751–762.

    Article  CAS  PubMed  Google Scholar 

  12. Bliven, B. D., Kaufman, S. E., & Spertus, J. A. (2001). Electronic collection of health-related quality of life data: Validity, time benefits, and patient preference. Quality of Life Research, 10, 15–21.

    Article  CAS  PubMed  Google Scholar 

  13. Fritz, F., Balhorn, S., Riek, M., Breil, B., & Dugas, M. (2012). Qualitative and quantitative evaluation of EHR-integrated mobile patient questionnaires regarding usability and cost-efficiency. International Journal of Medical Informatics, 81, 303–313.

    Article  PubMed  Google Scholar 

  14. Swartz, R. J., de Moor, C., Cook, K. F., Fouladi, R. T., Basen-Engquist, K., Eng, C., et al. (2007). Mode effects in the center for epidemiologic studies depression (CES-D) scale: Personal digital assistant vs. paper and pencil administration. Quality of Life Research, 16, 803–813.

    Article  PubMed  Google Scholar 

  15. Kroenke, K., & Spitzer, R. L. (2002). The PHQ-9: A new depression diagnostic and severity measure. Psychiatric Annals, 32, 509–515.

    Article  Google Scholar 

  16. Kroenke, K., Spitzer, R. L., & Williams, J. B. W. (2001). The PHQ-9—validity of a brief depression severity measure. Journal of General Internal Medicine, 16, 606–613.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  17. Forkmann, T., Boecker, M., Norra, C., Eberle, N., Kircher, T., Schauerte, P., et al. (2009). Development of an item bank for the assessment of depression in persons with mental illnesses and physical diseases using Rasch analysis. Rehabilitation Psychology, 54, 186–197.

    Article  PubMed  Google Scholar 

  18. Tennant, A., & Conaghan, P. G. (2007). The Rasch measurement model in rheumatology: What is it and why use it? When should it be applied, and what should one look for in a Rasch paper? Arthritis & Rheumatism-Arthritis Care & Research, 57, 1358–1362.

    Article  Google Scholar 

  19. Forkmann, T., Kroehne, U., Wirtz, M., Norra, C., Baumeister, H., Gauggel, S., et al. (2013). Adaptive screening for depression—recalibration of an item bank for the assessment of depression in persons with mental and somatic diseases and evaluation in a simulated computer-adaptive test environment. Journal of Psychosomatic Research, 75, 437–443.

    Article  PubMed  Google Scholar 

  20. Forkmann, T., Boecker, M., Wirtz, M., Glaesmer, H., Brahler, E., Norra, C., et al. (2010). Validation of the Rasch-based Depression Screening in a large scale German general population sample. Health and Quality of Life Outcomes, 8, 105.

    Article  PubMed Central  PubMed  Google Scholar 

  21. Boecker, M., Elhan., A. H., Tennant, A., Wirtz, M., Eberle N., Gauggel, S. (2010). On the way to NeuroCAT: development and initial evaluation of the Aachen-ADL-item bank. Abstract.

  22. Bayliss, E. A., Ellis, J. L., & Steiner, J. F. (2005). Subjective assessments of comorbidity correlate with quality of life health outcomes: Initial validation of a comorbidity assessment instrument. Health and Quality of Life Outcomes, 3, 51.

    Article  PubMed Central  PubMed  Google Scholar 

  23. Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6, 431–444.

    Article  Google Scholar 

  24. van der Linden, W. J., & Pashley, P. J. (2000). Item selection and ability estimation in adaptive testing. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice. Dordrecht: Kluwer.

    Chapter  Google Scholar 

  25. Choi, S. W., & Swartz, R. J. (2009). Comparison of CAT item selection criteria for polytomous items. Applied Psychological Measurement, 33, 419–440.

    Article  PubMed Central  PubMed  Google Scholar 

  26. Penfield, R. D. (2006). Applying Bayesian item selection approaches to adaptive tests using polytomous items. Applied Measurement in Education, 19, 1–20.

    Article  Google Scholar 

  27. Kroenke, K., Spitzer, R. L., Williams, J. B. W., & Lowe, B. (2010). The patient health questionnaire somatic, anxiety, and depressive symptom scales: A systematic review. General Hospital Psychiatry, 32, 345–359.

    Article  PubMed  Google Scholar 

  28. Bayliss, E. A., Ellis, J. L., & Steiner, J. F. (2009). Senior´s self-reported multimorbidity captured biopsychosocial factors not incorporated into two other databased morbidity measures. Journal of Clinical Epidemiology, 62, 550–557.

    Article  PubMed Central  PubMed  Google Scholar 

  29. Wittchen, H. U., Zaudig, M., & Fydrich, T. (1997). SKID-I und SKID-II. Strukturiertes klinisches interview für DSM-IV (1st ed.). Göttingen: Hogrefe.

    Google Scholar 

  30. Hox, J. J. (2010). Multilevel analysis: techniques and applications (2nd ed.). New York: Routledge.

    Google Scholar 

  31. Macmillan, N. A. (2002). Signal detection theory. In J. Wixted (Ed.), Methodology in experimental psychology. New York: Wiley.

    Google Scholar 

  32. Swets, J. A. (1988). Measuring the accuracy of diagnostic Systems. Science, 240, 1285–1293.

    Article  CAS  PubMed  Google Scholar 

  33. Andrich, D., Lyne, A., Sheridan, B., & Luo, G. (2010). RUMM: Rasch unidimensional measurement models software (2030). Computer software. Perth: RUMM Laboratory.

    Google Scholar 

  34. Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174.

    Article  Google Scholar 

  35. Shea, T. L., Tennant, A., & Pallant, J. F. (2009). Rasch model analysis of the Depression, Anxiety and Stress Scales (DASS). BMC Psychiatry, 9, 21.

    Article  PubMed Central  PubMed  Google Scholar 

  36. Pallant, J. F., & Tennant, A. (2007). An introduction to the Rasch measurement model: An example using the Hospital Anxiety and Depression Scale (HADS). British Journal of Clinical Psychology, 46, 1–18.

    Article  PubMed  Google Scholar 

  37. Elhan, A. H., Oztuna, D., Kutlay, S., Kucukdeveci, A. A., & Tennant, A. (2008). An initial application of computerized adaptive testing (CAT) for measuring disability in patients with low back pain. BMC Musculoskeletal Disorders, 9, 166.

    Article  PubMed Central  PubMed  Google Scholar 

  38. Andrich, D. (1988). Rasch models for measurement. London: Sage.

    Google Scholar 

  39. Linacre, J. M. (1998). Structure in Rasch residuals: why principal component analysis (PCA). Rasch Measurement Transactions, 12, 636.

    Google Scholar 

  40. Smith, R. M., & Miao, C. Y. (1994). Assessing unidimensionality for Rasch measurement. In M. Wilson (Ed.), Objective measurement: Theory into practice. Norwood NJ: Ablex Publishing Corporation.

  41. Smith, E. V. J. (2002). Detecting and evaluating the impact of multidimensionality using item fit statistics and principal component analysis of residuals. Journal of applied measurement, 3, 205–231.

    PubMed  Google Scholar 

  42. Maddala, R. (1986). Limit-dependent and qualitative variables in economics (3rd ed.). Cambridge: Cambridge University Press.

    Google Scholar 

  43. Forkmann, T., Gauggel, S., Spangenberg, L., Brahler, E., & Glaesmer, H. (2013). Dimensional assessment of depressive severity in the elderly general population: Psychometric evaluation of the PHQ-9 using Rasch analysis. Journal of Affective Disorders, 148, 323–330.

    Article  PubMed  Google Scholar 

  44. Phelan, E., Williams, B., Meeker, K., Bonn, K., Frederick, J., LoGerfo, J., et al. (2010). A study of the diagnostic accuracy of the PHQ-9 in primary care elderly. BMC Family Practice, 11, 63.

    Article  PubMed Central  PubMed  Google Scholar 

  45. Chen, S. L., Chiu, H. L., Xu, B. H., Ma, Y., Jin, T., Wu, M. H., et al. (2010). Reliability and validity of the PHQ-9 for screening late-life depression in Chinese primary care. International Journal of Geriatric Psychiatry, 25, 1127–1133.

    Article  PubMed  Google Scholar 

  46. Lamers, F., Jonkers, C. C. M., Bosma, H., Penninx, B. W. J. H., Knottnerus, J. A., & van Eijk, J. T. (2008). Summed score of the Patient Health Questionnaire-9 was a reliable and valid method for depression screening in chronically ill elderly patients. Journal of Clinical Epidemiology, 61, 679–687.

    Article  PubMed  Google Scholar 

  47. Spangenberg, L., Forkmann, T., Braehler, E., & Glaesmer, H. (2011). The association of depression and multimorbidity in the elderly: Implications for the assessment of depression. Psychogeriatrics, 11, 227–234.

    Article  PubMed  Google Scholar 

  48. Gilbody, S., Richards, D., Brealey, S., & Hewitt, C. (2007). Screening for depression in medical settings with the patient health questionnaire (PHQ): A diagnostic meta-analysis. Journal of General Internal Medicine, 22, 1596–1602.

    Article  PubMed Central  PubMed  Google Scholar 

  49. Spangenberg, L., Forkmann, T., & Glaesmer, H. (2014). Actually I am not a computer literate …”: Usability and acceptance of depression measurement by tablet administration versus paper-pencil mode in German primary care patients >60 years. Nervenheilkunde, 33, 631–637.

    Google Scholar 

  50. Riedel-Heller, S. G., Busse, A., & Angermeyer, M. C. (2006). The state of mental health in old-age across the ‘old’ European Union—a systematic review. Acta Psychiatrica Scandinavica, 113, 388–401.

    Article  CAS  PubMed  Google Scholar 

  51. Djernes, J. K. (2006). Prevalence and predictors of depression in populations of elderly: A review. Acta Psychiatrica Scandinavica, 113, 372–387.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

This study was funded by a junior research grant by the medical faculty, University of Leipzig.

Conflict of interest

None.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lena Spangenberg.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Spangenberg, L., Glaesmer, H., Boecker, M. et al. Differences in Patient Health Questionnaire and Aachen Depression Item Bank scores between tablet versus paper-and-pencil administration. Qual Life Res 24, 3023–3032 (2015). https://doi.org/10.1007/s11136-015-1040-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11136-015-1040-5

Keywords

Navigation