Quality of Life Research

, 16:143 | Cite as

Development and evaluation of a computer adaptive test for ‘Anxiety’ (Anxiety-CAT)

  • Otto B. Walter
  • Janine Becker
  • Jakob B. Bjorner
  • Herbert Fliege
  • Burghard F. Klapp
  • Matthias Rose


Within the framework of item response theory (IRT), we developed a German version of an item bank, as well as a software application that can be employed to measure anxiety by means of a computer adaptive test (CAT). A sample of n = 2348 psychiatric and psychosomatic patients answered a set of up to 13 standardized questionnaires. 81 items drawn from these questionnaires were considered pertinent to the anxiety construct. Various tests were conducted to ensure the suitability of these items for an IRT-based assessment. After these tests, 50 items remained in the item bank and were calibrated using the Generalized Partial Credit Model. Simulation studies conducted on an independent sample of n = 1528 respondents indicate that 6–8 items suffice to measure the latent trait with high precision (standard error ≤ 0.32). CAT scores correlated highly with scores estimated from all available items (r = .97) and scale scores of the State Trait Anxiety Inventory (STAI, state scale, r = .93). Within a routine clinical setting, 102 in-patients answered the Anxiety-CAT along with a number of established anxiety questionnaires. The correlation between the Anxiety-CAT and the STAI state scale was still high (r = .60), but lower than the correlations found in the simulation studies. The Anxiety-CAT was able to differentiate between mental health disorders in a similar manner as established questionnaires. These results suggest that the Anxiety-CAT does indeed exhibit the advantages expected from theory, but the results of further studies are needed in order to judge its full potential for research and clinical practice.


Item response theory Computer adaptive testing Anxiety Questionnaire 


  1. 1.
    Embretson, S. E. (1996). The new rules of measurement. Psychological Assessment, 8, 341–349.CrossRefGoogle Scholar
  2. 2.
    Meijer, R. R. & Nering, M. L. (1999). Computerized adaptive testing. Overview and introduction. Applied Psychological Measurement, 23, 187–194.CrossRefGoogle Scholar
  3. 3.
    Fries, J. F., Bruce, B. & Cella, D. (2005). The promise of PROMIS: Using item response theory to improve assessment of patient-reported outcomes. Clinical and Experimental Rheumatology, 23, 33–37.Google Scholar
  4. 4.
    Cella, D. F. & Chang, C. H. (2000). A discussion of item response theory and its applications in health status assessment. Medical Care, 38, 66–72.CrossRefGoogle Scholar
  5. 5.
    Handel, R., Ben-Porath, Y. & Watt, M. (1999). Computerized adaptive assessment with the MMPI-2 in a clinical setting. Psychological Assessment, 11, 369–380.CrossRefGoogle Scholar
  6. 6.
    Ware, J. E., Kosinski, M., Bjorner, J. B., Bayliss, M. S., Batenhorst, A., Dahlöf, C. G. H., Tepper, S. & Dowson. A. (2003). Applications of computerized adaptive testing (CAT) to the assessment of headache impact. Quality of Life Research, 12, 935–952.PubMedCrossRefGoogle Scholar
  7. 7.
    Fliege, H., Becker, J., Walter, O. B., Bjorner, J. B., Klapp, B. F. & Rose, M. (2005). Development of a computer-adaptive test for depression (D-CAT). Quality of Life Research, 14, 2277–2291.PubMedCrossRefGoogle Scholar
  8. 8.
    Rose, M., Bjorner, J. B., Becker, J., Fries, J. F. & Ware, J. E. (2007). Evaluation of a preliminary physical function item bank supports the expected advantages of the patient-reported outcomes measurement information system (PROMIS). Journal of Clinical Epidemiology (in press).Google Scholar
  9. 9.
    Walter, O. B., Becker, J., Fliege, H., Bjorner, J. B., Kosinski, M., Walter, M., Klapp, B. F. & Rose, M. (2005). Entwicklungsschritte für einen computeradaptiven Test zur Erfassung von Angst (A-CAT). Diagnostica, 51, 88–100.CrossRefGoogle Scholar
  10. 10.
    Häcker, H. & Stapf, K. H. (1998). Dorsch Psychologisches Wörterbuch. Bern: Huber.Google Scholar
  11. 11.
    Spielberger, C. D. (1972). Anxiety. Current trends in theory and research. Oxford: Academic Press.Google Scholar
  12. 12.
    Dilling, H., Mombour, W. & Schmidt, M. H. (1999). Internationale Klassifikation psychischer Störungen. ICD-10 Kapitel V (F). Klinisch-diagnostische Leitlinien (3rd ed.). Bern: Huber.Google Scholar
  13. 13.
    Hautzinger, M. & Bailer, M. (1993). ADS. Allgemeine Depressionsskala. Testmappe mit Handanweisung. Weinheim: Beltz.Google Scholar
  14. 14.
    Hautzinger, M., Bailer, M., Worall, H. & Keller, F. (1994). BDI. Beck-Depressions-Inventar. ( A. T. Beck, C. H. Ward, M. Mendelson, J. Mock & J. Erbaugh, 1961). Testmappe mit Manual. Bern: Huber.Google Scholar
  15. 15.
    Bullinger, M., Kirchberger, I. & von Steinbuechel, N. (1993). Der Fragebogen Alltagsleben – ein Verfahren zur Erfassung der gesundheitsbezogenen Lebensqualität. Zeitschrift für Medizinische Psychologie: ZMP, 2, 121–131.Google Scholar
  16. 16.
    Hörhold, M. & Klapp, B. F. (1993). Testungen der Invarianz und der Hierarchie eines mehrdimensionalen Stimmungsmodells auf der Basis von Zweipunkterhebungen an Patienten- und Studentenstichproben. Zeitschrift für Medizinische Psychologie: ZMP, 2, 27–35.Google Scholar
  17. 17.
    Brähler, E. & Scheer, J. W. (1995). GBB. Gießener Beschwerdebogen. Testmappe mit 2., ergänzter und revidierter Auflage des Handbuchs. Bern: Huber.Google Scholar
  18. 18.
    Beckmann, D., Brähler, E. & Richter, H. E. (1991). Der Gießen-Test (GT). Ein Test für Individual- und Guppendiagnostik. Bern: Huber.Google Scholar
  19. 19.
    Deneke, F. W. & Hilgenstock, B. (1989). NI. Das Narzissmusinventar. Testmappe mit Handanweisung. Bern: Huber.Google Scholar
  20. 20.
    Schoeneich, F., Rose, M., Danzer, G., Thier, P., Weber, C. & Klapp, B. F. (2000). Narzissmusinventar-90 (NI-90). Empiriegeleitete Itemreduktion und Identifikation veränderungssensitiver Items des Narzissmusinventars zur Messung selbstregulativer Parameter. Psychotherapie, Psychosomatik, Medizinische Psychologie, 50, 396–405.PubMedCrossRefGoogle Scholar
  21. 21.
    Ludwig, M., Geier, S. & Bullinger, M. (1990). Skalen zur Erfassung des Wohlbefindens: Psychometrisches Analysen zum “Profile of Mood States” (POMS) und zum “Psychological General Well-Being Index” (PGWI). Zeitschrift für Differentielle und Diagnostische Psychologie, 11, 53–61.Google Scholar
  22. 22.
    Levenstein, S., Prantera, C., Varvo, V. & Scribano, M. L. (1993). Development of the perceived stress questionnaire: A new tool for psychosomatic research. Journal of Psychosomatic Research, 37, 19–32.PubMedCrossRefGoogle Scholar
  23. 23.
    Bullinger, M. & Kirchberger, I. (1998). SF-36. Fragebogen zum Gesundheitszustand. Testmappe mit Handanweisung. Göttingen: Hogrefe.Google Scholar
  24. 24.
    Faller, H. (1997). Subjektive Krankheitstheorien bei Patienten einer psychotherapeutischen Ambulanz. Zeitschrift für Klinische Psychologie und Psychotherapie, 45, 264–278.Google Scholar
  25. 25.
    Laux, L., Glanzmann, P., Schaffner, P. & Spielberger, C. D. (1981). STAI. State-Trait Anxiety Inventory. ( C. D. Spielberger, R. L. Gorsuch, R. E. Lushene 1970). Testmappe mit Handanweisung. Weinheim: Beltz.Google Scholar
  26. 26.
    Scholler, G., Fliege, H. & Klapp, B. F. (1999). Fragebogen zu Selbstwirksamkeit, Optimismus und Pessimismus. Restrukturierung, Itemselektion und Validierung eines Instrumentes an Untersuchungen klinischer Stichproben. Psychotherapie, Psy-cho-so-ma-tik, Medizinische Psychologie, 49, 275–283.Google Scholar
  27. 27.
    Rose, M., Hess, V., Hoerhold, M., Brähler, E. & Klapp, B. F. (1999). Mobile computergestützte psychometrische Diagnostik. Ökonomische Vorteile und Ergebnisse zur Teststabilität. Psychotherapie, Psychosomatik, Medizinische Psychologie, 49, 202–207.PubMedGoogle Scholar
  28. 28.
    Rose, M., Walter, O. B., Fliege, H., Becker, J., Hess, V. & Klapp, B. F. (2002). 7 years of experience using Personal Digital Assistants (PDA) for psychometric diagnostics in 6000 inpatients and polyclinic patients. In H. B. Bludau & A. Koop (Eds.), Mobile computing in medicine. Lecture notes in informatics (pp. 35–44). Bonn: Köllen.CrossRefGoogle Scholar
  29. 29.
    Bjorner, J., Kosinski, M. & Ware, J. E. (2003). Calibration of an item pool for assessing the burden of headaches: An application of item response theory to the Headache Impact Test (HIT-superTM). Quality of Life Research, 12, 913–933.PubMedCrossRefGoogle Scholar
  30. 30.
    Ware, J. E., Bjorner, J. B. & Kosinski, M. (2000). Practical implications of item response theory and computerized adaptive testing. A brief summary of ongoing studies of widely used headache impact scales. Medical Care, 38, II73–II82.PubMedCrossRefGoogle Scholar
  31. 31.
    Bock, R. D. & Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6, 431–444.CrossRefGoogle Scholar
  32. 32.
    Wang, S. (1999). The accuracy of ability estimation methods for computerized adaptive testing using the generalized partial credit model. PA: Unpublished doctoral dissertation, University of Pittsburgh.Google Scholar
  33. 33.
    Gardner, W., Kelleher, K. J. & Pajer, K. A. (2002). Multidimensional adaptive testing for mental health problems in primary care. Medical Care, 40, 812–823.PubMedCrossRefGoogle Scholar
  34. 34.
    Wittchen, H. U., Kessler, R. C., Zhao, S. & Abelson, J. (1995). Reliability and clinical validity of UM-CIDI DSM-III-R generalized anxiety disorder. Journal of Psychiatric Research, 29, 95–110.PubMedCrossRefGoogle Scholar
  35. 35.
    Wittchen, H. U., Pfister, H. & Garctynski, E. (1998). Composite International Diagnostic Interview (CIDI). Göttingen: Hogrefe.Google Scholar
  36. 36.
    Muthén, L. K. & Muthén, B. O. (2004). Mplus. The comprehensive modeling program for applied researchers. Users guide. Los Angeles: Muthén & Muthén.Google Scholar
  37. 37.
    Hu, L. & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1–55.CrossRefGoogle Scholar
  38. 38.
    MacCallum, R., Browne, M. & Sugawara, H. (1996). Power analysis and determination of sample size for covariance structure modeling. Psychological Methods, 1, 130–149.CrossRefGoogle Scholar
  39. 39.
    Ramsay, J. O. (1995). TestGraf. A program for the graphical analysis of multiple choice test and questionnaire data. Montreal: McGill University.Google Scholar
  40. 40.
    Holland, P. W. & Wainer, H. (1993). Differential item functioning. Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
  41. 41.
    Swaminathan, H. & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361–370.CrossRefGoogle Scholar
  42. 42.
    Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and likert-type (Ordinal) item scores. Ottawa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.Google Scholar
  43. 43.
    Nagelkerke, N. J. D. (1991). A note on a general definition of the coefficient of determination. Biometrika, 78, 691–692.CrossRefGoogle Scholar
  44. 44.
    Bjorner, J., Kosinski, M. & Ware, J. E. (2003). The feasibility of applying item response theory to measures of migraine impact: A re-analysis of three clinical studies. Quality of Life Research, 12, 887–902.PubMedCrossRefGoogle Scholar
  45. 45.
    Jodoin, M. & Gierl, M. (2001). Evaluating type I error and power rates using an effect size measure with the logistic regression procedure for DIF detection. Applied Measurement in Education, 14, 329–349.CrossRefGoogle Scholar
  46. 46.
    Muraki, E. (1992). A Generalized partial credit model: Application of an EM algorithm. Applied Psychological Measurement, 16, 159–176.CrossRefGoogle Scholar
  47. 47.
    Muraki, E. & Bock, R. D. (1999). PARSCALE: Analysis of graded responses and ratings. Chicago, IL: Scientific Software Int., Inc.Google Scholar
  48. 48.
    Mineka, S., Watson, D. & Clark, L. A. (1998). Comorbidity of anxiety and unipolar mood disorders. Annual Review of Psychology, 49, 377–412.PubMedCrossRefGoogle Scholar
  49. 49.
    Zinbarg, R. & Barlow, D. (1996). Structure of anxiety and the anxiety disorders: A hierarchical model. Journal of Abnormal Psychology, 105, 181–193.PubMedCrossRefGoogle Scholar
  50. 50.
    Brown, T., Chorpita, B. & Barlow, D. (1998). Structural relationships among dimensions of the DSM-IV anxiety and mood disorders and dimensions of negative affect, positive affect, and autonomic arousal. Journal of Abnormal Psychology, 107, 179–192.PubMedCrossRefGoogle Scholar
  51. 51.
    Chorpita, B., Albano, A. & Barlow, D. (1998). The structure of negative emotions in a clinical sample of children and adolescents. Journal of Abnormal Psychology, 107, 74–85.PubMedCrossRefGoogle Scholar
  52. 52.
    Joiner, T., Catanzaro, S. & Laurent, J. (2004). Tripartite structure of positive and negative affect, depression, and anxiety in child and adolescent psychiatric inpatiens. Journal of Abnormal Psychology, 105, 401–409.CrossRefGoogle Scholar
  53. 53.
    Kessler, R. C., Nelson, C. B., McGonagle, K. A. & Liu, J. (1996). Comorbidity of DSM-III-R major depressive disorder in the general population: results from the US National Comorbidity Survey. The British Journal of Psychiatry. Supplement, 168, 17–30.Google Scholar
  54. 54.
    Hornke, L. F. (1999). Benefits from computerized adaptive testing as seen in simulation studies. European Journal of Psychological Assessment, 15, 91–98.CrossRefGoogle Scholar
  55. 55.
    Wainer, H., Dorans, N. J., Green, B. F., Steinberg, L., Flaugher, R., Mislevy, R. J. & Thissen, D. (1990). Computerized adaptive testing: A primer. Hillsdale: Lawrence Erlbaum Associates.Google Scholar
  56. 56.
    Cooke, D. J., Michie, C., Hart, S. D. & Hare, R. D. (1999). Evaluating the screening version of the Hare psychopathy checklist – Revised (PCL:SV): An item response theory analysis. Psychological Assessment, 11, 3–13.CrossRefGoogle Scholar
  57. 57.
    Hambleton, R. K., Swaminathan, H. & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park, CA: Sage Publications Inc.Google Scholar
  58. 58.
    McDonald, R. P. (1989). Future directions for item response theory. International Journal of Educational Research, 13, 205–222.CrossRefGoogle Scholar
  59. 59.
    Dahlstrom, W. G., Brooks, J. D. & Peterson, C. D. (1990). The Beck depression inventory: Item order and the impact of response sets. Journal of Personality Assessment, 55, 224–233.PubMedCrossRefGoogle Scholar
  60. 60.
    Walter, O. B., Fliege, H. & Rose, M. (2005). Effect of item order on item calibration and item bank construction for computer adaptive tests. Quality of Life Research, 14, 2013.Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2007

Authors and Affiliations

  • Otto B. Walter
    • 1
  • Janine Becker
    • 2
    • 3
    • 4
  • Jakob B. Bjorner
    • 2
    • 3
  • Herbert Fliege
    • 4
  • Burghard F. Klapp
    • 4
  • Matthias Rose
    • 2
    • 3
  1. 1.Psychological Institute IV, Statistics and Quantitative MethodsUniversity of MünsterMünsterGermany
  2. 2.Health Assessment LabWalthamUSA
  3. 3.QualityMetric Inc.LincolnUSA
  4. 4.Department of Psychosomatic Medicine and Psychotherapy, Clinic for Internal MedicineCharité, University Medicine BerlinCharité, BerlinGermany

Personalised recommendations