, Volume 21, Issue 1, pp 56-73
Date: 28 Jan 2011

Human vs. Computer Diagnosis of Students’ Natural Selection Knowledge: Testing the Efficacy of Text Analytic Software

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access


Our study examines the efficacy of Computer Assisted Scoring (CAS) of open-response text relative to expert human scoring within the complex domain of evolutionary biology. Specifically, we explored whether CAS can diagnose the explanatory elements (or Key Concepts) that comprise undergraduate students’ explanatory models of natural selection with equal fidelity as expert human scorers in a sample of >1,000 essays. We used SPSS Text Analysis 3.0 to perform our CAS and measure Kappa values (inter-rater reliability) of KC detection (i.e., computer–human rating correspondence). Our first analysis indicated that the text analysis functions (or extraction rules) developed and deployed in SPSSTA to extract individual Key Concepts (KCs) from three different items differing in several surface features (e.g., taxon, trait, type of evolutionary change) produced “substantial” (Kappa 0.61–0.80) or “almost perfect” (0.81–1.00) agreement. The second analysis explored the measurement of human–computer correspondence for KC diversity (the number of different accurate knowledge elements) in the combined sample of all 827 essays. Here we found outstanding correspondence; extraction rules generated using one prompt type are broadly applicable to other evolutionary scenarios (e.g., bacterial resistance, cheetah running speed, etc.). This result is encouraging, as it suggests that the development of new item sets may not necessitate the development of new text analysis rules. Overall, our findings suggest that CAS tools such as SPSS Text Analysis may compensate for some of the intrinsic limitations of currently used multiple-choice Concept Inventories designed to measure student knowledge of natural selection.