Refined Distractor Generation with LSA and Stylometry for Automated Multiple Choice Question Generation

  • Josef Robert Moser
  • Christian Gütl
  • Wei Liu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7691)


As lifelong learning becomes increasingly important in our society, mechanisms allowing students to evaluate their progress must be provided. A commonly used and widely accepted feedback mechanism is the multiple-choice test. Manual creation of multiple choice questions is often a time consuming process involving many iterations of trail and error. Using text processing and natural language processing techniques, automated multiple choice question generation, in recent years, is getting much closer to reality than ever. However, one of the most difficult tasks in both manual creation and automated generation of this kind of tests is the creation of distractors, because unsuitable distractors allow students to easily guess the correct answer, which counteracts the goal of these questions. In this paper, we investigated the desired properties of distractors and identified relevant text processing algorithms, specifically, latent semantic analysis and stylometry, for distractor selection. The refined distrators are compared with baseline distrators generated by our existing Automated Question Creator (AQC). Our preliminary evaluation shows that this novel combined approach produces distractors with a higher quality than those of the baseline AQC system.


Automated Multiple-Choice Question Generation Distractors Text Processing Latent Semantic Analysis Stylometry 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abbasi, A., Chen, H.: Writeprints: A stylometric approach to identity-level identification and similarity detection in cyberspace. ACM Trans. Inf. Syst. 26, 7:1–7:29 (2008)Google Scholar
  2. 2.
    Aldabe, I., Maritxalar, M.: Automatic Distractor Generation for Domain Specific Texts. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds.) IceTAL 2010. LNCS, vol. 6233, pp. 27–38. Springer, Heidelberg (2010), CrossRefGoogle Scholar
  3. 3.
    Alias-i: Lingpipe 4.0.0 (2010), (access date: December 16, 2010)
  4. 4.
    Bransford, J.D., Brown, A.L., Cocking, R.R. (eds.): How People Learn: Brain, Mind, Experience, and School: Expanded Edition. The National Academies Press (2000)Google Scholar
  5. 5.
    Brown, J.C., Frishkoff, G.A., Eskenazi, M.: Automatic question generation for vocabulary assessment. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, pp. 819–826. Association for Computational Linguistics, Morristown (2005), Google Scholar
  6. 6.
    Chang, V., Gütl, C.: Generation y learning in the 21st century: Integration of virtual worlds and cloud computing services. In: Abas, Z.W., Jung, I., Luca, J. (eds.) Proceedings of Global Learn Asia Pacific 2010, pp. 1888–1897. AACE, Penang (2010)Google Scholar
  7. 7.
    Chen, C.Y., Liou, H.C., Chang, J.S.: Fast: an automatic generation system for grammar tests. In: Proceedings of the COLING/ACL on Interactive Presentation Sessions, pp. 1–4. Association for Computational Linguistics, Morristown (2006), CrossRefGoogle Scholar
  8. 8.
    Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: A framework and graphical development environment for robust NLP tools and applications. In: Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (2002)Google Scholar
  9. 9.
    Gütl, C.: Automatic limited-choice and completion test creation, assessment and feedback in modern learning processes. In: LRN Conference 2008, Guatemala (February 2008)Google Scholar
  10. 10.
    Gütl, C., Chang, V.: Ecosystem-based theoretical models for learning in environments of the 21st century. International Journal of Emerging Technologies in Learning (iJET) 3(1) (2008)Google Scholar
  11. 11.
    Gütl, C., Lankmayr, K., Weinhofer, J.: Enhanced approach of automatic creation of test items to foster modern learning setting. In: Proc. of the 9th European Conference on e-Learning, pp. 225–234 (November 2010)Google Scholar
  12. 12.
    Gütl, C., Lankmayr, K., Weinhofer, J., Höfler, M.: Enhanced approach of automatic test item creation to foster modern e-education. Electronic Journal of e-Learning (2011) (to appear)Google Scholar
  13. 13.
    Haladyna, T.M.: Developing and Validating Multiple-Choice Test Items, 3rd edn. Lawrence Erlbaum Associates (2004)Google Scholar
  14. 14.
    Hendrikson, C.: Project management for construction (2000), (access date: November 17, 2010)
  15. 15.
    Holmes, D.: Authorship attribution. Computers and the Humanities 28, 87–106 (1994), CrossRefGoogle Scholar
  16. 16.
    Holmes, D.: The Evolution of Stylometry in Humanities Scholarship. Literary and Linguistic Computing 13(3), 111–117 (1998), CrossRefGoogle Scholar
  17. 17.
    Klein, S., Simmons, R.F.: A computational approach to grammatical coding of english words. J. ACM 10, 334–347 (1963), zbMATHCrossRefGoogle Scholar
  18. 18.
    Landauer, T.K., Foltz, P.W., Laham, L.: An introduction to latent semantic analysis. Discourse Processes 25, 259–284 (1998)CrossRefGoogle Scholar
  19. 19.
    Mitkov, R., Ha, L.A.: Computer-aided generation of multiple-choice tests. In: Proceedings of the HLT-NAACL 03 Workshop on Building Educational Applications Using Natural Language Processing, vol. 2, pp. 17–22. Association for Computational Linguistics, Morristown (2003), CrossRefGoogle Scholar
  20. 20.
    Mitkov, R., Ha, L.A., Varga, A., Rello, L.: Semantic similarity of distractors in multiple-choice tests: extrinsic evaluation. In: GEMS 2009: Proceedings of the Workshop on Geometrical Models of Natural Language Semantics, pp. 49–56. Association for Computational Linguistics, Morristown (2009)CrossRefGoogle Scholar
  21. 21.
    Naber, D.: A Rule-Based Style and Grammar Checker. Diploma thesis, University of Bielefeld (2003), (access date: January 2011)
  22. 22.
    Schutz, A.: Xtrak4me - extraction of keyphrases for metadata creation. SmILE: Semantic Information Systems and Language Engineering Group (2008), (access date: November 17, 2010)

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Josef Robert Moser
    • 1
    • 3
  • Christian Gütl
    • 1
    • 2
  • Wei Liu
    • 3
  1. 1.Institute for Information Systems and Computer MediaGraz University of TechnologyAustria
  2. 2.Business SchoolCurtin UniversityAustralia
  3. 3.School of Computer Science and Software EngineeringThe University of Western AustraliaAustralia

Personalised recommendations