Skip to main content


Log in

Multiple-choice question generation with auto-generated distractors for computer-assisted educational assessment

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript


Multiple-choice questions (MCQs) are used as instrumental tool for assessment, not only in various competitive examinations but also in contemporary information and communications Technology (ICT)-based education, active learning, etc. Therefore, automatic generation of multiple-choice test items from text-based learning material is a truly demanding task in computer aided-assessment. A lot of systems were developed in the past two decades for this purpose, but the system generated questions have failed to satisfy the needs of computer-based automated assessment. As a consequence, this is still an open area of research in education technology and natural language processing. This article presents an automated system for generating multiple-choice test items with distractors. The system first selects informative sentences using the topic-words or keywords (one or more words). The best keyword from a selected sentence is chosen as an answer key. Next, the system eliminates the answer key from this sentence and transforms it into a question-sentence (stem). The wrong options or distractors are generated automatically using a feature-based clustering approach, without using any external information or knowledge-base. The result highlights the efficiency of the proposed system for generating MCQs with distractors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others





  1. Afzal N, Mitkov R (2014) Automatic generation of multiple choice questions using dependency-based semantic relations. Soft Comput 18(7):1269–1281

    Article  Google Scholar 

  2. Agarwal M, Mannem P (2011) Automatic gap-fill question generation from text books. In: Proceedings of the 6th workshop on innovative use of NLP for building educational applications, pp 56–64. Association for computational linguistics

  3. Agarwal M, Shah R, Mannem P (2011) Automatic question generation using discourse cues. In: Proceedings of the 6th workshop on innovative use of nlp for building educational applications, pp 1–9. Association for computational linguistics

  4. Aldabe I, Maritxalar M (2010) Automatic distractor generation for domain specific texts. In: Proceedings of the 7th international conference on advances in natural language processing. Springer, Berlin, pp 27–38

  5. Alsubait T, Parsia B, Sattler U (2016) Ontology-based multiple choice question generation. KI-Künstliche Intelligenz 30(2):183–188

    Article  Google Scholar 

  6. Andersen S (2014) Sentence types and functions. California: San José State University Writing Center

  7. Aouicha MB, Taieb MAH, Hamadou AB (2018) Sisr: system for integrating semantic relatedness and similarity measures. Soft Comput 22(6):1855–1879

    Article  Google Scholar 

  8. Araki J, Rajagopal D, Sankaranarayanan S, Holm S, Yamakawa Y, Mitamura T (2016) Generating questions and multiple-choice answers using semantic analysis of texts. In: Proceedings of COLING 2016, the 26th International conference on computational linguistics: technical papers, pp 1125–1136

  9. Becker L, Basu S, Vanderwende L (2012) Mind the gap: learning to choose gaps for question generation. In: Proceedings of ACL on human language technologies, pp 742–751. Association for Computational Linguistics

  10. Bhatia AS, Kirti M, Saha SK (2013) Automatic generation of multiple choice questions using wikipedia. In: Proceedings of the pattern recognition and machine intelligence. Springer, Berlin, pp 733–738

  11. Bholowalia P, Kumar A (2014) Ebk-means: A clustering technique based on elbow method and k-means in wsn. Int J Comput Appl 105(9)

  12. Ch DR, Saha SK (2018) Automatic multiple choice question generation from text: A survey. IEEE Transactions on Learning Technologies

  13. Coniam D (1997) A preliminary inquiry into using corpus word frequency data in the automatic generation of english language cloze tests. Calico Journal 14(2-4):15–33

    Article  Google Scholar 

  14. Das B, Majumder M (2017) Factual open cloze question generation for assessment of learner’s knowledge. Int J Educ Technol High Educ 14:1–12

    Article  Google Scholar 

  15. Das B, Majumder M, Phadikar S (2018) A novel system for generating simple sentences from complex and compound sentences. Int J Modern Educ Comput Sci 10(1):57

    Article  Google Scholar 

  16. Das B, Majumder M, Phadikar S, Sekh AA (2019) Automatic generation of fill-in-the-blank question with corpus-based distractors for e-assessment to enhance learning. Comput Appl Eng Educ 27(6):1485–1495

    Article  Google Scholar 

  17. Divate M, Salgaonkar A (2017) Automatic question generation approaches and evaluation techniques. Current Science (00113891) 113(9)

  18. Dostal M, Ježek K (2011) Automatic keyphrase extraction based on nlp and statistical method. Poster presentation of SVK, pp 140–145

  19. Du X, Cardie C (2017) Identifying where to focus in reading comprehension for neural question generation. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 2067–2073

  20. Du X, Shao J, Cardie C (2017) Learning to ask: Neural question generation for reading comprehension. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp 1342–1352

  21. Effenberger T (2015) Automatic question generation and adaptive practice. PhD thesis, Masarykova univerzita, Fakulta informatiky

  22. Gao L, Gimpel K, Jensson A (2020) Distractor analysis and selection for multiple-choice cloze questions for second-language learners. In: Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications, pp 102–114. Association for Computational Linguistics

  23. Gates DM (2011) How to generate cloze questions from definitions: A syntactic approach. In: 2011 AAAI Fall symposium series, pp 19–22

  24. Goto T, Kojiri T, Watanabe T, Iwata T, Yamada T (2010) Automatic generation system of multiple-choice cloze questions and its evaluation. Int J Knowl Manag E-Learning 2(3):210–224

    Google Scholar 

  25. Karamanis N, An HL, Mitkov R (2006) Generating multiple-choice test items from medical text: A pilot study. In: Proceedings of the fourth international natural language generation conference, pp 111–113. Association for Computational Linguistics

  26. Kim Y, Lee H, Shin J, Jung K (2019) Improving neural question generation using answer separation. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 6602–6609

  27. Knoop S, Wilske S (2013) Wordgap-automatic generation of gap-filling vocabulary exercises for mobile learning. In: Proceedings of the second workshop on NLP for computer-assisted language learning at NODALIDA; Oslo; Norway, number 086 in 17. Linköping University Electronic Press, pp 39–47

  28. Kurdi G, Leo J, Parsia B, Sattler U, Al-Emari S (2020) A systematic review of automatic question generation for educational purposes. Int J Artif Intell Ed 30(1):121–204

    Article  Google Scholar 

  29. Leo J, Kurdi G, Matentzoglu N, Parsia B, Sattler U, Forge S, Donato G, Dowling W (2019) Ontology-based generation of medical, multi-term mcqs. Int J Artif Intell Education, pp 1–44

  30. Levy R, Andrew G (2006) Tregex and tsurgeon: tools for querying and manipulating tree data structures. In: LREC. Citeseer, pp 2231–2234

  31. Li J, Huang G, Fan C, Sun Z, Zhu H (2019) Key word extraction for short text via word2vec, doc2vec, and textrank. Turkish J Electrical Eng Comput Sci 27(3):1794–1805

    Article  Google Scholar 

  32. Liu M, Rus V, Liu L (2018) Automatic chinese multiple choice question generation using mixed similarity strategy. IEEE Trans Learn Technol 11 (2):193–202

    Article  Google Scholar 

  33. Lott B (2012) Survey of keyword extraction techniques. UNM Education, 50

  34. Ma L, Zhang Y (2015) Using word2vec to process big text data. In: 2015 IEEE International Conference on Big Data (Big Data). IEEE, pp 2895–2897

  35. Majumder M, Saha SK (2014) Automatic selection of informative sentences: The sentences that can generate multiple choice questions. Knowledge Management and E-Learning: An International Journal 6(4):377–391

    Google Scholar 

  36. Majumder M, Saha SK (2015) A system for generating multiple choice questions: With a novel approach for sentence selection. In: Proceedings of the 2nd workshop on natural language processing techniques for educational applications, pp 64–72

  37. Manning CD, Surdeanu M, Bauer J, Finkel JR, Bethard S, McClosky D (2014) The stanford corenlp natural language processing toolkit. In: Proceedings of 52nd annual meeting of the association for computational linguistics: system demonstrations, pp 55–60

  38. Marneffe Marie-Catherine D., Manning CD (2008) Stanford typed dependencies manual. Technical report, Technical report, Stanford University

  39. Maurya KK, Desarkar MS (2020) Learning to distract: a hierarchical multi-decoder network for automated generation of long distractors for multiple-choice questions for reading comprehension, pp 1115–1124. Association for Computing Machinery, New York, NY, USA

  40. Mazidi K, Nielsen RD (2014), vol 2, pp 321–326

  41. Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space.arXiv:1301.3781

  42. Mitkov R, An HL, Karamanis N (2006) A computer-aided environment for generating multiple-choice test items. Nat Lang Eng 12(2):177–194

    Article  Google Scholar 

  43. Naqvi SR, Akram T, Haider SA, Khan W, Kamran M, Muhammad N, Qadri NN (2019) Learning outcomes and assessment methodology: case study of an undergraduate engineering project. Int J Electri Eng Educ 56(2):140–162

    Article  Google Scholar 

  44. Narendra A, Agarwal M, Shah R (2013) Automatic cloze-questions generation. In: Proceedings of recent advances in natural language processing, pp 511–515. Hissar, Bulgaria

  45. Olszewska JI (2019) Designing transparent and autonomous intelligent vision systems. In: ICAART (2), pp 850–856

  46. Patra R, Saha SK (2019) A hybrid approach for automatic generation of named entity distractors for multiple choice questions. Educ Inf Technol 24 (2):973–993

    Article  Google Scholar 

  47. Pugh D, Champlain AD, Gierl M, Lai H, Touchie C (2016) Using cognitive models to develop quality multiple-choice questions. Medical Teacher 38 (8):838–843

    Article  Google Scholar 

  48. Rose S, Engel D, Cramer N, Cowley W (2010) Automatic keyword extraction from individual documents. Text mining: applications and theory, pp 1–20

  49. Santhanavijayan A, Balasundaram SR, Narayanan SH, Kumar SV, Prasad VV (2017) Automatic generation of multiple choice questions for e-assessment. International Journal of Signal and Imaging Systems Engineering 10 (1-2):54–62

    Article  Google Scholar 

  50. Smith S, Avinesh PVS, Kilgarriff A (2010) Gap-fill tests for language learners: Corpus-driven item generation. In: Proceedings of ICON: 8th international conference on natural language processing, pp 1–6

  51. Subramanian S, Wang T, Yuan X, Zhang S, Trischler A, Bengio Y (2018) Neural models for key phrase extraction and question generation. In: Proceedings of the workshop on machine reading for question answering, pp 78–88. Association for Computational Linguistics

  52. Sun X, Liu J, Lyu Y, He W, Ma Y, Wang S (2018) Answer-focused and position-aware neural question generation. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 3930–3939

  53. Susanti Y, Tokunaga T, Nishikawa H, Obari H (2018) Automatic distractor generation for multiple-choice english vocabulary questions. Res Pract Technol Enhanc Learn 13(1):15

    Article  Google Scholar 

  54. Warrens MJ, van der Hoef H (2020) Understanding the rand index. In: Advanced studies in classification and data science. Springer, pp 301–313

  55. Wongso R, Hanafiah N, Hartanto J, Alexander K, Sutanto C, Kesuma F (2018) Complaint analysis in indonesian language using wpke and rake algorithm. Int J Electr Comput Eng 8(6):5311

    Google Scholar 

  56. Yuan X, Wang T, Gulcehre C, Sordoni A, Bachman P, Zhang S, Subramanian S, Trischler A (2017) Machine comprehension by text-to-text neural question generation. In: Proceedings of the 2nd Workshop on Representation Learning for NLP, pp 15–25. Association for Computational Linguistics

  57. Zhang S, Hu Y, Bian G (2017) Research on string similarity algorithm based on levenshtein distance. In: 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). IEEE, pp 2247–2251

  58. Zhang Y, Jin R, Zhou Zhi-Hua (2010) Understanding bag-of-words model: a statistical framework. Int J Mach Learn Cybern 1(1-4):43–52

    Article  Google Scholar 

Download references


This study is not funded from anywhere.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Bidyut Das.

Ethics declarations

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Conflict of Interests

The authors declare that there is no conflict of interest regarding the publication of this paper.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Das, B., Majumder, M., Phadikar, S. et al. Multiple-choice question generation with auto-generated distractors for computer-assisted educational assessment. Multimed Tools Appl 80, 31907–31925 (2021).

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: