A Novel Framework for the Generation of Multiple Choice Question Stems Using Semantic and Machine-Learning Techniques

Kumar, Archana Praveen; Nayak, Ashalatha; K, Manjula Shenoy; Chaitanya; Ghosh, Kaustav

doi:10.1007/s40593-023-00333-6

A Novel Framework for the Generation of Multiple Choice Question Stems Using Semantic and Machine-Learning Techniques

ARTICLE
Open access
Published: 30 March 2023

Volume 34, pages 332–375, (2024)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Artificial Intelligence in Education Aims and scope Submit manuscript

A Novel Framework for the Generation of Multiple Choice Question Stems Using Semantic and Machine-Learning Techniques

Download PDF

3683 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

Multiple Choice Questions (MCQs) are a popular assessment method because they enable automated evaluation, flexible administration and use with huge groups. Despite these benefits, the manual construction of MCQs is challenging, time-consuming and error-prone. This is because each MCQ is comprised of a question called the "stem", a correct option called the "key" along with alternative options called "distractors" whose construction demands expertise from the MCQ developers. In addition, there are different kinds of MCQs such as Wh-type, Fill-in-the-blank, Odd one out, and many more needed to assess understanding at different cognitive levels. Automatic Question Generation (AQG) for developing heterogeneous MCQ stems has generally followed two approaches: semantics-based and machine-learning-based. Questions generated via AQG techniques can be utilized only if they are grammatically correct. Semantics-based techniques have been able to generate a range of different types of grammatically correct MCQs but require the semantics to be specified. In contrast, most machine-learning approaches have been primarily able to generate only grammatically correct Fill-in-the-blank/Cloze by reusing the original text. This paper describes a technique for combining semantic-based and machine-learning-based techniques to generate grammatically correct MCQ stems of various types for a technical domain. Expert evaluation of the resultant MCQ stems demonstrated that they were promising in terms of their usefulness and grammatical correctness.

Question Guru: An Automated Multiple-Choice Question Generation System

Automatic Multiple-Choice and Fill-in-the-Blank Question Generation from Arbitrary Text

An Experimental Evaluation of Automatically Generated Multiple Choice Questions from Ontologies

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

The use of Multiple Choice Questions (MCQs) as an assessment tool is gaining more attention in the current education field (Yaneva et al., 2018) Some of the benefits of using MCQs are that they: a) offer the opportunity to measure intelligence, knowledge, or cognitive skills, b) are easy to evaluate, and c) can be administered to huge groups. Due to these advantages, MCQs have been popularly used to aid decision-making during job placements, and college admissions. In addition, MCQ stems can also evaluate whether the relevant course outcomes of a particular course have been satisfied, which can help to review and revise the instructional activities if needed.

Despite these benefits, there are challenges associated with developing and using MCQs. One of these is the need to develop many distinct MCQ stems for each course (Haladyna & Rodriguez, 2013; D’Sa & Visbal-Dionaldo, 2017). According to Wood (2009), the answers can be memorized by reusing the same MCQ stem, thus posing a threat to the validity of the exam. Furthermore, for a given course, the design of MCQ stems should be able to address the course objectives along with course outcomes (Tarrant & Ware, 2012). Hence, the manual construction of MCQ stem is time-consuming, cumbersome, and error-prone (Hansen & Dexter, 1997; Tarrant et al., 2006). Most MCQ stem developers have inadequate expertise and training to develop high-quality MCQ stem (Tarrant et al., 2006). These item writing flaws lead to the construction of ambiguous MCQ stem, which does not ensure the validity of the questions as established by the expert panel review (Considine et al., 2005; Xie et al., 2022). According to Considine et al. (2005), the validity testing of MCQ stem concerning content can be established only by the expert panel review.

To solve all the problems mentioned above, technology in automatically generating MCQ stem holds much promise. Rus et al. (2008) defines Automatic Question Generation (AQG) as a task to develop the questions automatically from various inputs such as text, database, or semantic representation. Hence, automatically generated questions assist in measuring the learning capability and offer a quicker solution to large-scale assessment tests (Gierl et al., 2017). The construction of MCQ stems through AQG also facilitates the usage of MCQs in drills and practice sessions without much problem. AQG can also be customized to design personalized MCQs for test takers, which include their preferences and learning ability (Mostow & Chen, 2009; Shah et al., 2017).

According to Le et al.(2014), AQG, in the most recent research, deals with techniques to generate questions from knowledge resources that are either structured (e.g., ontology) or unstructured (textual data). The approach using structured knowledge resources is called a semantic-based or ontology-based approach. In contrast, Machine-Learning (ML) based or text-based approach uses unstructured data resources. Ontology generates MCQ stem due to its semantics and precise syntax to represent the domain knowledge (Alsubait, 2015). This knowledge representation then describes the question stem. Therefore semantic-based approaches have generated heterogeneous MCQs in various domains using structured knowledge resources. ML-based techniques have also gained popularity where classifiers trained with textual data features identify the relevant sentences (Kurdi et al., 2020) that are converted into primarily only Cloze questions.

In the existing literature, it is observed that AQG towards MCQs has been generated predominantly for the language learning domain (Alsubait, 2015). The contribution of AQG has been minimal towards the technical domain (Heilman, 2011). In the current education system, the technical domain use MCQs extensively (O’Dwyer, 2012). According to Narayanan et al. (2015), MCQs on engineering education need to comprise questions that: a). are real-life problem solving and inductive learning with reasoning, b). can satisfy the cognitive skills based on Bloom's Taxonomy, and c). can satisfy the required course outcomes for a given course. Constructing MCQs manually that fulfill these guidelines requires tremendous effort (Testa et al., 2018). Therefore this research is an attempt to generate MCQ stems automatically for a technical domain based on Bloom's taxonomy cognitive levels (Testa et al., 2018) to structure and characterize the assessment in terms of complexity and higher order skills.

AQG in the current research has generated a variety of MCQs like Cloze and Wh-type questions (questions starting with 'Where', 'What', and many more.). However, based on observations by Kurdi et al. (2020), ontologies fail to generate grammatically correct Cloze questions compared to ML techniques. The main reason is that verbalizing ontology into Cloze questions generates grammatically incorrect questions (Faizan & Lohmann, 2018). In addition, the AQG method need not infer any semantic reasoning towards developing Cloze questions. Hence the unstructured data can be suitably used to generate Cloze questions using ML. Based on the review by Ch and Saha (2018), ML generates reasonably good Cloze questions but fails to generate grammatically correct Wh-type questions compared to the ontology technique. Given the above issues and limitations, it is imperative to develop a system that can generate MCQ stems automatically for a technical domain. Hence the objectives are formalized into the following research questions:

1.
RQ1: Can a system automatically generate different Wh-type and Cloze question stems for a technical domain?
2.
RQ2: Can this system generate MCQ stems that assess cognitive skills as categorized in Bloom's Taxonomy?
3.
RQ3: Can this system generate useful and grammatically correct Wh-type and Cloze MCQ stems using a hybrid combination of ontology and ML?

This research proposes a hybrid approach using an Ontology-Based Technique (OBT) and Machine-Learning Based Technique (MBT) to generate different types of Wh-type and Cloze question stems for a technical domain. The research proposes the following objectives:

A hybrid framework of OBT and MBT to generate heterogeneous MCQ stems for a technical domain
Generates Wh-type question stems using OBT and Cloze question stems using MBT
Evaluates MCQ stems based on Bloom's Taxonomy

The rest of the article is subdivided as follows: Background is presented in "Background" section. "Related Work" section explains the Related Work. The methodology is shown in "Methodology" section, and "Results of Experiment and Analysis" section provides the experiment and analysis results. Evaluation of the system is discussed in "Evaluation" section, and the conclusion is detailed in "Conclusion" section.

Background

Predominantly semantic-based approaches utilize ontology and its components for the automatic generation of MCQ. This section briefly introduces ontology, MCQ, different types of MCQ, and the cognitive skills classified based on Bloom's Taxonomy.

Ontology

Gruber (1993) defines ontology as a concrete specification of the domain. Ontologies represent knowledge that can be used as a foundation to build many intelligent applications. Recent advances in publishing knowledge in the form of ontologies have led to the increased use of these structures in educational applications (Vinu & Kumar, 2015). Researchers use existing or hand-crafted ontologies for a given domain to generate assessment questions. Ontology can be either built by (a). Using open source software Protégé Stanford Center for Biomedical Research (2019) or (b). Using a programming language, i.e., Web Ontology Language (OWL). Ontology provides an explicit specification of a domain modelled through the ontology components of concepts, instances, attributes (datatype property), relations (object-type property), and axioms Gruber, 1995).

Concepts are classes; instances are individuals of a concept; relations are attributes of a concept or relationships between concepts; axioms are restrictions or constraints on the concepts. Description Logic (DL) from the family of logic-based knowledge representations conveys the assertions or facts on concepts, instances, and relations through axioms of the ontology (Horrocks, 2005). The axioms added are called Terminological axioms (TBox) and Assertional axioms (ABox). According to Grosof et al. (2003), TBox is used to structure the domain, i.e., the schema of the domain, while ABox shows the instances of the domain. Hence ontology is analogous to database models except that the database reflects data in tables, while ontology reflects data in a knowledge graph. Additional higher-level constraints or rules to specific roles satisfied by concepts extend the ontology's semantics (Eiter et al., 2008). According to Horrocks et al (2004), Semantic Web Rule Language (SWRL) asserts facts such as: 'Father is a male having a human child' or 'Parent': as an inverse relationship of a 'child'. SWRL is a rule-based markup language comprising a set of rules with antecedent and consequent. The rule implies that whenever the conditions in the antecedent hold, the conditions in the consequent must also hold.

Consider an example: For a university domain, the three classes or concepts are—Person, Faculty, and Student. A Person can be either a Faculty or a Student. So Faculty and Student are sub-classes under Person. Each Faculty has a datatype property—Name, Age, Address, EmpId, Dept, and Salary. Each instance of Student will have datatype property—RegNo, RollNo, Section, Name, and DOB. Faculty teaches Students, so 'teaches' is an object-type property. X is an instance under Faculty, while Y is under Student. Figure 1 shows the ontology for this domain done through Protégé.

DL also provides reasoning capability along with constructs of conjunction, disjunction, negation, and quantifiers to the ontology (Baader et al., 2005). Hence ontology is said to represent the domain at the semantic level. A built ontology can infer certain assertions and add them through reasoning techniques. Thus ontology not only represents knowledge but also adds and extends semantics by inferring through reasoning. Due to this semantic representation, research uses ontology models to generate MCQs automatically.

MCQs

An MCQ introduces the question called stem (Majumder & Saha, 2015) and has one correct answer called the key, with three to four wrong options called distractors. It is either a single-response question with only one key or a multiple-response question having multiple keys. The approach in this research is towards the automatic generation of a single-response MCQ. Figure 2 shows the structure of single response MCQ. The MCQ questions can be in different variations like Wh-type, Definitions, One-word answers, Synonyms, T/F (True/False), Odd one-out, Analogy, and Cloze questions (Agarwal, 2012).

Interrogative Questions which start with Wh-word, e.g., 'When', 'Who', 'Where', 'Why', 'Which' and 'What' are Wh-type questions. 'What' is mostly used in questions about a term directly or indirectly. Direct questions of the form 'What is X?' represents Definition questions. Questions referring to the concept indirectly of the form 'What is the concept which yields X?' answer one word. Henceforth, such questions are referred to as One-word answer questions. Questions to extract the equivalent name of the given concept are synonyms, e.g., 'What is the equivalent name of X?'. The True/False questions determine whether a given stem is true or false. Odd one-out questions are Wh-type questions to identify the key which satisfies the stem, e.g., 'Which among the following is of type X'?. Analogy questions involve giving a special relationship and then identifying another similar one. E.g.: 'As Fuel: Car then Food: ?'. Cloze questions are questions where a word is substituted by a blank, e.g., 'A part of the word from the sentence is termed as _________'. A given MCQ stem needs to satisfy a certain cognitive skill based on Bloom's Taxonomy (Dunham et al., 2015).

Bloom’s Taxonomy

Bloom (1956) identified the three learning domains or educational activities: Cognitive Knowledge or Mental Skills, Affective Attitude or Emotions, and Psychomotor or Physical Skills. Technical domain education requires questions assessing intellectual skills such as problem-solving and critical thinking belonging to the cognitive domain (Anderson & Krathwohl, 2001). This research aims to generate such question stems automatically. So this section introduces only those educational objectives under the cognitive domain.

Under cognitive, the six skills are remembering, understanding, applying, analyzing, evaluating, and creating (Palmer & Devitt, 2007). Table 1 shows the different levels of cognitive learning (Krathwohl, 2002). For the learning domain, the MCQ needs to evaluate all the student's possible skills for the given course (Narayanan et al., 2015). However, MCQs are not appropriate for testing higher levels of creativity (Carneson et al., 1996). Nevertheless, the MCQ can test the other higher levels of evaluating, applying, and analyzing, but often it tests the lower order learning of understanding and remembering (Carneson et al., 1996). There are certain question words used in MCQ to test the different cognitive skills (Anderson & Krathwohl, 2001) shown in Table 2.

Table 1 Cognitive skills as stated in Bloom’s taxonomy (Krathwohl, 2002)

A Novel Framework for the Generation of Multiple Choice Question Stems Using Semantic and Machine-Learning Techniques

Abstract

Similar content being viewed by others

Question Guru: An Automated Multiple-Choice Question Generation System

Automatic Multiple-Choice and Fill-in-the-Blank Question Generation from Arbitrary Text

An Experimental Evaluation of Automatically Generated Multiple Choice Questions from Ontologies

Introduction

Background

Ontology

MCQs

Bloom’s Taxonomy

Related Work

Summary of Related Work

Methodology

Overview of the Proposed System

Ontology Based Technique – OBT

Ontology Modeling

Instance Tree (ITree) Creation

Variable Representation and Wh-type Transformation

Class_List

Prop_List

Algorithm

Sample Example

Machine-learning Based Technique – MBT

Sentence Selection

Key Sentences Extraction

Feature_set

Keywords Selection

Algorithm

Sample Example

Results of Experiment and Analysis

Dataset

Experiment

OBT

MBT

Evaluation

Manual Evaluation

Results

MCQs Generation with Different Datasets

Comparison of the Proposed Approach with Existing Approaches

Conclusion

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation