Skip to main content

How Useful Are Educational Questions Generated by Large Language Models?

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1831))

Abstract

Controllable text generation (CTG) by large language models has a huge potential to transform education for teachers and students alike. Specifically, high quality and diverse question generation can dramatically reduce the load on teachers and improve the quality of their educational content. Recent work in this domain has made progress with generation, but fails to show that real teachers judge the generated questions as sufficiently useful for the classroom setting; or if instead the questions have errors and/or pedagogically unhelpful content. We conduct a human evaluation with teachers to assess the quality and usefulness of outputs from combining CTG and question taxonomies (Bloom’s and a difficulty taxonomy). The results demonstrate that the questions generated are high quality and sufficiently useful, showing their promise for widespread use in the classroom setting.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
EUR   29.95
Price includes VAT (Finland)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
EUR   106.99
Price includes VAT (Finland)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
EUR   142.99
Price includes VAT (Finland)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    [10] show that subject matter experts can’t distinguish between machine and human written questions, but state that a future direction is to assess CTG with teachers.

  2. 2.

    The passages, few-shot examples, prompt format, taxonomic level definitions, annotator demographics and raw results are available: https://tinyurl.com/y2hy8m4p.

  3. 3.

    Despite not being a teacher’s opinion, this is evaluated because we want to know the model’s success here without relying on automatic assessment.

References

  1. Baidoo-Anu, D., Owusu Ansah, L.: Education in the Era of Generative Artificial Intelligence (AI): Understanding the Potential Benefits of ChatGPT in Promoting Teaching and Learning (2023). Available at SSRN 4337484

    Google Scholar 

  2. Landis, J.R., Koch, G.G. The measurement of observer agreement for categorical data. Biometrics, 159–174 (1977)

    Google Scholar 

  3. Krathwohl, D.R.: A revision of Bloom’s taxonomy: An overview. Theory Practice 41(4), 212–218 (2002)

    Google Scholar 

  4. Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Comput. Surv. 55(9), 1–35 (2023)

    Article  Google Scholar 

  5. Mulla, N., Gharpure, P.: Automatic question generation: a review of methodologies, datasets, evaluation metrics, and applications. Progress Artif. Intell., 1–32 (2023)

    Google Scholar 

  6. Ouyang, L., et al.: Training language models to follow instructions with human feedback. arXiv preprint arXiv:2203.02155 (2022)

  7. Pérez, E.V., Santos, L.M.R., Pérez, M.J.V., de Castro Fernández, J.P., Martín, R.G.: Automatic classification of question difficulty level: Teachers’ estimation vs. students’ perception. In: 2012 Frontiers in Education Conference Proceedings, pp. 1–5. IEEE (2012)

    Google Scholar 

  8. Terwiesch, C.: Would Chat GPT Get a Wharton MBA? A Prediction Based on Its Performance in the Operations Management Course. Mack Institute for Innovation Management at the Wharton School, University of Pennsylvania (2023)

    Google Scholar 

  9. Wang, X., Fan, S., Houghton, J., Wang, L.: Towards Process-Oriented, Modular, and Versatile Question Generation that Meets Educational Needs. arXiv preprint arXiv:2205.00355 (2022)

  10. Wang, Z., Valdez, J., Basu Mallick, D., Baraniuk, R.G.: Towards human-like educational question generation with large language models. In: Artificial Intelligence in Education: 23rd International Conference, AIED 2022, Durham, UK, 2022, Proceedings, Part I, pp. 153–166. Springer International Publishing, Cham (2022)

    Chapter  Google Scholar 

  11. Zhang, H., Song, H., Li, S., Zhou, M., Song, D.: A survey of controllable text generation using transformer-based pre-trained language models. arXiv preprint arXiv:2201.05337 (2022)

Download references

Acknowledgements

We’d like to thank Mitacs for their grant for this project, and CIFAR for their continued support. We are grateful to both the annotators for their time and the anonymous reviewers for their valuable feedback.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sabina Elkins .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Elkins, S., Kochmar, E., Serban, I., Cheung, J.C.K. (2023). How Useful Are Educational Questions Generated by Large Language Models?. In: Wang, N., Rebolledo-Mendez, G., Dimitrova, V., Matsuda, N., Santos, O.C. (eds) Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners, Doctoral Consortium and Blue Sky. AIED 2023. Communications in Computer and Information Science, vol 1831. Springer, Cham. https://doi.org/10.1007/978-3-031-36336-8_83

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36336-8_83

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36335-1

  • Online ISBN: 978-3-031-36336-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics