Abstract
The integration of formative practice questions with textbook content is a well-known method for increasing student learning, and recent advances in artificial intelligence have made automatic question generation a viable option for scaling this learning method to thousands of textbooks. To expand this method to even more students, a parallel construction approach was developed to utilize the question generation process in English to create questions for other languages, such as Spanish. However, validation of the Spanish questions by native speaking subject matter experts is a necessary step to ensure the questions generated through parallel construction are of the same quality and suitable for educational purposes. In this paper, questions were generated via parallel construction for six Spanish textbooks and evaluated by subject matter experts teaching those subjects at major universities in Mexico and Argentina. Results from this review are discussed and implications for future use and research outlined.
Similar content being viewed by others
References
Abel, A. B., & Bernanke, B. S. (2004). Macroeconomía (4th ed.). Pearson Educación.
Czinkota, M. R., & Ronkainen, I. A. (2008). Marketing internacional (8th ed.). Cengage Learning.
Das, B., Majumder, M., Phadikar, S., & Sekh, A. A. (2021). Automatic question generation and answer assessment: a survey. Research and Practice in Technology Enhanced Learning, 16(1), 1–15.
Dyer, C., Chahuneau, V., & Smith, N. A. (2013). A simple, fast, and effective reparameterization of IBM model 2. NAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference, June, 644–648.
Feldman, R. S. (2017). Psicología con aplicaciones de América Latina (12th ed.). McGraw-Hill Education.
Heilman, M., & Smith, N. A. (2010). Rating computer-generated questions with Mechanical Turk. Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, 35–40. https://aclanthology.org/W10-0705.pdf
Honnibal, M., Montani, I., Van Landeghem, S., & Boyd, A. (2020). spaCy: Industrial-strength natural language processing in Python. https://doi.org/10.5281/zenodo.1212303
Jerome, B., Van Campenhout, R., Dittel, J. S., Benton, R., Greenberg, S., & Johnson, B. G. (2022). The Content Improvement Service: An adaptive system for continuous improvement at scale. In Meiselwitz, et al., Interaction in New Media, Learning and Games. HCII 2022. Lecture Notes in Computer Science, vol. 13517, 286–296, Springer. https://doi.org/10.1007/978-3-031-22131-6_22
Jerome, B., Van Campenhout, R., Dittel, J. S., Benton, R., & Johnson, B. G. (2023). Iterative improvement of automatically generated practice with the Content Improvement Service. In R. Sottilare & J. Schwarz (Eds.), Adaptive Instructional Systems. HCII 2023. Lecture Notes in Computer Science, vol. 14044, 312–324, Springer. https://doi.org/10.1007/978-3-031-34735-1_22
Johnson, B. G., Dittel, J. S., Van Campenhout, R., Bistolfi, R., Maeda, A., & Jerome, B. (2022a). Parallel construction: A parallel corpus approach for automatic question generation in non-English languages. Fourth Workshop on Intelligent Textbooks at the 23rd International Conference on Artificial Intelligence in Education. CEUR Workshop Proceedings, 40–49. http://ceur-ws.org/Vol-3192/itb22_p5_short9847.pdf
Johnson, B. G., Dittel, J. S., Van Campenhout, R., & Jerome, B. (2022b). Discrimination of automatically generated questions used as formative practice. Proceedings of the Ninth ACM Conference on Learning@Scale, 325–329. https://doi.org/10.1145/3491140.3528323
Koedinger, K. R., Kim, J., Jia, J., McLaughlin, E., & Bier, N. (2015). Learning is not a spectator sport: Doing is better than watching for learning from a MOOC. Proceedings of the Second ACM Conference on Learning@Scale, 111–120. https://doi.org/10.1145/2724660.2724681
Koedinger, K. R., McLaughlin, E. A., Jia, J. Z., & Bier, N. L. (2016). Is the doer effect a causal relationship? How can we tell and why it’s important. Proceedings of the Sixth International Conference on Learning Analytics & Knowledge, 388–397. Edinburgh, United Kingdom. https://doi.org/10.1145/2883851.2883957
Koedinger, K. R., Scheines, R., & Schaldenbrand, P. (2018). Is the doer effect robust across multiple data sets? Proceedings of the 11th International Conference on Educational Data Mining, 369–375.
Kurdi, G., Leo, J., Parsia, B., Sattler, U., & Al-Emari, S. (2020). A systematic review of automatic question generation for educational purposes. International Journal of Artificial Intelligence in Education, 30(1), 121–204. https://doi.org/10.1007/s40593-019-00186-y
Label, W., de León Ledesma, J., & Ramos Arriagada, R. A. (2016). Contabilidad para no contadores (2nd ed.). ECOE Ediciones.
Lefer, M.-A. (2020) Parallel corpora. In M. Paquot & S. T. Gries (Eds.), A practical handbook of corpus linguistics. Springer. https://doi.org/10.1007/978-3-030-46216-1_12
Lovett, M., Meyer, O., & Thille, C. (2008). The open learning initiative: measuring the effectiveness of the OLI statistics course in accelerating student learning. Journal of Interactive Media in Education, 2008(1), 1–16. https://doi.org/10.5334/2008-14
MartínezRamírez, B. (2021). Ciencias de la comunicación (2nd ed.). McGraw-Hill Education.
Mihalcea, R., & Tarau, P. (2004). TextRank: Bringing order into text. Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 404–411. https://aclanthology.org/W04-3252
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. International Conference on Learning Representations (ICLR) 2013. Workshop proceedings. https://doi.org/10.48550/arXiv.1301.3781
OpenAI. (2022). ChatGPT [Large language model]. https://chat.openai.com/chat
Ross, S. A., Westerfield, R. W., Jaffe, J., & Jordan, B. D. (2016). Finanzas corporativas (11th ed.). McGraw-Hill Education.
Snell, S., & Bohlander, G. (2013a). Administración de recursos humanos (16th ed.). Cengage Learning.
Snell, S., & Bohlander, G. (2013b). Managing human resources (16th ed.). South-Western, Cengage Learning.
Van Campenhout R., Clark, M., Jerome, B., Dittel, J. S., & Johnson, B. G. (2023a). Advancing intelligent textbooks with automatically generated practice: A large-scale analysis of student data. Fifth Workshop on Intelligent Textbooks at the 24th International Conference on Artificial Intelligence in Education. CEUR Workshop Proceedings, 1–12. https://intextbooks.science.uu.nl/workshop2023/files/itb23_s1p2.pdf
Van Campenhout, R., Dittel, J. S., Jerome, B., & Johnson, B. G. (2021a). Transforming textbooks into learning by doing environments: An evaluation of textbook-based automatic question generation. Third Workshop on Intelligent Textbooks at the 22nd International Conference on Artificial Intelligence in Education. CEUR Workshop Proceedings, 60–73. http://ceur-ws.org/Vol-2895/paper06.pdf
Van Campenhout, R., Jerome, B., Dittel, J. S., & Johnson, B. G. (2023b). The doer effect at scale: Investigating correlation and causation across seven courses. Proceedings of LAK23: 13th International Learning Analytics and Knowledge Conference. https://doi.org/10.1145/3576050.3576103
Van Campenhout, R. Johnson, B. G., & Olsen, J. A. (2021b). The doer effect: Replicating findings that doing causes learning. Proceedings of eLmL 2021: The Thirteenth International Conference on Mobile, Hybrid, and On-line Learning, 1–6. https://www.thinkmind.org/index.php?view=article&articleid=elml_2021_1_10_58001
Van Campenhout, R., Johnson, B. G., & Olsen, J. A. (2022). The doer effect: Replication and comparison of correlational and causal analyses of learning. International Journal on Advances in Systems and Measurements, 15(1&2), 48–59. http://www.iariajournals.org/systems_and_measurements/tocv15n12.html
Véronis, J. (2000) From the Rosetta stone to the information society. In J. Véronis (Ed.), Parallel text processing, (pp. 1–24). Springer. https://doi.org/10.1007/978-94-017-2535-4_1
VitalSource Technologies (2023). VitalSource Supplemental Data Repository. https://github.com/vitalsource/data
Acknowledgements
We gratefully acknowledge the SME reviewers and coordinators for their participation in this research, Reilly Fitzgibbons for her assistance with the project, and the issue editor and anonymous reviewers for their constructive comments on the manuscript.
Funding
This work was supported by VitalSource Technologies.
The authors are employed at the company supporting the research.
Author information
Authors and Affiliations
Contributions
Benny G. Johnson, Rachel Van Campenhout, Bill Jerome, and Jeffrey S. Dittel contributed to the study conception and design. Maria Fernanda Castro and Rodrigo Bistolfi contributed subject matter expertise for question validation and reviewer instructions and feedback. Rachel Van Campenhout and Benny G. Johnson completed the manuscript. All authors approve the manuscript.
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Johnson, B.G., Van Campenhout, R., Jerome, B. et al. Automatic Question Generation for Spanish Textbooks: Evaluating Spanish Questions Generated with the Parallel Construction Method. Int J Artif Intell Educ (2024). https://doi.org/10.1007/s40593-024-00394-1
Accepted:
Published:
DOI: https://doi.org/10.1007/s40593-024-00394-1