Abstract
Online education platforms are powered by various NLP pipelines, which utilize models like BERT to aid in content curation. Since the inception of the pre-trained language models like BERT, there have also been many efforts toward adapting these pre-trained models to specific domains. However, there has not been a model specifically adapted for the education domain (particularly K-12) across subjects to the best of our knowledge. In this work, we propose to train a language model on a corpus of data curated by us across multiple subjects from various sources for K-12 education. We also evaluate our model, K-12BERT, on downstream tasks like hierarchical taxonomy tagging.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Araci, D.: FinBERT: financial sentiment analysis with pre-trained language models (2019). https://arxiv.org/abs/1908.10063
Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text (2019). https://arxiv.org/abs/1903.10676
Cer, D., et al.: Universal sentence encoder (2018)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2018). https://arxiv.org/abs/1810.04805
Gu, Y., et al.: Domain-specific language model pretraining for biomedical natural language processing. ACM Trans. Comput. Healthc. 3(1), 1–23 (2022). https://doi.org/10.1145/3458754
Lee, J.S., Hsiang, J.: PatentBERT: patent classification with fine-tuning a pre-trained BERT model (2019). https://arxiv.org/abs/1906.02124
Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, September 2019. https://doi.org/10.1093/bioinformatics/btz682
V, V., Mohania, M., Goyal, V.: Tagrec: Automated tagging of questions with hierarchical learning taxonomy (2021). https://arxiv.org/abs/2107.10649
Zhu, H., Peng, H., Lyu, Z., Hou, L., Li, J., Xiao, J.: TravelBERT: pre-training language model incorporating domain-specific heterogeneous knowledge into a unified representation (2021). https://arxiv.org/abs/2109.01048
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Goel, V., Sahnan, D., Venktesh, V., Sharma, G., Dwivedi, D., Mohania, M. (2022). K-12BERT: BERT for K-12 Education. In: Rodrigo, M.M., Matsuda, N., Cristea, A.I., Dimitrova, V. (eds) Artificial Intelligence in Education. Posters and Late Breaking Results, Workshops and Tutorials, Industry and Innovation Tracks, Practitioners’ and Doctoral Consortium. AIED 2022. Lecture Notes in Computer Science, vol 13356. Springer, Cham. https://doi.org/10.1007/978-3-031-11647-6_123
Download citation
DOI: https://doi.org/10.1007/978-3-031-11647-6_123
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-11646-9
Online ISBN: 978-3-031-11647-6
eBook Packages: Computer ScienceComputer Science (R0)