Skip to main content

Advertisement

Log in

A survey of the literature: how scholars use text mining in Educational Studies?

  • Published:
Education and Information Technologies Aims and scope Submit manuscript

Abstract

The massive amount of text related to education provides rich information to support education in many aspects. In the meantime, the vast yet increasing volume of text makes it impossible to analyze manually. Text mining is a powerful tool to automatically analyze large-scaled texts and generate insights from the texts. However, many educational scholars are not fully aware of whether text mining is useful and how to use it in their studies. To address this problem, we reviewed the literature to examine the educational research that used text mining techniques. Specifically, we proposed an educational text mining workflow and focused on identifying the articles’ bibliographic information, research methodologies, and applications in alignment with the workflow. We selected 161 articles published in educational journals from 2015 to 2020. We find that text mining is becoming more popular and essential in educational research. The conclusion is that we can employ three steps (text source selection, text mining techniques application, and educational information discovery) to use text mining in educational studies. We also summarize different options in each step in this paper. Our work should help educational scholars better understand educational text mining and provide support information for future research in text mining for educational contexts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data Availability

We make sure that all data and materials support our published claims and comply with field standards.

References

  • Akçapınar, G. (2015). How automated feedback through text mining changes plagiaristic behavior in online assignments. Computers & Education, 87, 123–130. https://doi.org/10.1016/j.compedu.2015.04.007

    Article  Google Scholar 

  • Arispe, M., Capucao, J., Relucio, F., & Maligat, D. E. Jr. (2019). Teachers’ sentiments to Bikol MTB-MLE: Using sentiment analysis and text mining techniques. International Journal of Research Studies in Education, 8(4), 21–26. https://doi.org/10.5861/ijrse.2019.4906

    Article  Google Scholar 

  • Abuzir, Y. (2018). Innovative Model for Student Project Evaluation Based on Text Mining. International Journal of Research in Education and Science, 4(2), 409–419. https://doi.org/10.21890/ijres.409481

    Article  Google Scholar 

  • Bayrak, T. (2020). A content analysis of top-ranked universities’ mission statements from five global regions. International Journal of Educational Development, 72, 102130. https://doi.org/10.1016/j.ijedudev.2019.102130

    Article  Google Scholar 

  • Buenaño-Fernandez, D., González, M., Gil, D., & Luján-Mora, S. (2020). Text mining of open-ended questions in self-assessment of university teachers: An LDA topic modeling approach. Ieee Access : Practical Innovations, Open Solutions, 8, 35318–35330

    Article  Google Scholar 

  • Bedenlier, S., Kondakci, Y., & Zawacki-Richter, O. (2018). Two decades of research into the internationalization of higher education: Major themes in the Journal of Studies in International Education (1997–2016). Journal of Studies in International Education, 22(2), 108–135. https://doi.org/10.1177/1028315317710093

    Article  Google Scholar 

  • Baddam, S., Bingi, P., & Shuva, S. (2019). Student Evaluation of Teaching in Business Education: Discovering Student Sentiments Using Text Mining Techniques. e-Journal of Business Education and Scholarship of Teaching, 13(3), 1–13

    Google Scholar 

  • Bozkurt, A., Koseoglu, S., & Singh, L. (2019). An analysis of peer reviewed publications on openness in education in half a century: Trends and patterns in the open hemisphere. Australasian Journal of Educational Technology, 35(4), https://doi.org/10.14742/ajet.4252

  • Chong, S. W. (2019). A systematic review of written corrective feedback research in ESL/EFL contexts. Language Education & Assessment, 2(2), 70–95. https://doi.org/10.29140/lea.v2n2.138

    Article  Google Scholar 

  • Chen, Z., Zhang, R., Xu, T., Yang, Y., Wang, J., & Feng, T. (2020). Emotional attitudes towards procrastination in people: A large-scale sentiment-focused crawling analysis. Computers in Human Behavior, 110, 106391. https://doi.org/10.1016/j.chb.2020.106391

    Article  Google Scholar 

  • Chen, X., Zou, D., Cheng, G., & Xie, H. (2020). Detecting latent topics and trends in educational technologies over four decades using structural topic modeling: A retrospective of all volumes of Computers & Education. Computers & Education, 151, 103855. https://doi.org/10.1016/j.compedu.2020.103855

    Article  Google Scholar 

  • Cronin, A., Intepe, G., Shearman, D., & Sneyd, A. (2019). Analysis using natural language processing of feedback data from two mathematics support centres. International Journal of Mathematical Education in Science and Technology, 50(7), 1087–1103. https://doi.org/10.1080/0020739X.2019.1656831

    Article  Google Scholar 

  • Chee, K. N., Yahaya, N., Ibrahim, N. H., & Hasan, M. N. (2017). Review of mobile learning trends 2010- 2015: A meta-analysis. Journal of Educational Technology & Society, 20(2), 113–126

    Google Scholar 

  • Chung, K. S. K., & Paredes, W. C. (2015). Towards a social networks model for online learning & performance. Journal of Educational Technology & Society, 18(3), 240–253

    Google Scholar 

  • Çepni, S. B., & Demirel, E. T. (2016). Impact of Text-Mining and Imitating Strategies on Lexical Richness, Lexical Diversity and General Success in Second Language Writing. Turkish Online Journal of Educational Technology-TOJET, 15(4), 61–68

    Google Scholar 

  • Doleck, T., Basnet, R. B., Poitras, E. G., & Lajoie, S. P. (2015). Mining learner–system interaction data: implications for modeling learner behaviors and improving overlay models. Journal of Computers in Education, 2(4), 421–447. https://doi.org/10.1007/s40692-015-0040-3

    Article  Google Scholar 

  • Elia, G., Solazzo, G., Lorenzo, G., & Passiante, G. (2019). Assessing learners’ satisfaction in collaborative online courses through a big data approach. Computers in Human Behavior, 92, 589–599. https://doi.org/10.1016/j.chb.2018.04.033

    Article  Google Scholar 

  • Elena Gallagher, S., O’Dulain, M., O’Mahony, N., Kehoe, C., McCarthy, F., & Morgan, G. (2017). Instructor-provided summary infographics to support online learning. Educational Media International, 54(2), 129–147. https://doi.org/10.1080/09523987.2017.1362795

    Article  Google Scholar 

  • Erkens, M., Bodemer, D., & Hoppe, H. U. (2016). Improving collaborative learning in the classroom: Text mining based grouping and representing. International Journal of Computer-Supported Collaborative Learning, 11(4), 387–415. https://doi.org/10.1007/s11412-016-9243-5

    Article  Google Scholar 

  • Freak, A., & Miller, J. (2017). Magnifying pre-service generalist teachers’ perceptions of preparedness to teach primary school physical education. Physical Education and Sport Pedagogy, 22(1), 51–70. https://doi.org/10.1080/17408989.2015.1112775

    Article  Google Scholar 

  • Fan, W., Wallace, L., Rich, S., & Zhang, Z. (2006). Tapping the power of text mining. Communications of the ACM, 49(9), 76–82

    Article  Google Scholar 

  • Gašević, D., Joksimović, S., Eagan, B. R., & Shaffer, D. W. (2019). SENS: Network analytics to combine social and cognitive perspectives of collaborative learning. Computers in Human Behavior, 92, 562–577. https://doi.org/10.1016/j.chb.2018.07.003

    Article  Google Scholar 

  • Geng, Z., Chen, G., Han, Y., Lu, G., & Li, F. (2020). Semantic relation extraction using sequential and tree-structured LSTM with attention. Information Sciences, 509, 183–192

  • Gottipati, S., Shankararaman, V., & Lin, J. R. (2018). Text analytics approach to extract course improvement suggestions from students’ feedback. Research and Practice in Technology Enhanced Learning, 13(1), 1–19

  • Gupta, V., & Lehal, G. S. (2009). A survey of text mining techniques and applications. Journal of emerging technologies in web intelligence, 1(1), 60–76

  • Hotho, A., Nürnberger, A., & Paaß, G. (2005, May). A brief survey of text mining. In Ldv Forum (Vol. 20, No. 1, pp. 19–62)

  • Hyndman, B., Suesee, B., McMaster, N., Harvey, S., Jefferson-Buchanan, R., Cruickshank, V. … Pill, S. (2019). Physical education across the international media: A five-year analysis. Sport Education and Society. https://doi.org/10.1080/13573322.2019.1583640

    Article  Google Scholar 

  • Harvey, S., & Atkinson, O. (2017). One youth soccer coach’s maiden implementation of the tactical games model. Ágora para la Educación Física y el Deporte, 19(2–3), 135–157

    Article  Google Scholar 

  • Howard, S. K., Yang, J., Ma, J., Maton, K., & Rennie, E. (2018). App clusters: Exploring patterns of multiple app use in primary learning contexts. Computers & Education, 127, 154–164. https://doi.org/10.1016/j.compedu.2018.08.021

    Article  Google Scholar 

  • Hujala, M., Knutas, A., Hynninen, T., & Arminen, H. (2020). Improving the quality of teaching by utilizing written student feedback: A streamlined process. Computers & Education, 157, 103965. https://doi.org/10.1016/j.compedu.2020.103965

    Article  Google Scholar 

  • Haynes, J. E., Miller, J. A., & Varea, V. (2016). Preservice generalist teachers enlightened approach to teaching physical education through teacher biography. Australian Journal of Teacher Education (Online), 41(3), 21–38. https://doi.org/10.14221/ajte.2016v41n3.2

    Article  Google Scholar 

  • Harvey, S., Curtner-Smith, M., & Kuklick, C. (2018). Influence of a models-based physical education teacher education program on the perspectives and practices of preservice teachers. Curriculum Studies in Health and Physical Education, 9(3), 220–236. https://doi.org/10.1080/25742981.2018.1475246

    Article  Google Scholar 

  • Harvey, S., & Hyndman, B. (2018). An investigation into the reasons physical education professionals use Twitter. Journal of Teaching in Physical Education, 37(4), 383–396. https://doi.org/10.1123/jtpe.2017-0188

    Article  Google Scholar 

  • Harvey, S., Pill, S., Hastie, P., & Wallhead, T. (2020). Physical education teachers’ perceptions of the successes, constraints, and possibilities associated with implementing the sport education model. Physical Education and Sport Pedagogy, 25(5), 555–566. https://doi.org/10.1080/17408989.2020.1752650

    Article  Google Scholar 

  • Harvey, S., Carpenter, J. P., & Hyndman, B. P. (2020). Introduction to social media for professional development and learning in physical education and sport pedagogy. Journal of Teaching in Physical Education, 39(4), 425–433

  • Intepe, G., & Shearman, D. (2020). Developing Statistical Understanding and Overcoming Anxiety via Drop-In Consultations.Statistics Education Research Journal, 19(1)

  • Jo, T. (2019). Text mining. Studies in Big Data. Cham:. Springer International Publishing

  • Joo, S., & Cahill, M. (2018). Exploring research topics in the field of school librarianship based on text mining. School Libraries Worldwide, 24(1), 15–28

    Google Scholar 

  • Kim, D. H., & Pior, M. Y. (2018). A Study on the Mainstream of Real Estate Education with Core Term Analysis. Education Sciences, 8(4), 182. https://doi.org/10.3390/educsci8040182

    Article  Google Scholar 

  • Koseoglu, S., & Bozkurt, A. (2018). An exploratory literature review on open educational practices. Distance education, 39(4), 441–461. https://doi.org/10.1080/01587919.2018.1520042

    Article  Google Scholar 

  • Kagklis, V., Karatrantou, A., Tantoula, M., Panagiotakopoulos, C. T., & Verykios, V. S. (2015). A learning analytics methodology for detecting sentiment in student fora: A Case Study in Distance Education. European Journal of Open Distance and E-learning, 18(2), 74–94

    Article  Google Scholar 

  • Liu, Q., Zhang, S., Wang, Q., & Chen, W. (2017). Mining online discussion data for understanding teachers reflective thinking. IEEE Transactions on Learning Technologies, 11(2), 243–254

  • Martí-Parreño, J., Méndez‐Ibáñez, E., & Alonso‐Arroyo, A. (2016). The use of gamification in education: a bibliometric and text mining analysis. Journal of computer assisted learning, 32(6), 663–676. https://doi.org/10.1111/jcal.12161

    Article  Google Scholar 

  • Machado, C. J. R., Maciel, A. M. A., Rodrigues, R. L., & Menezes, R. (2019). An approach for thematic relevance analysis applied to textual contributions in discussion forums. International Journal of Distance Education Technologies (IJDET), 17(3), 37–51

    Article  Google Scholar 

  • Ming, N. C., & Ming, V. L. (2015). Visualizing and Assessing Knowledge from Unstructured Student Writing.Technology, Instruction, Cognition & Learning, 10(1)

  • Magnier-Watanabe, R., Watanabe, Y., Aba, O., & Herrig, H. (2017). Global virtual teams’ education: experiential learning in the classroom. On the Horizon, 25(4), 267–285. https://doi.org/10.1108/OTH-02-2017-0007

    Article  Google Scholar 

  • Nuankaew, W., & Nuankaew, P. (2019). The study of the factors and development of educational model: The relationship between the learner context and the curriculum context in higher education. International Journal of Emerging Technologies in Learning (iJET), 14(21), 205–226

    Article  Google Scholar 

  • Okoye, K., Arrona-Palacios, A., Camacho-Zuñiga, C., Hammout, N., Nakamura, E. L., Escamilla, J., & Hosseini, S. (2020). Impact of students evaluation of teaching: a text analysis of the teachers qualities by gender. International Journal of Educational Technology in Higher Education, 17(1), 1–27. https://doi.org/10.1186/s41239-020-00224-z

    Article  Google Scholar 

  • Okada, Y., Sawaumi, T., & Ito, T. (2017). Effects of Observing Model Video Presentations on Japanese EFL Learners’ Oral Performance.Electronic Journal of Foreign Language Teaching, 14(2)

  • Poblete, C., Leguina, A., Masquiarán, N., & Carreño, B. (2019). Informal and non formal music experience: power, knowledge and learning in music teacher education in Chile. International Journal of Music Education, 37(2), 272–285. https://doi.org/10.1177/0255761419836015

    Article  Google Scholar 

  • Park, A., Conway, M., & Chen, A. T. (2018). Examining thematic similarity, difference, and membership in three online mental health communities from Reddit: a text mining and visualization approach. Computers in human behavior, 78, 98–112. https://doi.org/10.1016/j.chb.2017.09.001

    Article  Google Scholar 

  • Pei, B., Xing, W., & Lee, H. S. (2019). Using automatic image processing to analyze visual artifacts created by students in scientific argumentation. British Journal of Educational Technology, 50(6), 3391–3404. https://doi.org/10.1111/bjet.12741

    Article  Google Scholar 

  • Peng, X., & Xu, Q. (2020). Investigating learners’ behaviors and discourse content in MOOC course reviews. Computers & Education, 143, 103673. https://doi.org/10.1016/j.compedu.2019.103673

    Article  Google Scholar 

  • Pillutla, V. S., Tawfik, A. A., & Giabbanelli, P. J. (2020). Detecting the depth and progression of learning in massive open online courses by mining discussion data. Technology Knowledge and Learning, 25(4), 881–898. https://doi.org/10.1007/s10758-020-09434-w

    Article  Google Scholar 

  • Poole, F., Clarke-Midura, J., Sun, C., & Lam, K. (2019). Exploring the pedagogical affordances of a collaborative board game in a dual language immersion classroom. Foreign Language Annals, 52(4), 753–775

  • Romero, C., & Ventura, S. (2007). Educational data mining: A survey from 1995 to 2005. Expert systems with applications, 33(1), 135–146. https://doi.org/10.1016/j.eswa.2006.04.005

    Article  Google Scholar 

  • Rodriguez-Andara, A., Río-Belver, R. M., Rodríguez-Salvador, M., & Lezama-Nicolás, R. (2018). Roadmapping towards sustainability proficiency in engineering education. International Journal of Sustainability in Higher Education, 19(2), 413–438. https://doi.org/10.1108/IJSHE-06-2017-0079

    Article  Google Scholar 

  • Sukanya, M., & Biruntha, S. (2012, August). Techniques on text mining. In 2012 IEEE International Conference on Advanced Communication Control and Computing Technologies (ICACCCT) (pp. 269–271). IEEE. https://doi.org/10.1109/ICACCCT.2012.6320784

  • Song, D., Lin, H., & Yang, Z. (2007). Opinion mining in e-learning system. In 2007 IFIP international conference on network and parallel computing workshops (NPC 2007) (pp. 788-792). IEEE. https://doi.org/10.1109/NPC.2007.51

  • Sumathy, K. L., & Chidambaram, M. (2013). Text mining: concepts, applications, tools and issues-an overview.International Journal of Computer Applications, 80(4)

  • Stupans, I., McGuren, T., & Babey, A. M. (2016). Student evaluation of teaching: A study exploring student rating instrument free-form text comments. Innovative Higher Education, 41(1), 33–42. https://doi.org/10.1007/s10755-015-9328-5

    Article  Google Scholar 

  • Shen, W., & Zhang, S. (2018). Emotional Tendency Dictionary Construction for College Teaching Evaluation. International Journal of Emerging Technologies in Learning, 13(11), https://doi.org/10.3991/ijet.v13i11.9605

  • Schiller, S. Z. (2016). CHAT for chat: Mediated learning in online chat virtual reference service. Computers in Human Behavior, 65, 651–665. https://doi.org/10.1016/j.chb.2016.06.053

    Article  Google Scholar 

  • Tan, A. H. (1999, April). Text mining: The state of the art and the challenges. In Proceedings of the pakdd 1999 workshop on knowledge disocovery from advanced databases (Vol. 8, pp. 65–70). sn

  • Tseng, W. T. (2020). Mining Text in Online News Reports of COVID-19 Virus: Key Phrase Extractions and Graphic Modeling. English Teaching & Learning, 1–11. https://doi.org/10.1007/s42321-020-00070?2

  • Tawfik, A. A., Law, V., Ge, X., Xing, W., & Kim, K. (2018). The effect of sustained vs. faded scaffolding on students’ argumentation in ill-structured problem solving. Computers in Human Behavior, 87, 436–449. https://doi.org/10.1016/j.chb.2018.01.035

    Article  Google Scholar 

  • Takagi, D., Hayashi, M., Iida, T., Tanaka, Y., Sugiyama, S., Nishizaki, H., & Morimoto, Y. (2019). Effects of dental students’ training using immersive virtual reality technology for home dental practice. Educational Gerontology, 45(11), 670–680. https://doi.org/10.1080/03601277.2019.1686284

    Article  Google Scholar 

  • Tao, Y., & Xie, M. (2019). Technical Writing as a Supplement. In Restructuring Translation Education (pp. 145–156). Springer, Singapore

  • Wang, Y., & Fikis, D. J. (2019). Common core state standards on Twitter: Public sentiment and opinion leaders. Educational Policy, 33(4), 650–683. https://doi.org/10.1177/0895904817723739

    Article  Google Scholar 

  • Wang, S. (2017). Determinants of mobile apps downloads: A systematic literature review. In The European Conference on Information Systems Management (pp. 353–360). Academic Conferences International Limited

  • Wu, J. Y., Hsiao, Y. C., & Nian, M. W. (2020). Using supervised machine learning on large-scale online Forums to classify course-related Facebook messages in predicting learning achievement within the personal learning environment. Interactive Learning Environments, 28(1), 65–80. https://doi.org/10.1080/10494820.2018.1515085

    Article  Google Scholar 

  • Wu, P., Yu, S., & Wang, D. (2018). Using a Learner-Topic Model for Mining Learner Interests in Open Learning. Educational Technology & Society, 21(2), 192–204

    Google Scholar 

  • Wu, F., & Lai, S. (2019). Linking prediction with personality traits: a learning analytics approach. Distance Education, 40(3), 330–349. https://doi.org/10.1080/01587919.2019.1632170

    Article  Google Scholar 

  • Wook, M., Razali, N. A. M., Ramli, S., Wahab, N. A., Hasbullah, N. A., Zainudin, N. M., & Talib, M. L. (2019). Opinion mining technique for developing student feedback analysis system using lexicon- based approach (OMFeedback). Education and Information Technologies, 1–12. https://doi.org/10.1007/s10639-019-10073-7

  • Wook, M., Razali, N. A. M., Ramli, S., Wahab, N. A., Hasbullah, N. A., Zainudin, N. M., & Talib, M. L. (2020). Opinion mining technique for developing student feedback analysis system using lexicon-based approach (OMFeedback). Education and Information Technologies, 25(4), 2549–2560

  • Xing, W., & Gao, F. (2018). Exploring the relationship between online discourse and commitment in Twitter professional learning communities. Computers & Education, 126, 388–398. https://doi.org/10.1016/j.compedu.2018.08.010

    Article  Google Scholar 

  • Xie, K., Di Tosto, G., Lu, L., & Cho, Y. S. (2018). Detecting leadership in peer-moderated online collaborative learning through text mining and social network analysis. The Internet and Higher Education, 38, 9–17. https://doi.org/10.1016/j.iheduc.2018.04.002

    Article  Google Scholar 

  • Yim, S., & Warschauer, M. (2017). Web-based collaborative writing in L2 contexts: Methodological insights from text mining. Language Learning & Technology, 21(1), 146–165

    Google Scholar 

  • Zanini, N., & Dhawan, V. (2015). Text Mining: An introduction to theory and some applications. Research Matters, 19, 38–45

    Google Scholar 

  • Zawacki-Richter, O., & Latchem, C. (2018). Exploring four decades of research in Computers & Education. Computers & Education, 122, 136–152. https://doi.org/10.1016/j.compedu.2018.04.001

    Article  Google Scholar 

  • Zheng, J., Xing, W., & Zhu, G. (2019). Examining sequential patterns of self-and socially shared regulation of STEM learning in a CSCL environment. Computers & Education, 136, 34–48. https://doi.org/10.1016/j.compedu.2019.03.005

    Article  Google Scholar 

  • Zuo, Z., Zhao, K., & Eichmann, D. (2017). The state and evolution of US iSchools: From talent acquisitions to research outcome. Journal of the Association for Information Science and Technology, 68(5), 1266–1277. https://doi.org/10.1002/asi.23751

    Article  Google Scholar 

  • Zhang, K. (2015). Mining data from Weibo to WeChat: A comparative case study of MOOC communities on social media in China. International Journal on E-Learning, 14(3), 305–329

    Google Scholar 

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Junhe Yang.

Ethics declarations

Conflict of interest

The author declares no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, J., Kinshuk & An, Y. A survey of the literature: how scholars use text mining in Educational Studies?. Educ Inf Technol 28, 2071–2090 (2023). https://doi.org/10.1007/s10639-022-11193-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10639-022-11193-3

Keywords

Navigation