Is ChatGPT a Threat to Formative Assessment in College-Level Science? An Analysis of Linguistic and Content-Level Features to Classify Response Types

Wang, Heqiao; Li, Tingting; Haudek, Kevin; Royse, Emily A.; Manzanares, Mandy; Adams, Sol; Horne, Lydia; Romulo, Chelsie

doi:10.1007/978-981-99-7947-9_13

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 190))

Included in the following conference series:

International Conference on Artificial Intelligence in Education Technology

462 Accesses

Abstract

The impact of OpenAI’s ChatGPT on education has led to a reexamination of traditional pedagogical methods and assessments. However, ChatGPT’s performance capabilities on a wide range of assessments remain to be determined. This study aims to classify ChatGPT-generated and student constructed responses to a college-level environmental science question and explore the linguistic- and content-level features that can be used to address the differential use of language. Coh-Metrix textual analytic tool was implemented to identify and extract linguistic and textual feature. Then we employed random forest feature selection method to determine the best representative and nonredundant text-based features. We also employed TF-IDF metrics to represent the content of written responses. The true performance of classification models for the responses was evaluated and compared in three scenarios: (a) using content-level features alone, (b) using linguistic-level features alone, (c) using the combination of two. The results demonstrated that the accuracy, specificity, sensitivity, and F1-score all increased when we used the combination of two-level features. The results of this study hold promise to provide valuable insights for instructors to detect student responses and integrate ChatGPT into their course development. This study also highlights the significance of linguistic- and content-level features in AI education research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Towards a machine learning-based constructive alignment approach for improving outcomes composure of engineering curriculum

Article 02 September 2023

Automatic content analysis of asynchronous discussion forum transcripts: A systematic literature review

Article 06 May 2022

Automatic Classification of Online Discussions and Other Learning Traces to Detect Cognitive Presence

Article Open access 01 May 2023

References

Coh-Metrix Web Documentation. http://cohmetrix.com/. Accessed 12 Feb 2023
Cooper, M.M., Stowe, R.L.: Chemistry education research—from personal empiricism 27to evidence, theory, and informed practice. Chem. Rev. 118(12), 6053–6087 (2018)
Article Google Scholar
Baidoo-Anu, D., Owusu Ansah, L.: Education in the era of generative artificial intelligence (AI): understanding the potential benefits of ChatGPT in promoting teaching and learning (2023)
Google Scholar
de Rooij, M., Weeda, W.: Cross-validation: a method every psychologist should know. Adv. Meth. Pract. Psychol. Sci. 3(2), 248–263 (2020)
Article Google Scholar
FAO: The Water-Energy-Food Nexus: a new approach in support of food security and sustainable agriculture. The Food and Agricultural Organisation of the United Nations, Rome (2014)
Google Scholar
Genuer, R., Poggi, J.M., Tuleau-Malot, C.: Variable selection using random forests. Pattern Recogn. Lett. 31(14), 2225–2236 (2010)
Article Google Scholar
Gerard, L.F., Linn, M.C.: Using automated scores of student essays to support teacher guidance in classroom inquiry. J. Sci. Teacher Educ. 27, 111–129 (2016)
Article Google Scholar
Gilson, A., Safranek, C., Huang, T., Socrates, V., Chi, L., Taylor, R.A., Chartash, D.: How well does ChatGPT do when taking the medical licensing exams? The implications of large language models for medical education and knowledge assessment. medRxiv (2022)
Google Scholar
Graesser, A.C., McNamara, D.S., Louwerse, M.M., Cai, Z.: Coh-Metrix: Analysis of text on cohesion and language. Behav. Res. Meth. Instrum. Comput. 36(2), 193–202 (2004). https://doi.org/10.3758/BF03195564
Article Google Scholar
Graham, F.: Daily briefing: will ChatGPT kill the essay assignment? Nature (2022)
Google Scholar
He, P., Chen, I.C., Touitou, I., Bartz, K., Schneider, B., Krajcik, J.: Predicting student science achievement using post-unit assessment performances in a coherent high school chemistry project-based learning system. J. Res. Sci. Teach. 60, 724–760 (2022)
Article Google Scholar
Huang, K.: Alarmed by A.I. Chatbots, Universities Start Revamping How They Teach. New York Times. https://www.nytimes.com/2023/01/16/technology/chatgpt-artificial-intelligence-universities.html. Accessed 14 Feb 2023
Humbird, K.D., Peterson, J.L., McClarren, R.G.: Deep neural network initialization with decision trees. IEEE Trans. Neural Netw. Learn. Syst. 30(5), 1286–1295 (2018)
Article Google Scholar
Kim, N., Htut, P.M., Bowman, S.R., Petty, J.: (QA)2: Question answering with questionable (2022)
Google Scholar
King, M.R.: The future of AI in medicine: a perspective from a Chatbot. Ann. Biomed. Eng. 51, 291–295 (2023). https://doi.org/10.1007/s10439-022-03121-w
Article Google Scholar
Kirmani, A.R.: Artificial Intelligence-enabled science poetry. ACS Energy Lett. 8, 574–576 (2022)
Article Google Scholar
Krajcik, J.S.: Commentary—applying machine learning in science assessment: opportunity and challenges. J. Sci. Educ. Technol. 30, 313–318 (2021)
Article Google Scholar
Latifi, S., Gierl, M.: Automated scoring of junior and senior high essays using Coh-Metrix features: implications for large-scale language testing. Lang. Test. 38(1), 62–85 (2021)
Article Google Scholar
Li, T., Miller, E., Chen, I.C., Bartz, K., Codere, S., Krajcik, J.: The relationship between teacher’s support of literacy development and elementary students’ modelling proficiency in project-based learning science classrooms. Education 3–13 49(3), 302–316 (2021)
Google Scholar
Li, T., Liu, F., Krajcik, J.: Automatically assess elementary students’ hand-drawn scientific models using machine learning: is it possible? Paper proposal submitted to the 96th NARST Annual International Conference 2023, Chicago, IL (2023)
Google Scholar
McCarthy, M.P., Lightman, J.E., Dufty, F.D., McNamara, S.D.: Using Coh-Metrix to assess cohesion and difficulty in high-school textbooks (2019)
Google Scholar
McNamara, D.S., Graesser, A.C.: Coh-Metrix: an automated tool for theoretical and applied natural language processing. In: Applied Natural Language Processing: Identification, Investigation and Resolution, pp. 188–205. IGI Global (2012)
Google Scholar
McNamara, D.S., Graesser, A.C., McCarthy, P.M., Cai, Z.: Automated Evaluation of Text and Discourse with Coh-Metrix. Cambridge University Press (2014)
Google Scholar
McNamara, D.S., Louwerse, M.M., McCarthy, P.M., Graesser, A.C.: Coh-Metrix: capturing linguistic features of cohesion. Discourse Process. 47(4), 292–330 (2010)
Article Google Scholar
Metz, C.: The new Chatbots could change the world. Can you trust them? The New York Times. https://www.nytimes.com/2022/12/10/technology/ai-chat-bot-chatgpt.html. Accessed 12 Feb 2023
Mitchell, A.: Professor catches student cheating with ChatGPT: ‘I feel abject terror’ (2022)
Google Scholar
National Research Council: A Framework for K-12 Science Education: Practices, Crosscutting Concepts, and Core Ideas. The National Academies Press, Washington, DC (2012)
Google Scholar
NGSS Lead States Next generation science standards for states, by states. https://www.nextgenscience.org/. Accessed 12 Feb 2023
Shiroda, M., Fleming, M.P., Haudek, K.C.: Ecological diversity methods improve quantitative examination of student language in short constructed responses in STEM. Front. Educ. 8, 12 (2022)
Google Scholar
Stokel-Walker, C.: AI bot ChatGPT writes smart essays-should academics worry? Nature (2022)
Google Scholar
Susnjak, T.: ChatGPT: the end of online exam integrity? arXiv (2022)
Google Scholar
Tate, T.P., Doroudi, S., Ritchie, D., Xu, Y., Uci, M.W.: Educational research and AI-generated writing: confronting the coming Tsunami (2023)
Google Scholar
Thorp, H.H.: ChatGPT is fun, but not an author. Science 379(6630), 313 (2023)
Article Google Scholar
Troia, G.A., Wang, H., Lawrence, F.R.: Latent profiles of writing-related skills, knowledge, and motivation for elementary students and their relations to writing performance across multiple genres. Contemp. Educ. Psychol. 71, 102100 (2022)
Article Google Scholar
Troia, G.A., Shen, M., Brandon, D.L.: Multidimensional levels of language writing measures in grades four to six. Writ. Commun. 36(2), 231–266 (2019)
Article Google Scholar
Underwood, S.M., Posey, L.A., Herrington, D.G., Carmel, J.H., Cooper, M.M.: Adapting assessment tasks to support three-dimensional learning. J. Chem. Educ. 95(2), 207–217 (2018)
Article Google Scholar
Vabalas, A., Gowen, E., Poliakoff, E., Casson, A.J.: Machine learning algorithm validation with a limited sample size. PLoS ONE 14(11), e0224365 (2019)
Article Google Scholar
Vincent, J.: AI-generated answers temporarily banned on coding Q&A site Stack Overflow. https://www.theverge.com/2022/12/5/23493932/chatgpt-ai-generated-answers-temporarily-banned-stack-overflow-llms-dangers. Accessed 12 Feb 2023
Wang, H., Troia, G.: Integrating genre-related factors into writing quality predictive modeling. Written Commun. 40, 1070–1112 (2023)
Article Google Scholar
Whitford, E: A computer can now write your college essay—maybe better than you can. https://www.forbes.com/sites/emmawhitford/2022/12/09/a-computer-can-now-write-your-collegeessay---/?sh=2c9da98c6811. Accessed 12 Feb 2023
Williams, C.: Hype, or the future of learning and teaching? 3 Limits to AI’s ability to write student essays. London School of Economics internet blog. https://kar.kent.ac.uk/99505/. Accessed 12 Feb 2023
Wilson, J., Roscoe, R., Ahmed, Y.: Automated formative writing assessment using a levels of language framework. Assess. Writ. 34, 16–36 (2017)
Article Google Scholar
Zhai, X.: ChatGPT user experience: implications for education (2022)
Google Scholar

Download references

Acknowledgements

We would like to thank Steven Anderson, Shirley Vincent, Ennea Fairchild and other members of the Next Generation Concept Inventory project for their assistance. This material is based upon work supported by the National Science Foundation under Grant No. 2013359. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Michigan State University, 620 Farm Lane, East Lansing, MI, 48824, USA
Heqiao Wang, Tingting Li & Kevin Haudek
University of Northern Colorado, Candelaria 2200, Box 115, Greeley, CO, 80639, USA
Emily A. Royse, Mandy Manzanares, Sol Adams & Chelsie Romulo
Rowan University, Discovery Hall 127, 201 Mullica Hill Road, Glassboro, NJ, 08028, USA
Lydia Horne

Authors

Heqiao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tingting Li
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Haudek
View author publications
You can also search for this author in PubMed Google Scholar
Emily A. Royse
View author publications
You can also search for this author in PubMed Google Scholar
Mandy Manzanares
View author publications
You can also search for this author in PubMed Google Scholar
Sol Adams
View author publications
You can also search for this author in PubMed Google Scholar
Lydia Horne
View author publications
You can also search for this author in PubMed Google Scholar
Chelsie Romulo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Heqiao Wang .

Editor information

Editors and Affiliations

IU International University of Applied Sciences, Erfurt, Germany
Tim Schlippe
The Education University of Hong Kong, Hong Kong S.A.R., China
Eric C. K. Cheng
Swinburne University of Technology, Melbourne, VIC, Australia
Tianchong Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, H. et al. (2023). Is ChatGPT a Threat to Formative Assessment in College-Level Science? An Analysis of Linguistic and Content-Level Features to Classify Response Types. In: Schlippe, T., Cheng, E.C.K., Wang, T. (eds) Artificial Intelligence in Education Technologies: New Development and Innovative Practices. AIET 2023. Lecture Notes on Data Engineering and Communications Technologies, vol 190. Springer, Singapore. https://doi.org/10.1007/978-981-99-7947-9_13

Download citation

DOI: https://doi.org/10.1007/978-981-99-7947-9_13
Published: 09 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7946-2
Online ISBN: 978-981-99-7947-9
eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics

Is ChatGPT a Threat to Formative Assessment in College-Level Science? An Analysis of Linguistic and Content-Level Features to Classify Response Types

Abstract

Access this chapter

Similar content being viewed by others

Towards a machine learning-based constructive alignment approach for improving outcomes composure of engineering curriculum

Automatic content analysis of asynchronous discussion forum transcripts: A systematic literature review

Automatic Classification of Online Discussions and Other Learning Traces to Detect Cognitive Presence

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Is ChatGPT a Threat to Formative Assessment in College-Level Science? An Analysis of Linguistic and Content-Level Features to Classify Response Types

Abstract

Access this chapter

Similar content being viewed by others

Towards a machine learning-based constructive alignment approach for improving outcomes composure of engineering curriculum

Automatic content analysis of asynchronous discussion forum transcripts: A systematic literature review

Automatic Classification of Online Discussions and Other Learning Traces to Detect Cognitive Presence

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation