Abstract
Purpose
ChatGPT has gained popularity as a web application since its release in 2022. While artificial intelligence (AI) systems’ potential in scientific writing is widely discussed, their reliability in reviewing literature and providing accurate references remains unexplored. This study examines the reliability of references generated by ChatGPT language models in the Head and Neck field.
Methods
Twenty clinical questions were generated across different Head and Neck disciplines, to prompt ChatGPT versions 3.5 and 4.0 to produce texts on the assigned topics. The generated references were categorized as “true,” “erroneous,” or “inexistent” based on congruence with existing records in scientific databases.
Results
ChatGPT 4.0 outperformed version 3.5 in terms of reference reliability. However, both versions displayed a tendency to provide erroneous/non-existent references.
Conclusions
It is crucial to address this challenge to maintain the reliability of scientific literature. Journals and institutions should establish strategies and good-practice principles in the evolving landscape of AI-assisted scientific writing.
Data availability
The data that support the findings of this study are openly available in Mendeley at https://doi.org/10.17632/y6wbt9snv7.1.
References
Gilson A, Safranek CW, Huang T, Socrates V, Chi L, Taylor RA, Chartash D (2023) How does ChatGPT perform on the United States medical licensing examination? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ 9:e45312. https://doi.org/10.2196/45312
Number of ChatGPT Users (2023) Available online: https://explodingtopics.com/blog/chatgpt-users. Accessed on 9 Jun 2023
Dave T, Athaluri SA, Singh S (2023) ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations. Front Artif Intell 6:1169595. https://doi.org/10.3389/frai.2023.1169595
Kim JK, Chua M, Rickard M, Lorenzo A (2023) ChatGPT and large language model (LLM) chatbots: the current state of acceptability and a proposal for guidelines on utilization in academic medicine. J Pediatr Urol S1477–5131(23):00224–00233. https://doi.org/10.1016/j.jpurol.2023.05.018
Sallam M (2023) ChatGPT utility in healthcare education, research, and practice: systematic review on the promising perspectives and valid concerns. Healthcare (Basel) 19(11):887
Bahadoran Z, Mirmiran P, Kashfi K, Ghasemi A (2020) The principles of biomedical scientific writing: citation. Int J Endocrinol Metab 18:e102622
ChatGPT (2023) Available online: https://openai.com/blog/chatgpt. Accessed on 10 Jun 2023
Hill-Yardin EL, Hutchinson MR, Laycock R, Spencer SJ (2023) A Chat(GPT) about the future of scientific publishing. Brain Behav Immun 110:152–154. https://doi.org/10.1016/j.bbi.2023.02.022
Frosolini A, Gennaro P, Cascino F, Gabriele G (2023) In reference to “role of Chat GPT in public health”, to highlight the AI’s incorrect reference generation. Ann Biomed Eng. https://doi.org/10.1007/s10439-023-03248-4
Balel Y (2023) Can ChatGPT be used in oral and maxillofacial surgery? J Stomatol Oral Maxillofac Surg. https://doi.org/10.1016/j.jormas.2023.101471
Park I, Joshi AS, Javan R (2023) Potential role of ChatGPT in clinical otolaryngology explained by ChatGPT. Am J Otolaryngol 29(44):103873
Hoch CC, Wollenberg B, Lüers JC, Knoedler S, Knoedler L, Frank K, Cotofana S, Alfertshofer M (2023) ChatGPT’s quiz skills in different otolaryngology subspecialties: an analysis of 2576 single-choice and multiple-choice board certification preparation questions. Eur Arch Otorhinolaryngol. https://doi.org/10.1007/s00405-023-08051-4
Chiesa-Estomba CM, Lechien JR, Vaira LA, Brunet A, Cammaroto G, Mayo-Yanez M, Sanchez-Barrueco A, Saga-Gutierrez C (2023) Exploring the potential of Chat-GPT as a supportive tool for sialendoscopy clinical decision making and patient information support. Eur Arch Otorhinolaryngol. https://doi.org/10.1007/s00405-023-08104-8
Wu RT, Dang RR (2023) ChatGPT in head and neck scientific writing: a precautionary anecdote. Am J Otolaryngol 44(6):103980. https://doi.org/10.1016/j.amjoto.2023.103980
Acknowledgements
None.
Funding
None.
Author information
Authors and Affiliations
Contributions
Conceptualization, AF, GG and PG; methodology, AF and GG; formal analysis, AF, LF, GM, LAV; investigation, AF, LF, SB, LAV; resources, AF, SB, LAV; data curation, AF, LF, SB, GM, GG, LAV; writing—original draft preparation, AF, SB; writing—review and editing, AF, LF, GM, LAV; visualization, AF; supervision GM, GG, PG, CdF; project administration GM, GG, CdF, PG; funding acquisition, not applicable. All authors have read and agreed to the published version of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Institutional review board
Not required.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Frosolini, A., Franz, L., Benedetti, S. et al. Assessing the accuracy of ChatGPT references in head and neck and ENT disciplines. Eur Arch Otorhinolaryngol 280, 5129–5133 (2023). https://doi.org/10.1007/s00405-023-08205-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00405-023-08205-4