Abstract
Objective
ChatGPT (Generative Pre-trained Transformer) is an artificial intelligence language tool developed by OpenAI that utilises machine learning algorithms to generate text that closely mimics human language. It has recently taken the internet by storm. There have been several concerns regarding the accuracy of documents it generates. This study compares the accuracy and quality of several ChatGPT-generated academic articles with those written by human authors.
Material and methods
We performed a study to assess the accuracy of ChatGPT-generated radiology articles by comparing them with the published or written, and under review articles. These were independently analysed by two fellowship-trained musculoskeletal radiologists and graded from 1 to 5 (1 being bad and inaccurate to 5 being excellent and accurate).
Results
In total, 4 of the 5 articles written by ChatGPT were significantly inaccurate with fictitious references. One of the papers was well written, with a good introduction and discussion; however, all references were fictitious.
Conclusion
ChatGPT is able to generate coherent research articles, which on initial review may closely resemble authentic articles published by academic researchers. However, all of the articles we assessed were factually inaccurate and had fictitious references. It is worth noting, however, that the articles generated may appear authentic to an untrained reader.
Similar content being viewed by others
References
OpenAI. [Internet]. Introducing ChatGPT. San Francisco, California: OpenAI. 2022. [cited 2023 Feb 27]. Available from: https://openai.com/blog/
Kitamura FC. ChatGPT is shaping the future of medical writing but still requires human judgment. Radiology. 2023;2:230171. https://doi.org/10.1148/radiol.230171.
Biswas S. ChatGPT and the future of medical writing. Radiology. 2023;2:223312. https://doi.org/10.1148/radiol.223312.
Patel A, Davies AM, Botchu R, James S. A pragmatic approach to the imaging and follow-up of solitary central cartilage tumours of the proximal humerus and knee. Clin Radiol. 2019;74(7):517–26. https://doi.org/10.1016/j.crad.2019.01.025.
Bharath A, Uhiara O, Botchu R, et al. The rising root sign the magnetic resonance appearances of post-operative spinal subdural extra-arachnoid collections. Skeletal Radiol. 2017;46:1225–31. https://doi.org/10.1007/s00256-017-2682-x.
Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepaño C, Madriaga M, Aggabao R, Diaz-Candido G, Maningo J, Tseng V. Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digit Health. 2023;2(2):e0000198. https://doi.org/10.1371/journal.pdig.0000198.
Shen Y, Heacock L, Elias J, Hentel KD, Reig B, Shih G, Moy L. ChatGPT and other large language models are double-edged swords. Radiology. 2023;26:230163. https://doi.org/10.1148/radiol.230163.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ariyaratne, S., Iyengar, K.P., Nischal, N. et al. A comparison of ChatGPT-generated articles with human-written articles. Skeletal Radiol 52, 1755–1758 (2023). https://doi.org/10.1007/s00256-023-04340-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00256-023-04340-5